A Novel Multi-threshold Segmentation Approach Based on Artificial Immune System Optimization. Authors; Authors and affiliations. Erik Cuevas; ValentÃn Osuna- ...
A Novel Multi-threshold Segmentation Approach Based on Artificial Immune System Optimization Erik Cuevas, Valentín Osuna-Enciso, Daniel Zaldívar, and Marco Pérez-Cisneros Departamento de Ciencias Computacionales, Av. Revolucion 1500, Guadalajara, Jal. Mexico {erik.cuevas, daniel.zaldivar, marco.perez}@cucei.udg.mx
Abstract. Threshold selection is a critical step in computer vision. Immune systems, has inspired optimization algorithms known as Artificial Immune Optimization (AIO). AIO have been successfully applied to solve optimization problems. The Clonal Selection algorithm (CSA) is the most applied AIO method. It generates a response after an antigenic pattern is identified by an antibody. This works presents an image multi-threshold approach based on AIS optimization. The approach considers the segmentation task as an optimization process. The 1-D histogram of the image is approximated by adding several Gaussian functions whose parameters are calculated by the CSA. The mix of Gaussian functions approximates the histogram; each Gaussian function represents a pixel class (threshold point). The proposed approach is computationally efficient and does not require prior assumptions about the image. The algorithm demonstrated ability to perform automatic threshold selection. Keywords: Image segmentation, artificial immune system, automatic thresholding, intelligent image processing.
1 Introduction Biological inspired methods can successfully be transferred into novel computational paradigms as shown by the development and successful use of concepts such as artificial neural networks, evolutionary algorithms, swarming algorithms and so. The human immune system (HIS) is a highly evolved, parallel and distributed adaptive system [1], which exhibits remarkable abilities that can be imported into important aspects in the field of computation. This emerging field is referring to as artificial immune systems (AIS) [2]. It can be defined as computational systems inspired by theoretical immunology and its observed immune functions, principles and models. AIS have recently gained considerable research interest from different communities [3] considering several aspects of optimization, pattern recognition, anomaly detection, data analysis, and machine learning. Artificial immune optimization has been successfully applied to tackle numerous challenging optimization problems with remarkable performance in comparison to other classical techniques [4]. Clonal selection algorithm (CSA) [5] is one of the most widely employed AIO approaches. The CSA is a relatively novel evolutionary optimization algorithm which has been built on the basis of the clonal selection principle [6] of HIS. The clonal W. Yu and E.N. Sanchez (Eds.): Advances in Computational Intell., AISC 61, pp. 309–317. © Springer-Verlag Berlin Heidelberg 2009 springerlink.com
310
E. Cuevas et al.
selection principle explains the immune response when an antigenic pattern is recognized by a given antibody. CSA has the ability of getting out of local minima while operates over a population of points within the search space simultaneously. It does not use the derivatives or any of their related information as it employs probabilistic transition rules instead of deterministic ones. Moreover, its implementation is simple and straightforward. CSA has been therefore used extensively in the literature for solving several kinds of challenging engineering problems [7, 8, 9]. For many applications in image processing, abstractions of objects or features, commonly used by other high-level tasks, are derived from images. In particular for the abstraction analysis, the pixels within the image have to be grouped into meaningful regions by applying image segmentation. In many cases, the gray level value for pixels belonging to the object is substantially different from those belonging to the background. Thus thresholding becomes a simple but effective tool to separate objects from the background yielding several sorts of applications, such as document image analysis, which aims to extract printed characters [10, 11], logos, graphical contents, or musical scores. Also map processing, which seeks to identify legends and characters [12], scene processing which tracks a given target [13] and quality inspection of materials [14, 15] which aims to detect defective parts. Thresholding is therefore a simple and effective technique for image segmentation. It can be classified into two categories: bi-level and multi-level thresholding. The latter considers images containing several distinct objects which required multiple threshold values to accomplish segmentation. The bi-level case may segment an image into two simple pixel classes, one representing the object and the other accounting for the background pixels. A wide variety of thresholding approaches have been proposed for image segmentation, by means of conventional methods such as in [16], [17], [18], [19] and those based on computational intelligent techniques as in [20], [21] and [9]. When extended to multi-level thresholding, two major drawbacks may arise: (i) they have no systematic and analytic solution when the number of classes in the image increases, and (ii) the number of classes in the image is difficult to define and it should be prespecified, although such parameter is unknown for many real-world applications. In order to contribute towards solving some of these problems, this paper presents an alternative approach using the optimization algorithm based on AIS for multilevel thresholding. In the traditional multilevel optimal thresholding, the intensity distributions belonging to the object or to the background pixels are assumed to follow some Gaussian probability function; therefore a combination of probability density functions is usually adopted to model these functions. The parameters in the combination function are unknown and the parameter estimation is typically assumed to be a nonlinear optimization problem [22]. The unknown parameters that give the best fit to the processed histogram are determined by using the Clonal selection algorithm. This paper organizes as follows: Section 2 presents the method following the Gaussian approximation of the histogram. Section 3 provides information about the CSA while Section 4 demonstrates the automatic threshold determination. Section 5 discusses some implementation details. Experimental results for the proposed
A Novel Multi-threshold Segmentation Approach Based on AIS Optimization
311
approach are presented in Section 6, followed by the discussion summarized in Section 7.
2 Gaussian Approximation Assuming an image has L gray levels [0,K , L − 1] , following a gray level distribution which can be displayed in the form of the histogram h( g ) . In order to simplify the description, the histogram is normalized and it is considered as a probability distribution function, yielding:
h( g ) =
ng
, h( g ) ≥ 0, N =
N
L −1
∑
L −1
ng , and
g =0
∑ h( g ) = 1
(1)
g =0
Assuming that ng denotes the number of pixels with gray level g, while N is the total number of pixels in the image. The histogram function can thus be contained into a mix of Gaussian probability functions, which yields: K
p( x) =
∑
K
Pi ⋅ pi ( x) =
i =1
∑ i =1
⎡ −( x − μi ) 2 ⎤ exp ⎢ ⎥ 2 2πσ i ⎣ 2σ i ⎦ Pi
(2)
Considering that Pi is the a priori probability of class i, pi ( x) is the probability distribution function of gray-level random variable x in class i, μi and σ i are the mean and standard deviation of the i-th probability distribution function and K is the number of classes within the image. In addition, the constraint
∑
K
P = 1 must be
i =1 i
satisfied. The typical mean square error consideration is used to estimate the 3K parameters Pi , μi and σ i , i = 1, . . ,K. For example, the mean square error between the composite Gaussian function p ( xi ) and the experimental histogram function h( xi ) is defined as follows:
E=
1 n
⎛ 2 ⎣⎡ p( x j ) − h( x j ) ⎦⎤ + ω ⋅ ⎜ j =1 ⎝ n
∑
⎞
K
∑ P ⎟⎠ − 1 i
(3)
i =1
Assuming an n-point histogram as in [22] and ω being the penalty associated with the constrain
∑
K
P =1.
i =1 i
In general, the determination of parameters that minimize the square error is not a simple problem. A straightforward method to decrease the partial derivatives of the error function to zero considers a set of simultaneous transcendental equations [22]. An analytical solution is not available due to the non-linear nature of the equations, and therefore a multilevel optimization method must be used. The algorithm makes use of a numerical procedure following an iterative approach based on the gradient information. However, considering that the gradient descent method may easily get
312
E. Cuevas et al.
stuck within local minima, the final solution is highly dependant on the initial values. Some previous experiences have shown that the evolutionary-based approaches actually provide a satisfactory performance in case of image processing problems [9, 20, 23, 24, 25]. The CSA is therefore adopted in order to find the parameters and their corresponding threshold values.
3 Clonal Selection Algorithm In the natural immune systems, only the antibodies (Abs) which are able to recognize the intruding antigens (non-self cells) are to be selected to proliferate by cloning [6]. Therefore, the fundamental of the clonal optimization method is that only capable Abs will proliferate. More precisely, the underlying principles of the CSA borrowed from the CSP are as follows: • • • • •
Maintenance of memory cells functionally disconnected from repertoire, Selection and cloning of most stimulated Abs, Suppression of non-simulated cells, Affinity maturation and reselection of clones with higher affinities, and Mutation rate proportional to Abs affinities.
Within Immunology, an antigen is any substance that forces the immune system to produce antibodies against it. In CSA, antigens refer to the pending optimization problem. In CSA, B cells, T cells and antigen-specific lymphocytes are generally called antibodies. An antibody is a representation of a candidate solution for an antigen. A selective mechanism guarantees that those offspring cells (solutions) that better recognize the antigen, which elicited the response, are selected to have long life spans. Therefore these cells are named memory cells (M). Figure 1 shows the diagram of the basic CSA steps as follows: 1. 2. 3. 4. 5. 6. 7. 8.
Initialize the Ab pool (Pinit) including the subset of memory cells (M). Evaluate the fitness of all the individuals in population P. The fitness here refers to the affinity measure. Select the best candidates (Pr) from P, according to their fitness (affinities to the antigens). Clone these Ab’s into a temporary pool (C). Generate a mutated Ab pool (C1). The mutation rate of each individual is inversely proportional to its fitness. Evaluate all the Ab’s in C1 Eliminate those Ab’s similar to the ones in C, and update C1. From C1, re-select the individuals exhibiting the best fitness to build M. Other individuals with improved fitness in C1 can thus replace other members showing poor fitness in P in order to keep the overall Ab diversity.
The CSA can be terminated once one fixed criteria, previously established, is satisfactorily reached. The clone size in Step 4 is commonly defined as a monotonic function of the affinity measure. Although a unique mutation operator is used in Step 5, the mutated values of individuals are inversely proportional to their fitness by means of choosing different mutation variations, i.e. as Ab shows better fitness, the less it may change.
A Novel Multi-threshold Segmentation Approach Based on AIS Optimization
313
The similarity property within the Abs can also affect the convergence speed of the CSA. The idea of the Ab suppression based on the immune network theory is introduced to eliminate the newly generated Abs, which is too similar to those already in the candidate pool (Step 7). With such a diverse Ab pool, the CSA can avoid being trapped into local optima. In contrast to the popular genetic algorithms (GA), which usually tend to bias the whole population of chromosomes towards only the best candidate solution [26], it can effectively handle the challenging multimodal optimization tasks [27- 30].
P M
Pinit
Eliminate
Evaluate Evaluate Re-select Select
Pr
Clone
C1
Mutate
C
Fig. 1. Basic diagram of clonal selection algorithm (CSA)
The management of population includes a simple and direct searching algorithm for globally optimal multi-modal functions. This is a clear difference in comparison to other evolutionary algorithms, like genetic algorithms, because the method does not require crossover but only cloning and hyper-mutation of individuals in order to use affinity as selection mechanism. The CSA is adopted in this work to find the parameters P, σ and μ of Gaussian functions and their corresponding threshold values for the image.
4 Determination of Thresholding Values The next step is to determine the optimal threshold values. Considering that the data classes are organized such that μ1 < μ2 < K < μ K , the threshold values can thus be
314
E. Cuevas et al.
calculated by estimating the overall probability error for two adjacent Gaussian functions, as follows: E (Ti ) = Pi +1 ⋅ E1 (Ti ) + Pi ⋅ E2 (Ti ),
i = 1, 2,K , K − 1
(4)
Considering
E1 (Ti ) =
∫
Ti
−∞
pi +1 ( x)dx,
(5)
pi ( x)dx,
(6)
And E2 (Ti ) =
∫
∞
Ti
E1 (Ti ) is the probability of mistakenly classifying the pixels in the (i + 1)-th class to the i-th class, while E2 (Ti ) is the probability of erroneously classifying the pixels in the i-th class to the (i + 1)-th class. Pj′s are the a-priori probabilities within the combined probability density function, and Ti is the threshold value between the i-th and the (i + 1)-th classes. One Ti value is chosen such as the error E (Ti ) is minimized. By differentiating E (Ti ) with respect to Ti and equating the result to zero, it is possible to use the following equation to define the optimum threshold value Ti :
ATi 2 + BTi + C = 0
(7)
Considering A = σ i2 − σ i2+1 , B = 2 ⋅ ( μiσ i2+1 − μi +1σ i2 ), ⎛σ P ⎞ C = (σ i μi +1 ) 2 − (σ i +1 μi ) 2 + 2 ⋅ (σ iσ i +1 ) 2 ⋅ ln ⎜ i +1 i ⎟ ⎝ σ i Pi +1 ⎠
(8)
Although the above quadratic equation has two possible solutions, only one of them is feasible (the positive one which falls within the interval). Figure 2 shows the determination process of the threshold points.
5 Implementation Details In this work, either an antibody or an antigen will be represented (in binary form) by a bit chain of the form: c =< c1 , c2 ,..., cL >
(9)
with c representing a point in an L-dimensional space, c ∈ S L The clonal selection algorithm can be stated as follows: 1) An original population of N individuals (antibodies) is generated, considering the size of 22 bits. 2) The n best individuals based on the affinity measure are selected. They will represent the memory set.
A Novel Multi-threshold Segmentation Approach Based on AIS Optimization
315
pi ( x)
pi +1 ( x)
μi
μi +1
Ti
Fig. 2. Determination of the threshold points
3) Such n best individuals are cloned m times. 4) Performing a hyper-mutation of the cloned individuals which follows the proportion inside the affinity between antibodies and antigens and generating one improved antibody population. 5) From the hyper-mutated population, the individuals with the higher affinity are to be re-selected. 6) As for the original population, the individuals with the highest affinity are replaced, improving the overall cells set. Once the above steps are completed, the process is started again, until one individual showing the highest affinity is found, i.e. finding the minimum stated in Equation 3. At this work, the algorithm considers 3 Gaussians to represent the same number of classes. However, it can be easily expanded to more classes. Each single Gaussian has the variables Pi , μi , σ i (with i=1, 2, 3) representing the Hamming shape-space by means of an 22 bits word over the following ranges:
Pi : [1, max(h) ] , μ i : [ min( g ), max( g )] , σ i : [1, max( g ) *0.5]
(10)
With g representing the grey level, and h is the histogram of the grey level image. Hence, the fist step is to generate the initial individual population of the antibody population by means of AB = 2 .* rand(N ,S p ) - 1; and S p representing the bit string assigned to each of the initial individuals N . Later, in order to perform the mapping from binary string to base 10
(c
L
,..., c2 , c1
)
2
⎛ =⎜ ⎝
21
⎞
∑ c ⋅ 2 ⎟⎠ i
i
i=0
= r'
(11)
10
As to find the corresponding real value for r: r = r '⋅
rmax 222 − 1
with rmax representing max(h), max( g ), max( g ) / 2 .
(12)
316
E. Cuevas et al.
The population is set to 100 individuals (N), including the best 20 individuals (n). The 20 selected individuals are cloned 10 times (m). The corresponding mutation probability is proportional to the resulting error by Equation 3. The algorithm is thus executed until the minimum possible value of Eq. 3 is reached.
7 Conclusions This works presents a novel segmentation algorithm which includes an automatic threshold determination approach. The intensity distributions for objects and background within the image are assumed to obey Gaussian distributions with different means and standard deviations. The histogram of the image is approximated by a combination of Gaussian probability functions. The 1-D histogram of the image is approximated by the addition of several Gaussian functions whose parameters are calculated using the CSA. Each Gaussian function approximates the histogram, representing a pixel class and therefore a threshold point. The clonal selection algorithm is used to estimate the parameters within the mixed density function obtaining a minimum square error between the density function and the original histogram. Experimental results reveal that the proposed approach can produce satisfactory results. The paper also discusses on two benchmark images to test the performance of the algorithm. The proposed approach is not only computationally efficient but also it does not require prior assumptions about the image whatsoever. The method is likely to be most useful for applications considering different and perhaps initially unknown image classes. Experimental results demonstrate the algorithm’s ability to perform automatic threshold selection while preserving main features from the original image. Further work extending the proposed approach to other techniques and its comparison to other newly-developed state-of-the-art image segmentation techniques is currently under development.
References 1. Goldsby, G.A., Kindt, T.J., Kuby, J., Osborne, B.A.: Immunology, 5th edn. Freeman, New York (2003) 2. de Castro, L.N., Timmis, J.: Artificial Immune Systems: A New Computational Intelligence Approach. Springer, London (2002) 3. Dasgupta, D.: Advances in artificial immune systems. IEEE Computational Intelligence Magazine 1(4), 40–49 (2006) 4. Wang, X., Gao, X.Z., Ovaska, S.J.: Artificial immune optimization methods and applications—a survey. In: Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, The Hague, The Netherlands (October 2004) 5. de Castro, L.N., von Zuben, F.J.: Learning and optimization using the clonal selection principle. IEEE Transactions on Evolutionary Computation 6(3) (2002) 6. Ada, G.L., Nossal, G.: The clonal selection theory. Sci. Am. 257, 50–57 (1987) 7. Coello Coello, C.A., Cortes, N.C.: Solving multiobjective optimization problems using an artificial immune system. Genet. Program. Evolvable Mach. 6, 163–190 (2005)
A Novel Multi-threshold Segmentation Approach Based on AIS Optimization
317
8. Campelo, F., Guimaraes, F.G., Igarashi, H., Ramirez, J.A.: A clonal selection algorithm for optimization in electromagnetics. IEEE Trans. Magn. 41, 1736–1739 (2005) 9. Weisheng, D., Guangming, S., Li, Z.: Immune memory clonal selection algorithms for designing stack filters. Neurocomputing 70, 777–784 (2007) 10. Abak, T., Baris, U., Sankur, B.: The performance of thresholding algorithms for optical character recognition. In: Proceedings of International Conference on Document Analytical Recognition, 697–700 (1997) 11. Kamel, M., Zhao, A.: Extraction ofbinary character/graphics images from grayscale document images. Graph. Models Image Process. 55(3), 203–217 (1993) 12. Trier, O.D., Jain, A.K.: Goal-directed evaluation of binarization methods. IEEE Trans. Pattern Anal. Mach. Intell. 17(12), 1191–1201 (1995) 13. Bhanu, B.: Automatic target recognition: state of the art survey. IEEE Trans. Aerosp. Electron. Syst. 22, 364–379 (1986) 14. Sezgin, M., Sankur, B.: Comparison of thresholding methods for non-destructive testing applications. In: IEEE International Conference on Image Processing (2001) 15. Sezgin, M., Tasaltin, R.: A new dichotomization technique to multilevel thresholding devoted to inspection applications. Pattern Recognition Lett. 21(2), 151–161 (2000) 16. Guo, R., Pandit, S.M.: Automatic threshold selection based on histogram modes and discriminant criterion. Mach. Vis. Appl. 10, 331–338 (1998) 17. Pal, N.R., Pal, S.K.: A review on image segmentation techniques. Pattern Recognit. 26, 1277–1294 (1993) 18. Shaoo, P.K., Soltani, S., Wong, A.K.C., Chen, Y.C.: Survey: A survey of thresholding techniques. Comput. Vis. Graph. Image Process. 41 (1988) 19. Snyder, W., Bilbro, G., Logenthiran, A., Rajala, S.: Optimal thresholding: A new approach. Pattern Recognit. Lett. 11, 803–810 (1990) 20. Chen, S., Wang, M.: Seeking multi-thresholds directly from support vectors for image segmentation. Neurocomputing 67(4), 335–344 (2005) 21. Chih-Chih, L.: A Novel Image Segmentation Approach Based on Particle Swarm Optimization. IEICE Trans. Fundamentals 89(1), 324–327 (2006) 22. Gonzalez, R.C., Woods, R.E.: Digital Image Processing. Addison Wesley, Reading (1992) 23. Baştürk, A., Günay, E.: Efficient edge detection in digital images using a cellular neural network optimized by differential evolution algorithm. Expert System with Applications 36(8), 2645–2650 (2009) 24. Lai, C.-C., Tseng, D.-C.: An optimal L-filter for reducing blocking artifacts using genetic algorithms. Signal Process. 81(7), 1525–1535 (2001) 25. Tseng, D.-C., Lai, C.-C.: A genetic algorithm for MRF-based segmentation of multispectral textured images. Pattern Recognit. Lett. 20(14), 1499–1510 (1999) 26. Poli, R., Langdon, W.B.: Foundations of Genetic Programming. Springer, Berlin (2002) 27. Yoo, J., Hajela, P.: Immune network simulations in multicriterion design. Structural Optimization 18(2-3), 85–94 (1999) 28. Wang, X., Gao, X.Z., Ovaska, S.J.: A hybrid optimization algorithm in power filter design. In: Proceedings of the 31st Annual Conference of the IEEE Industrial Electronics Society, Raleigh, NC, November 2005, pp. 1335–1340 (2005) 29. Xu, X., Zhang, J.: An improved immune evolutionary algorithm for multimodal function optimization. In: Proceedings of the Third International Conference on Natural Computation, Haikou, China, August 2007, pp. 641–646 (2007) 30. Tang, T., Qiu, J.: An improved multimodal artificial immune algorithm and its convergence analysis. In: Proceedings of the Sixth World Congress on Intelligent Control and Automation, Dalian, China, June 2006, pp. 3335–3339 (2006)