Ant Colony Optimization and a New Particle ... - Semantic Scholar

1 downloads 0 Views 254KB Size Report
the validation methods such as Jack Knife method, Round. Robin Method and Ten Fold validation method. The rest of the paper is organized as follows: the.
Ant Colony Optimization and a New Particle Swarm Optimization algorithm for Classification of Microcalcifications in Mammograms M.Karnan #1 K.Thangavel*2 P.Ezhilarasu *3 #1

Tamilnadu College of Engg., Coimbatore Tamil Nadu, #2 Periyar University, Salem Tamil Nadu India. #3 Hidusthan College of Engg. & Tech., Coimbatore Tamil Nadu, India. Abstract—

Genetic Algorithm (GA), Ant Colony Optimization (ACO) algorithm and Particle Swarm Optimization (PSO) are proposed for feature selection, and their performance is compared. The Spatial Gray Level Dependence Method (SGLDM) is used for feature extraction. The selected features are fed to a three-layer Backpropagation Network hybrid with Ant Colony Optimization and Particle Swarm Optimization (BPN-ACO-PSO) for classification. And the Receiver Operating Characteristic (ROC) analysis is performed to evaluate the performance of the feature selection methods with their classification results. The proposed algorithms are tested with 114 abnormal images from the Mammography Image Analysis Society (MIAS) database. Keywords GA, ACO, PSO, Feature Extraction, Classification I. INTRODUCTION In many western countries breast cancer is the most common form of cancer among women. The World Health Organization’s International agency for Research on Cancer, estimates that more than 250 000 women worldwide die of breast cancer each year. The breast cancer is one among the top three cancers in American women. In United States, the American Cancer Society estimates that, 315 990 new cases of breast carcinoma has been diagnosed, in 2007. It is the leading cause of death due to cancer in women under the age of 65. Thangavel et al., [16] presented a good review on various methods for detection of microcalcifications. It is of crucial importance to design the classification method in such a way to obtain a high level of True-Positive Fraction (TPF) while maintaining the False-Positive Fraction (FPF) at its minimum level. The texture features are extracted using Spatial Gray Level Dependence Method (SGLDM) from the segmented mammogram image [3,8]. In order to reduce the complexity and to increase the performance of the classifier the redundant and irrelevant features are reduced from the original feature set. In this paper, GA, ACO and PSO algorithms are proposed to select the optimal features from the original feature set. Only the optimal features are inputted to the classifier for classification of microcalcifications. The following section presents an overview of the work. A. Overview of the CAD System In this paper, the classification is performed in four steps. Initially the mammogram image is smoothened using median

filter and the pectoral muscle region is removed from the breast region [12, 18]. Next, the Particle Swarm Optimization (PSO) algorithm hybrid with Markov Random Filed (MRF) method is used to segment the microcalcifications from the enhanced mammogram image. In the second step, the cooccurrence matrix is generated to extract the texture features from the segmented image. The Haralick features are extracted from the co-occurrence matrix. There are totally 14 features for each image. These 14 features are grouped into 4 categories respective to their characteristics. In the third step, Genetic Algorithm (GA), Ant Colony Optimization (ACO) algorithm and Particle Swarm Optimization (PSO) are proposed to select the optimal features from each group of feature set. The optimal features selected from the original feature set are considered for classification. In the fourth step, the selected features are fed to the Backpropagation Network for classification. The weights for the neurons in BPN are extracted and updated using Ant Colony Optimization algorithm and Particle Swarm Optimization. The performances of the reduction algorithms are evaluated using the classification results by generating the ROC curve using the validation methods such as Jack Knife method, Round Robin Method and Ten Fold validation method. The rest of the paper is organized as follows: the following section presents a brief discussion on segmentation of microcalcifications. Section 3 describes the feature extraction method. Section 4 focuses on feature selection algorithms such as Genetic Algorithm, Ant Colony Optimization algorithm and Particle Swarm Optimization. The classification is performed in section 5. Section 6 describes the ROC analysis . The results and performance analysis is presented in section 7. And conclusions are given in section 8. II. SEGMENTATION OF MICROCALCIFICATIONS Before extracting the texture features, microcalcifications should be segmented from the background of the mammographic images [6, 7, 17]. In this paper, microcalcifications are segmented using Particle Swarm Optimization (PSO) algorithm hybrid with Markov Random Field (MRF). The segmentation process with this method consists of three steps. The first step enhances the mammogram image using median filtering.. In the second step, the cliques having similar arrangements of pixels are assigned with unique label. And the Maximizing a Posterior (MAP) function value is estimated for each clique using MRF. Where,

clique is a window of neighborhood pixels with 3×3 size. The segmentation is performed with the optimum label, which minimizes the MAP estimate. The PSO algorithm [1, 4, 9] is implemented to find out the optimum label in the third step. The intensity value, Iopt of the pixel that generates the optimum label is traced. The pixels having the intensity value which is equal to Iopt or higher are extracted from the original image to create the segmented image. This segmented image is used in the next step for extracting the texture features. III. FEATURE EXTRACTION The texture of images refers to the appearance, structure and arrangement of the parts of an object within the image. Images used for diagnostic purposes in clinical practice are digital. A two dimensional digital image is made up of little rectangular blocks or pixels (picture elements) each is represented by a set of coordinates in space, and each has a value, representing the gray-level intensity of that picture element in space. A feature value is a real number, which encodes some discriminatory information about a property of an object. In this paper, the Spatial Gray Level Dependence Method is used to extract the features from the segmented mammogram image. A. Spatial Gray Level Dependence Method (SGLDM) In this method, co-occurrence matrix is generated to extract the texture features from the segmented mammogram image. There may be many co-occurrence matrices computed for a single image, one for each pair of distances and directions defined. Normally a set of 20 co-occurrence matrices are computed, for five different distances, in the horizontal, vertical, and two diagonal directions i.e., the distances are 1, 3, 5, 7 and 9, and the four angles 0o, 45o, 90o, and 135o were defined for calculating the matrix for each of the five distances. Since the co-occurrence matrix analyzes the graylevel distribution of pairs of pixels, it is also known as the second-order histogram. The estimated joint conditional probability density functions are defined as [3]. The features computed from the co-occurrence matrix are Angular Second Moment (ASM), Contrast (CON), Correlation (COR), Variance (VAR), Inverse Difference Moment (IDM), Sum Average (SA), Sum Variance (SV), Sum Entropy (SE), Entropy (ENT), Difference Variance (DV), Difference Entropy (DE), Information Measure of Correlation I (IMC1), Information Measure of Correlation II (IMC2) and Maximal Correlation Coefficient (MCC). The features based on the cooccurrence matrices should capture some characteristics of textures, such as homogeneity, coarseness, periodicity and others. The 14 features can be put into four groups [3]. Let us consider a 7×7sub image of a segmented image. Suppose d = 1 and the angle = 00 and 1800, then count the number of similar pairs of pixels which are successive in horizontal direction and enter the count into the matrix as M (73, 72) = 3. Figure 1 shows a sub image for constructing the SGLDM. Figure 2 shows a typical SGLDM matrix.

Figure 1: Sub image of a segmented mammogram for constructing SGLDM

Figure 2: A typical SGLDM matrix

Fig:3. (a) Mammogram, (b) Segmented Image IV. FEATURE SELECTION Feature selection is meant here to refer to the problem of dimensionality reduction of data, which initially contain a high number of features. One hope to choose optimal subsets of the original features which still contain the information essential for the classification task, while reducing the computational burden imposed by using many features [5, 8, 10]. In this paper, the Genetic Algorithm (GA), Ant Colony Optimization (ACO) algorithms and Particle Swarm Optimization (PSO) are proposed for feature selection. A. Feature Selection Using Genetic Algorithm A GA is a heuristic search or optimization technique for obtaining the best possible solution in a vast solution space[2, 11]. In this paper, totally 20 co-occurrence matrices are created for each image for each pair of distances and directions defined. The Haralick features are extracted for all the 114 images. And the features are grouped into four categories as discussed in the earlier section. A single feature value for all the images are considered as initial population for genetic algorithm. Table1 SGLDM based feature extraction with MRF-PSO method

3. Report results 4.Terminate Figure 3: Feature Selection Using Particle Swarm Optimization As a result the ASM, IDM, ENT and IMC2 are the selected features from this algorithm. Table 2: features selected by the feature selection algorithms Algorithms GA ACO PSO An optimum value is found out for each individual feature set. In a group, the optimum value from each individual set is compared; the one, which is optimum among other features in the same group, is selected for classification. Like this, for every group an optimum feature is selected. Finally, the algorithm returns four optimum features from the set of 14 features. Only the selected features are considered for classification. The features selected from this Genetic algorithm are ASM, VAR, ENT and IMC2. B. Feature Selection Using Ant Colony Optimization Algorithm The optimum feature is selected from each group and only those selected features are further used in the classification. As a result the ASM, IDM, ENT and IMC2 are the selected features from this ACO algorithm. C. Feature Selection Using Particle Swarm Optimization (PSO) The optimum feature is selected from each group and only those selected features are further used in the classification. As a result the ASM, IDM, ENT and IMC1 are the selected features from this algorithm. Algorithm: 1. Initialize (a) Set constants kmax, c1, c2, w0:(b) Randomly initialize particle positions xi0 ∈ D in IRn for i = 1, ..., p (c) Randomly initialize particle velocities 0 ≤ vi0 ≤ vmax0 for i = 1, ..., p (d) Set k = 1 2. Optimize (a) Evaluate function value fik using design space coordinates xik (b) If fik≤ fibest then fibest = fik, pi = xik (c) If fik ≤fgbest then fgbest = fik, pg = xik (d) If stopping condition is satisfied then go to 3. (e) Update particle velocity vector vik+1 using Eq: a pseudovelocity υik+1 calculated: υik+1 = wkυik + c1r1(pik − xik) + c2r2(pgk − xik). (f) Update particle position vector xik+1 using Eq.:. Xi i i k+1 = x k + υ k+1 (g) Increment i. If i > p then increment k, and set i = 1. (h) Go to 2(a).

Selected Features ASM, VAR, ENT, IMC2 ASM, IDM, ENT, IMC2 ASM, IDM, ENT, IMCI

V. CLASSIFICATION Classification of objects is an important area of research and of practical applications in a variety of fields, including pattern recognition, artificial intelligence and vision analysis. Classifier design can be performed with labeled or unlabeled data. The Back Propagation learning algorithm is widely used for multi-layer feed forward network. The classifier employed in this paper is a three layer Back Propagation Neural network. The Back Propagation Neural network optimizes the net for correct responses to the training input data set. More than one hidden layer may be beneficial for some applications, but one hidden layer is sufficient if enough hidden neurons are used. A. Back Propagation Network Classifier Hybrid with Ant Colony Optimization Algorithm Back propagation is a learning algorithm for multi-layered feed forward networks that uses the sigmoid function. The backpropagation neural network optimizes the net for correct responses to the training input data set. More than one hidden layer may be beneficial for some applications, but one hidden layer is sufficient if enough hidden neurons are used [4, 8]. In the back propagation algorithm error function is calculated after the presentation of each input and the error is propagated back through the network modifying the weights before the presentation of the next pattern. This error function is usually the Mean Square Error (MSE) of the difference between the desired and the actual responses of the network over all the output units. Then the new weights remain fixed and a new image is presented to the network and this process continuous until all the images have been presented to the network. The presentation of all the patterns is usually called one epoch or single iteration. In practice many epochs are needed before the error becomes acceptably small. The number of hidden neurons is equal to the number of input neurons. And only one output neuron. Initial weights are extracted using the ACO algorithm as follows: In weight extraction, N random numbers are generated with d number of digits. Where, N is the total number of neurons in the BPN. The weights are extracted from the population of random numbers to determine the fitness values. The actual weight wk is given by:

wk={c*[ xkd+210d-2+xkd+310d-3+ .. +x(k+1)d]}/10d-2 where c=1, if 5 ≤ xkd+1 ≤ 9, else c=-1, and k represents the population. The weights are extracted for each string in the population. The fitness values is calculated as defined below: F = 1 / E, Where, E = sqrt ( [ E1 + E2+ … Em ] / m ), Where, m – is the total number of training patterns, and E1, E2 …. Em are the errors for each pattern, i.e., Ei = (Ti – Oi)2, where Ti is the desired output, and Oi is the actual result of the output layer. Thus the fitness value is calculated for a single population. Like this M populations are generated and their fitness values are calculated. Then optimum fitness value is selected using ACO algorithm. The procedure for finding the optimum value is similar to the algorithm as described in section 4.2. All the minimum fitness values are replaced with the maximum fitness values. Now the weights are updated with the new fitness value and the training is performed again. This procedure is repeated until the error from the backpropagation network is less than the tolerance value. The output from the each hidden neuron is calculated using the sigmoid function, S1 = 1/(1+e-λx), where λ=1, and x = Σi wih ki, where wih is the weight assigned between input and hidden layer, and k is the input value. The output from the output layer is calculated using the sigmoid function, S2 = 1 / (1 + e-λx), where λ=1, and x = Σi who Si, where who is the weight assigned between hidden and output layer, and Si is the output value from hidden neurons. The network is trained to produce a 0.9 output value for malignant and 0.5 output value for benign. 0.1 output value for normal. B. Back Propagation Network Classifier Hybrid with PSO Algorithm The PSO–BPN is an optimization algorithm combining the PSO with the BPN. Similar to the GA, the PSO algorithm is a global algorithm, which has a strong ability to find global optimistic result, this PSO algorithm, however, has a disadvantage that the search around global optimum is very slow. The BPN algorithm, on the contrary, has a strong ability to find local optimistic result, but its ability to find the global optimistic result is weak. By combining the PSO with the BPN, a new algorithm referred to as PSO–BPN hybrid algorithm is formulated in this paper. The fundamental idea for this hybrid algorithm is that at the beginning stage of searching for the optimum, the PSO is employed to accelerate the training speed. When the fitness function value has not changed for some generations, or value changed is smaller than a predefined number, the searching process is switched to gradient descending searching according to this heuristic knowledge. Similar to the PSO algorithm, the PSO–BPN algorithm’s searching process is also started from initializing a group of random particles. First, all the particles are updated according to the Eqs. 7 and 8, until a new generation set of particles are generated, and then those new particles are used to search the global best position in the solution space. Finally the BPN algorithm is used to search around the global optimum. In this way, this hybrid algorithm

may find an optimum more quickly. The procedure for this PSO–BPN algorithm can be summarized as follows: Algorithm: Step 1: Initialize the positions and velocities of a group of particles randomly in the range of [0, 1]. Step 2: Evaluate each initialized particle’s fitness value, and Pb is set as the positions of the current particles, while Pg is set as the best position of the initialized particles. Step 3: If the maximal iterative generations are rrived, go to Step 8, else, go to Step 4. Step 4: The best particle of the current particles is stored. The positions and velocities of all the particles are updated according to Eqs. 7 and 8, then a group of new particles are generated, If a new particle flies beyond the boundary [Xmin, Xmax], the new position will be set as Xmin or Xmax; if a new velocity is beyond the boundary [vmin0, vmax0], the new velocity will be set as vmin0 or vmax0. Step 5: Evaluate each new particle’s fitness value, and the worst particle is replaced by the stored best particle. If the ith particle’s new position is better than pik, pik is set as the new position of the ith particle. If the best position of all new particles is better than pgk, then pgk is updated. Step 6: Reduce the inertia weights wk according to the selection strategy. Step 7: If the current pgk is unchanged for ten generations, then go to Step 8; else, go to Step 3. Step 8: Use the BPN algorithm to search around pgk for some epochs, if the search result is better than pgk, output the current search result; or else, output pgk. This is only the first kind of condition, we can also use the following steps to replace the above Steps 6–8, then get the second kind of condition. Step 6: Use the BPN algorithm to search around pgk for some generations, if search result is better than pgk, pgk is set for the current search result; or else, comparing it with the worst particle of current particles, if it is better than the best particle, using it to replace the worst particle, or else, go to Step 7. Step 7: Reducing the inertia weights wk. Step 8: Output the global optimum pgk. Figure 4: Back Propagation Network Classifier Hybrid with PSO Algorithm VI. RECEIVER OPERATING CHARACTERISTIC (ROC) ANALYSIS The receiver operating characteristic (ROC) curve is a popular tool in medical and imaging research. It conveniently displays diagnostic accuracy expressed in terms of sensitivity (or true-positive rate) against (1-specificity) (or false-positive rate) at all possible threshold values. Performance of each test is characterized in terms of its ability to identify true positives while rejecting false positives [14, 15]. VII. RESULTS AND DISCUSSION The images used in this work were taken from the Mammography Image Analysis Society (MIAS) (2003). The database consisting of 322 images, which belong to three

normal categories: normal, benign and malign. There are 208 normal images, 63 benign and 51 malign. In this paper, only the benign and malign images are considered for feature extraction. All the images also specify the locations of any abnormities that may be present.. The classification results of the back propagation neural network is tested by using a jackknife method, round-robin method, and ten fold validation method. The results were analyzed by using ROC curve. Table 3: Classification based on the training and feature selection algorithms Method

GA

ACO

PSO

Jack-Knife Method Round Robin Method Ten-fold validation method

0.856 0.861 0.849

0.948 0.947 0.938

0.952 0.950 0.949

REFERENCES 1.

2. 3.

4.

5. 6.

7.

8.

9.

. Fig; 5. ROC Curves for Feature Selection methods, GA, ACO and PSO based on (a) Jack-Knife Method, (b) Round Robin Method and (c) Tenfold validation method : Figure 6(a) shows the ROC curve generated for GA, ACS and PSO based on jack knife method. The Az value for GA is 0.85, 0.92 for ACO and 0.94 for PSO. : Figure 6(b) shows the ROC curve generated for GA, ACS and PSO on round robin method. The Az value for GA is 0.86, 0.94 for ACO and 0.95 for PSO.:Figure 6(c) shows the ROC curve generated for GA, ACS and PSO based on ten-fold validation method. The Az value for GA is 0.84, 0.93 for ACO and 0.94 for PSO.

10.

11.

12. 13.

VIII. CONCLUSION 14.

In this paper, SGLDM is used to extract the Haralick features from the segmented mammogram image. And the features are grouped into four categories based on visual texture characteristics, statistics, information theory and information measures of correlation. Genetic Algorithm, Ant Colony Optimization algorithms and Particle Swarm Optimization are proposed for feature selection. Each algorithm is selecting the optimum feature from each group and the selected features are considered for classification. A three-layer Back propagation Neural Network hybrid with Ant Colony Optimization algorithm and Particle Swarm Optimization is used for classification. The ACO and PSO algorithm is used for weight extraction while learning. ROC analysis is performed to compare the classification results of the feature selection algorithms. The results show that the PSO algorithm selects better features than GA and ACO

15. 16.

17.

18.

Eberhart, R. C., and Kennedy, J. 1995: A New Optimizer Using Particles Swarm Theory, Proc. Sixth International Symposium on Micro Machine and Human Science (Nagoya, Japan), IEEE Service Center, Piscataway, NJ, 39-43. Goldberg, D.E.: “Genetic Algorithms in Search,” Optimization and Learning, NY, Addison Wesley, 1989. Haralick, R.M., Shanmugan, K., and Dinstein, I.: “Textural features for image classification,” IEEE Trans. Syst., Man, Cybern., vol. 3, p: 610– 621, 1973. Jing-Ru Zhang, Jun Zhang, Tat-Ming Lok , Michael R. Lyu,” A hybrid particle swarm optimization–back-propagation algorithm for feedforward neural network training”, International Journal on Applied Mathematics and Computation (Elsvier). VOL 185, pp.1026-1037, 2007 John, G., Kohavi, R., and Pfleger, K.: “Irrelevant Features and the Subset Selection Problem,” Proc ICML, pp: 121-129, 1994. M. Karnan, R. Sivakumar, M. Almelumangai, K. Selvanayagi and T. Logeswari,” Hybrid Particle Swarm Optimization for Automatically Detect the Breast Border and Nipple position to Identify the Suspicious Regions on Digital Mammograms Based on Asymmetries”, International Journal of Soft Computing 3 (3): 220-223, 2008 M.Karnan and K.Thangavel, Automatic Detection of the Breast Border and Nipple Position on Digital Mammograms Using Genetic Algorithm, International Journal on Computer Methods and Programs in Biomedicine (Elsvier). VOL 87, pp.12-20 2007 M.Karnan, K.Thangavel, “Weight Updating in BPN Network Using Ant Colony Optimization for Classification of Microcalcifications in Mammograms, International Journal of Computing and Applications, Vol:2,no.2, pp 95-109, 2007 M.Karnan, K.Thangavel, K.Geetha, K.Thanuskodi, “Partial Swarm Optimization for Segmentation of Microcalcifications in Mammograms”. Lecturer Notes in Engineering and Computer Science IMECS Hong Kong: PP 115-121 June, 2006. “Ant Colony M.Karnan, K.Thangavel, R.Sivakumar, K.Geetha optimization algorithm for Feature Selection and Classification of Microcalcifications in Mammograms, IEEE International Conference on Advanced Computing and Communications, 2006. ,IEEE press, pp: 298-303,2006 M.Karnan, K.Thangavel, K.Thanuskodi, K.Geetha “Automatic Detection of Suspicious Regions on Digital Mammograms using Genetic Algorithm ” DGTPMT-2006, Feb18-19 2006 Dayalbagh Educational Institute, Agra, pushpak publications 2006. M.Karnan, K.Thangavel K.Geetha, K.Thanuskodi “Enchancement of Micrcalcifications in Digital Mammograms”, Proceedings pp: 363-370, 2006 April-15, GCEVISION’06, Thirunelveli. M.Karnan, K.Thangavel, K.Geetha, K.Thanushkodi, R.Sivakumar, “Particle Swarm Optimization for Automatic Detection of the Suspicious Regions on Digital Mammograms”, Algorithm International Conference on Intelligent systems and controls, Karpagam College of Engineering, Coimbatore. 9-11 August 2006. Metz, C.E.: “ROC methodology in radiologic imaging,” Investigative Radiology, vol.21, pp: 720–733, 1986. Swets, J.A.: “ROC analysis applied to the evaluation of medical imaging techniques,” Investigative Rad., vol. 14, pp: 109 – 121, 1979. K.Thangavel, M.Karnan, R. Siva Kumar, and A. Kaja Mohideen. “Automatic Detection of Microcalcification in Mammograms-A Review,” International Journal on Graphics Vision and Image Processing, vol. 5, no. 5, pp: 31-61, 2005. K.Thangavel and M.Karnan. “Computer Aided Diagnosis in Digital Mammograms: Detection of Microcalcifications by Meta Heuristic Algorithms,” International Journal on Graphics Vision and Image Processing, vol. 7, no. 7, pp: 41-55, 2005. Thangavel, K., and Karnan, M.; “CAD system for Preprocessing and Enhancement of Digital Mammograms,” International Journal on Graphics Vision and Image Processing, vol. 5, no. 9, pp: 69-74, 2005.