Ubiquitous Computing and Communication Journal
CLASSIFICATION OF MC CLUSTERS IN DIGITAL MAMMOGRAPHY VIA HARALICK DESCRIPTORS AND HEURISTIC EMBEDDED FEATURE SELECTION METHOD Imad Zyout1, PhD Department of Electrical Engineering, Tafila Technical University, Tafila 66110, Jordan Ikhlas Abdel-Qader2, PhD, PE Department of Electrical and Computer Engineering, Western Michigan University, MI 49008, USA Christina Jacobs3, MD Radiology Department, Bronson Methodist Hospital, MI 49007,USA. Email:
[email protected],
[email protected],
[email protected],
ABSTRACT Characterizing the texture of mammographic tissue is an efficient and robust tool for the diagnosis of microcalcification (MC) clusters in mammography because it does not require a prior MC segmentation stage. This work is not only intended to validate MCs’ surrounding tissue hypothesis that reveals the potential of breast tissue surrounding MCs to diagnose microcalcifications, but to present an improvement over the existing methods by introducing a new heuristic feature selection based on particle swarm optimization and KNN classifier (PSO-KNN). Using MC clusters from mini-MIAS and a local dataset, our results demonstrate the effectiveness of the proposed characterization and feature selection methods. Keywords: Microcalcification cluster, surrounding tissue, Haralick measures, embedded feature selection. 1 INTRODUCTION Morphology based methods are the primary tools for diagnosis and decision making on the nature of mammographic microcalcifications [1]. However, a key and challenging step for analyzing clustered microcalcifications using their shape is the segmentation stage [2]. An alternative and promising method for characterizing MCs is by analyzing the texture of mammographic regions enclosing them [3]. Texturebased computer-aided diagnosis of MCs overrides the need for the MC segmentation stage. A texture-based diagnosis approach is also more suitable for characterizing the texture dependency and spectral properties that are invisible to human eyes and cannot be described using shape measures. Such alternative method has been investigated in several studies [3]-[9]. A common shortcoming of [4]-[6], [8]-[9] is the bias of texture analysis due to the presence of breast calcifications that are tiny deposits of calcium, which cannot be considered malignant or benign lesions. A few studies [3],[7] from the literature have attempted to minimize the bias of the texture-based diagnosis by excluding image locations that correspond to the microcalcifications before characterizing the malignancy of a given mammographic region. In answering the question: can the texture of breast tissue surrounding
Volume 6 Number 4
MCs contribute to the diagnosis of MC clusters in mammography? In [3] and [7], it was demonstrated that the texture of breast tissue surrounding MCs can be indeed useful for cancer diagnosis. Thiele et al. [7] classified 54 MC clusters by extracting texture and fractal features of the region surrounding each cluster and reported a classification sensitivity of 89% and specificity of 83%. Using Laws’ measures of texture and an exhaustive feature search method, Karahaliou et al. [3] also diagnosed 100 MC clusters from the DDSM dataset by analyzing the surrounding texture of MCs and produced a classification accuracy of 89 %. The results of studies illustrate the importance of analyzing the texture of tissue surrounding MCs to improve the performance of the texture-based CADx of breast cancer and might provide a diagnosis method that can avoid segmentation of MCs. Feature selection approaches used in [3] and [7] were based on exhaustive and linear discriminate analysis methods, respectively. Such methods have their own shortcomings. That is an exhaustive feature search has a higher tendency to over-fit the training data. Also, feature selection based on deterministic methods such as linear discriminate analysis suffers from local minima problem at higher rates than heuristic search methods such as Genetic algorithms [11] and particle swarm optimization
Page 8
www.ubicc.org
Ubiquitous Computing and Communication Journal
(PSO) [12]. Moreover, in this work, we decided to use a heuristic search based on PSO that since it is more efficient than a GA approach [13] that is a heuristic search using PSO is easy to implement with fewer parameter to be selected during initialization and optimization stages. 2 BACKGROUND 2.1 Heuristic parameter selection using PSO PSO is a population-based heuristic search method [12], [14], which was inspired by the social behaviors of schools of fish and flocks of birds. According to the PSO algorithm, individuals or particles from the swarm cowork to find an optimal solution to the parameter selection problem. Similar to other population search methods, the PSO algorithm starts with random initialization of the candidates (particles) in the parameter space. During the optimization process, the PSO algorithm stores the locations of the personal best fitness or achievement x kp [ xkp1 , xkp2 ,...] that has been accomplished by each individual and the global best fitness x g [ x1g , x2g ,...] achieved by all individuals or particles in the swarm. This information is then used to update the movement and the location of the particles in the parameter space. The new velocity vki (t 1) is expressed as vki (t 1) w.vki (t ) c1.r1.(xki (t ) xkip ) c2 .r2 .(xki (t ) xig )
(1) Where w is constant, typically in interval [0.0 1.0], and represents the inertia of the movement, r1 and r2 are
the diagnosis of microcalcifications in mammography [4][6].Characterizing texture using Haralick measures is based on analyzing the second order statistics of the graylevel histogram of the given region. Such a process can be accomplished by forming a set of gray-level-cooccurrence matrices (GLCMs). A GLCM represents the frequency of the occurrence of a gray-levels i and j separated by distances Δx and Δy along x and y directions. Co-occurrence matrices usually computed for specific displacements Δx and Δy and four directional angles: 0 o, 45 o, 90 o, and 135o. This process leads to four GLCM matrices. From each GLCM matrix, we compute a set of twenty Haralick measures as shown in Table 1. This set of measures includes autocorrelation, energy, entropy, contrast, local homogeneity, correlation, cluster’s shade, cluster’s prominence, dissimilarity, sum of squares, sum average, maximum probability, sum entropy, difference entropy, sum variance, difference variance, information measure of correlation I, information measure of correlation II, inverse difference normalized(INN), and inverse difference moment normalized. Using the four GLCM matrices, the average, range, and standard deviation of each texture measure is calculated and used as texture features. This step leads to sixty GLCM features. For the purpose of testing the power of the proposed heuristic feature selection approach, we did not attempt to use any of the common feature filters [16] or dimensionality reduction methods such as reducing the number of features by eliminating descriptors of poor discriminative power.
random numbers between [0.0 1.0], and c1 and c2 are non-negative constants representing the learning rates. To control the search speed, the ith velocity vki (t ) is constrained by the user to be in the range [viMin , viMax ] . Using the new velocity computed in (1), the new location xki (t 1) is updated as
xki (t 1) xki (t ) vki (t 1)
(2)
This iterative search process is continued until a predefined termination criterion, a fitness value or maximum number of iterations, is reached. 2.2 Texture features based on Haralick measures
Table 1: List of Haralick measures. No. Measure description
No.
Measure description
1 2 3 4 5 6
Autocorrelation
11
Sum of squares: Variance
Contrast
12
Sum average
Correlation
13
Sum variance
Cluster Prominence
14
Sum entropy
Cluster Shade
15
Difference variance
Dissimilarity
16
7
Energy
17
8
Entropy
18
9
Homogeneity
19
Maximum probability
20
Difference entropy Information measure of correlation 1 Information measure of correlation 2 Inverse difference normalized (INN) Inverse difference moment normalized
10
Haralik features were first introduced for characterizing Alzheimer’s disease in magnetic resonance images (MRI) [15]. These features have been used also in
Volume 6 Number 4
Page 9
www.ubicc.org
Ubiquitous Computing and Communication Journal
3
CLASSIFICATION USING PSO-KNN
The process of differentiating between malignant and benign MC clusters is accomplished through three stages: texture analysis using Haralick measures of the surrounding tissue, feature selection, and a pattern classification using a KNN classifier. As shown in Figure 1, the process of characterizing the texture of the tissue surrounding MCs involves selecting a mammographic region that best fits each MC cluster, segmentation of MCs surrounding tissue by removing the image regions corresponding to microcalcifications, and analyzing the texture of the surrounding tissue using Haralick measures. We briefly describe the four stages of the diagnosis scheme as follows: Mammographic region selection: Microcalcification cluster ground truth that represents the radiologist interpretation of each mammogram and includes the degree of the malignancy and the size of the region best depicts each cluster is used to select the mammographic region. Due to the variability of the size of MC clusters, we used one region size that is larger than the size of most of clusters. For instance, regions of size of 128×128 and 256×256 pixels are used to analyze mammogram of 100 µm and 200 µm spatial resolutions, respectively. Microcalcification segmentation: We used a dual tophat morphological transform that uses two structuring elements to segment individual MCs as was introduced in [9]. Each top-hat filtering stage is followed by a thresholding step that is to produce a binary representation of the segmented microcalcifications. Then, the segmentation outcomes from the two filtering stages are logically added to produce the final segmentation results with an example shown in Figure 2b.
Digital Mammogram
PSO-KNN Feature Selection
KNN Classifier
Surrounding tissue segmentation: Utilizing the results of MC segmentation, this step involves removal of image regions correspond to segmented MCs by means of image subtraction step to produce a region of the surrounding tissue. An example of surrounding tissue segmentation is shown in Figure 2c. Texture analysis using Haralick features: This stage exploits the second order probability of gray-level histogram of the surrounding texture of each MC cluster using sixty Haralick, GLCM texture descriptors, as was presented in Table 1. Embedded feature selection using PSO-kNN: For selecting the most discriminative Haralick features and for accomplishing a dimensionality reduction of the input feature space, we used an embedded feature selection strategy that uses a heuristic parameter search based on PSO and KNN classifier. An embedded feature selection scheme using a PSO-KNN framework incorporates the feature selection stage during the classifier learning process, hence, we are proposing a hyprid PSO algorithm. Each PSO particle is represented by N+D coordinates of which N coordinates are allocated for the feature search while D coordinates are used for the classifier parameter selection. Because this work uses a KNN classifier for the classifying stage, K parameter is the only parameter to be adjusted during the learning process. PSO-KNN fitness function: Performance criteria such as the classification accuracy and area under a receiver operating characteristic (ROC) curve, estimated using a leave-one-out (LOO) training and testing method, are possible selections for the objective function of the heuristic optimization using a PSO-KNN framework.
Region Selection*
Haralick Feature Extraction
MC Segmentation
Srrounding Tissue Segmentation
* A region selection stage utilizes MC cluster ground truth provided with each mammogram
Figure 1: Characterization of MC clusters using surrounding tissue and PSO-KNN embedded feature selection. `
Volume 6 Number 4
Page 10
www.ubicc.org
Ubiquitous Computing and Communication Journal
(a) (b) (c) Figure 2: Segmentation of surrounding tissue. (a) An original mammographic region of size of 256 ×256 pixels with a malignant MC cluster, (b) segmented microcalcifications and (c) surrounding tissue obtained by subtracting image regions corresponding to the segmented MCs from the original region shown in (a), that is subtracting the corresponding gray-level image representation of fig.2b from the image in fig.2a.
However, for applications such as the diagnosis of microcalcifications clusters, it is important to find a solution that achieves the best classification accuracy as well as the best generalization performance. In this study, we selected a cross-validation based on a leave-one-out (LOO) technique since such a method will ensure unbiased estimation of the classifier generalization performance [9]. Furthermore, an LOO approach is an excellent choice when the size of the dataset is relatively small. The LOO fitness ( f i ) of the ith candidate solution of the PSO-KNN scheme is defined as
fi TPFi TNFi exp( ( Ni 1) / N )
(3)
where TPFi and TNFi are the true positive (or Sensitivity) and true negative (Specificity) fractions of the classification performance of the ith solution. Also, N and N i are the dimensionality of the original and ith selected feature spaces, respectively. 4
EXPERIMENTAL RESULTS
4.1 Test data This study uses two datasets to validate the proposed characterization and feature selection methods. The first set of mammograms is from the public screen film mammography dataset provided by the Mammographic Image Analysis Society (MIAS) [17]. Each mammogram in the mini-MIAS dataset is of 1024×1024 pixels with 200µm pixel’s size and 8-bit depth. This dataset contains 23 mammograms with MCs from which we extracted 33 MC clusters (13 benign and 20 are malignant).
Volume 6 Number 4
The second set of mammograms was obtained from the digital mammography system of Bronson Methodist Hospital (BMH, MI). This dataset consists of 30 digital mammograms of 100-µm pixel’s size and 16-bit depth. These digital mammograms contain 35 MC clusters of which 18 are benign and 17 are malignant. 4.2 Experimental setup All results presented in this paper have been produced using a heuristic search algorithm with the swarm size of 100 and the maximum number of iterations is set to 50. The fitness function of the PSOKNN heuristic embedded feature selection, introduced in Section 3, is designed to simultaneously find a learning model that is simple but still provide the best generalization and classification performance while maintaining an acceptable computational complexity level of the algorithm. Due to the difference in the spatial resolution between the digitized screen film mammograms from mini-MIAS and digital mammograms from the Bronson Methodist Hospital (BMH), regions of size 128×128 and 256×256 pixels have been used to characterize MC clusters from miniMIAS and BMH datasets, respectively. Because MC clusters from MIAS and BMH datasets vary in size, we have selected the region size that best fits most of the clusters from each dataset. 4.3 Results Analysis and Discussion Results of using GLCM features to characterize MCs’ surrounding tissue and the PSO-KNN embedded feature selection scheme to analyze or classify the malignancy
Page 11
www.ubicc.org
Ubiquitous Computing and Communication Journal
of MC clusters form BRONSON and mini-MIAS datasets are presented in Table 2.The abundant number of features, 60, in this study and the relatively small size of the dataset led to have many feature subsets that produce same classification accuracy. This challenge has been minimized through appropriate selection of the fitness value of the PSO-KNN heuristic parameter selection process. Rather than using the classification accuracy, which is the percentage of misclassified examples, this work instead incorporates the classification’s specificity, sensitivity, and the dimensionality of the selected feature space in the selection of the best learning model. Using the classification accuracy as an objective function of the PSO-KNN scheme, several learning models achieved the same classification accuracy. For instance, two models have accomplished a perfect classification accuracy of 100% of MC clusters from MIAS datasets and several models have achieved classification accuracy of 94 % on Bronson datasets. Hence, including
both the classification performance and the dimensionality of the selected feature space has helped in selecting the best learning models, as presented in Table 3. As shown by Table 3, the best learning models for MIAS and Bronson are models with five and with four Haralick features, respectively. As for the most discriminative Haralick features, our results indicate that the most distinguishing features are slightly different form one dataset to another. This difference is mostly because mammographic regions from MIAS and BMH are different in terms of the spatial and contrast resolutions. For the MCs’ surrounding tissue of MIAS dataset, we found the most discriminative features to be: the standard deviation of Inverse difference normalized, average of difference entropy, standard deviation of cluster shade. While those of the BMH dataset are: the average of inverse difference moment normalized, standard deviation and range of dissimilarity, standard deviation of sum of average, and standard deviation of inverse difference normalized.
Table 2 : Classification of MC clusters using PSO-KNN approach and Haralick features. Dataset
K
N
TPF
FPF
Accuracy
Fitness
miniMIAS
3 3 3 3 3 3 3 3
5 7 2 6 7 4 6 5
1 1 0.95 1 0.94 0.88 0.94 0.88
1 1 0.92 0.92 0.94 1.0 0.94 1.0
1 1 0.94 0.96 0.94 0.94 0.94 0.94
0.064 0.095 0.138 0.151 0.196 0.161 0.180 0.175
Bronson
Table 3: Details of PSO-KNN learning models that achieve the highest classification performance. Dataset
K
N
TPF
FNF
Accuracy
3
5
1.0
1.0
1.0
Haralick Features Avg. of Inverse difference normalized Avg. of Difference entropy SD of Cluster Shade
miniMIAS
SD of Cluster Prominence Range of inverse difference normalized
Bronson
3
4
0.88
1.0
0.94
Avg. of Inverse difference normalized SD of Dissimilarity Range of Dissimilarity Range of the Sum of average
Volume 6 Number 4
Page 12
www.ubicc.org
Ubiquitous Computing and Communication Journal
In this work, we also examined the impact of the size of the mammographic regions (i.e. the region of the surrounding tissue) on the performance of the malignancy analysis using the surrounding tissue and on the outcome of the classification and feature selection using PSO-KNN. Results from the mini-MIAS dataset indicate the importance of using appropriate region size of the surrounding tissue. That is using regions of size 256×256 pixels to analyze the surrounding tissue and classify MC clusters from mini-MIAS, the best result was 94% accuracy that corresponds to TPF of 1.0 and FPF of 0.90 and three Haralick features. The results of this study demonstrated significant differences with respect to which Haralick features produce the best classification performance. However, this result needs further validation using other datasets and other sizes of the surrounding tissue. 5
in mammograms, Pattern Recognition, Vol. 37, pp.1973-1986 (2004). [5] H. S. Zadeh., P.S. Nezhad, and F. R. Rad: Shape
based and texture-based feature extraction for classification of microcalcifications in mammograms. Proceedings from SPIE Medical Imaging, Vol. 4322, pp. 3010-310 (2001). [6] H. P. Chan, B. Sahiner, K. L. Lam, N. Petrick, M.
A. Helvie, M. M. Goodsitt, and D. D. Adler: Computerized analysis of mammographic microcalcifications in morphological and texture feature spaces, Medical Physics, pp.2007–2019 (1998). [7] Thiele, D. L., Kimme-Smith, C., Johnson, T. D.,
McCombs, M. and Bassett, L. W.: Using tissue texture surrounding calcification clusters to predict benign vs malignant outcomes. Medical Physics, Vol. 23, pp. 549-555(1996).
CONCLUSIONS
In this paper, we characterized the malignancy of MC clusters using Haralick features of the MCs’ surrounding tissue integrated into an embedded feature selection framework based on a heuristic PSO-KNN approach. Two mammogram datasets were used to validate the surrounding tissue hypothesis and to investigate the effectiveness of the proposed feature selection and classification methods. Using MC clusters from mini-MIAS and BMH datasets, we have achieved a classification performance of accuracy of 100% and 94 %, respectively. Results of this study also indicate that the chosen size of the surrounding tissue has some impact on the feature selection and classifier outcomes. This work also reveals the potential of the second order statistics of the MCs’ surrounding tissue as another tool to assist radiologists in the diagnosis of breast cancer. 6
REFERENCES [1] M. Kallergi: Computer-aided diagnosis of mammographic microcalcification clusters. Medical Physics, Vol. 31 No. 2, pp. 314-326(2004). 2009. CADx of mammographic mass and clustered microcalcifications: A review, Medical Physics, Vol. 36 No. 6 , pp. 2052-2068 (2009).
[8] A. P. Dhawan, Y. Chitre, C. Bonasso, and K.
Wheeler: Radial-basis-function-based classification of mammographic microcalcifications using texture features, Proceeding from the 17th Annual International Conference and 21st Canadian Medical and Biological Engineering Conference, pp. 535–536 (1995). [9] I. Zyout, and I. Abdel-Qader: Characterization of
clustered microcalcifications using multiscale Hessian based feature extraction, IEEE International Conference on Electro/Information Technology (EIT2010), Normal, IL, USA, 2010. [10] I. Zyout : Toward automated detection and
diagnosis of mammographic microcalcifications, Ph.D. dissertation, Dept. of Elect. and Comp. Eng., Western Michigan University (2010). [11] W. Siedlecki and J. Sklansky: A note on genetic
algorithm for large scale feature selection,” Pattern recognition letter, Vol.10, pp.335-347(1989). [12] J. Kennedy and
R. Eberhart: Particle swarm optimization, IEEE International Conference on Neural Networks, Perth: IEEE Service Center, Piscataway, NJ, Vol. 4, pp. 1942–1948, (1995).
[2] M. Elter and A. Horsch:
[13] X. C. Guo, J. H. Yang, G. C. Wu, C. Y. Wang, and
Y. C. Liang Y. C. A novel LS-SVMs hyperparameter selection based on particle swarm optimization. Neurocomputing, Vol.71, pp.3211– 3215 (2008).
[3] A. Karahaliou, I. Boniatis, P. Sakellaropoulos, S.
Skiadopoulos, G. Panayiotakis, and L. Costaridou: Can texture of tissue surrounding microcalcifications in mammography be used for breast cancer diagnosis? Nuclear Instruments and Methods in Physics Research,Vol. 580, pp.1071–1074 (2007).
[14] J. Kennedy and R. C. Eberhart : A discrete binary
[4] H. S.
Zadeh, F. R. Rad, and S. P. Nejad: Comparison of multiwavelet, wavelet, Haralick, and shape features for microcalcification classification
Volume 6 Number 4
Page 13
version of the particle swarm algorithm, Conference on Systems, Man, and Cybernetics, Piscataway, NJ, pp. 4104-4109(1997).
www.ubicc.org
Ubiquitous Computing and Communication Journal
[15] R.
M. Haralick : Statistical and structural approaches to texture, Proceedings from IEEE Vol. 67, No. 5, pp.786–804 (1979).
[16] I. Guyon: An introduction to variable and feature
selection. Journal of Machine Learning Research, Vol. 3, pp. 1157-1182 (2003). [17] J. Suckling, J. Parker, D. Dance, S. Astley, I. Hutt,
C. Boggis, I. Ricketts, E. Stamatakis, N. Cerneaz, S. Kok , P. Taylor, D. Betal, and J. Savage :The mammographic image analysis society digital mammogram database. Exerpta Medica, Vol. 1069, pp. 375-378 (1994).
Volume 6 Number 4
Page 14
www.ubicc.org