Comparison of color clustering algorithms for ... - Semantic Scholar

2 downloads 223 Views 421KB Size Report
uEme ns7 is one of the simplest unsupervised le rning lgorithmsX u entroidsD one for e h luster re definedF his lgorithm ims t minimizing n o je tive fun tion th ...
Comparison of color clustering algorithms for segmentation of dermatological images Rudy Melli, Costantino Grana, Rita Cucchiara Dipartimento di Ingegneria dell'Informazione, University of Modena e Reggio Emilia, 41100 Modena, Italy;

ABSTRACT

Automatic segmentation of skin lesions in clinical images is a very challenging task; it is necessary for visual analysis of the edges, shape and colors of the lesions to support the melanoma diagnosis, but, at the same time, it is cumbersome since lesions (both naevi and melanomas) do not have regular shape, uniform color, or univocal structure. Most of the approaches adopt unsupervised color clustering. This works compares the most spread color clustering algorithms, namely median cut, k-means, fuzzy-c means and mean shift applied to a method for automatic border extraction, providing an evaluation of the upper bound in accuracy that can be reached with these approaches. Di erent tests have been performed to examine the in uence of the choice of the parameter settings with respect to the performances of the algorithms. Then a new supervised learning phase is proposed to select the best number of clusters and to segment the lesion automatically. Examples have been carried out in a large database of medical images, manually segmented by dermatologists. From these experiments mean shift was resulted the best technique, in term of sensitivity and speci city. Finally, a qualitative evaluation of the goodness of segmentation has been validated by the human experts too, con rming the results of the quantitative comparison. Keywords: Dermoscopic images, color clustering, segmentation, boundary extraction, computer-aided diagnosis 1. INTRODUCTION

Many papers have dealt with the problem of segmenting dermatologic images of skin lesions, either macroscopic or dermoscopic, to support the melanoma diagnosis. The former are obtained with standard camera acquisition, while the latter, know as epiluminescence microscopy (ELM) or skin-surface microscopy, are taken by placing a liquid contact medium on the skin and then compressing it with a glass like surface, besides an incident surface lighting is used. This technique is now widely recognized to enhance the features of pigmented skin lesion and thus the automated clinical diagnosis1 .2 Indeed, the oil immersion make the skin translucent and permits a better visualization of surface and subsurface structures, the images resulted are more detailed and ideals for a computer analysis3 with respect to the macroscopic images (see Fig. 1). The lesions, both naevi and melanoma, have neither regular edges and shape, neither uniform color and can be distinguished by the skin based on the fact that for biological reasons the lesion has a darker aspect with respect to the skin. Moreover, most of the lesions for which the diagnosis is uncertain, do not present a clear bimodal luminance distribution between skin and lesion and adaptive thresholds (e.g. Otsu's threshold4 ) on luminance can achieve approximated segmentation only. Color classi cation based on absolute threshold of colors (learned or manually tuned) are not e ective because most current image acquisition systems in dermatology do not really cope with color calibration. This has forced to revert to unsupervised approaches, which typically would perform worse than trained systems, but are much more robust to not calibrated acquisitions. In this work it is provided a new procedure for lesion's boundary extraction, satisfactory for dermatological experts and that can be exploited in further automatic analysis.5 Four widely employed color clustering algorithms are compared: median cut,6 k-means7 , fuzzy c-means8 and mean shift9 ,10 without employing any Further author information: E-mail: fmelli.rudy, grana.costantino, [email protected], Web: http://imagelab.ing.unimo.it, Telephone: +39 059 2056270

(a) Macroscopic image Figure 1.

(b) Dermoscopic image

The same image with di erent acquisition: clinical at the left and dermoscopic at the right.

spatial constraint, in order to verify if they showed signi cant di erences in performance in identifying the lesion versus the skin colors. Di erent tests had to be performed in order to verify the in uence of the algorithms with respect to the choice of the parameter settings and the lesions characteristics. This fair comparison demonstrates the little but not negligible advantage in the use of mean shift using a high number of clusters and merging them into categories skin/lesion. Thus we developed an approach to select the best number of clusters and a simple but e ective learning step for the merging stage. The approach has been deeply tested. The paper has been structured as follows: in Section 2 has described the related works of boundary detection on dermoscopic images and clustering segmentation; in Section 3 the description of our proposal approach; in Section 4 the mean shift algorithm in details; in Section 5 the description of training and classi cation processes; in Section 6 the experimental results; in Section 7 the conclusion of our work and the Web-Platform developed for human quality data acquisition. 2. RELATED WORKS

In the next subsection we review the methods most used for border detection of medical image and in the successive subsection the clustering theory and especially the four algorithms compared in this work are analyzed. 2.1. Medical Image Segmentation

A lesion could be distinguish from surrounding skin using di erent features such as color, brightness or luminance, texture and shape. Segmentation is the rst step to a correct border extraction. Due to its simplicity the thresholding operation is the most used: Xu et al.11 after a preprocessing, determined a threshold from the average intensity of high gradient pixels in the intensity image previous calculated. Ganster et al.12 used a global thresholding in a bimodal histogram, subsequently13 adopted a fusion of three di erent algorithms with heuristic rules: global thresholding, dynamic thresholding and 3-D color clustering concept. Sice color is de netely the most signi cant feature and the assumption of color uniformity for the skin color is commonly accepted the most used approach to detect the boundary of a medical dermoscopic image is the unsupervised color clustering. Hance et al.18 compared the accuracy of six di erent color segmentation techniques: adaptive thresholding, SCT/center split, split and merge, multiresolution segmentation and in particular median cut (conjuncted with Principal Component Transform) and fuzzy c-means. For all the methods the number of colors for segmentation was kept constant to three and no other multi-clusters analysis was taken into account. Schmid19 used a modi ed version of fuzzy c-means where the number of classes depend on the number of maxima in the histogram information of the three color.20 Cucchiara et al.21 adopted a interactive fuzzy c-means with two clusters to segment the lesion and a recursive version of that clustering algorithms to identify internal structures. No e ort was spent to quantify the performance evaluation, neither to compare methods or the need of di erent number of colors.

2.2. Clustering

In the case of segmentation, clustering is used to quantize and reduce the number of colors voting the most representatives. This is a problem of exclusive and unsupervised classi cation, the meat of cluster analysis.22 Here we provide a brief explanation of the the four clustering algorithms used in this work. Median cut, rstly proposed by Paul Heckbert in 19826 creates the number of color's classes desired holding on each of them roughly the equal number of points. The color space is iteratively divided in two boxes at the medium point. K-means7 is one of the simplest unsupervised learning algorithms: K centroids, one for each cluster are de ned. This algorithm aims at minimizing an objective function that is the sum of the Euclidian distances between the k clusters c and their data points x . Fuzzy c-means (FCM), developed by Dunn8 and improved by Bezdek,23 is a method of clustering where each point of the data set have a relationship with all the clusters at the same time, provided by weights. These last two methods depend of the initial placement of centroids and to do a fair comparison, the same centroids equally distributed have been chosen. Finally, mean shift ,proposed in 1975 by Fukunaga and Hostetler,9 has been proposed in 1997 by Comaniciu and Meer10 for color image clustering. The mean shift algorithm is considered a powerful technique for image segmentation and a revolutionary strategy that performs multistart global optimization. k i

k

3. PROPOSAL

We propose a mixture approach based on an unsupervised segmentation and a supervised skin classi cation. The main idea is to obtain two super-classes, skin and lesion, by merging the clusters resulted from the clustering process in two groups. This is done providing a classi cation based on the recognition of the skin. This supervised classi cation is not based on the appearance of the color clusters only, but on their position using spatial constraints. Indeed, in the medical images, lesion almost even occupy the center of the image without cover the corners.

Figure 2.

Flow chart of our proposed border detection process.

Fig. 2 shown the ow chart of our proposed algorithm of border extraction. There are four important steps: Training, Clustering, Classi cation and Border Extraction. The image is quantized by a clustering algorithm, then the levels obtained are portioned in two group called skin and lesion. We will demonstrate that, if we can perform a right classi cation, the boundary recovered is optimal. Thus a color and spatial classi cation of the skin obtains a very satisfactory approximation of the ideal cluster portioning. In the follow chapters we explain the theory back each of these four steps.

4. COLOR CLUSTERING WITH MEAN SHIFT

The clustering based segmentation has the aim to found the k clusters more representative of the color distribution of the image. Mean shift is a technique for analysis of feature spaces. When used for color image segmentation, the image data is mapped into the feature space (that is the RGB space) . The algorithm exploits an iterative procedure that shifts each data point to the average of data points in the neighborhood. The algorithm is based on a kernel that, for our experiments, we chosen at. The sample mean at x 2 X is

P ( ) = 2P

m x

s

K (s

x)s

K (s

x)

S

2

s

(1)

S

where S is an embedded nite set in X . The di erence m(x) x is called mean shift in Fukunaga and Hostetler.9 The repeated movement of data points to the sample means is called the mean shift algorithm,9 .26 In each iteration of the algorithm, s is updated with m(s) for all s 2 S simultaneously. The algorithm ends when the mean of the mean shift of each point of data set is less than . Di erently from the basic implementation of this algorithm we have computed the 3D RGB color histogram of the image and, at each iteration of mean shift convergence, we have updated properly the histogram. Di erently from the other algorithms tested, mean shift does not need a xed number of clusters. Indeed, its procedure is based on a density estimator kernel that have one di erent parameter, , the dimension of the neighborhood. To compare this algorithm with the others we have applied the mean shift to the entire database with several di erent  logging the number of resulted clusters. Then we have calculated the interpolation curve, minimizing the mean square error, to determine the approximation function 

= 986:65n

1:4787

(2)

where n is the desired number of clusters. This function helps the system to adapt the features of mean shift algorithm to the other algorithms for the comparison. This algorithm still remain di erent from the other because, also applying the function, the number of desired cluster is not constant and equal to n, but it can vary in a range around n. 5. SKIN COLOR TRAINING AND CLASSIFICATION

We assume that the lesion occupies mostly the central part of the image, so that only skin's clusters should cover the four angles of the image. Clusters are merged in the skin region if they contains a sucient number of pixels and whose colors has been trained as a skin colors. We use the color of the angles of the image that are not covered by lesion as training color for skin clusters classi cation. Fig. 3 gives some examples of the corner pixel used for the training.

Figure 3.

Example of skin detection with the Corners mask. Pixels's lesion do not are present.

 To evaluate the closeness of centroids (clusters) for each algorithm was used the same metric to measure distance between data, the Euclidian distance in the RGB color space.

Figure 4.

(a) Original image.

Example of large lesions that cover a corner.

(b) Clustered image. Figure 5.

(c) Skin suppression.

(d) Boundary extraction.

Example of proposed algorithm step by step

In order to prevents worse classi cation in the cases of large lesion when its border cover the angles, an initial segmentation approximated based on adaptive threshold of the luminosity is exploited and a region of interest ROI (I ) (where the lesion could be present) is selected. Fig. 4 gives some examples of the classi cation mask de ned using the predicate C = fx 2 I : (x 2 Angle(I ) ^ (x 2 ROI (I )))g (3) m

Boundary extraction process is the nal step. First the labeling algorithm is applied and the object with the maximum area and with the barycenter far o the center of the image is taken and lled. The rest of the objects was discarded and resulted image contain only the lesion (Fig. 5(c)). To extract the boundary from the binary image was used the Chain Code algorithm to run after the edge (See Fig. 5(d)). 6. EXPERIMENTAL RESULTS

The database used for this work is composed by 117 dermoscopic images with a resolution of 768x512 pixel and 24bit color depth. For each image a reference boundary, drawn by hand from a group of Dermatologists, is present. To measure the accuracy of the tests, we used Sensitivity , Speci city  (largely used in medical image analysis27 ) and Score calculated as the average of  and  . is used to take into account the two measures and to allows a direct comparison between results; indeed, a great value of both  and  means a better accuracy of the boundary. 6.1. Clustering accuracy evaluation

The rst aim was to demonstrate the reliability of clustering algorithms to reach an optimal boundary of skin lesions and the dependency with the number of clusters of the quantized image. In Fig. 6(a) is shown the maximum achievable for each clustering algorithm on 12 di erent number of classes: 2, 3, 4, 5, 6, 7, 8, 16, 24, 32, 64, 128. This maximum of Fig. 6(a) is an upper bound, reachable only in the best condition (i.e. when the classi cation of clusters in skin/lesion could be a priori known). It is evident that increasing the number of clusters, the accuracy of boundary detected increases, for all the algorithms. The values resulted are even higher than 0:93 demonstrating that clustering is a possible good approach to found the boundary of skin lesion images if a correct clusters classi cation is provided.

(a) Quantitative analysis of the clustered algorithms. (b) Obtained score in boundary detection with skin training with di erent color clusters. Figure 6.

Experimental results

Comparison of the four clustering algorithms in the accuracy to detect automatically the boundaries with skin training in di erent color clusters; in gray is underlined the maximum values of each .

Table 1.

Number

Median cut

K-means

of clusters

2 3 4 5 6 7 8 16 24 32 64 128

FP

24179,96 4230,68 7845,23 7845,22 8503,79 9948,68 10786,95 16565,19 25808,54 58399,8 134992,6 198874

Number

FN

6521,53 11314,56 10732,09 11083,93 7465,41 7960,23 7606,62 5467,97 4196,79 2503,21 1360,83 139,92



0,93 0,98 0,97 0,97 0,97 0,96 0,96 0,94 0,91 0,81 0,56 0,34



0,94 0,92 0,92 0,91 0,95 0,92 0,93 0,93 0,93 0,96 0,99 1,00

0,93 0,95 0,94 0,94 0,96 0,94 0,94 0,93 0,92 0,89 0,77 0,67

FP

9244,74 4982,9 6968,79 8217,26 6040,44 6971,28 11754,32 10857,96 24006,97 34716,97 117002,2 183830

Fuzzy c-means

FN

7964,87 15195,43 15716,53 11160,79 7524,17 10847,01 8641,42 6477,53 5311,06 4803,89 2481,32 282,46



0,97 0,98 0,97 0,97 0,98 0,97 0,96 0,96 0,92 0,89 0,62 0,38



0,94 0,88 0,87 0,91 0,95 0,91 0,91 0,93 0,94 0,95 0,97 1,00

0,95 0,93 0,92 0,94 0,96 0,94 0,93 0,94 0,93 0,92 0,8 0,69

Mean shift

of clusters

2 3 4 5 6 7 8 16 24 32 64 128

FP

1701,11 3631,97 4294,18 5104,47 5798,03 8841,88 9522,98 17578,5 35591,27 61560,51 119214,2 215388

FN

10522,83 15559,44 9537,66 10042,95 8932,19 8689,14 7015,92 6989,09 7915,88 4500,73 2143,94 566,35



0,99 0,98 0,98 0,98 0,98 0,97 0,96 0,94 0,88 0,8 0,61 0,25



0,91 0,88 0,93 0,92 0,93 0,89 0,93 0,9 0,86 0,94 0,97 0,99

0,95 0,93 0,96 0,95 0,96 0,93 0,95 0,92 0,87 0,87 0,79 0,62

FP

4043,44 4204,03 5418,34 5039,74 10784,61 7282,47 6070,23 9300,1 9626,96 9925,64 10885,43 12135,26

FN

14028,59 13694,56 11055,43 13366,44 12527,32 14421,78 10828,87 8054,52 8484,67 8272,96 6718,71 5635,9



0,98 0,98 0,97 0,98 0,96 0,97 0,97 0,96 0,96 0,96 0,96 0,95



0,91 0,92 0,93 0,92 0,93 0,92 0,94 0,95 0,94 0,94 0,95 0,96

0,94 0,95 0,95 0,95 0,95 0,95 0,96 0,96 0,96 0,96 0,96 0,95

6.2. Comparison

The procedure described in Section 5 was used to extract the boundaries from each image quantized with the four algorithms. The algorithms was applied with the 12 di erent color clusters con gurations and for each image the boundary was compared with the reference one calculating the value. In Table 1 is shown the results of tests with our approach (graphically visible in Fig. 6(b)) that distinguishes clusters according with a training phase of skin pixels. It is noticeable that increasing the number of clusters, the sensitivity tends to increase but the speci city decreases, due to the growing of false positive (see Table 1), especially over 24 levels. This behavior is caused by the diculty to classify properly the clusters when this number is too high (Fig. 7), a part from mean shift. In the matter of fact, its speci city does not decrease valuable, as is graphically visible in Fig. 6(b) (dotted bold line). This is an interesting experimental validation that demonstrates the good performance of the mean shift algorithm. Indeed, its property to adapt itself to the features of the target image allows it to create clusters that are more distant each other in the feature space, making their classi cation more straightforward. Fig. 7 shows an example of automatic boundary detection obtained with 32 clusters; as is clear mean shift (Figure 3.d) is still accurate, whereas the others algorithms fail. Fig. 8 depicts some examples where mean shift slightly outperforms the other methods. A further interesting information is provided from Table 1 and Fig. 6(b) that shown the best boundaries for the algorithms is obtained with 6 clusters for median cut, k-means and fuzzy c-means and 64 for mean shift.

Figure 7. Automatic boundary detection on 32 clusters image of the four algorithms (from left to right: median cut, k-means, fuzzy c-means and mean shift), it is noticeable that mean shift continue to performe a good segmentation compared with the other algorithms that produce an over-segmentation.

Examples of automatic boundary detection with the four clustering algorithms: median cut( rst column), k-means (second column), fuzzy c-means (third column) and mean shift (quarter column). Figure 8.

Figure 9.

Flow diagram of algorithm's voting expressed by expert University Dermatologics. 7. CONCLUSION

After having performed 5616 segmentation runs for the analysis of the upper bound, and 5616 runs with our approach we can assume that color clustering is very suitable for dermatological image segmentation; all color clustering methods, if the right number of clusters is used, are suciently accurate, even if mean shift has demonstrated an higher stability w.r.t. the parameter variation. The nal tread of this work, as well a quantitative comparison, was to analyze also a qualitative comparison carried out by human experts. We created a Web site with the results of the segmentation, choosing the best con guration for each algorithm, on the whole database and asked the Dermatologists to vote for the skin lesion contours detected of the four algorithms. This data acquired is a verifying test of the performance of each algorithm and a qualitative evaluation independent from previous data obtained with the reference boundary. Also in this case mean shift has been evaluates as the best method (Fig. 9). The importance of this result is implicit because is a double regard that evidence the good performance of this algorithm for both computer and human evaluation. 8. ACKNOWLEDGMENT

This project has been funded by MIUR-PRIN Project 2004-2006. We thanks Dermatologic Department of University of Modena and Reggio Emilia for the data evaluation. REFERENCES

1. H. Pehamberger, M. Binder, A. Steiner, and K. Wol , \In vivo epiluminescence microscopy: Improvement of early diagnosis of melanoma," J Invest Dermatol 100, pp. 356S{362S, 1993. 2. W. Stolz, O. Braun-Falco, M. Landthaler, and A. Bilek, P. Cognetta, Color Atlas of Dermatoscopy, 1994. 3. G. Argenziano, H. Soyer, V. Giorgi, and D. Piccolo, \Dermoscopy: a tutorial," EDRA Medical Publishing and New Media , 2000. 4. N. Otsu, \A threshold selection method from gray-level histograms," IEEE Trans. Systems Man Cybernet 9(1), pp. 62{69, 1979. 5. C. Grana, G. Pellacani, R. Cucchiara, and S. Seidenari, \A new algorithm for border description of polarized light surface microscopic images of pigmented skin lesions," in IEEE Trans. on Medical Imaging, 22, Aug. 2003.

6. P. Heckbert, \Color image quantization for frame bu er display," Computer Graphics 16, July 1982. 7. J. MacQueen, \Some methods for classi cation and analysis of multivariate observations," in Proc. of 5-th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, University of California, 1967. 8. J. Dunn, \A fuzzy relative of the isodata process and its use in detecting com-pact well-separated clusters," Journal of Cybernetics (3), pp. 32{57, 1973. 9. K. Fukunaga and H. L.D., \The estimation of the gradient of a density function, with applications in pattern recognition," in IEEE Trans. Information Theory, 21, pp. 32{40, 1975. 10. D. Comaniciu and M. P., \Robust analysis of feature spaces: Color image segmentation," in Proc. of IEEE Int'l Conference on Computer Vision and Pattern Recognition, pp. 750{755, 1997. 11. L. Xu, M. Jackowski, A. Goshtasby, D. Roseman, S. Bines, C. Yu, A. Dhawan, and A. Huntley, \Segmentation of skin cancer images," Image Vision Computing 17, p. 6574, 1999. 12. H. Ganster, M. Gelautz, A. Pinz, M. Binder, H. Pehamberger, M. Bammer, and J. Krocza, Initial Results of Automated Melanoma Recognition, 1995. 13. A. Ganster, Pinz, R. Rhrer, E. Wildling, M. Binder, H., and H. Kiitler, \Automated melanoma recognition," in Proc. of IEEE Trans. on Medical Imaging, 30(3), 2001. 14. T. Lee, V. Ng, D. McLean, A. Coldman, R. Gallagher, and J. Sale, \A multi-stage segmentation method for images of skin lesions," in Proc. of IEEE Paci c Rim Conference on Communications, Computers and Signal Processing, pp. 602{605, 1995. 15. P. Schmid, \Lesion detection in dermatoscopic images using anisotropic di usion and morphological ooding," in Proc. of IEEE Int'l Conference on Image Processing, 3, pp. 449{453, 1999. 16. B. Erkol, R. Moss, R. Stanley, W. Stoecker, and E. Hvatum, \Automatic lesion boundary detection in dermoscopy images using gradient vector ow snakes," Skin Research and Technology 11, Feb. 2005. 17. Z. Zhang, W. Stoecker, and R. Moss, \Border detection on digitized skin tumor images," in Proc. of IEEE Trans. on Medical Imaging, 19, Nov. 2000. 18. G. Hance, S. Umbaugh, R. Moss, and W. Stoecker, \Unsupervised color image segmentation: with application to skin tumor borders," IEEE Engineering in Medicine and Biology 15, pp. 104{111, 1996. 19. P. Schmid, \Segmentation of digitized dermatoscopic images by two-dimensional color clustering," in Proc. of IEEE Trans. on Medical Imaging, 18, Feb. 1999. 20. Y. Lim and S. Lee, \On the color image segmentation algorithm based on the thresholding and the fuzzy c-means techniques," Pattern recognition 23(9), pp. 935{952, 1990. 21. R. Cucchiara, C. Grana, S. Seidenari, and G. Pellacani, \Exploiting color and topological features for region segmentation with recursive fuzzy c-means," Machine Graphics and Vision 11(2-3), pp. 169{182, 2002. 22. A. Jain and R. Dubes, Algorithms for Clustering Data, 1988. 23. J. Bezdek, \Pattern recognition with fuzzy objective function algoritms," Plenum Press, New York , 1981. 24. Y. Cheng, \Mean shift, mode seeking, and clustering," IEEE Trans. Pattern Analysis and Machine Intelligence 17, pp. 790{799, Aug. 1995. 25. C. Grana, G. Pellacani, S. Seidenari, and R. Cucchiara, \Color calibration for a dermatological video camera system," in Proceedings of IAPR International Conference on Pattern Recognition (ICPR 2004), 18, pp. 798{801, Aug. 2004. 26. B. Silverman, Density Estimation ,for Statistics and Data Analysis, 1986. 27. A. P. Bradley, \The use of the area under the roc curve in the evaluation of machine learning algorithms," in Pattern Recognition, 30, pp. 1145{1159, 1997.

Suggest Documents