Classification of Cervix Lesions Using Filter Bank-Based Texture Models Yeshwanth Srinivasan†, Brian Nutter†, Sunanda Mitra†, Benny Phillips‡, and Eric Sinzinger¦ † Dept. of Electrical and Computer Engineering, Texas Tech University, Lubbock, Texas 79409 ‡ OB/GYN-Lubbock, School of Medicine, Lubbock, Texas 79430 ¦ Dept. of Computer Science, Texas Tech University, Lubbock, Texas 79409
[email protected],
[email protected],
[email protected],
[email protected]
Abstract This paper explores the classification of texture patterns observed in digital images of the cervix. In particular, the problem of identifying and segmenting punctations and mosaic patterns is considered. First, the ability of large scale filter banks in characterizing punctations and mosaic structures is studied using texton models. However, texton-based models fail to consistently classify punctation and mosaic sections obtained from cervix images of different subjects. We present a novel method to segment punctations that combines matched filtering using a Gaussian template with Gaussian Mixture Models. Features extracted from the objects detected using this novel method on punctation and mosaic sections are shown to provide excellent classification between punctation and mosaicism. Results demonstrate the effectiveness of our approach in detecting punctations and separating punctation sections from mosaic sections.
1. Introduction Cervical cancer is the second most common form of cancer in women, affecting over 12,000 women in America and 400,000 women worldwide [1]. While the Papanicolaou (Pap) smear test [2] is the most common screening tool for cervical cancer, optical tests such as visual inspection with acetic acid (VIA) and cervicography and colposcopy for visual examination of the cervix [3] are becoming increasingly popular. In particular, cervicography and colposcopy are considered to be more effective screening tools because of their amenability to automated processing and longitudinal analysis. Automatic segmentation and classification of cervix lesions are extremely desirable tools for automatic,
non-invasive detection of cervical cancer. Such tools greatly enhance the power of colposcopes, which are devices used to photographically record images of the cervix, by adding analytical capabilities to the imaging system, thereby reducing the amount of human input required in making a decision. However, segmentation and classification of precancerous regions in the cervix is a non-trivial task that is complicated by several factors, including the non-uniform surface of the cervix, variations in illumination, viewing direction and scale, and differences in imaging modalities. Cervical cancer is preceded by Cervical Intraepithelial Neoplasia (CIN), which refers to the spectrum of abnormalities of the surface epithelium. The most important CIN features that help in distinguishing between normal and abnormal lesions are acetowhite (AW) change and vascular patterns [4]. The AW region is a white epithelium that appears following the application of acetic acid. It helps in the detection and characterization of cervical abnormality in the early stages because it produces a sharp line of demarcation between normal and abnormal epithelium. Vascular patterns observed in CIN are essentially of three types: Punctations, Mosaicism and Vasculature. Dilated, elongated and widely-spaced twisted hairpin capillaries that extend close to the surface are referred to as punctations. Ordered vascular patterns formed by capillaries arranged parallel to the surface are called mosaic patterns. Examples of punctation and mosaic are shown in figure 1 (a)-(e) and (f)-(j), respectively. Terminal vessels that are irregular in size, shape, coarseness and arrangement, with a greater intercapillary distance than in normal epithelium, are referred to as vasculature. The problem of segmenting an image of the cervix into pathologically meaningful regions, like Squamous Epithelium (SE), Columnar Epithelium (CE), AW, mosaic and punctation, has been addressed by several
Proceedings of the 19th IEEE Symposium on Computer-Based Medical Systems (CBMS'06) 0-7695-2517-1/06 $20.00 © 2006 IEEE
(a) P1
(b) P2
(c) P3
(d) P4
punctations and mosaic patterns into their respective categories. Once accurately classified, the punctations and mosaic sections can be segmented using methods that are appropriate to each kind of texture. The remainder of the paper is organized as follows. In section 2, the problem of classifying mosaic and punctation sections is treated as a texture classification problem, and a solution is attempted using 2Dimensional (2-D) textons. Results are provided to show the inadequacy of the texton-based models in characterizing mosaic and punctations for accurate classification. Section 3 introduces a novel method based on matched filtering and Gaussian Mixture Models (GMM) to accurately segment punctations. It is shown that geometrical features extracted from the segmented objects can be used for accurate classification of mosaic and punctation sections. Section 4 concludes the paper with a discussion on the significance of the results and future work.
2. Filter banks and textons
(e) P5
(f) M1
(g) M2
(h) M3
(i) M4
(j) M5
Figure 1. Punctation and mosaic sections used in the analysis. (a)-(e) Punctation sections. (f)-(j) Mosaic sections. researchers in the past with varying levels of success [5, 6, 7]. However, a fully automatic system that decomposes a digital image of the cervix, obtained under widely varying conditions, into pathologically significant regions is still only in the development phase. In this paper, we try to solve the important problem of classifying image sections containing
Texture classification using filter banks has been widely applied to classification of images from standard texture databases with greater than 95% accuracy [8, 9, 10]. The basic idea is to characterize a texture by its responses to a set of Nfil linear filters that are orientation- and frequency- selective. The Nfildimensional responses at each pixel across training images of a particular texture are clustered to return a small set of prototype response vectors called textons. The response vectors themselves are called appearance vectors because they encode local texture variations. The response at each pixel is then quantized to the closest texton, and the histogram of texton labels is plotted and used as a model for that texture. Thus, if a set of textons is trained on images of textured surfaces obtained under some, but not all, possible illumination and viewing conditions, then the texton model can be used to accurately classify images of the same surfaces obtained under all possible lighting and viewing direction variations. More details on filter bank methods, texton models and variations of texton based approaches can be obtained from [8], [9] and [10]. The eight Maximum Responses or MR8 filter bank of Varma and Zisserman is a compact set of 8 rotationally invariant filters that has been proven to provide excellent discrimination between textures using only one image as input [9]. The filter bank consists of a Gaussian filter with σ = 10, a Laplacian of Gaussian (LoG) filter with σ = 10, a set of 6 oriented edge filters, and 6 oriented bar filters at 3 different scales, (σx, σy) = {(1, 3), (2, 6), (4, 12)}. However, to achieve rotational
Proceedings of the 19th IEEE Symposium on Computer-Based Medical Systems (CBMS'06) 0-7695-2517-1/06 $20.00 © 2006 IEEE
invariance and reduce the dimensionality of the filter response vectors, only the maximum response of the 6 oriented filters at each scale is taken, resulting in 3 edge responses and 3 bar responses for a total of 8 relevant filter responses, including the Gaussian and LoG filters. The premise for using this filter bank to classify mosaicism and punctation is straightforward. The mosaic patterns are formed by enclosed vascular structures that can be approximated by a set of oriented edges. Hence, they respond well to edge and bar filters. On the other hand, the punctations, which are essentially circular objects, respond better to the rotationally symmetric Gaussian and LoG filters. Due to this difference in the type of filters to which mosaic and punctations respond, the texton model is expected to perform acceptably, even though registered training images obtained at multiple viewpoints and illumination for each texture are not available.
2.1. Methodology Images used for each kind of texture (mosaicism and punctations) are shown in figure 1. The images are 100 x 100 RGB sections obtained from cervicographic and colposcopic images. The Specular Reflections (SR) due to application-specific illumination difficulties are removed by applying a hard threshold on R, G and B values exceeding 200 and iteratively interpolating for the removed pixels using non-SR pixels in the 7x7 neighborhood. As suggested in [9], each image is converted to gray scale and intensity normalized to have zero mean and unit standard deviation. The filters are normalized so that they have unit L1 norm. Each image is then convolved with the set of 38 filters, and the 8-D filter response vector at each pixel x is contrast normalized using Weber’s law as: F ( x) = F ( x) log(1 + L / 0.03) / L , (1) where L = F (x) 2 is the L2 norm of the filter response vector at x. Then, the filter response vectors from all five images of each texture are aggregated and clustered into 100 clusters using K-means clustering. The centroids of the clusters are the textons. The closest textons are then progressively combined to reduce the number of textons from 200, with 100 from mosaicism and 100 from punctations, to a compact set of 100 textons. The filter response at each pixel is quantized to the closest texton, and the histogram of texton labels for each image is plotted. The histogram of labels becomes the model representing the texture images. If the model adequately represents the difference in texture between mosaicism and punctations, then, at least for the
training images, the similarity in the histogram of labels between images of the same texture should be greater than the similarity in the histogram of labels between images of different texture. The measure of similarity used is the chi-square distance between histograms given by χ 2 (h1, h2 ) =
1 2
N bins
∑ (hh((nn))−+hh ((nn))) 1
n =1
1
2
2
,
(2)
2
where h1 and h2 are the two histograms of labels being compared and Nbins is the number of textons in the model, which is 100 in this case. Filters of support 7 were used.
2.2. Results Figure 2 shows the responses of one punctation image, P2, and one mosaic image, M2. As expected, the mosaic structures respond strongly to the bar and edge filters, and the vascular structures are also emphasized by the Gaussian and LoG filters due to the small support of the filters. However, the punctations respond strongly only to the Gaussian and LoG filters. Table 1 shows the chi-square distances between each model histogram and every other model histogram in the set. Although the filter responses for mosaicism and punctations are perceptually different, the chisquare distances don’t corroborate with the observed differences in filter responses. In fact, the chi-square distances indicate that some of the mosaic images are more similar to some punctation images than other mosaic images in the training set, indicating that the textons fail to characterize the difference in textures between mosaic and punctations. The lack of registered images from multiple viewpoints for each texture prevents the texton-based model from finding textons that are individually specific to a certain texture.
3. Matched filtering to detect punctations Punctations appear as roughly circular objects with a heavy reddish hue on the RGB image, or as dark objects on the corresponding grayscale image. Because the shape and size of the objects are fairly uniform and known apriori, matched filtering using a template matching the shape of the object to be detected can be used to accentuate these objects. Punctations also have a tendency to occur in groups with no clear line of demarcation between two or more punctations. Matched filtering using a Gaussian template helps to increase the degree of separation between two close punctations because the detected objects are forced to
Proceedings of the 19th IEEE Symposium on Computer-Based Medical Systems (CBMS'06) 0-7695-2517-1/06 $20.00 © 2006 IEEE
be circular. Matched filtering is implemented as a convolution given by m ( x, y ) = G ( x , y ) ∗ f ( x , y ) , (3) where G(x,y) is a separable 2-D Gaussian kernel with σ = 10, which was found to adequately describe the variations in intensity around punctations in the samples used, f(x,y) is the input image, and m(x,y) is the matched filtered image. Matched filtering essentially serves two purposes: it increases the contrast around individual punctations, and it smoothes uniform regions. The resulting image, m(x,y), consists of dark punctations on a bright background. The intensities of m(x,y) can be modeled as a mixture of two Gaussians, one that predominantly models the variations in intensity of the punctations, and the other that predominantly models the variations in intensity in the background. In other words, we can write the Probability Density Function (PDF) of the intensities of m(x,y) as N
p ( x / Θ) =
∑α p ( x / θ ) i
i
,
(4)
i =1
where N = 2 is the number of Gaussians in the mixture, αi is the apriori probability of each PDF, θi = (μi, Σi) are the parameters of the PDF, which for the GMM are the means and the covariance matrix, Θ = (α1,α 2 ...,α N ,θ1,θ 2 ...,θ N ) , and x is the set of gray level intensities of pixels in m. The maximum likelihood estimate for the parameters αi, μi, and Σi can be found using the Expectation-Maximization (EM) algorithm [11, 12]. An important advantage of modeling m(x,y) as a GMM is that it helps to make the detection algorithm independent of the image acquisition process. Because the EM algorithm is applied to each individual image, the punctations will be detected irrespective of variations in illumination and image acquisition modalities as long as the punctations appear darker than the background.
3.1 Methodology for detecting punctations First, the SR are eliminated using the procedure outlined in section 2.1. The images are converted to grayscale and preprocessed using anisotropic diffusion as suggested in [13]. This filtering serves to smooth the inconsistent background while preserving the edges that separate the punctations from the background. The preprocessed image is then subjected to matched filtering using a Gaussian kernel with a support of 7. The matched filtered image is then modeled as a mixture of two Gaussians, the parameters of which are estimated using the EM algorithm. Finally, each pixel is assigned to one of two clusters – object and
background – based on the maximum posteriori probabilities found from the estimated parameters. The PDF with lower mean corresponds to the punctations and is assigned a label ‘1’, while the background is labeled ‘0’.
3.2 Classification of mosaic and punctations To separate mosaic and punctation image sections, the procedure outlined in section 3.1 is applied to all 10 images in figure 1, and the area, which is the feature used to discriminate mosaic and punctation images, of each object in the binary image is calculated. Objects with area less than 5 pixels are removed, and the average area of the objects in each image is calculated. Punctations on the sample images occupy at least 5 pixels, and objects less than 5 pixels were usually just speckle noise. For the mosaic images, matched filtering simply accentuates the entire vascular structure while picking up stray punctation-like objects. Hence the average area of the detected objects is generally significantly higher for the mosaic images due to the significant contribution from the vascular structure.
3.3 Results The results of punctation detection using matched filtering on image P2 is shown in figure 3. The contours of the detected object are marked in blue. It can be seen that the algorithm provides an excellent estimate of the punctations and even detects clustered punctations as isolated objects. Table 2 shows the average area of detected objects for images in figure 1. It can be seen that the magnitude of values for mosaicism is consistently higher than that for punctations. The mean average area of objects across all mosaic images is significantly greater than the corresponding value for punctation images, and the smallest average area of objects for mosaic images is greater than the largest average area of objects for punctation images. This separation indicates that our algorithm can be used to discriminate mosaic sections from punctation sections with minor modifications suggested in the following section to improve the degree of separation by removing false positives, which reduce the average area of objects in mosaic images.
4. Discussion and future work In this work, we have demonstrated a method to detect punctations and applied it to discriminate punctation images from those of mosaic sections. Once the two kinds of patterns are distinguished and
Proceedings of the 19th IEEE Symposium on Computer-Based Medical Systems (CBMS'06) 0-7695-2517-1/06 $20.00 © 2006 IEEE
separated, the mosaic structures can be segmented using approaches demonstrated in [7, 14, 15]. This is an important step in building a fully automated system that decomposes a given image of the cervix into pathologically meaningful sections, which is the ultimate goal of this research. Future work in this area will include training the neighborhood characteristics around true positive punctations in order to remove false positives from both mosaic and punctation sections and increase the parity in average areas of objects in mosaic and punctation images, estimating quantitatively the number of punctations and relating it to its pathological consequences, scale- space methods to accurately determine punctations of all sizes, and extensive testing on a bigger dataset of cervicographic and colposcopic images.
5. Acknowledgements The authors would like to acknowledge Mr. Rodney Long of the National Library of Medicine and Dr. Daron Ferris of the Medical College of Georgia for providing the images used in this research. The authors also acknowledge Dr. Manik Varma for posting the specifics of the MR8 filter bank on his website, which was helpful in verifying our implementation.
6. References [1] American Cancer Institute, Cancer Facts and Figures 2005. [2] L. G. Koss, “The Papanicolaou Test for Cervical Cancer Detection. A Triumph and Tragedy,” JAMA, Vol. 261, pp. 773-774, 1989. [3] M. Anderson, J. Jordan, A. Morse, and F. Sharp, "A Text and Atlas of Integrated Colposcopy," Chapman and Hall, First edition, 1991. [4] Maj Gary Clark, “Colposcopy Syllabus:Through The Looking Glass: Normal and Abnormal Cervical Transformation Zone,” Faculty Development Fellowship, MAMC, Feb. 28-Mar. 5, 1998.
[5] G. Zimmerman, S. Gordon, H. Greenspan, “Contentbased indexing and retrieval of uterine cervix images,” Proc. of 23rd IEEE Convention of Electrical and Electronics Engineers in Israel 2004, pp. 181-185, Tel-Aviv, Israel, 2004 [6] S. Gordon , G. Zimmerman, H. Greenspan, “Image segmentation of uterine cervix images for indexing in PACS,” Proc. of the 17th IEEE Symposium on ComputerBased Medical Systems, CBMS 2004, Bethesda, MD, 2004. [7] Q. Ji, J. Engel, E. Craine, “Texture analysis for classification of cervix lesions,” IEEE Transactions on Medical Imaging, Vol. 19, No. 11, 2000. [8] T. Leung, J. Malik, “Representing and Recognizing the Visual Appearance of Materials Using Three-Dimensional Textons,” Intl. Journal of Computer Vision, Vol. 43(1), pp. 29-44, 2001 [9] M. Varma, A. Zisserman, “Classifying Images of Materials: Achieving Viewpoint and Illumination Independence,” Proc. Of the European Conference on Computer Vision, Vol. 3, pp. 255-271, Springer-Verlag, 2002 [10] M. Varma, A. Zisserman, “Texture Classification: Are Filter Banks Necessary?,” Proc. IEEE Conference on Computer Vision and Pattern Recognition, Vol. 2, pp. 691-8, 2003. [11] F. Dellaert, “The expectation Maximization Algorithm,” Technical Report Number GIT-GVU-0220, College of Computing, Georgia Institute of Technology [12] J. A. Bilmes, “A Gentle Introduction to the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models,” Technical Report TR-97-021, International Computer Science Institute (ICSI) and Computer Science Division, Dept. of Electrical Engineering and Computer Science, U.C. Berkeley. [13] P. Perona, J. Malik, “Scale space and edge detection using anisotropic diffusion,” IEEE Transactions on Pattern Analalysis Machine Intelligence, vol. 12, no. 7, pp. 629-639, Jul. 1990. [14] B. Tupule, “Color and Texture Analysis of Cervix Lesions,” Master’s thesis, Texas Tech University, 2004. [15] B. Tulpule, S. Yang, Y. Srinivasan, S. Mitra, B. Nutter, “Segmentation and Classification of Cervix Lesions by Pattern and Texture Analysis,” IEEE Conference on Fuzzy Systems, pp.173-176, May 22-25, 2005.
Table 1. Chi-square distances between model histograms of texton labels for images used in training M1 M2 M3 M4 M5 P1 P2 P3 P4 P5
M1 0 1455.8 1434.9 658.82 417.3 706.5 1188.4 317.36 725.06 591.5
M2 1455.8 0 407.17 624.79 2240.8 765.31 644.76 1165.3 861.44 828.59
M3 1434.9 407.17 0 356.56 2363.5 1012.4 625.76 1134.2 807.62 670.16
M4 658.82 624.79 356.56 0 1325 633.2 567.08 486.6 466.98 299.1
M5 417.3 2240.8 2363.5 1325 0 1221.8 1884.7 729.28 1210 1138.9
P1 706.5 765.31 1012.4 633.2 1221.8 0 270.47 384.07 404.65 387.06
Proceedings of the 19th IEEE Symposium on Computer-Based Medical Systems (CBMS'06) 0-7695-2517-1/06 $20.00 © 2006 IEEE
P2 1188.4 644.76 625.76 567.08 1884.7 270.47 0 677.87 385.58 395.57
P3 317.36 1165.3 1134.2 486.6 729.28 384.07 677.87 0 591.39 387.32
P4 725.06 861.44 807.62 466.98 1210 404.65 385.58 591.39 0 171.85
P5 591.5 828.59 670.16 299.1 1138.9 387.06 395.57 387.32 171.85 0
(a) Bar, S1
(b) Bar, S2
(c) Bar, S3
(d) Edge, S1
(e) Edge, S2
(f) Edge, S3
(g) Gaussian
(h) LoG
(i) Bar, S1
(j) Bar, S2
(k) Bar, S3
(l) Edge, S1
(m) Edge, S2
(n) Edge, S3
(o) Gaussian
(p) LoG
Figure 2. Response of punctation and mosaic images to the 8 filters. (a)-(h) Response of punctation image P2. (i)-(p) Response of mosaic image M2. S1, S2 and S3 indicate the 3 scales different scales used.
(a) Pre-processed grayscale image
(b) Image After match filtering
(c) After GMM based clustering
(d) RGB with contours around punctations
(e) Pre-processed grayscale image
(f) Image After match filtering
(g) After GMM based clustering
(h) RGB with contours around mosaic
Figure 3. Punctation detection using match filtering and GMM. (a)-(d) use image P2 from figure 1(b). (e)-(h) use Image M2 from figure 1(g). Table 2. Average area of detected objects for the 10 images in figure 1 Average area
M1
M2
M3
M4
M5
P1
P2
P3
P4
P5
49.51
68.54
92.41
35.72
35.62
33.50
24.90
30.35
25.58
28.02
Proceedings of the 19th IEEE Symposium on Computer-Based Medical Systems (CBMS'06) 0-7695-2517-1/06 $20.00 © 2006 IEEE