Classification of Linear Structures in Mammograms ... - Springer Link

5 downloads 0 Views 713KB Size Report
sify different types of linear structure. Encouraging results are presented for syn- thetic linear structures added to real mammographic backgrounds, and spicules.
Classification of Linear Structures in Mammograms Using Random Forests Zezhi Chen, Michael Berks, Susan Astley, and Chris Taylor Imaging Science and Biomedical Engineeering, School of Cancer and Enabling Sciences, University of Manchester, Oxford Road, Manchester, M13 9PT, UK [email protected]

Abstract. Classification of linear structures, such as blood vessels, milk ducts, spiculations and fibrous tissue can be used to aid the automated detection and diagnosis of mammographic abnormalities. We use a combination of dual-tree complex wavelet coefficients and random forest classification to detect and classify different types of linear structure. Encouraging results are presented for synthetic linear structures added to real mammographic backgrounds, and spicules in real mammograms. For spicule/non-spicule classification in real mammograms we report an area Az = 0.764 under the receiver operating characteristic. Keywords: mammography, linear structures, classification, random forests, dual-tree complex wavelet.

1 Introduction It has been reported recently that that current CAD systems do not detect architectural distortion (AD) with adequate sensitivity or specificity [1]. Previous attempts at detecting patterns of distorted breast tissue – including both patterns of spicules associated with malignant masses and more general cases of AD in which no focal mass is visible – have used a two stage approach involving i) detecting linear structures, ii) analysing the orientation patterns of these structures to determine if AD is present [2-4]. We hypothesise that the sensitivity of AD detection algorithms could be improved if different types of linear structure could be labelled automatically, and used selectively in the second stage of the analysis outlined above. Although there is an extensive literature on detecting linear structures in digital mammograms [2-7], less attention has been paid to classifying different structure types [8]. We present a novel method for classifying linear structures, based on the use of a complex wavelet transform to provide a rich representation in which local shape can be inferred from the phase relationships between coefficients (cf [5-7]). We use random forest classification [9] applied to this representation to detect linear structures and classify their types. Results are given for synthetic linear structures added to real mammographic backgrounds, and for spicule detection and classification in real mammograms. J. Martí et al. (Eds.): IWDM 2010, LNCS 6136, pp. 153–160, 2010. © Springer-Verlag Berlin Heidelberg 2010

154

Z. Chen et al.

2 Data and Methods 2.1 Mammogram Data We used a sequential set of 84 abnormal mammograms with biopsy-proven malignancy, drawn from a screening population (Nightingale Breast Centre, South Manchester University Hospitals Trust, UK), and a set of 89 normal mammograms of the contralateral breasts of the same individuals (where disease was radiologically confirmed to be confined to one breast). All mammograms were digitised to a resolution of 80µm, using a Vidar CADPRO scanner. A 4 × 4 cm patch was extracted around each abnormality, and a similar patch was sampled randomly from each of the normal mammograms. For each abnormal patch an expert radiologist annotated some (though not necessarily all) of the spicules associated with the abnormality, using in-house software, resulting in a total of 555 spicule annotations. 2.2 Synthetic Data We generated synthetic images by adding linear structures to 128 × 128 pixel normal mammogram patches, pre-processed to remove naturally occurring linear structure. 6130 normal mammogram patches were sampled randomly from 185 normal screening mammograms, including the 89 described above. For the experiments in Section 3 we added linear structures with Gaussian or rectangular cross-sections. For the experiments in Section 4.1 we added linear structures with elliptical cross-sections, simulating the x-ray projection of uniformly dense cylindrical structures. 2.3 Representing Local Structure Using the DT-CWT Wavelet transforms have been used extensively in image processing and analysis to provide a rich description of local structure. The dual-tree complex wavelet transform (DT-CWT) has particular advantages because it provides a directionally selective representation with shift-invariant coefficient magnitudes and local phase information [10]. The DT-CWT combines the outputs of two discrete transforms using real wavelets, differing in phase by 90°, to form the real and imaginary parts of complex coefficients. For 2-D images, the DT-CWT produces 6 directional sub-bands, oriented at ±15º, ±45º, ±75º, at each of a series of scales separated by factors of 2. In the experiments described in Sections 3 and 4, we used the complex coefficients (in phase/magnitude form) from the 6 oriented sub-bands in each of the s finest decomposition scales from a w× w neighbourhood centred on each pixel. This produced feature vectors with of length 12sw2 . In some experiments we formed more compact vectors of length 2sw2 by including only the coefficients at each location and scale for the sub-band that gave the largest response. 2.4 Classifying Structure Using Random Forests Given a set of training data consisting of N samples each of which is a D-dimensional feature vector labelled as belonging to one of C classes, a random forest comprises a set of tree predictors constructed from the training data [9]. Each tree in the forest is

Classification of Linear Structures in Mammograms Using Random Forests

155

built from a bootstrap sample of the training data (that is, a set of N samples chosen randomly, with replacement, from the original data). The trees are built using a standard classification and regression tree (CART) algorithm; however, rather than assessing all D dimensions for the optimal split at each tree node, only a random subset of d < D dimensions are considered. The trees are built to full size (i.e. until a leaf is reached containing samples from only one class) and are not pruned. During classification, an unseen feature vector is classified independently by each tree in the forest; each tree casts a unit class vote, and the most popular class can be assigned to the input vector. Alternatively, the proportion of votes assigned to each class can be used to provide a probabilistic labelling of the input vector. Random forests are particularly suited to learning non-linear relationships in highdimensional multi-class training data, and have been shown to perform as well as classifiers such as Adaboost or support vector machines, whilst being computationally more efficient [9]. For all the experiments described below we followed published guidelines [9], constructing forests containing 200 trees and setting d = D .

3 Experimental Results for Synthetic Data We conducted initial experiments to test our approach, using synthetic images containing linear structures with Gaussian or rectangular cross-sections superimposed on real mammographic backgrounds, as described in Section 2.2. For each image, the bar type and orientation were selected randomly, whilst the contrast and width were randomly sampled from ranges typical of linear structures in real mammograms: widths [4, 32] pixels (0.3 – 2.5mm), peak contrast [8, 16] grey-levels (relative to images scaled 0 – 255). Fig. 1 shows two synthetic images, and the largest complex coefficient (over the 6 orientation sub-bands) for three different levels in the transform.

Fig. 1. Synthetic images and their DT-CWT coefficients. Top: Gaussian bar, width (SD) 4.33 pixels, contrast 10.12 grey-levels. Bottom: rectangular bar, width 8 pixels, contrast 9.05 greylevels. Columns L to R show original and maximum response over orientation sub-bands for the 2nd, 3rd and 4th levels of the DT-CWT, using intensity (magnitude), hue (phase) coding.

156

Z. Chen et al.

We generated training sets containing 10, 20, 40, 80 and 160 images and a test set containing 100 images. The pixels in each image were labelled as belonging to either background or rectangular/Gaussian bar, giving a three-class classification problem. We extracted 432-dimensional feature vectors, using all 6 orientation sub-bands, a neighbourhood size of w = 3, and s = 4 scales. We sampled 40,000 vectors randomly from each of the training sets, and constructed a random forest classifier as described in Section 2.4. The forest was used to classify all the pixels in the test images and classification error (misclassified pixels / total pixels) was calculated. The results are summarised in Table 1. Classification accuracy improves as the number of training images increases, reaching 97.7% correct classification for the largest training set tested. Given these promising results, we moved on to real data. Table 1. Random forest classification error rates for 3-class labelling of sythetic images Number of training images Classification error

10

20

40

80

160

0.0522 ± 0.0584

0.0354 ± 0.0429

0.0365 ± 0.0425

0.0237 ± 0.0348

0.0231 ± 0.0322

4 Experimental Results for Real Data To apply our approach to real mammographic data we proceeded in three stages i) detecting the linear structures in a set of normal and abnormal training images, ii) building a spicule/non-spicule classifier using the expert annotations of the training images, iii) using the classifier to label pixels in unseen test images, using expert annotations to evaluate classification accuracy. Because we were working with a limited dataset, we used a cross-validation approach to evaluation. 4.1 Detecting Linear Structures in Mammograms For line detection in real mammograms, we trained a random forest classifier on synthetic images designed to contain similar structures to those found in mammograms. Linear structures with elliptical cross-sections were added to normal mammogram backgrounds as described in Section 2.2, with widths in the range [2, 32] pixels (0.15 – 2.5 mm), and contrasts in the range [4/256, 16/256] grey-levels. We experimented with neighbourhood sizes w = 1, 3 and 5, scales s = 4 and 6, and number of sub-bands all or maximum response only, building in each case a random forest classifier using 200,000 training vectors sampled with equal probability from the line and background classes. We also varied the lower and upper bounds [l, u] of the range of widths used during training. We found that it was possible to achieve close to 100% classification accuracy on unseen synthetic data with virtually all parameter combinations, though all-sub-band classifiers generally outperformed maximum-sub-band classifiers. Example linear structure probability images obtained by applying classifiers built using different training regimes to a real mammogram patch are shown in Fig 2. All the classifiers produce plausible results, but because we do not have ground truth data for the real

Classification of Linear Structures in Mammograms Using Random Forests

157

linear structures it is difficult to draw firm conclusions. Inspecting the results for a large number of images, we made the following observations regarding different training regimes: • minimum line width l = 2 gave more noisy results than l = 4; • maximum line width u = 32 gave less sensitivity to subtle lines than u = 16; • neighbourhood size w = 3 or 5 gave better signal-to-noise than w = 1, probably because information on phase derivatives is captured; • scales s = 6 gave better discrimination between lines and edges than s = 4; • using all orientation sub-bands gave better results near line crossings than the maximum sub-band approach.

(a)

(b)

(c)

(d)

(e)

(f)

Fig. 2. (a) Original mass region; (b)-(f) Line probability maps using for varying parameter sets: (b) Bar widths = [4, 16], w = 3, s = 6, all sub-bands; (c) Bar widths = [2, 16], w = 3, s = 6, all sub-bands; (d) Bar widths = [2, 32], w = 3, s = 6, all sub-bands; (e) Bar widths = [2, 16], w = 3, s = 6, maximum sub-band response; (f) Bar widths = [2, 16], w = 1, s = 6, all sub-bands

Based on these observations, we selected for subsequent experiments a classifier trained using synthetic bars of width [4, 16] pixels, with 648-dimensional feature vectors constructed using neighbourhood size w = 3, scales s = 6, and all oriented subbands (see Fig 2 (b)). This classifier was used to construct linear structure probability images for the 84 abnormal and 89 normal regions.

158

Z. Chen et al.

4.2 Classifying Spicules in Mammograms We used the linear structure probability images described above to train a random forest classifier to distinguish between spicules and other linear structures in real mammograms, using DT-CWT features. The expert spicule annotations for the abnormal images were used as a basis for selecting spicule pixels, though they were not sufficiently accurate to be used directly. To refine the annotations, we initialised a snake [11] using each original annotation, and iterated it to convergence, using evidence from the linear structure probability image. The 555 refined spicule annotations identified a set of 36,514 spicule pixels. We also randomly sampled an equal number of pixel locations from the 89 normal patches such that the distribution of linear structure probabilities in the normal samples matched the distribution of those in the spicule sample. Random forest classifiers were trained using DT-CWT feature vectors constructed using varying neighbourhood size w, scales s, and number of sub-bands with the spicule/non-spicule labels, and evaluated using a 10-fold cross-validation design. The set of normal and abnormal regions were divided into 10 groups so that the total number of normal and spicule samples in each group were as close as possible to a 10th of the total. The samples in each group were then classified using a random forest trained on the samples from the remaining 9 groups. The classification results from each group were pooled to generate an unbiased class probability for each sampled pixel. These probabilities were used to compute an ROC curve for each training regime, and the area under the curve (Az) was computed and used as a measure of classification performance. The results are tabulated in Table 2. Table 2. Spicule classification results for varying compositions of feature vectors Composition of feature vectors Neighbourhood No. of decomposition size (w) scales (s) 3×3 4 3×3 5 3×3 6 6 3×3 1×1 6 5×5 6

No. of subbands All All All Maximum All All

Size of feature vectors (D)

ROC Az

432 540 648 108 72 1800

0.693 0.699 0.755 0.752 0.764 0.754

From the results we can see the advantage of including more decomposition scales in the feature vectors and using the responses in all oriented subbands as opposed to using only the maximum response. However, somewhat surprisingly, the classification results appear to be slightly better using 1x1 neighbourhoods rather than 3x3 neighbourhoods, in contrast to the trend observed when performing line detection ( see Section 4.1). We also applied the 10-fold cross-validation approach and the best classifier design (neighbourhood size w = 1, scales s = 6, all sub-bands) to generate unbiased spicule probability images for all 89 normal and 84 abnormal regions. Typical results are shown in Figure 3, where the increased spicule probability in spiculated areas of the abnormal region – relative to both the normal region and the non-spiculated areas of the abnormal region – is clearly visible.

Classification of Linear Structures in Mammograms Using Random Forests

159

Fig. 3. Left column: a mass region and normal region; centre column: line probability maps of each region; right column: spicule probability depicted as hue from cyan (normal) to pink (spicule), modulated by line strength

5 Conclusion In this paper we have presented a new method for classifying local structure in mammograms. We have applied the method to detect and differentiate between two types of synthetic linear structures added to real mammographic backgrounds. The accuracy of the classification highlighted the promise of the approach. For real data, we first trained a classifier on synthetic images to perform line detection. We then used the results of this detection scheme, together with radiologist annotations to perform spicule/non-spicule classification. An ROC Az of 0.764 suggests that a meaningful differentiation can be made between the two classes. Whilst such a classification may not be strong enough to detect abnormal malignant patterns on its own, the spicule probabilities may allow us to assign a weighting to each pixel when it is included in other measures (for example probability maps) designed to detect such patterns, thus improving the specificity of these measures. This will be the subject of further work.

References 1. Prajna, S., et al.: Detection of Architectural Distortion in Mammograms Acquired Prior to the Detection of Breast Cancer using Texture and Fractal Analysis. In: Giger, M.L., Karssemeijer, N. (eds.) proc. SPIE Medical Imaging, vol. 6915 (2008)

160

Z. Chen et al.

2. Parr, T., et al.: Statistical Modelling of Lines and Structures in Mammograms. In: Duncan, J.S., Gindi, G. (eds.) IPMI 1997. LNCS, vol. 1230, pp. 405–410. Springer, Heidelberg (1997) 3. Karssemeijer, N., te Brake, G.M.: Detection of Stellate Distortions in Mammograms. IEEE Transactions on Medical Imaging 15(5), 611–619 (1996) 4. Bornefalk, H.: Use of Phase and Certainty Information in Automatic Detection of Stellate Patterns in Mammograms. In: Proc. SPIE Medical Imaging, vol. 5370, pp. 97–107 (2004) 5. Wai, L.C., Mellor, M., Brady, M.: A Multi-resolution CLS Detection Algorithm for Mammographic Image Analysis. In: Barillot, C., Haynor, D.R., Hellier, P. (eds.) MICCAI 2004. LNCS, vol. 3217, pp. 865–872. Springer, Heidelberg (2004) 6. Schenk, V., Brady, M.: Finding CLS Using Multiresolution Orientated Local Energy Feature Detection. In: Heitgen, H. (ed.) Proc. 6th International Workshop on Digital Mammography, pp. 64–68 (2002) 7. McLoughlin, et al.: Connective Tissue Representation for Detection of Microcalcifications in Digital Mammograms. In: Sonka, M., Fitzpatrick, J.M. (eds.) Proc. SPIE Medical Imaging, vol. 4684, pp. 1246–1256 (2002) 8. Zwiggelaar, R., Astley, S.M., Boggis, C.R., et al.: Linear Structures in Mammographic Images: Detection and Classification. IEEE Transactions on Medcal Imaging 23(9), 1077–1086 (2004) 9. Breiman, L.: Random Forests. Machine Learning 45(1), 5–32 (2001) 10. Selesnick, I., Baraniuk, R.G., Kingsbury, N.G.: The Dual-Tree Complex Wavelet Transform. IEEE Signal Processing Magazine 22(6), 123–151 (2005) 11. Kass, M., Witkin, A., Terzopoulos, D.: Snakes: Active Contour Models. International Journal of Computer Vision 1(4), 321–331 (1988)

Suggest Documents