evaluation of geometric feature descriptors for detection and ...

3 downloads 0 Views 672KB Size Report
CLASSIFICATION OF LUNG NODULES IN LOW DOSE CT SCANS OF THE CHEST. Amal Farag ... example, the national lung screening trial (NLST) has just.
EVALUATION OF GEOMETRIC FEATURE DESCRIPTORS FOR DETECTION AND CLASSIFICATION OF LUNG NODULES IN LOW DOSE CT SCANS OF THE CHEST Amal Farag, Asem Ali, James Graham, Aly Farag, Salwa Elshazly and Robert Falk* Computer Vision and Image Processing Laboratory Department of Electrical and Computer Engineering, University of Louisville (*) Medical Imaging Division, Jewish Hospital, Louisville, KY, USA E-mail: [email protected] ; URL: www.cvip.uofl.edu ABSTRACT This paper examines the effectiveness of geometric feature descriptors, common in computer vision, for false positive reduction and for classification of lung nodules in low dose CT (LDCT) scans. A data-driven lung nodule modeling approach creates templates for common nodule types, using active appearance models (AAM); which are then used to detect candidate nodules based on optimum similarity measured by the normalized cross-correlation (NCC). Geometric feature descriptors (e.g., SIFT, LBP and SURF) are applied to the output of the detection step, in order to extract features from the nodule candidates, for further enhancement of output and possible reduction of false positives. Results on the clinical ELCAP database showed that the descriptors provide 2% enhancements in the specificity of the detected nodule above the NCC results when used in a k-NN classifier. Thus quantitative measures of enhancements of the performance of CAD models based on LDCT are now possible and are entirely model-based. Most importantly, our approach is applicable for classification of nodules into categories and pathologies. Index Terms— Lung nodule classification, LDCT scans, Lung nodule detection, SIFT, LBP 1. INTRODUCTION In the past two decades numerous screening studies have been conducted worldwide to study early indications of lung cancer. Indeed, survival of lung cancer is strongly dependent on accurate and early diagnosis [1]. Enhancements in image acquisition systems have increased the ability to obtain low-dose computed tomography (LDCT) scans, where significant tissue analysis can now be performed at much lower risk of radiation. In the US, for example, the national lung screening trial (NLST) has just released a report indicating that screening using LDCT may help reduce cancer than using regular chest X-ray (e.g., http://www.cancer.gov) [1]. The typical CAD system, involves image filtering, segmentation, nodule detection and classification. There is a rich literature under the various steps (e.g., [2]-[8]). We can state with great confidence that the filtering and segmentation steps have

978-1-4244-4128-0/11/$25.00 ©2011 IEEE

169

been resolved in terms of accuracy, speed and robustness to various uncertainties. We believe, based on nearly a decade long experience with CAD systems that the detection step has improved significantly, even though false positives are still problematic, especially with low resolution scans, and when the nodules are severely occluded by the anatomical structures. Due to the coupling of the detection and the classification steps, there is still a need for improvements in the techniques and in validation. A quantitative measure of progress in these two steps, and in fact the entire CAD system for LDCT, requires standard database of nodules and a validation mechanism. Our work using the ELCAP database have shown that data-driven nodule models outperforms the parametric models (e.g., [6][7]). We use the Active Appearance Modeling (AAM) approach to construct the data-driven templates from an ensemble of nodules. Such models improved the specificity and sensitivity of detection and reduced the false positives over the use of parametric nodule models. An issue of interest is the effect of nodule models on classification into categories (i.e., types of nodules) and pathology. What are the types of features that may be extracted from the nodules, in order to improve the accuracy of classification? Computer vision methodologies developed for object modeling and matching may lend a great benefit for answering this question. Local Binary Pattern was used in [9] to extract feature vectors to classify the area between the thyroid boundaries and then detect thyroid nodules in ultrasound images. Li et al in [10] used a combination of techniques including a nodule filter enhancement and an automatic rule-based classifier that allowed for falsepositive reduction during detection. In our previous work [11], we compared the features extracted using an adaptation of the Daugman iris recognition algorithm [13] and the Scale-Invariant Feature Transform (SIFT) algorithm [14] for nodule classification, while in [12] we compared the features of the Multi-resolution Local Binary Pattern (LBP) [15] and the Speeded Up Robust Features (SURF) [16] algorithms for nodule classification. These algorithms have been able to provide very useful features for categorization of lung nodules, which warrant further investigation on larger databases. In this paper, we expand further on our

ISBI 2011

previous work in [11-12] in two fronts: first, an additional fifth class, “non nodule”, is introduced to the possible categories (clusters) of nodules from the detection step, which allows for other possibilities of nodules in the lung tissue (e.g., nodules that radiologists may differ in categorization). Features from SIFT and LBP are used in a k-NN framework for categorization. Second, the SIFT and LBP feature information was also used during the detection step as a false positive reducer to enhance the results obtained from the template matching using the data-driven approach to model the lung nodules [6-7]. 2.

FEATURE DESCRIPTORS FOR CLASSIFICATION

The success of object description hinges on two main conditions: distinction and invariance. The methodology needs to be robust to accommodate for variations in imaging conditions and at the same time produces a distinctive characterization of the desired object. Within the four nodule categories (e.g., Kostis [4]), we recognize intervariations in shape and texture among the Juxta-Pleural, Pleural-Tail, Vascularized and Well-Circumscribed nodules. Similar distinctions were also found between the nodules and “non nodules”. This work is based on the ELCAP public database [17]. We created a nodule database by using the locations of 397 nodules provided by radiologist. A subset database containing 294 nodules which are accurately categorized of the original 397 was used. Given the nodule centroid we extract the SIFT and LBP descriptors. The SIFT approach [14] used in this paper transforms the data information in each class (nodule and non nodule), into scale-invariant coordinates relative to local features. The SIFT process produces a feature vector of length 128, this feature vector is the descriptor representation of one image. Figure 1 depicts the SIFT recognition process, the extracted SIFT descriptors are projected to a lower-dimensional subspace using principle component analysis (PCA) and linear discriminant analysis (LDA), techniques abundant in the pattern recognition literature (e.g., [18]), where noise is filtered out.

Fig 1. Visualization of SIFT Recognition process for the Juxta , Well-Circumscribed, Vascularized, Pleural-Tail Nodules and Non nodules.

The extended LBP operator within a (P, R) neighborhood ୳ଶ [15] was used in this paper. Fig. 2 illustrates the LBP ୔ୖ

170

of both the original images and gradient images, where Sobel filters (Š୶ andŠ୷ ) is used to generate the gradient magnitude image. ଶ

ȁ‫ܫ׏‬ȁ ൌ ටሺ݄௫ ٔ ‫ܫ‬ሻଶ ൅ ൫݄௬ ٔ ‫ܫ‬൯

The extracted LBP descriptors are also projected using PCA

and LDA to filter out the noise artifacts. Once the feature descriptors for each of the five classes were generated, a kNN classifier, leave-one-out with Euclidean distance as the similarity measure was implemented, in order to test if in fact distinctions are apparent between classes. Various training percentages within the classes were used for training, i.e. x% is the amount of ground-truth nodules taken into consideration in the training phase. Training in this paper was performed using a one-time random sampling approach.

ࡸ࡮ࡼ࢛૛ ૡǡ૚ 

ࡸ࡮ࡼ࢛૛ ૚૟ǡ૛ 

ࡸ࡮ࡼ࢛૛ ૡǡ૚ 

ࡸ࡮ࡼ࢛૛ ૚૟ǡ૛ 

Fig 2. Block Diagram of generating the LBP for a juxta-pleural nodule. The equation for the above picture is:  ࢛૛ ࢛૛ ࢛૛ ࡸ࡮ࡼ࢛૛ where the first two terms ૡǡ૚ ൅ ࡸ࡮ࡼ૚૟ǡ૛ ൅ ࡸ࡮ࡼૡǡ૚ ൅ ࡸ࡮ࡼ૚૟ǡ૛ represent the original image and the last two terms represent the gradient image.

Quantification of nodule type classification performance was conducted by measuring true positives rates. A classification result is considered a true positive if a sample from a certain class is classified as belonging to the same class. The results shown in Table 1 are promising. The raw SIFT and raw LBP fell-short in correctly classifying the actual nodules into their desired classes, but showed overall rather excellent classification results of the non nodule class. The PCA and LDA SIFT and LBP increased classification results for all classes. PCA SIFT and LBP obtained better results as training percentage decreased while the opposite is true in the LDA SIFT and LBP results. All of the results depicted in Table 1 allow the conclusion to be made that non nodules do in-fact contain descriptor variations that allow them to be correctly classified. In this paper we chose to experiment with using the raw SIFT and LBP feature descriptors, for various training percentages, as false positive reducers during detection. Thus, the four nodule classes were merged into one class, nodule, and the second class non nodule.

Table 1. Classification Results obtained from the: a) Raw SIFT and LBP, b) PCA SIFT and LBP and c) LDA SIFT and LBP.

(a)

Raw SIFT vs. Raw LBP

results from the previous detection as initial candidate nodule locations. The SIFT and LBP feature extraction methods generate the feature descriptor information for each of these candidate nodules which are then compared with the SIFT and LBP feature descriptors for various training of data nodule and non nodule classes. The non nodule candidates are discarded and no longer seen as candidate lung nodules. Sensitivity and specificity is computed to obtain the detection results after false-positive reduction. The widely used form of the NCC in the literature for the normalized cross-correlation of a template, t(x,y) with a subimage f(x,y), is implemented in this paper. A rastersweeping methodology of template matching that computes the NCC as it performs is implemented. A NCC threshold value of 0.5 was deemed suitable from previous analysis [6]. 4.

(b) PCA SIFT vs. PCA LBP

(c) LDA SIFT vs. LDA LBP

3.

LUNG NODULE DETECTION WITH FALSE POSITIVE REDUCTION The predominant templates used in the literature follow a parametric form (usually circular or semi-circular in 2D and spherical or semi-spherical in 3D). This approach has limited sensitivity and specificity, and the false positives are very hard to quantify (e.g., [2][5][6]). The main reason for the limited performance of parametric nodules is the fact that real world nodules do not have uniform shape or fixed size, and are not isotropic. In this paper, we use our data driven templates (shown in Fig. 3) [7]; these are four templates where each template represents the mean shape and texture of one of the nodules described above. These models were generated using a Procrustes based AAM method [19] which allowed for obtaining a more realistic texture and shape description of the nodules. In this paper, we implement a generic template matching approach that uses the normalized cross-correlation (NCC) as the similarity measure.

As stated previously, this work is based on the ELCAP public database [17]. Detection results are obtained using the nodules extracted from the 50 sets of low-dose CT lung scans taken at a single breath-hold with slice thickness 1.25 mm. Numerous experimentations were conducted to test the robustness and effectiveness of the false-positive reducer feature descriptor combined with the data-driven template models for detection. We depict some of the results, in Tables 2-5, obtained before false positive reduction and after false positive reduction was conducted using the RAW SIFT or RAW LBP, for different training percentages. As can be seen before false positive reduction an overall 86% sensitivity and 97% specificity were obtained. False positive reduction using increased SIFT training data yielded overall specificity increase by 2%, to become 99% specific and 85% sensitivity. False positive reduction using LBP yielded similar results to the SIFT algorithm, but the overall specificity increase was by 1%. Analyzing explicit nodule types the well-circumscribed and vascularized nodules were the least sensitive nodules, yet their specificity is very high. Overall the results obtained from using false positive reduction with 50% data training allows detection results to increase in specificity with minimal sensitivity changes overall and per nodule type. 5.

Fig 3. Nodule models using the mean shape and texture of coregistered nodules. The resultant nodule models are based on the average of shape and texture from the AAM approach (e.g., [7]).

The detection process was carried out in two methods: the first is implementing the template matching using the datadriven nodule models and compute the sensitivity and specificity results for detection. The second method uses the

171

DETECTION RESULTS

CONCLUSIONS AND FUTURE WORK

This paper discussed key approaches for nodule and non nodule texture feature extraction using some of the wellknown feature descriptors in the computer vision literature, used for the first time in the lung nodule detection and classification research. The features from the descriptors were optimized by projection to lower sub-space using PCA and LDA in order to decrease noise artifacts in the generated features. Classification between nodules and non-nodules was examined using a k-NN leave-one-out algorithm with Euclidean distance as the similarity measure, in order to test

whether or not there exists significant distinctions between classes. Table 2: Template matching results without false positive reduction.

Table 3: Template matching results using 25% of SIFT generated feature descriptors for false positive reduction

3.

4.

5.

6.

Table 4: Template matching results using 50% of SIFT generated feature descriptors for false positive reduction

Table 5: Template matching results using 50% of LBP generated feature descriptors for false positive reduction

7.

8. 9.

Detection using data driven template matching approach before and after false positive reduction via SIFT and LBP feature extraction for various training was also implemented. Sensitivity and specificity of the detection results was analyzed. An overall 2% specificity increase was found after template matching followed by false positive reduction by the SIFT descriptor. No significant enhancement in the sensitivity resulted from this process. The enhancement in the specificity is very important, as it enables us to confidently assign nodules to categories, and eventually to pathologies. Future directions are geared toward generating a larger nodule database from other clinical data to expand our work. We are aiming to test the detection process using the PCA and LDA results for both the SIFT and LBP for false positive reduction, sensitivity and specificity analysis will be computed to obtain a conclusive finding on false positive reduction using the methods discussed in this paper. We are aiming to incorporate other classification techniques to the proposed approach in this paper for comparison and to obtain the best generalized method.

10.

11.

12. 13. 14. 15.

16.

Acknowledgements: This research has been supported by grants from the Kentucky Lung Cancer Program. The first author has been supported by NASA Graduate Fellowship.

17.

REFERENCES

18.

1. 2.

United States National Institute of Health. www.nih.gov. H. Fujita S. Itoh Y. Lee, T. Hara and T. Ishigaki. Automated detection of pulmonary nodules in helical ct images based on an improved template-matching technique. IEEE Transactions on Medical Imaging, 20, 2001.

172

19.

D. F. Yankelevitz, W.J. Kostis, A.P. Reeves and C. I. Henschke. Three dimensional segmentation and growth-rate estimation of small pulmonary nodules in helical ct images. Medical Imaging IEEE Transactions, 22:1259–1274, 2003. Kostis, W. J., et al. “Small pulmonary nodules: reproducibility of three-dimensional volumetric measurement and estimation of time to follow-up,” Radiology, Vol. 231, pp. 446-52, 2004. M. Prokop I. Sluimer, A. Schilham and B. van Ginneken. Computer analysis of computed tomography scans of the lung: A survey. IEEE Transactions on Medical Imaging, 25(4):385–405, April 2006. Amal A. Farag, S.Y. Elhabian, S.A. Elshazly and A.A. Farag. “Quantification of nodule detection in chest CT: A clinical investigation based on the ELCAP study”. Proc. of Second International Workshop on Pulmonary Image Processing in conjunction with MICCAI-09. A. Farag, J. Graham, A. Farag, S. Elshazly, R. Falk. “Parametric and Non-Parametric Nodule Models: Design and Evaluation”. Proc. of Third International Workshop on Pulmonary Image Processing in conjunction with MICCAI10, pp. 151-162, 2010. S. Lee, A. Kouzani and E. Hu. “Automated detection of lung nodules in computed tomography images: a review”. Machine Vision and Applications, 2010. Keramidas, E.G., Iakovidis, D.K., Maroulis, D., and Karkanis, S. Efficient and effective ultrasound image analysis scheme for thyroid nodule detection. Lecture Notes in Computer Science, vol.4633, pp. 1052-1060, 2007. Li, Q., Li, F., Doi, K.: Computerized detection of lung nodules in thin-section CT images by use of selective enhancement filters and an automated rule-based classifier. Acad. Radiol. 15, 165–175 (2008). Amal Farag, S. Elhabian, J. Graham, A. Farag, R. Falk, “Toward Precise Pulmonary Nodule Descriptors for Nodule Type Classification”. Medical Image Computing and Computer-Assisted Intervention – MICCAI 2010. Lecture Notes in Computer Science 2010. pp. 626-633. Amal Farag, A. Ali, S. Elhabian, J. Graham, A. Farag, “Feature-Based Lung Nodule Classification”. International Symposium on Visual Computing (ISVC). November, 2010. J. Daugman, "Probing the uniqueness and randomness of IrisCodes: Results from 200 billion iris pair comparisons." Proceedings of the IEEE, 94(11), pp 1927-1935. D. G Lowe. “Distinctive image features from scaleinvariant keypoints”. International Journal of Computer Vision, 60(2), 2004. T. Ojala, M. Pietikainen, and T. Maenpaa, “Multiresolution gray-scale and rotation invariant texture classification with local binary patterns,” IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-24, 2002, pp. 971-987. Herbert Bay, Andreas Ess, Tinne Tuytelaars, Luc Van Gool, "SURF: Speeded Up Robust Features", Computer Vision and Image Understanding (CVIU), Vol. 110, No. 3, pp. 346-359, 2008. ELCAP public lung image database. www.via.cornell.edu/databases/lungdb.html. R. Duda, P. Hart and D. Stork, Pattern Classification, 2nd Edition, Wiley, 2001. Edwards G. J., Taylor C. J. Cootes, T. F. Active Apperance Models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6), 2001.

Suggest Documents