A System for Computer Aided Detection of Diseases Patterns in High Resolution CT images of the Lungs T. Zrimec1,2, S. Busayarat2 Centre for Health Informatics, University of New South Wales, 2 School of Computer Science and Engineering, University of New South Wales, Sydney, Australia
[email protected] 1
Abstract Automatic detection of disease patterns in medical images can assist radiologists in image analysis. We present a system for detection of disease patterns demonstrated on HRCT images of the lung. Automated image analysis can be assisted by incorporating into a program information and knowledge that is available to radiologists. Anatomical features and landmarks are first extracted from the images. This information, together with the structure and regions of the lung, that are stored in a model of the lungs, is used in detecting disease patterns. Rules for recognizing different disease patterns are generated using machine learning. The system’s performance is demonstrated on detecting two kinds of diseases patterns, one related to structural deformation of the bronchial tree and one showing fibrotic changes of the lung parenchyma. The results show that the system is able to recognize and indicate the existence, size and location of potential lung abnormalities.
1. Introduction High Resolution CT (HRCT) techniques developed in the last decade have become invaluable tools for the detection of subtle diffuse lung disease patterns and for their characterization into multiple possible diseases. HRCT imaging protocols produce 3D volume data. The 16 slice scanner produces up to 40 images per study and the newer 64 slice scanner produces 300 to 600 images. With such a large number of images, which is growing rapidly, computerized image analysis methods can be of great help to radiologists. They can assist diagnosis by indicating the existence and location of potential abnormalities and by providing measurements of their size and distribution. There is a growing number of Computer-Aided Diagnosis (CAD) systems aimed at automating the analysis of lung CT images and supporting diagnosis [2-4]. Uppaluri et al. [2] presented a CAD system for detecting six lung tissue patterns using textural features. A multiple feature method was used to determine the optimal subset among 22 textural features calculated for each 31x31 pixel region of interest in an image. A Bayesian classifier was trained to use the optimal subset of features to recognize six different tissue patterns. They reported that the automated system performed as well as experienced human observers who were told the diagnosis in advance. Uchiyama et al. [3] also divided the lung into square regions and employed neural networks to perform classification of HRCT images into 6 textural classes. The neural network, trained with examples of different tissue patterns, was able to automatically detect images containing abnormalities and to provide good classification. In the work reported by Sluimer et al. [4], a multi-scale filter bank was used to represent the local image texture and structure. They used various classifiers to train the system. They reported that the CAD ROC curve showed very similar performance compared to that of two radiologists.
Twentieth IEEE International Symposium on Computer-Based Medical Systems (CBMS'07) 0-7695-2905-4/07 $20.00 © 2007
Almost all of the existing CAD systems divide the image into small regions, use classical image processing techniques to calculate the image features but do not take advantage of existing anatomical knowledge. Our idea is to first segment and extract anatomical features from the images and then to use that knowledge in detecting abnormalities caused by disease processes. This paper presents the system and describes how it uses specialised knowledge of lung anatomy and pathology to guide the analysis of HRCT images. The system also makes use of machine-learned rules for detecting normal anatomy and pathology. Novel knowledge-guided techniques have been developed for lung segmentation, quick and efficient detection of lung anatomy based on template matching, model-based image analysis and image registration for efficient disease detection.
2. Material and Methods 2.1 HRCT Image data The data used in the development of the system are from a regular radiology practice. The images were taken using a SIEMENS scanner, with an image resolution of 512x512 pixels, a slice thickness of 1.0 mm and exposure time of 750 ms. The data are stored as Dicom 16-bit greyscale images, with the pixel intensity proportional to tissue density. Alteration of the lung anatomy caused by a disease can be clearly seen in a thin-slice CT image. Figure 1 shows examples of an image with normal anatomy and images showing the presence of diseases. Although those disease patterns are clearly visible to a trained human eye, it is not obvious how to provide an appropriate description that can be used by a computer. The best way is to delineate an example of a disease appearance. Using our specially developed web-based system [5] radiologists were able to access our image database, select and delineate representative examples of different lung diseases patterns (see Figure 1.c ) and attach labels with the name of the pattern, severity and the potential disease. The marked images were used to train the computer to distinguish between different disease patterns. 2.2 Analysing and interpreting HRCT images Knowledge-based image analysis can improve detection in many ways. However, this requires enabling the system to use knowledge similar to that used by radiologists when analysing images. 2.2.1 Information used by radiologists when interpreting HRCT images: Knowledge of anatomy and of disease appearance helps radiologists to recognize the part of the lung shown on a 2D axial image. Radiologists also make extensive use of anatomical landmarks, which are objects or features that help determine the location of the imaged part of the body. Information that helps in detecting pathology is the specific regional distribution of lung diseases. The same features located in a different region of the lung or distributed in a different way can be linked to different pathologies [1]. Lung regions are extensively used in clinical reporting as adjuncts to the detected disease pattern. For example, cysts in patients with IPF are primarily located in the lung periphery and the disease has a basal distribution. Patients with centrilobular emphysema have an upper lobe distribution of low-attenuation lesions [1]. 2.2.2 Information used by a computer aided detection system: We build a digital model of the lung using HRCT images to represent the structure and the content of the
Twentieth IEEE International Symposium on Computer-Based Medical Systems (CBMS'07) 0-7695-2905-4/07 $20.00 © 2007
lung anatomy and to provide regional information to the system. We used specialized literature about the visual interpretation of HRCT images of the lungs [1] to acquire knowledge about disease appearance and behaviour. Machine learning was employed to automatically generate knowledge for detecting anatomical features and disease patterns during image analysis. Figure 2 shows a graphical representation of the system.
3. Computer aided detection Techniques for segmentation of relevant anatomical features and landmarks were developed in parallel with the lung model [6]. A lung atlas was built using 418 HRCT images from a relatively normal subject, with no gaps between images. The performance of the algorithms for feature detection was improved with the growth of the knowledge in the model. Honeycombing
Knowledge Base Models Atlases Rules
Machine Learning
c Signet ring Normal pair bronchus artery
a
HRCT Images
b
Figure 1. HRCT images of the lung: a) normal, b) with a disease, c) marked honeycombing region.
Image Processing
Interpreted HRCT images
Figure 2. System overview.
3.1 Detection of anatomical features 3.1.1 Lung boundaries and lung lobes: Prior to the analysis of the lung tissue, lung boundaries are determined to separate the pixels belonging to the lung parenchyma from the background. Lung boundaries are often used in providing diagnostic information in measurements such as “the number of cells adjacent to the lung boundary” or “the number of low opacity regions next to the lung boundary”. We have developed technique for automatic segmentation of the lungs in HRCT images based on morphological operators and active contour snakes. Figures 3.a, 3.b and 3.c show the steps in lung segmentation. Lungs are divided by fissures into lobes (See Figure 3.c ). The lobes are relatively independent functional units within the lung. Lung pathology may be confined to one lobe. A knowledge-based method for fissure detection, developed in our previous work, performed well in cases where fissure was fully visible. However, in almost 30% of the data, the fissures are partially visible or are not visible at all [7]. Using the information from the lung model it was possible to successfully determine fissures in the cases where fissures were partially visible or missing. The model guided fissure detection by predicting its expected location. The detected fissures were used to determine the lung lobes in 2D images (see Figure 3.d) and in 3D models of patient data (see Figure 4. left ). 3.1.2 Detecting lung landmarks: Anatomical landmarks used to help determine the location of the imaged part of the lung include the sternum, vertebrae and spinal canal, (located on the ribcage) and trachea, carina, hilum and the lung root. Algorithms for the automatic detection of landmarks are based on knowledge of lung anatomy and of the grey level appearance in HRCT images. Using the landmarks, the lungs were divided
Twentieth IEEE International Symposium on Computer-Based Medical Systems (CBMS'07) 0-7695-2905-4/07 $20.00 © 2007
into lung regions that are used in clinical reporting: apical vs. basal, central vs. peripheral [7]. Figure 4. (right) shows one of the lung divisions. 3.1.3 Detection of bronchi and accompanying pulmonary arteries: The identification of bronchi and vessels in HRCT images provides valuable clinical information for patients with suspected airways diseases. One of the main signs is the dilation of the bronchi. In axial HRCT images, the bronchi that run perpendicular to the scan plane appear as high-attenuation circular rings and the arteries appear as highattenuation solid circles (see Figure 1). Automatic detection of bronchi is based on local intensity gradient analysis and rule-based classification. Knowledge from the lung model is used to guide the detection algorithm, providing information about the location and the number of expected bronchi on a particular image. This was found to be helpful for detecting bronchi on a scan where there is a 15 mm gap between successive images. Each bronchus has an accompanying artery. An automatic method for detecting arteries based on [8] had problems with the ambiguous appearances of the adjacent arteries, which presents difficulties even for an experienced radiologist. It also had problems in providing accurate measurements of the size of small arteries due to the pixel rounding effect. A new technique was developed that uses template matching to approximately locate the adjacent artery [9]. A specially developed region growing with leaking correction algorithm was used to accurately segment the arteries. In contrast to other template matching techniques, where predefined templates are used, here the templates are generated on-the-fly using the detected bronchi.
Central
Intermediate
a
b Peripheral
upper lobes
lower lobes
c
d
Figure 3. Lung segmentation (a, b and c); lung fissures (arrows in c); d) detected fissures and lung lobes.
Figure 4. A model of the lung; left: each lobe has different colour: right: lung regions.
3.2 Detection of disease patterns There is a substantial number of different disease patterns that can be visually identified in HRCT images of the lungs. The two examples, described in this paper, are chosen to demonstrate the different detection techniques required by different disease patterns. Rules for classifying the detected patterns were built automatically using J48, the Weka [10] decision tree-induction algorith m. The input to an inductive learning algorithm is a set of classified examples represented by a set of attributes. In our case the examples are disease patterns described by image attributes. The result of learning is a classification tree in which the most informative attributes are used to determine the correct class. 3.2.1 Bronchial dilatation as a direct sign for Bronchiectasis: Dilation of a bronchus is detected by comparing its size with the size of the accompanying artery.
Twentieth IEEE International Symposium on Computer-Based Medical Systems (CBMS'07) 0-7695-2905-4/07 $20.00 © 2007
Bonchiectasis is considered to be present when the diameter of a bronchus is greater than that of the adjacent pulmonary artery (See Figure 1. “Signet ring”) [1]. A set of parameters, calculated to compare the bronchus and its accompanying artery includes: lumen area, shortest diameter and the ratios of the lumen areas and the shortest diameters of a broncho-vascular pair as well as distance between the bronchus and the artery. Machine learning was used to automatically determine the severity thresholds and to determine which parameter to be used in assessing the severity for different sizes of bronchi. A comparison of the broncho-vascular pairs classified by a radiologist and the automatically detected and classified pairs is shown in Figure 5. 3.2.2 Honeycombing as a sign for interstitial lung diseases: Honeycombing indicates a disease process characterised by a cluster of air-filled cysts divided by thick walls. The cysts range from a few millimetres to several centimetres and occur predominantly in the periphery of the lung [1]. Honeycombing is common in patients with idiopathic pulmonary fibrosis and other interstitial diseases. In an HRCT image, honeycombing can be seen as a cluster of roughly circular dark patches surrounded by white walls (see Figure 1.c ). Honeycombing is a challenging pattern to detect due to its textural and structural appearance. We have developed different methods for its detection. A structure-based method first detects potential honeycombing cysts. After clustering, each group of cysts is a potential honeycombing region. Each region is represented by sixty-four (64) image attributes to describe its global and local appearance. A set of potential honeycombing regions was classified by an expert and used for machine learning. J48 decision tree learning [11] produced rules for recognizing honeycombing and non-honeycombing regions. Results of the detection are shown in Figure 5.b . In the texture-based method sixty-four image features are calculated for each pixel and it’s surrounding 15x15 pixels. Machine learning was also used to generate rules for classifying the pixel-based regions. Figure 5.c shows the original image and regions with detected honeycombing. radiologist
computer
a
cell ®ions detection
Classification
b
c
Figure 5. Bronchiectasis a) marked and detected (square – normal pair); Honeycombing (HC) b) structure-based detection (left HC, right nonHC) and c) texture-based detection – (HC bottom: green –cysts, red cysts-walls).
4. Results The result of the automatic detection of bronchi was compared with the 711 manually identified bronchi from 67 images of 18 subjects. It achieved 73% sensitivity and 83% precision on the unseen data. Most of the false negatives occur with small bronchi, which radiologists also have difficulty in identifying. The automatic detection of the presence and severity of bronchial dilatation was evaluated on 442 broncho-arterial
Twentieth IEEE International Symposium on Computer-Based Medical Systems (CBMS'07) 0-7695-2905-4/07 $20.00 © 2007
pairs from 64 subjects. A radiologist verified the comparison. The method achieved 90% accuracy for artery detection and 82% accuracy for dilatation assessment. The detection of honeycombing was tested on 42 HRCT images from 8 patients. Using tenfold cross validation the structure-based method achieved 91.6% accuracy and the texture-based method achieved 88.6%accuracy.
5. Conclusion We have presented a system for computer-aided detection of disease patterns. In the proposed framework, normal anatomy and anatomical landmarks are segmented and used in detecting the disease patterns. Recognizing normal anatomy helps in detecting many diseases that have similar appearance. For example, the appearance of honeycombing is similar to normal bronchi and vessels. Because we know the expected location of the bronchi and vessels, they can be eliminated, leaving the honeycombing. Most of the methods developed are knowledge-guided. Knowledge of anatomy came from a model of the lung. Specific knowledge, related to HRCT images, was acquired via machine learning from examples. Knowledge about disease appearance and its distribution in the lungs was encoded in heuristic rules. Having learned the lung anatomy and having developed a model of the lung, we now have the basis for building systems for recognizing patterns created by other lung diseases.
6. Acknowledgements We thank Peter Wilson for his medical knowledge and assistance, James Wong and Jonathan Creenaune for helping with the implementation, Claude Sammut for his comments on the paper. This research was supported by the Australian Research Council.
7. References [1] R. W. Webb, N. L. Muller, and D. P. Naidich, High-Resolution CT of the Lung (third edition), Lippincott Williams & Wilkins, 2001. [2] R. Uppaluri, E.A. Hoffman, M. Sonka, P.G. Hartley, G.W. Hunninghake, and G. McLennan,“Computer recognition of regional lung disease patterns”, American Journal of Respiratory and Critical Care Medicine, Vol. 160(2): 648–654, 1999. [3] Y. Uchiyama, S. Katsuragawa, H. Abe, J. Shiraishi, F. Li, Q. Li, C.-T. Zhang, K. Suzuki, and K. Doi, “Quantitative computerized analysis of diffuse lung disease in high-resolution computed tomography”, Medical Physics, Vol. 30(9): 2440–2454, 2003. [4] I. Sluimer, P.F. van Waes, M. A. Viergever, and B. van Ginneken, “Computer-aided diagnosis in high resolution CT of the lungs”, Medical Physics, Vol. 30(12): 3081–3090, 2003. [5] M. Rudrapatna, A. Sowmya,T. Zrimec, P. Wilson, G. Kossoff, J. Wong, S. Busayarat, A. Misra, P. Lucas, “LMIK – Learning Medical Image Knowledge: An Internet-based medical image knowledge acquisition framework”, Electronic Imaging Science and Technology, IS&T/SPIE's 16th Annual Symposium, Jan 2004, San Jose, CA. [7] T. Zrimec, S. Busayarat, P. Wilson, “A 3D model of the human lung with lung regions characterization”, ICIP 2004 Proc. IEEE Int. Conf. on Image Processing, Singapore, 2004, pp. 1149-1152. [7] S. Eenakshi, K.Y. Manjunath, V. Balasubramanyam, “Morphological variations of the lung fissures and lobes”, Indian J. Chest Dis. Allied Sci., Vol. 46(3): 179-182, 2004. [8] F. Chabat, X. Hu, D.M. Hansell, G.Z.Yang, “ERS Transform for automated detection of bronchial abnormalities on CT”, IEEE Trans Medical Imaging, Vol 20(9): 942-952, 2001. [9] S. Busayarat, T. Zrimec, “Automatic Detection of Pulmonary Arteries and Assessment of Bronchial Dilatation in HRCT Images of the Lungs”, in Proc. Of 2005 ICSC Congress on Computational Intelligence: Methods & Applications, IEEE, ISBN: 1- 4244-0020-1, 2005. [10]. I.H. Witten and E. Frank, Data Mining: Practical machine learning tools and techniques, Morgan Kaufmann, San Francisco, 2nd ed, 2005.
Twentieth IEEE International Symposium on Computer-Based Medical Systems (CBMS'07) 0-7695-2905-4/07 $20.00 © 2007