Automated Image Retrieval of Chest CT Images ...

Automated Image Retrieval of Chest CT Images Based on Local Grey Scale Invariant Features Marcelo Arrais Portoa, Marcos Cordeiro d’Ornellasb a

b

Animati Computação Aplicada, Santa Maria-RS, Brasil Universidade Federal de Santa Maria, Universidade Federal de Santa Maria, Santa Maria-RS, Brasil

Abstract Textual-based tools are regularly employed to retrieve medical images for reading and interpretation using current retrieval Picture Archiving and Communication Systems (PACS) but pose some drawbacks. All-purpose content-based image retrieval (CBIR) systems are limited when dealing with medical images and do not fit well into PACS workflow and clinical practice. This paper presents an automated image retrieval approach for chest CT images based local grey scale invariant features. The developed application prototype is integrated with PACS and RIS, by means of user-specified image features or user-supplied queries to retrieve the most similar images from an image database. Local correspondences are used for radiological diagnosis to substantially reduce the full pose search space. Performance was measured in terms of precision and recall, average retrieval precision (ARP), and average retrieval rate (ARR). Preliminary results have shown the effectiveness of the proposed approach. The prototype is also a useful tool for radiology research and education, providing valuable information to the medical and broader healthcare community. Keywords: PACS; Image Retrieval; Chest CT Images; Local Features.

Introduction Medical imaging has become essential for modern medicine [1]. It is being used extensively for diagnosis, planning, and treatment evaluation, dramatically changing how physicians deal with illnesses. The broad applicability of medical imaging has increased its evolution, translating into a standard of medical care for cancer, stroke, trauma, etc [2]. Moreover, with the wide development and deployment of reliable, fast, and user-friendly medical image acquisition devices, physicians can easily generate a large number of images with high spatial resolution. Therefore, careful non-invasive observation of anatomical structures in the human body has made possible. Nevertheless, image interpretation and understanding poses a significant and continuous burden on physicians who need to evaluate large amounts of data, which in the case of chest CT may comprise hundreds of images, in a short period of time. In order to tackle these challenges, a generic integration of computer-aided diagnosis (CAD) and content-based image retrieval (CBIR) is expected. CBIR has been introduced in radiology interpretation routine in recent years [3]. CBIR usually employs a set of low-level feature descriptors to represent a medical image, from which a group of similarity functions are used to drive different sorts of queries. This helps physicians to find similar cases from a variety of archives, thus providing support with medical image interpretation and decision-making. The question of image similarity has an im-

pact in the medical domain since diagnostic decision-making traditionally involves evidence from patient’s medical records that are coupled with physicians’ prior knowledge. Picture archiving and communication systems (PACS) [4] repositories provide a chance for image-based diagnosis, learning, teaching, and research-based comparisons based on image similarities. This may require searching the database for images with similar characteristics of the patient under consideration. Even though, PACS searching capabilities are mostly text-driven [5]. Textual descriptions limit these capabilities, requiring physicians to go through several medical records to grasp some keywords. While text-based PACS search continues to be useful, a comprehensive search is always limited since it does not take the visual properties of the images into account. Additionally, the dramatic volume of imaging stored in hospital and clinical environments required content-based medical image retrieval (CBMIR) systems to provide solutions for a large array of medical imaging modalities, as well as methods to organize such images. Lung cancer is one of the most common cancers in the world. It is the leading cause of cancer death in both men and women in developing and developed countries [6]. A major issue in the successful treatment of lung cancer is the early detection of the disease. The purpose of the screening chest CT scan is to detect and diagnose at an early stage and underlying medical condition, before any symptoms develop. This paper presents an automatic approach to image retrieval of chest CT images based on local grey scale invariant features such as scale-invariant feature transform and multi-scale oriented patches. In order to achieve a convenient interface for integrating image retrieval capability into PACS environments, an adapter was developed. The solution helps with the interpretation and comparison of chest CT images, which is a domain where diagnostics are fairly hard especially for nonchest specialists. The proposed approach is tested in terms of performance, by conducting some experiments on distinctive open-source chest CT databases. This paper structure is given as follows: Related Work, Prototype Architecture, Experimental Results, Discussion Conclusion, and Further Work. Related Work presents relevant and updated information on content-based image retrieval in medical databases and medical image retrieval with local descriptors. The following session presents the interface architecture, which aims at CMBIR and PACS integration. Experimental Results reports precision-recall measurements against open source chest CT databases. Results and reasonable outcomes are hypothesized before conducting a statistical analysis. Discussion of the results, and Conclusion provide meaningful observations while further work is outlined in order to guide future experimental activities about ongoing research.

Content Based Image Retrieval in Medical Databases Feature vectors, obtained by low level features, are fundamental to assess similarity measurements in CBMIR searching process. Some methods make use of global features for image retrieval [7]. Also, other methods rely on local features. A Gaussian mixture model framework for matching and categorizing X-Ray images was treated in [8]. An evaluation between texture and multi-scale features by several classifiers in medical image classification and retrieval was proposed in [9]. A bag of features approach was used to describe patch-based image content in [10] while an enhanced methodology using multiple assignment and visual words weighting of the patches was discussed in [11]. An introduction to the process where fusion features assist matching scale invariant feature transform (SIFT) image features from high contrast scenes was presented in [12]. Local features were arbitrarily sampled modified SIFT descriptors and global features were downsizing pixels while segmentation was carried out by support vector machines (SVM). Though these methods have proved effective in medical image retrieval, there is still a semantic disparity amid low and high level features. Semi-supervised learning approaches that try to leverage both labeled and unlabeled data were studied in [13]. An image retrieval algorithm based on SVM and local features was developed in [14] whereas label mean was employed as prior knowledge to improve semi-supervised achievements. Linear neighborhood propagation, which can propagate the labels from the labeled points to the whole dataset using linear neighborhoods with sufficient smoothness, was presented in [15]. In CAD for lung cancer detection and diagnosis, nodule [16] detection in chest CT is the most valuable and tedious task. Lung is the main structure in chest CT images, built by vessels, bronchus, and fissures, which are relevant references for lung cancer, pneumonia, and several lung diseases in clinical diagnosis. Recognition of lung structure is essential aspect of successful CAD, and strongly influences CBMIR systems. Several research works [17] have been carried out on segmenting a special target region such as the lung, vessels, among other structures from chest CT images for medical image retrieval. CBMIR systems have been applying global features including color histograms and statistical distributions. Recently, local features have emerged as an effective approach, having several advantages over global features including robustness to occlusion and clutter, being superior in discriminating fine details. Local descriptors are based on points of interest and local characterization. Many techniques of points of interest and performance evaluations have been proposed in the literature [18]. They rely on the application and, more precisely, on the imaging transformations. Artifacts are commonly found in clinical CT, and may obscure or simulate pathology. There are many types of CT artifacts including noise, motion, ring, and metal artifacts. Therefore, feature extraction methods must be stable and robust to enable image recognition and retrieval. Local invariant methods have been employed to tackle these problems including SIFT [19] and MOPS[20]. We adopt MOPS algorithm in this paper for feature extraction and matching of chest CT images. MOPS is a relatively lightweight scale invariant feature detector compared to SIFT, and has advantage of faster detection speed. The algorithm extended Harris algorithm with rotation and scale invariance. The autocorrelation detector presented in the implementation

makes it more prone to find edge or corner like features. MOPS features are well spatially distributed in the image, using gradients to extract a dominant orientation in a region centered at the feature point.

Prototype Architecture The prototype is arranged in a set of modules, as illustrated in Figure 1. Visual content descriptors are extracted from the image in the feature descriptor module and submitted to medical image search engine module. The engine searches for appropriate set of well-described images and returns a set of similar images, which can be used for further processing. The feature descriptor module interacts with the image processing library module that provides a seamless interface for image processing libraries to deal with DICOM images and image formats. The query subsystem belongs to the search engine module and is responsible for retrieving similar images from the database according to the user’s query image. The query subsystem receives the queries from the user interface, construct the statements and invoke the DBMS to execute them. The search is based on a similarity comparison rather than on exact match, and the retrieved results are ranked according to a similarity metric in the index and retrieval module. The retrieval module does the management and retrieval of data sources. The medical image database and the image features are stored in the DBMS, and the associated metadata are stored using traditional or XML data types. Supported queries include standard queries that employ usual attributes, DICOM metadata based queries and content-based queries.

Figure 1 – Overview of the prototype architecture. Although the query subsystem leaves room for relevance feedback, it may be included in future versions. The query subsystem retrieves information during user’s interactions, generating user profiles that can be used for improving the similarity computation according to each user perception. Feature extractors and distance functions can be easily added to the prototype, allowing setting up various parameters for different image contexts. Finally, interface is implemented by the user interface, communicating with the other modules.

Experimental Results Experiments have been conducted using MOPS as local gray scale invariant features on chest CT images. An instance of the visual content descriptor is built, followed by the computation of descriptor for each of the key points. This assures that output key points and descriptors are consistent with each other. A comparison evaluation is done for each image pair, using the number of matched key points as similarity metric. In order to improve the computation speed, a sliding window technique is employed to explore the regions in each database image and the correspondences between the regions and the query region are computed based on MOPS features. Database images are ranked in descending order of the maximum region similarity. The top ranked images are taken as the candidate retrieval results.

Precision (P) =

•

Computed Tomography Emphysema Database (CTED) [21] contains 115 high-resolution images from 39 subjects including 9 never-smokers, 10 healthy smokers and 20 smokers with COPD (chronic obstructive pulmonary disease). Images from the dataset were acquired in the upper, middle and lower part of the lung of each patient. This paper only uses the lower part for evaluation, resulting in a set of 37 images; • NSCLC-Radiomics set contains 51,195 CT images. The lung1 dataset, containing data of 422 non-small cell lung cancer (NSCLC) patients. Pretreatment CT scans and clinical outcome data are available for these patients; • Cancer Genome Atlas-Lung Squamous Cell Carcinoma (TCGA-LUSC) data collection contains 29,136 images of 31 patients with lung squamous cell carcinoma, which were acquired in CT, PET and NM modalities. Apart from the first dataset, which is a rather small, collections of image samples from the other two datasets were selected for experimental evaluation. For instance, in NSCLCRadiomics, 711 images were taken into account during trials while 1500 images (CT only) were considered in TCGALUSC.

(1)

N

Group Precision (GP) =

Recall (R) =

1 1 ∑P N 1 i =1

(2)

Γ

1 1 ∑ GP Γ 1 i =1

(3)

# Relevant Images Retrieved × 100 Total # of Relevant Images

(4)

ARP =

N

Group Recall (GR) = ARR =

Datasets Three different training datasets were used in the validation approach. Chest CT image data is formatted as DICOM. The gray scale range level varies with the datasets. Images are compressed into 256 gray levels in order to speed up the feature extraction procedure. The datasets are described as follows:

# Relevant Images Retrieved × 100 Total # of Images Retrieved

1 1 ∑R N 1 i =1

(5)

Γ

1 1 ∑ GR Γ 1 i =1

(6)

where 𝑁! is the number of relevant images and Γ! is the number of groups. Case studies Figure 2, 3 and 4 display the retrieval results of a sample query over the CTED, NSCLC-Radiomics, and TCGA-LUSC datasets respectively. The query image is on the top. The corresponding retrieval results are presented in a 2x3 matrix. Results are shown in descending order of correspondences with respect to MOPS features.

Figure 2 – Sample query results using the Computed Tomography Emphysema Database.

Measuring Instruments Precision and recall are the fundamental measures used in evaluating search strategies. Both have a natural interpretation in terms of probability. Precision and recall calculations were done for comparative analysis on chest CT databases. A decision was made to determine precision and recall results on the fact that top matches returned by the prototype for a particular image should be an instance of that same image, probably on a different chest CT slice or marked and ranked by a different physician. Thus, ground truth was determined by objective, a priori knowledge about the datasets. In these experiments, each image in the database is used as the query image. The retrieval performance of the proposed method is measured in terms of recall, precision, average retrieval rate (ARP) and average retrieval precision (ARR) as given in Equations (1-6) [22]:

Figure 3 – Query results using the Cancer Imaging Archive database.

Figure 8 – ARPxTop Matches for all datasets.

Figure 4 – Query results using the NSCLC-Radiomics. Figures 5, 6, and 7 show the relationship between precision and recall in a more standard form, namely as precision plotted against recall for CTED, NSCLC-Radiomics, and TCGALUSC datasets respectively.

Figure 9 – ARRxTop Matches for all datasets.

Results

Figure 5 – PxR plot using the CTED dataset.

Figure 6 – PxR plot using the NSCLC-Radiomics dataset.

Figure 7 – PxR plot using the TCGA-LUSC dataset. Figures 8 and 9 illustrate the retrieval performance comparison of the proposed approach for all datasets in terms of ARP and ARR.

To determine the number of relevant images in each retrieved set, the median from the number of matching key points was considered as a threshold value. If the number of matching key points in the query is more or equal to that threshold value, then it is assumed as a relevant image. The median was chosen for this problem because it is not influenced by extreme values such as the number of points between two equal images. Figures 2, 3 and 4, shows the query results when 21, 50, and 52 relevant images were considered respectively. Smaller subsets (200 and 100 images) were retrieved to enhance performance in the last two datasets. By looking at PxR plots in Figures 5, 6 and 7, it is possible to take notice that the precision decreases along the increasing of recall, which is an expected behavior explained by the fact that if a little amount of images are retrieved, there is less chance of including irrelevant images in the results. At the same time, there is a small amount of relevant images. When the prototype starts retrieving more images, i.e., recall increases and the chances of retrieving irrelevant ones become greater, which leads to a decreasing precision. Plots also contain some peaks along the curve, which represent little sequences of images that were retrieved without including irrelevant data. Figures 8 and 9 displays ARP and ARR respectively. ARP and ARR were calculated by increasing the number of top matches considered in 5 images in each iteration and the results also prove that the system is according to the expected. In ARP, the average precision decreases while the number of top matches considered are increasing because more images are taking into account and, therefore, there are more chances to have irrelevant images in that set. On the opposite of ARP, ARR considers the images retrieved from a known total of relevant images, which is why the average retrieval rate increases when more top matches are considered.

Discussion The number of relevant images to perform the data analysis was determined by the median, but the ideal scenario was to have ground-truth sets of relevant images created by a special-

ist, so the efficiency of the algorithm could be measured more precisely. However, ground-truth sets may not be available in real-world applications, and statistical measurements tend to be the applicable solution to deal with large sets of data. The datasets used in this paper contain images from the upper, middle and bottom part of the lung, and the algorithm was able to retrieve images from the same part as the query in its best results, which proves that gray scale invariant features provided by MOPS is an efficient method to be used with images that have well-defined morphology like chest CT images, but are also sensitive to different conditions of acquisition.

Conclusion and Further Work This paper proposed an algorithm to extract features from chest CT images and use them as a similarity measurement to perform CBMIR. The performance results were based on the median of matching points due to the absence of ground-truth sets for the case studies, but other measurements can be tested in future work to improve the precision of the system to define how many relevant images exist in a set of retrieved items. The goal of this paper was to provide an automatic approach to image retrieval of chest CT images based on local grey scale invariant features. In order to achieve a convenient interface for integrating image retrieval capability into PACS environments, an adapter was developed. To access the correctness of query results, an evaluation procedure was carried out in terms of PR plots, average retrieval precision and average retrieval rate. The proposed approach is still not optimal and performance considerations must be taken into account in the near feature. The prototype might need some refactoring to achieve an improved performance when processing a whole batch of images. More datasets can be tested in the future to evaluate the algorithm and its accuracy to retrieve only images of interest. Clinic and Hospital databases usually include images from several different modalities and the system must be able to retrieve the ones that originated from the same modality in accordance with the required query.

References [1] Gao XW, Qian Y, Hui R. The State of the Art of Medical Imaging Technology: From Creation to Archive and Back. The Open Medical Informatics Journal 2011;5:73-85. [2] Kumar A, Kim J, Cai W, Fulham M, and Feng D. ContentBased Medical Image Retrieval: A Survey of Applications to Multidimensional and Multimodality Data. In: Journal of Digital Imaging, vol. 26, no. 6, pp. 1025-1039, 2013. [3] Ramos J, Kockelkorn T, Ginneken, B, Viergever MA, et al. Supervised Content Based Image Retrieval Using Radiology Reports. Image Analysis and Recognition. LNCS Volume 7325, 2012, 249-258. [4] Chandratilleke M, Honeybul S. Modifying Clinicians Use of PACS Imaging. J Digit Imaging. 2013, 26(6):1008-12. [5] Frideli K, Edgren L, Lindsköld L, Aspelin P, and Lundberg N. The Impact of PACS on Radiologists' Work Practice. J Digit Imaging. 2007, 20(4):411-21. [6] Who.int, (2014). WHO | Cancer. [online] Available at: http://www.who.int/mediacentre/factsheets/fs297/en/ [Accessed 20 Dec. 2014]. [7] Hwang, KH, Lee, H, and Choi D. Medical Image Retrieval: Past and Present. Healthc Inform Res. Mar 2012; 18(1): 3–9.

[8] Greenspan H, Pinhas AT. Medical Image Categorization and Retrieval for PACS Using the GMM-KL Framework. IEEE Transactions on Information Technology in Biomedicine. 2, 11(2007). [9] Lehmann TM, Glüd MO, Deselaers T, Keysers D, Schubert H et al. Automatic Categorization of Medical Images for Content-Based Retrieval and Data Mining. Computerized Medical Imaging and Graphics. 2, 29(2005). [10] Avni U, Konen E, Sharon M, and Goldberger J. X-ray Categorization and Retrieval on the Organ and Pathology Level Using Patch-Based Visual Words. IEEE Transactions on Medical Image. 3, 30 (2011). [11] Wang JI, Li YP, Zhang Y, Wang C, Xie HL et al. Bag-ofFeatures Based Medical Image Retrieval via Multiple Assignment and Visual Words Weighting. IEEE Transactions on Medical Image. 3,30 (2011). [12] Tatiana T, Francesco O, and Barbara C. Discriminative Cue Integration for Medical Image Annotation. Pattern Recognition Letters. 15,29 (2009). [13] Chapelle O, Schölkopf B, and Zien A. Semi-Supervised Learning. MIT Press, Cambridge, MA, 2006. [14] Li YF, Kwok JT, and Zhou ZH. Semi-Supervised Learning Using Label Mean. In: Proceedings of the 26th International Conference on Machine Learning, (2009) June 1418; Montreal, Canada. [15] Wang F, and Zhang CS. Label Propagation Through Linear Neighborhoods. In: Proceedings of International Conference on Machine Learning. (2006) June25-29; Pennsylvania, USA. [16] Armato SG III, Roberts RY, McNitt-Gray MF, Meyer CR, Reeves AP et al. The Lung Image Database Consortium (LIDC): Ensuring the Integrity of Expert-Defined Truth. Academic Radiology; 14: 1455–1463, 2007. [17] Li F, Engelmann R, Melz CE et al. Lung Cancers Missed on Chest Radiographs: Results Obtained With a Commercial Computer Aided Detection Program. Radiology 246(1):273–280. [18] Wang LD, Shou ZX. A New Approach for Chest CT Image Retrieval. International Conference, AICI 2009, Shanghai, China, 535-543. [19] Lowe DG. Distinctive Image Features From ScaleInvariant Key points. International journal of computer vision 60: 91–110, 2004. [20] Brown M, Szeliski R, and Winder S. Multi-Image Matching Using Multi-Scale Oriented Patches. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (1)510– 517, 2005. [21] Sørensen L, Shaker SB, and de Bruijne M. Quantitative Analysis of Pulmonary Emphysema using Local Binary Patterns, IEEE Transactions on Medical Imaging 29(2): 559-569, 2010. [22] Fierro-Radilla AN, Daniel KP, Nakano-Miyatake M, Meana HP, and Benois-Pineau J. An Effective Visual Descriptor Based on Color and Shape Features for Image Retrieval. MICAI (1) 2014: 336-348. Address for correspondence Marcelo Arrais Porto [email protected] Animati Computação Aplicada Av. Medianeira 1321 – Pavimento 2, Sala 105 97060-003 Santa Maria-RS - Brasil