Intelligent Advisory System for Screening ... - Semantic Scholar

11 downloads 6781 Views 562KB Size Report
May 20, 2004 - Current results on a moderate size database are encouraging and ... Builder schedule list generation. Scheduled task upload. (HL7 file).
IMTC 2004 – Instrumentation and Measurement Technology Conference Como, Italy, 18-20 May 2004

Intelligent Advisory System for Screening Mammography Gábor Horváth, József Valyon, György Strausz, Béla Pataki, László Sragner, László Lasztovicza, Nóra Székely Budapest University of Technology and Economics Department of Measurement and Information Systems H-1521 Budapest, Magyar Tudósok krt. 2. Email:{horvath, valyon, strausz, pataki, sragner, laszto, szekelyn}@mit.bme.hu Abstract – The paper presents the first results of a Computer Aided Diagnosis (CAD) system development project for automatic detection of breast cancer. The primary aim of the R&D project is to develop an intelligent advisory system for analyzing mammographic images, and to help radiologists in making diagnosis. One of these results is the set-up of an image library of 10,000 digitalized and qualified mammograms. This work has been started and that will support research and education both in the field of medical sciences and in medical engineering. It is also important that in the frame of the project a prototype of a mammography workstation will be developed and tested. Research tasks are already in an advanced state, algorithms has been developed and tested for microcalcification detection using hierarchical neural nets, and also for lesion characterization using hybrid methods and texture analyses. Current results on a moderate size database are encouraging and we can expect further advances from the utilization of the large image database. Keywords – Biomedical systems, computer aided diagnoses, mammography, hybrid intelligent system, neural networks

I. INTRODUCTION Breast cancer is the most common form and the second major cause of cancer among women. According to statistics, 8% of women will develop it in her lifetime. Thanks to recent advances in medicine, there are effective methods in the treatment. The sooner the illness is detected, the more effective the treatment is. If detected early, the five-year survival rate exceeds 95%. For early detection of breast cancer currently mammography is the most effective way [1]. Women aged 40-49 showed about 17% reduction 15 years after starting screening and women in the ages of 50-69 showed a reduction in mortality of 25-30% [2]. Screening mammography is an X-ray examination. In a mammographic session four X-ray images (two views typically craniocaudal (CC) and mediolateral (ML) views) of the two breasts are taken. The images are used to detect signs of abnormalities and judge their severity: to differentiate benign and malignant cases. Most of the mammograms are normal, but an enormous number of images must be evaluated. As an example this means around one million images per year in Hungary, if global screening were done (women over 40). The diagnosis of this amount of images takes very long time of skilled radiologists and can produce human errors due to the length

and monotony of the process. To reduce miss rate of human diagnosis usually double reading of mammograms are applied, when two radiologists – independently of each other – diagnose all cases. The mammographic advisory system can improve the efficiency of screening either by replacing the second radiologist or by drawing the attention of the radiologists to suspicious areas of the images. A further possibility of utilization of such a system is to support specialists by filtering out normal images. It could save time and could help avoiding (or at least reducing the number of) false diagnoses. This paper deals with the development of such a system by a consortium of medical doctors, intelligent system researchers and software engineers. In this paper the main architecture of our CAD system is presented, and some results of the different steps are introduced. The importance of the breast cancer problem is wellknown for decades, but the theoretical and technical background for supporting the task with an intelligent CAD system are only available for a few years. Our project’s main contribution to this area is twofold: we develop intelligent algorithms for automatic breast cancer detection and a largesize mammography image database is being built where each image is stored with two independent diagnoses. First we introduce the system architecture we set up for supporting the project. Then we describe the main parts of the computer aided diagnoses solution. The final paper will give detailed results of several algorithms. In this paper the results of microcalcification detection will be presented, while an accompanying paper gives some details [3] about mass detection. II. THE MAMMOGRAPHIC ADVISORY SYSTEM ARCHITECTURE A distributed architecture based CAD system was implemented in the project. This infrastructure was primarily designed to help the development of automatic detection algorithms. The environment supports all parts of the project, starting from scanning X-ray films, offering an enhanced viewer to help radiologists in labeling the images with diagnosis data and offering medical imaging standard DICOM interface for communicating with the automatic diagnosis module.

Report loader module

Data queries Hospital database

PracticeBuilder schedule list generation

Scheduled task upload (HL7 file)

Diagnostic report generation

Report upload

Hospital Information System Practice Builder Input images and attached Képek és csatolt informinformation ációk felvitele Acquisition Staion client Scanner Practice Builder

PracticeBuilder client

Practice Builder database PracticeBuilder server at the Hospital

Diagnostic input Képek és csatoltreport inform ációk felvitele

Report analysis High-speed network

Generating DICOM structured reports

Structured report upload

Intelligent diagnostic module Image and attached information retrieval Practice Builder mirror database Automatic diagnostic

PracticeBuilder (mirror server) Image Diagnostic at the University Database

Fig. 1. The infrastructure of the project

In the next year the system will mainly be used by the team members of the project: medical doctors, hospital assistants and researchers. On the other side the large capacity, the high quality viewer, the applied medical standards make this software environment to be a good candidate for a full-featured mammography workstation that is needed in hospitals. The main role of the system is to support the development of an automatic mammography diagnostic system but it was an important goal to create a system that can work in the future as a standalone mammographic workstation. For developing and testing diagnostic algorithms a large number of qualified cases, a large database is required. In the

whole project two large databases are being used. The first one is a public database constructed in University of South Florida [4]. This Digital Database for Screening Mammography (DDSM) contains 2620 cases (each one contains four images) is a large set of well diagnosed cases which is a collection of images of rather various normal, benign and malignant patients. The second one is being constructed by a Hungarian medical doctors' team. Its size will be similarly large to that of the DDSM database. The main reason to deal with two large databases is to use different images for developing the system and for the final testing. In the development phase DDSM is used, while the Hungarian database will serve testing. In testing it is

especially important that the images and all other available information would be similar to the everyday screening mammography examinations done in Hungarian mammography centers. The structure of the two databases is different. In DDSM the ratio of the cancerous cases to all cases is much higher than in the normal screened population, while the Hungarian database is developed from the cases obtained during the everyday activity of a mammographic screening center. The primary usage of the latter database will be in the testing phase, but the database itself will have a significant value. Therefore it is necessary to provide convenient interface for querying, filtering the database. Building the database needs significant human effort from both the assistants who are scanning and digitization the analogue films and from the radiologists providing the diagnostic information. An important characteristic of the data store is, that the size of a single image is some megabytes itself, so the infrastructure should be able to work with some hundreds of gigabyte data that needs special considerations in designing data archive and communication services of the system. As a result of the mentioned requirements the built system infrastructure (Figure 1.) supports the following main tasks: - digitalization and upload of mammography images from films, attaching relevant patient and image data, - graphical interface for viewing and analyzing the images, - marking and inputting and storing diagnostic information - and providing services for searching, filtering and using selected images. The system is used by users at two locations in parallel. Image upload and providing diagnosis are performed at the hospital participating in the project, meanwhile the development of automatic diagnostic algorithms is done at our university. The development work is supported with a mirror server that is connected through high-speed network to the server in the hospital. The system is used by multiple users at the same time that is served by the web client/server solution based on the PracticeBuilder application of ImageMedical Inc. III. COMPUTER-AIDED DIAGNOSIS TASK IN MAMMOGRAPHY The task of computer-aided diagnosis for mammography is a very hard and complex one. In several cases the pictures are taken from such breasts, where the breast tissue is rather dense, so the no details or only a few details can be observed in the mammographic images. These cases are hard to diagnose even for skilled radiologists. A further problem is that there are many patters in the images cased by normal breast tissues which are very similar to patterns of malignant areas. To differentiate these patters is a rather hard task and can be done only if both local information gained from the pattern and global information obtained from all four images of a patient are taken into consideration. A further difficulty

may come if the quality of the images is poor. As a consequence of all these difficulties even the best human specialists cannot diagnose 100% of the images correctly. So far many reports, papers have been published about mammographic CAD systems (e.g. [4], [5], [6]), some of them are quite promising, however, no final solution is known which solve all problems of mammographic diagnosis satisfactorily. The goal of our project is to develop a complex system to relieve radiologists from lots of monotonous tasks. The logical architecture of the system being developed is shown in Figure 2. The role of the different boxes are explained as follows: Current Images: In each mammographic session four pictures are taken. In the first processing steps each image is investigated separately. After some steps the localization and classification of the suspected abnormalities are improved using the two images (CC and ML) of the same breast. At the end asymmetry of the left and right breasts is investigated using e.g. texture analysis. Archived Images: If the images of a previous mammographic examination for the same patient are available, then the change of suspected abnormalities is checked. Anamnesis Data: Several personal parameters of the patient influence the probability of breast cancer development e.g. the family records (were similar diseases in the family or not), the profession etc. Therefore in the final evaluation the diagnosis depends on that data as well. Preprocessing: There are several important tasks before the diagnostic evaluation of the images can start. Because some of the pictures are of poor quality image enhancement methods should be applied [7]. The parenchymal patterns are patient dependant and probably different methods are optimal for the diagnosis of different basic patterns. Therefore in the preprocessing step the basic parenchymal type is determined. Detection: Several methods of determining the location of suspected abnormalities were suggested in the literature, but none of them gives solution for every type of abnormalities in every tissue. Two types of abnormalities are of primer importance; reliable detection of microcalcifications and lesions must be solved. Several methods were suggested especially for microcalcification detection e.g. classical image processing algorithms, morphological operations, neural network based methods, texture analysis, heuristic methods, wavelet transform based methods, etc. Some of these procedures (e.g. neural nets) can be used for lesion detection as well. Of course there are some special algorithms for suspicious mass localization like Iris filtering, etc. [8]. Because plenty of algorithms are available and no single algorithm solves even one of the detection problems perfectly, in the final system several methods will be applied parallel and the results of them are integrated to get a final qualification of a case. An important feature of the system is that it will give explanation about the detected lesions, to help radiologists to form the final diagnosis.

Fig. 2. The logical architecture of computer-aided diagnosis

Justification: When suspicious locations are detected in a mammogram, some different algorithms are used to justify or reject the hypothesis. This step is needed because the detection algorithms tend to give even in an absolutely normal mammogram 2-5 suspicious locations. Several parallel methods are used in that step - heuristic methods, some shape description and evaluation procedures for lesions, integration of the findings on the 4 images of the two breasts, comparison of the current images to older ones etc.

Classification: In the classification step a diagnosis is established based on the images only. The diagnosis consists of the findings, and some probability-like measure estimating the reliability of the diagnosis. Decision: In this final step the anamnesis data are integrated to the information extracted from the images and the best possible diagnosis is determined.

IV. HYBRID NEURAL APPROACH FOR MAMMOGRAPHIC IMAGE ANALYSES As the large-size mammographic database described above is under development now we mainly used images from the DDSM digital mammography database [4] in our experiments. In addition we used a specimen database developed during our project. That contains X-ray images of tissues removed during breast surgery. The advantage of these specimen pictures is, that the findings are histologically proved. In the tests so far we used full images and regions of interest, depending on the application. The number of test images (or regions) used so far depends on the applied approach: in some cases only about 50 images were tested while in other cases we have test results from more than 1000 images. The developed methods for lesion characterization consist of two main stages. At first a set of candidate regions of interest has to be chosen. In this stage methods have adjustable parameters that are set to produce a very high sensitivity at a cost of low specificity (that means a relatively high false positive rate). In the second stage these regions are to be examined for the real presence of lesions in order to reduce false positive detections. The studied methods implementing the first stage use mainly statistical approaches like textures analysis and co-occurrence matrixes, and work with a reasonably high sensitivity. Methods implementing the second stage try to extract possible lesions from the background followed by the characterization of their shapes. Several different algorithms were developed and tested for microcalcification detection. These methods use entirely different approaches: so far the best results could be achieved using classical image processing algorithms and rather complex morphological oprerations, but a hierarchical model of image features and neural networks to identify individual microcalcifications was also applied [9, 10] We applied the modified algorithm on regions of interest extracted from images in the DDSM database. The main goal of our examination was to determine the effects of the training samples and image features on the networks. The variability of the images and even the variability of the microcalcifications (MCs) are so high, that no single procedure was enough to overcome all of the problems. Therefore a multiprocedure system was used to detect MCs. (Later other detection procedures - e.g. wavelet analysis based methods - which are under development will also be added.) The basic architecture of microcalcification detection system is shown in Fig 3. First image enhancement and some basic classification of the image (e.g. parenchymal tissue is determined) are performed. In the next step the global analysis of the image is done using texture analysis. In that step the regions of interest are found (regions where microclassification is possibly present). Then evaluation is made based on the results of multiple different detection procedures.

IMAGE Preprocessing

(image enhancement, parenchymal tissue, etc)

S e g m e n ta ti o n (texture analysis) Regions of interest Microcalcification detection procedure #1 (Heuristic)

Microcalcifica cation detection procedure #2 (Neural)

E v a l u a ti o n

(comparison of results: number, position, size, shape of microcalcifications, clustered, etc.) Fig. 3. Microcalcification detection

Neural networks are well suited for microcalcification detection. The problem is the variability of the size and form of the microcalcifications. For solving these problems a hierarchical neural network structure was suggested by Sajda et al. [10]. In their original processing a pyramidal structure is built using different neural nets for different resolutions of the same image. In our project the neural architecture was used in a neural ensemble context to improve the performance in detection. Ensembles improve the performance if the networks are accurate but diverse [11]. In Figure 4 the basic architecture is shown using 3 levels of resolution. The neural nets are MLPs that are trained independently. The results in the current phase of the project show, that the ratio of true positive detections are larger than 90%. This is a quite promising result. However, the weakness of the current solution is its high false positive ratio. As an average 2-3 microcalcification-suspicious regions are detected in every image, which is not too bad comparing to the results of other mammographic CAD systems, but which should be reduced significantly. Reduction can be achieved using further analysis of the suspicious areas, where one of the most important steps will be the comparison of the two views and the detection of asymmetry between the images of the left and right breasts.

Processing

Neural Block #2

Hierarchical neural block #1

Multiresolution

Feature extraction (different for each block)

V O T I N G

Neural detection

Fig. 4. The hierarchical neural network structure of the neural detection scheme

V. CONCLUSIONS The paper describes the main goals of a mammographic project. These are: to develop an infrastructure for getting scanned images which were diagnosed by human experts, to develop algorithms (using different approaches) for detecting abnormalities in mammographic images and to develop a whole advisory system for screening mammography, where all available information is utilized. The first phase of the work is finished: the whole infrastructure has been developed, now it is used for building the Hungarian database. The second and most important phase is to develop algorithms for detecting abnormalities. While some useful algorithms have been developed (their testing is going on) further algorithms are under development. As the problem is extremely difficult it can be concluded that only hybrid solutions - where the combination of entirely different approaches is applied - can promise results. The results achieved with these procedures are encouraging for building mammography workstations. ACKNOWLEDGMENT This work is sponsored by Research and Development Secretariat of the Hungarian Ministry of Education under contract IKTA 102/2001. REFERENCES [1] L. Tabár: "Diagnosis and In-Depth Differential Diagnosis of Breast Diseases" Breast Imaging and Interventional Procedures, ESDIR, Turku, Finland, 1996. [2] "PDQ Detection and Prevention- Health Professionals", http://www. cancernet.nci.nih.gov/clinpdq/screening/Screening.for.breast.cancer. Physician.html

[3] N. Székely, N. Tóth and B. Pataki: "A Hybrid System for Detecting Masses in Mammographic Images" Proc. of the IMTC04, Como, Italy, 2004. [4] M. Heath, K. Bowyer, D. Kopans, R. Moore, K. Chang, S. Munishkumaran and P. Kegelmeyer "Current Status of the Digital Database for Screening Mammography" in Digital Mammography, N. Karssemeier, M. Thijssen, J. Hendriks and L. van Erning (eds.) Proc. of the 4th International Workshop on Digital Mammography, Nijmegen, The Netherlands, 1998. Kluwer Acamdemic, pp. 457-460. [5] Songyang Yu and Ling Guan, "A CAD System for the Automatic Detection of Clustered Microcalcifications in Digitized Mammogram Films" IEEE Trans. on Medical Imaging, vol. 19. No. 2. pp. 115-126. 2000. [6] B. Verma and J. Zakos, Ä Computer-Aided Diagnosis System for Digital Mammograms Based on Fuzzy-Neural Feature Extraction Techniques" IEEE Trans .on Information Technology in Biomedicine, vol. 5. No. 1. pp. 46-54. 2001. [7] Anil K. Jain. Fundamentals of Digital Image Processing. Prentice-Hall International, 1989. [8] W.T. Freeman and E.H. Adelson. "The Design and Use of Steerable Filters", IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 13. pp. 891-906, 1991. [9] P. Sajda and C. Spence. "Training Neural Networks for Computer-Aided Diagnosis: Experience in the Intelligence Community", Proc. of the Pacific Medical Technology Symposium, 17-20 August 1998, Honolulu, Hawai, pp. 338-394. [10] P. Sajda and C. Spence. "Learning Contextual Relationships in Mammograms using a Hierarchical Pyramid Neural Network", IEEE Trans. on Medical Imaging, vol. 21. No. 3. pp. 239-250. 2002. [11] P. Sollich and A. Krogh, "Learning with Ensembles: How over-fitting can be useful", In: Advances in Neural Information Processing Systems 8. D. S. Touretzky, M. C. Mozer and M. E. Hasselmo (eds.) MIT Press, pp. 190-196, 1996.