dates, cotton-wool spots, and drusen. Examples of red lesions are retinal hemorrhages and microaneurisms. The segmentation process produces a list of ...
Image Understanding for Automated Retinal Diagnosis M.H. Goldbaum, N.P. Katz, S. Chaudhuri*, M. Nelson**
Department of Ophthalmology and *Department of Electrical and Computer Engineering University of Califomia, San Diego, La Jolla, CA 92093 **Data Vector, 1775 Homet Road, Pasadena, CA 91106 Image understanding Involves not only image enhancement and segmentation, but also object classification and scene analysis. In images of the retina, the domain of detectable objects is limited to a flnite workable number, therefore, the feasibility of implementing a successful image understanding system is high.
lntepretation of imges of the ocular fudus by the of the REtna systm requir STARE (STuct A g eh m, object seg ena y t, _niu e -doe tion, objec idetfio, and see a l stte ho thm step are peformed and faked, and we d some succes with th STARE syte In oack of te stp. the bood veelb, optic We are c ty abe to s k, a loeso automaticy. nere, fove, brigt We desrbe the methds for thes tasks and the develomet y to complete the prodution of a dte of objects that fm a coded espto of the Image. For the final
This paper will focus on three parts of a retinal image understanding system: 1) the techniques used to automatically identify blood vessels, the optc disk. and a number of lesions, 2) the knowledge-description language which we developed to describe the various disease manifestations in the retnal image, and 3) the integration of neural networks into 'the classification and scene analysis stages of the system.
t t'ng the image, we found the a set of d srual network to be abe to lear to d from the type of information In the coded decription of the
in n
WQMWfOCENIR VI Image understanding has been described as a series of steps of information processing associated with computer vision: image enhancement, segmen on, object classiflcaIn general, imae tion, and scene analysis (Flgure 1). enhancement transfonmations allow certain features to be emphasized, conrected. or removed, providing the user with an output image which can be evaluated more easily for certain criteria. These routines are not designed to provide any high level Infornation about the presence of certain objects which may be of interest to the user. For example, although selected features may become much more apparent on the display screen after performing image processing. information is still stored as individual pixels. each of which may or may not have any relationship to its neighbors.
INTRODUCITION A significant part of the ophthalmologist's job is to analyze the retina, optic nerve, pigment epithelium. and choroid in the ocular fundus. Interpreting fundus photography and fluorescein angiography has become a common part of the ophthalmologic practice. In the STARE project (STructured Analysis of the REtina), we are developing an image-understanding system to diagnose diseases from fluorescein angiograms and color images of the retina au$omatically and to analyze sequential images for change.' This system requires a complex sequence of tasks to accomplish
Image segmnentation is the process by which a region of pixels that share a common property is separated from the rest of the image to produce symbolic information.4 Object classfictlon is the stage at which segmented objects are identified. Individual objects must be classified as to the real world objects they represent. This presents an especialy challenging task when different objects of interest to the user, any or all of which may be present in an image simultaneously, have similar features. In such instances, we use pattem recognition techniques, specifically statistical classification, based on knowledge about the features in each type of object.
these goals. We will describe the steps and the current level of development for each step.
The aim of our research is to extend the capabilites and productivity of the ophthalmologist and to provide decision support to physicians. STARE will allow automated analysis of large numbers of photographs or stored color images and fluorescein angiograms of the ocular fundus to alert the ophthalmologist to photographs and angiogams that need closer scrutiny (alertjiinction). This process will extend the number of photographs that can be analyzed as well as help ophthalmologists not to overlook diagnoses to be investigated (rerrindfunction). All objects of interest in an image will be measured with more precision than is routinely done now (quantifiction function). Pairs of inmages will also be analyzed for change over times (comparefimctlon).
Once classification is complete, a list of objects in a scene can be reported, providing their relative locations. If a computer is to accurately report a list of potential diagnoses associated with a mediccal iage, all possible manifestations of the diseases of interest must be coded into the computer's knowledge base. The task of analyzing combinations of objects detected in an image of the retina in order to report a list of potential diagnoses forns the basis of our image understanding system.
Supported by NIH grant EY05996
0195-4210/89/0000/0756$01.00 C 1989 SCAMC, Inc.
756
STARE SYSTEM OVERVIEW Digitized image
)
OW,tl m R173
A fundus camera provides color images and fluorescein angiograms that can be directly digitized or captured as 35mm slides. Our hardware presently includes a 25 MHz Compaq-386. This computer is complemented with a 300 MB hard disk and a set of peripherals which allow us to digitize, display, archive, and analyze images efficiently and reliably. The peripherals include a JVC color video camera, a Matrox image-processing board with a frame grabber and a neighborhood coprocessor, a Hitachi high resolution color monitor, and an Optimem one gigabyte digital laser optical storage system.
iUnes and regions
71
7
G 89
Database characterizing
Unidentified
Unidentified
Object
Object
Image
identitied
Color photographs are digitized into separate RGB planes with spatial and intensity resolutions limited by the scanning apparatus and the memory capacity of the frame grabber (ours is 512x48Ox8 bits/color). Fluorescein angiograms (FA) offer another view of the retina that provides a different set of objects. These images are captured as monochrome (gray scale) images. Thus, the first major task of our retinal image understanding system is to find and classify each of the objects in color images or in FAs.
object
Exudate
Object
EIuatel
SEGMENTATION
102
-89
S.tz "AExudate: location
density
"
Described
In all images, attempts are made to flnd objects in the
Image
following order: blood vessels, the optic nerve, the fovea, and all lesions trable 1). The lesions are generally divided into objects brighter or darker than the background.
spatial relations
Dalabase characterizing
image
acqguisition and archiving
ES or NN>
Symbolic change detection C gSize of object SiaelofinjetTt
~~~~ ~~~
Number of objects Presence of object Color of object
Diagnoses
Table 1: Partial list of fundus oculi objects in order sought
Hypetension .12
Hypertenso
.82
Objects resembling blood vessels Blood vessels Thick edges Optic nerve Bright objects
atf*¢_ ES --wo
*S
-
Figure 1: Steps In image understanding from digitized image
Drusen Cotton wool spots Large white regions (retinItis) Exudates Focal choroidal atrophy or scars
to
Interpreted image
Subretinal fibrosis Amelanotic tumors Dark objects Hemorrhage
OVERVIEW OF IMAGES OF THE OCULAR FUNDUS STARE has been successful at automatically segmenting color fundus images and reporting the locations of normal and abnormal objects in these images. The normal objects are blood vessels, the optic disk, and the fovea. Abnormal objects are any lesions or normally-present objects with abnormal characteristics. For example, blood vessels may be thin or thick or tortuous. The optic disk may show pallor in optic atrophy or a deep red color, enlargement, and blurred edges in papilledema. Yellow lesions include exudates, cotton-wool spots, and drusen. Examples of red lesions are retinal hemorrhages and microaneurisms. The segmentation process produces a list of isolated objects, reporting their location, size, orientation, and various features required for classification, including color, shape, texture, and edge sharpness. These features are used to identify the objects. A coded description of the image is built by reporting the location and dimensions of normal and abnormal objects.
Pignented scar Hyperpigmentation of the retinal pigment epithelium Nevus Melanoma
In the ocular fundus, the objects are essentially 2dimensional and are illuminated and viewed from only one point - through the pupil. We know in advance the types of objects that are being sought; therefore, we can use some of the features of these objects to design flters matched for the segmentation process.
Features of retinal objects used for segmentation
The blood vessels usually appear as curvilinear red structures that branch or cross and become smaller the farther they are traced from the optic nerve. The intensity of the vessel is almost always darker than the adjacent retina; although, in some patients with pigmented fundi, the background is darker. An inverted, Gaussian-shaped, zero sum matched fllter rotated about twelve discrete angles of 15' each is used to segmgit and identify piecewise-linear segments of blood vessels as shown in figure 2.
Most retinal diseases can be classified by using a finite number of manifestations and lesion types. Once objects have been identifled by classification, techniques for scene analysis can be developed to determine diagnoses based on the objects present in each image and their relationships.
757
_Optic Disk
_
Exudat Hemorrhage
Fovea
Blood Vesse
Figure 3: Left. fundus photograph showing background diabetic retinopathy. Right. automatically segmented image.
Figure 2: Three-dimensional representation of the kernel used to detect blood vessels.
nerve, blood vessels, fovea, bright lesions, and dark lesions have been found without user intervention
The optic nerve stands out from the retina as a bright, pinkish, vertically-oval disk about 1.5 mm in diameter with a sharp edge, large vessels originating from it, and a bright center. The large retinal vessels are mostly oriented vertically at the optic nerve. The algorithms used include a) a circularly symmetric, Gaussian-shaped, zero sum convolution kernel for a bright spot 1.5 mm in diameter, b) a vertically-oriented matched filter kernel for blood vessels, and c) the convergence of vessels as estimated by the density of straight lines representing a least squares flt of long vessel segments. The center of the disk is chosen by selecting the location of the maximum likelihooo estimation using a weighted sum of these three features.
Feature 1 CvULUI measurements for cmiaOltuauon llitabufClllCllLb iuiclassification of obiects V1 oUUJecL Features useful for classification include color, size, shape, texture, edge slope, and irregularity of the edge. Lesions vary widely in the vector of their features. As an example, drusen are round yellow lesions ranging from 15 pm to 400 gm in diameter. The edges are moderately sharp. They may be isolated or closely grouped. Individual drusen may be discrete, touching, or coalescent. Retinal hemorrhages are dark red, somewhat round or oval and range from 15 to 700 pm in diameter or along the long axis. They may be isolated or close to other hemorrhages.
Classification of objects
The fovea (about 500 pm diameter) is avascular and generally appears as a gradually darkening spot with a bright dot in the center, about 3.75 mm temporal to the center of the disk. The fovea is localized with filters matching a) the dark spot in the green plane and b) the avascular zone in the blood vessel image, both in the appropriate location with respect to the disk.
Objects that are different in appearance can be classifled with a single feature. However, for object types that are similar in appearance, simple color or intensity thresholding techniques are insufficient for classifying these objects. It is therefore necessary to consider other features of objects. We are buildirig a collection of measurements of features of objects in training sets for use in statistical classiflcation.
The bright lesions are detected with two techniques. One method (template matching) uses a circularly-symmetric Gaussian-shaped convolution kernel designed to detect small (500 pm) lesions. By resampling the image down by factors of 2, 4 and 8, and applying the same kernel to each scaled image, the four resulting images can be rescaled and combined to yield bright lesions of all sizes. In the second method (edge segmentation), the matched fllter designed for blood vessels is also used to find thick edges. We threshold the filtered images to retain all pixels below a given threshold and then invert the grey level map to reveal the regions between the edges in white and the detected edges in black. The isolated objects are counted with a fill routine which groups adjacent white pixels in the binary image.
Oect database
Once all the objects of interest in an image have been identified, the shape, size, and location of each object is entered into a database which forms a description of the image. The database allows querying using any characteristic field. Statistics can easily be computed over all the processed images in the database (training set). This allows a convenient method of computing the probability of finding certain objects in the image from a patient with a given diagnosis.
SCENE ANALYSIS
Dark lesions, which include hemorrhages, nevi, tumors, photocoagulation spots, and pigmented scars, require specialized algorithms. The signal to noise ratio between dark objects and the background retina is much lower than that of bright lesions. As a result, methods based on matched filters, absolute intensity, or color are not accurate. Therefore it has been necessary to develop different methods of detecting these dark, low-contrast objects. Since dark lesions occur as large clumps of dark pixels, relaxation is effective at grouping pixels belonging to dark lesions.
Knowledge-description language for the ocular fundus As part of scene analysis, the system must use information about the presence and location of lesions to determine the diagnostic probabilities. The tasks then are 1) to represent the information in the fundus image in a way that allows the system to distinguish among the diagnostic possibilities, 2) to represent the knowledge that is used to identify the differential diagnosis set that most appropriately matches the manifestations in the image, and 3) to reason with the information in the images and the knowledge in the knowledge base to conclude the differential diagnosis.
Figure 3 shows the automated segmentation of a color image from a patient with diabetic retinopathy. The optic
758
We developed an image-description language to symbolize the information in ophthalmologic images (Figure 4). We wanted to express the location of a lesion with respect to the disk or fovea, the pattern of lesions of the same type, and the special relationship that may exist for two different type lesions that occur together.
language to create a coded description of the image in a form acceptable for input into the neural network. Depending on the manifestation, for each manifestation, up to 11 units were used, each corresponding to one of the regions or a combination of regions. An extra unit represented presence of a manifestation. The network had 158 input units, 5 hidden units, and 9 output units. Nine diseases involving blood vessels in the retina or choroid were chosen: background diabetic retinopathy, proliferative diabetic retinopathy, macroaneurism, Coats' disease, central retinal artery occlusion, central retinal vein occlusion, branch retinal vein occlusion, macular degeneration, and normal (Figure 5). The appearance of the fundus in the selected diseases ranged from markedly different (e.g. central retinal artery occlusion and macular degeneration) to similar (e.g. branch retinal vein occlusion and proliferative diabetic retinopathy). The gold standard of the diagnosis for each case was determined independently by two retinal
xx
AXA, X /
1-rBB
B
~B X
AXB
specialists. Figure 4: Left. symbolic representation of manifestations In a patient with proliferative diabetic retinopathy. X = exudates. B blot hemorrhage. gray - preretinal hemorrhage. Right. I1 regions that are significant in the physician's reasoning process. Per periphery. Pap = papilla. Mac - macula. ASN - arcuate zone. superior nasal, PMIT - perimacular zone, inferior temporal.
Table 2: Manifestations of disease used for clsification by
backpropagation
For localization of a lesion, we divide the ocular fundus into 11 regions that we feel are significant in the physician's reasoning process (Figure 4 Right). The macular region is 1500 gm in diameter. The papillary region includes a zone that extends 750 pm from the edge of the disk. The perimacular zone and the arcuate zone are split horizontally to account for the general division of the retinal circulation into superior and inferior halves. These zones are also sep-
Manifestlons
Units
Dot hemorrhage/ microaneurism Blot hemorrhage Flame-shaped hemorrhage Preretinal hemorrhage Exudates Ring exudates around
11 regions 1 for presence
0 absent 0.5 low density 1.0 high denisity
Drusen, fine Drusen, large
5 regions 1 for presence
0 absent 0.5 low density 1.0 high density
Subretinal neovascularization SubRPE or subretinal hemorrhage
7 regions 1 for presence
0 absent 0.5 < half area 1.0 > half area
Papillary atrophy Papilledema
1 region
0 niormal 0.5 mild 1.0 severe
Arteries, general constriction Arteries, focal constriction Arteries, general dilatation Arteries, focal dilatation Veins, general constriction Veins, focal constriction Veins general dilatation Veins, focal dilatation
1 unit
0 absent 0.25=1 quadrant 0.5=2 qtiadrants 0.75=3 quadrants 1.0=4 quadrants
microaneurisms Cotton-wool spots White retina Telangiectasis Neovascularization
arated vertically in case there is importance in knowing whether lesions are temporal or nasal. We join some regions where we feel that increasing the spatial resolution would not be helpful. The peripheral zone is not subdivided, since we felt there would be little knowledge gained by partitioning this region. Artificial intelligence techniques One approach to deterrmining the relationship between the manifestations and the diagnoses is serial symbolic processing in an expert system which selects the rule that best fits the current situation. For an expert system, the knowledge must already be known by some expert and must be inserted into the system to form the knowledge base.
Another approach is to use a neural network that will learn from a teaching set of representative images of known diagnoses. By learning from a teaching set, the network builds a knowledge base from which it can reason to generate a differential diagnosis set from new images. Different models of neural networks perform optimally for specific problems. We chose the backpropagatiorA program from the programs of McClelland and Rumelhart, because it is an appropriate model for classification when the clustering is predetermined (in our case, into diagnoses). This model is particularly useful in pattern matching and classiflcation.
For each of the nine diagnoses, there were 11 patients, totalling 99 patients. One patient from each disease was extracted to create a test set of nine cases. The remaining 10 patients per disease constituted the matching teaching set of 90 cases. Eleven different test sets were extracted, creating 11 different corresponding teaching sets. The patients in a test set were evaluated in a network fllled with the weights generated by the matching teaching set. Cross validation was performed by accumulating the errors in diagnosing all 11 test sets.
We have investigated the ability of a neural network to learn to discriminate between diagnoses, given an input of a set of lesions in color photographs of the ocular fundus. In order to simulate the output we expect from the segmentation and object-identiflcation process, we manually identified and located the lesions in 99 images. The data in each image were reduced to symbols in a sketch (Figure 4 Left). We selected 24 manifestations that would discriminate the diseases selected (Table 2). We used the image-description
759
*B~
BA
PIOR
OUTPUTCAOCV
BV
small lesions with low contrast. Occasionally. false positives appeared at islands surrounded by three vessels. Feature extraction should provide good discrimination between these islands and actual lesions. Hemorrhage segmentation was more difficult since the signal to noise ratio between these dark objects and their backgrund is usually poor. Nonetheless, with the relaxation algorithm we were able to identify more than 80% of the individual blot hemorrhages in several images, with less than 5% false positives.
TEACHING
0W iI McpN
F!
IHIDDEN
nor ----
~~~~~~~~.......................... c [ @ Eme lr
Neuoal network
~ ~ ~ ~ ~INPUT
Figure 5: Organization of backpropagatlon network used to classify diseases from manually segmented images. PDR - proliferative diabetic retinopathy, MacroAn - macroaneurism, Coats' - Coat' disease. CRAO - central retinal artery occlusion, CRVO - central retinal vein occlusion, BRVO - branch retinal vein occlusion. MacDg -
Cross validation indicated that the taught networks were 83% accurate in diagnosing new cases from information in the images. We determined that accuracy could be improved by providing more experience (a larger, more complete teaching set) and by improving the image-description language.
1. N. Katz. M. Goldbaum. M. Nelson. S. Chaudhurt. "An image processing system for automatic retinal diagnosis." SPIE 7treeDimensional Imgbg and Remote Sensin Imaging. vol. 902. pp. 131-137. 1988.
macular degeneration
2. P. Cohen. E. Feigenbaum. The Handbook of Artificial Intellgence III. Menlo Park. Addison-Wesley. p. 127. 1982.
BESULM
3. T. Kanade. R. Reddy. "Computer vision - the challenge of Imperfect inputs." IEEE Spectrum. vol. 20. No. 11. p. 90. 1983. 4. R. Nevatia. "Image segmentation." in T. Y. Young. K. S. Fu (eds.). Handbook of Pattem Recogniion and Image Processin. New York. Academic Press. p. 224, 1986.
Thirty images with lesions were processed and segmented. Classification was performed on blood vessels and the optic nerve. Classification algorithms for bright objects (e.g. yellow lesions) and dark objects (e.g. hemorrhages) are still being developed. The matched filter used for blood vessel segmentation performed equally well over all the images tested. All vessels larger than 35 pm were detected. Continuity and edge lIzation were preserved. Due to the shape of the filter, some false positives occurred along edges of bright lesions. Once the lesions were segmented however, these edges could be removed.
Chaudhuri. S. Chatteijee. N. Katz. M. Nelson. M. Goldbaum. "Detection of blood vessels in retinal Images using two-dimensional matched filters." IEEE Transactions on Medical Imaging. Vol 3. No. 5. S.
3. pp. 1-5. September 1989. 6. S. Chaudhurl. S. Chatteilee. N. Katz. M. Goldbaum. "Automatic detection of the optic nerve in retinal Images." Proceedigs IEEE International Conference on Image Processig. Singapore. vol. 1 pp. 1-5. 1989. 7. A. Rosenfeld, A. C. Kak. Digital Picture Processing. New York. Academic Press. 1982. 8. J. L. McClelland. D. E. Rumelhart. Exploratios in ParalUel Distriuted Processin. A Handbook of Models, Programs, and Exercises. Cambridge. MIT Press. 1988. 9. D. E. Rumelhart. J. L McClelland (eds). Parallel Distributed Processing. Explorations in the Microstructure of Cognition Volume 1: Fbundations. Cambridge. MIT Press. 1986.
The optic disk was successfully localized in 28 of the images. In those images where disk detection failed, anomalies such as a large bright lesion close to the disk or optic disk atrophy caused the features used in the maximum likelihood estimator to fail to converge exactly at the disk. One method of improving this method would involve a higher level structural representation of the blood vessel network, which always converges at the disk.
Although lesion classification has not been performed, segmentation yielded 95% of the yellow lesions, including
760