Sparse Representation for Optic Disk Detection - SERC - Indian ...

40 downloads 368 Views 3MB Size Report
Sparse Representation for Optic Disk Detection. Neelam Sinha. International Institute of Information Technology. Bangalore, 560100, India.
Sparse Representation for Optic Disk Detection Neelam Sinha

R. Venkatesh Babu

International Institute of Information Technology Bangalore, 560100, India Email: [email protected]

Supercomputer Education and Research Center Indian Institute of Science Bangalore, 560012, India Email: [email protected]

Abstract—Automatic detection of optic disc (OD) is a crucial step in automated eye image processing. In this paper we present a novel method that utilizes the fact that ODs exhibit similar characteristics like circular shape and serves as the region where blood vessels and nerves converge. A dictionary with training patterns, built out of sub-images, with OD at the center, is utilized. For a given test image, the method examines all subimages of the pre-determined size, and expresses them as linear combinations of basis images in the dictionary, minimizing l1 norm of the solution. The underlying assumption is that all sub-images with OD at the center, lie on a single sub-space. The proposed method was evaluated on the publicly-available database DIARETDB1 and DRIVE, with disjoint dictionary and testing images. Of the 89 images, the OD-center was detected within 2% of normalized error in 88 images. On DRIVE images we obtain accuracy of 85%. Index Terms—Fundal image processing, optic disk detection, sparse representation, l1 -norm minimization

I. I NTRODUCTION The optic disc (OD) is among the most important landmarks in the eye. Automatic OD detection is a critical step in eye image processing, and enables subsequent processing for diagnostics. The salient features of ODs do not vary drastically across the population. The OD is characterized by its nearly circular shape and the fact that it serves as the convergence region of a network of blood vessels and nerves. OD detection is generally followed by extraction of exudates and lesions, and serves as the important initial step for screening processing of several conditions of pathology, such as diabetic retinopathy (DR). II. OD DETECTION METHODS The OD detection methods in literature typically utilize intensity levels, shape characteristics and the presence of fine vasculature, within the OD. In [1], the authors detect OD using basic image processing techniques that include thresholding, detection of object roundness and circle detection using Hough transform. Similar approach is reported in [2], where the OD is localized using morphological operations followed by the Hough transform. Some works have reported methods where the size and brightness of OD have been exploited. In [3], the OD center is approximated as that of the largest brightest connected component. A simple thresholding is used to obtain a binary image to obtain the bright regions. Another work that exploited the brightness of the OD pixels was reported in [4]. Here, the authors identified the candidate regions by clustering the brightest pixels in the intensity images.

This is followed by applying PCA to the candidate regions. The method reported in [5] comprised several pre-processing steps, such as binarization for mask generation followed by equalization. Blood vessel segmentation was carried out using matched filter frame-work in order to match the direction of the vessels at the OD vicinity. Some works have devised methods by customizing concepts of multi-scale approaches and filtering, for OD detection. The work reported in [6] proposed a method to automatically detect the OD in fundus images in two steps. The OD vessel candidate detection was followed by OD vessel candidate matching. In the first step multi-scale Gaussian filtering, scale production, and double thresholding were carried out to initially extract the directional map of the vessels. The obtained map was then thinned before another threshold was applied to remove pixels with low intensities. This result yielded the OD vessel candidates. This is followed by a Vessels Directional Matched Filter (VDMF) of various dimensions. VDMF was applied to the candidates to be matched, and the pixel with the smallest difference was designated the OD center. In [7], the authors reported an image processing algorithm for OD localization in low-resolution color fundus images. The authors proposed a combination of a Hausdorff-based template matching technique on edge maps, directed by a pyramidal decomposition for large scale object tracking. In this paper we present a method for automatic detection of OD without pre-processing, and is also free of assumptions of intensity levels, shape and contrast. The organization of the paper is as follows : The proposed method is presented in Section 3. Section 4 describes the data-set used, results obtained and their validation. In section 5, various aspects of the proposed work are discussed. Finally conclusion is presented in section 6. III. P ROPOSED METHOD The proposed template-based method is an attempt to detect the OD without requiring sensitive pre-processing steps such as equalization, enhancement, edge-detection and morphological operations. This method is an adaptation of the face detection technique proposed by Wright et al. reported in [8]. The proposed method exploits the fact that the circular outline of the OD is its most distinguishing feature. Although the Green channel provides maximum contrast, it is observed that the circular outline of the OD is prominently marked out in the Red channel image or the Gray intensity image, leading

to easier processing. The method hinges on the fact that the salient characteristics, such as the circular shape of the OD as well as the OD area serving as convergence region of a crisscross of nerves and the blood vessel networks, remain consistent across all sub-images that contain OD at their center. It is assumed that all sub-images containing OD at the center, lie in a single subspace, and that any given sub-image, that contains OD at its center, can be expressed as a linear combination of the positive samples among the training images. The pixel values of each of the sub-images are concatenated to form basis vectors {v1 , v2 , . . . vn } in the dictionary while similarsized sparse matrices with a single non-zero element for each of the pixel locations {r1 , r2 , . . . rm } are taken as negative training samples in the dictionary matrix, A. For a given test-image whose sub-image y is considered, l1 -minimization based frame-work is used for OD detection as out-lined below. A. l1 -minimization based OD detection Let us assume that the sub-images with the OD at their center constitute a sub-space, with its basis elements defined by the pixel values of the training sub-images {v1 , v2 , . . . vn }, while similar-sized sparse matrices with a single non-zero element for each of the pixel locations {r1 , r2 , . . . rm } form the negative training samples in the dictionary. Hence we have, A = {v1 , v2 , . . . vn , r1 , r2 , . . . rm }

(1)

We assume that a given sub-image y, can be expressed as a linear combination of the basis elements and that y = α1 v1 +α2 v2 +. . .+αn vn +β1 r1 +β2 r2 +. . .+βm rm (2) Re-writing as y = Ax

(3)

where x is the co-efficient vector given as, x = [α1 , α2 . . . αn , β1 , β2 . . . βm ]

(4)

It is clear that the coefficient vector x determines the nature of the sub-image in question. If the sub-image contains OD at its center, it would belong to the sub-space spanned by the positive training basis {v1 , v2 , . . . vn }, and hence the coefficient vector would resemble {α1 , α2 , . . . αn , 0, 0, . . . 0}

(5)

However, if a given test image does not contain OD at its center then it will not lie in the sub-space of interest. Hence when we attempt to express the image as a linear combination of the basis elements in dictionary A, the resulting coefficients will not be concentrated about the basis elements {v1 , v2 , . . . vn } corresponding to the positive training samples. Hence the coefficient vector will have values spread across the negative basis elements and would resemble {0, 0, . . . 0, β1 , β2 , . . . βm }

(6)

After processing an entire image, in order to accentuate the weightage accorded to the coefficients corresponding to the

positive basis elements, the classification-index (ck ) corresponding to the kth sub-image is determined as, Pn αi (7) ck = Pmi=1 |β j| j=1 is computed. The sub-image which contains OD at the center will result in the maximum value of the classification index ck . Hence the peak value of classification index will correspond to the location of the OD. Typically the system of equations encountered here, y = Ax

(8)

is under-determined, with no unique solution. In order to determine the optimal value of x, xopt , certain cost functions are minimized. In the proposed method l1 -norm has been minimized to obtain sparse solutions. (l1 ) : xopt = arg min ||x||1 s.t Ax = y

(9)

Conventionally, the solution is chosen as that which minimizes the l2 norm. However, minimizing l2 norm leads to solutions that are generally dense, with large nonzero entries corresponding to several basis vectors. Such a solution would not be of utility in this context, since we seek to distinguish between the OD-class and the rest. Hence we are required to seek the sparsest solution, which would mean solving the following optimization problem : (l0 ) : x0 = arg min ||x||0 s.t Ax = y

(10)

This solution would count the number of nonzero entries in the co-efficient vector. However, this problem is NP-hard [9], and there is no known procedure for finding the sparsest solution that is significantly more efficient than exhaustively searching through all possible solutions. Recent developments in sparse representation and compressed sensing [10] have led to the results that if the optimal solution xopt is sufficiently sparse then the l0 -norm minimization problem is equivalent to solving the l1 -norm minimization problem : (l1 ) : x1 = arg min ||x||1 s.t ||Ax − y||2 ≤ ǫ

(11)

As is well known, this convex optimization problem can be efficiently solved via cone programming [11]. The solution would contain coefficients that are sparse and their magnitudes decay rapidly, of the form : x = {0 . . . 0, 0, ap . . . aq , 0 . . . 0}

(12)

In the present context, the length of the coefficient vector is far larger than the number of the positive basis elements, and hence we should indeed look for the sparsest solution. Since we want the non-zero entries in the coefficient vector x to describe if the input sub-image contains the OD at the center or not, we look at the combined contributions from the coefficients corresponding to the positive training samples, captured in the classification index c.

B. Algorithm : Sparse Representation-based OD Classification 1) Input an array of n sub-images, of size say n1 × n2 with OD at the center. Concatenate the pixel values to form the basis vectors {v1 , v2 . . . vn }. For each pixel location insert a sparse matrix of size n1 × n2 with a single non-zero element. Concatenate these sparse matrix entries to form basis vectors {r1 , r2 . . . rm }, where m = n1 × n2 . Now form the dictionary matrix A = [v1 v2 . . . vn r1 . . . rm ]. 2) Normalize the columns of A to have unit l2 -norm. 3) For a given test image, repeat steps 4,5 for all subimages y of size n1 × n2 . 4) Solve the l1 -minimization problem, for a chosen value of ǫ : (l1 ) : x1 = arg min ||x||1 s.t ||Ax − y||2 ≤ ǫ

(13)

5) Compute the classification index of the sub-image Pn αi (14) c = Pmi=1 |β j| j=1 6) The peak value of classification index corresponds to the sub-image with OD at the center. IV. R ESULTS Publicly available datasets (DIARETDB1), available at [12] and DRIVE [13] were used to test the proposed method. In the reported experiments, the dictionary contained positive training samples from 30 instances of sub-images hand-marked from various images. We choose to down-sample the images to size 20 × 20, for ease of computation in the subsequent steps. The obtained results are validated by cross-checking the obtained results against the hand-marked OD center for each of the images. The plot in Fig. 1 shows the deviation computed as Euclidean distance, in images from DIARETDB1 database.

(a)

(b)

(c)

(d)

Fig. 2. Sample results of the proposed method on images of DIARETDB1 database. The ’+’ indicates the OD center as detected by the proposed method.

levels. Besides, contrast levels also vary within each image. Of particular interest is the image seen in 2(a), since the OD is hard to distinguish even to the human eye. Figure 3 shows the only image where the proposed method failed to detect the OD center correctly. Sample results on DRIVE database are illustrated in Fig. 4.

Fig. 3. One of the images on which the proposed method failed to detect the OD center correctly 350

Table I summarizes the performance of the method on each of two databases.

||Actual − Detected OD center||

300

250

TABLE I P ERFORMANCE OF THE PROPOSED METHOD

200

150

Database No. of images Success Accuracy (%)

100

50

DIARETDB1

DRIVE

89 88 98.8

40 34 85

0 0

10

20

30

40

50

60

70

80

90

Image number Fig. 1. Performance measured in terms of the deviation from the detected OD center compared to the actual OD center (DIARETDB1 database)

The Figs.2(a-d) show the representative results obtained using the proposed method on images in DIARETDB1 database. It is clearly seen that the images have varying illumination

V. D ISCUSSION Programs written in MATLAB were used for the experiments. These programs were interfaced with the implementation of the sparse-coding routines, based on [14], [15], available at the website [16]. A typical run on an image of size 1500 × 1100, required on an average 50 seconds on a

the deviation in the result obtained using the l2 -minimizer is greater than that using the l1 -minimizer.

(a)

(b)

Fig. 6. Illustration that detection with l1 -minimizer (+) is better than that with l2 -minimizer (x)

(c)

(d)

Fig. 4. Results of the proposed method on DRIVE database. The ’+’ indicates the OD center as detected by the proposed method.

||Actual − Detected OD center||

1000 l −minimizer

900

2

l −minimizer 1

800 700 600 500 400 300

The most computationally-intensive part of the proposed method is the solution of the linear system of equations Ax = y, where A is the dictionary matrix, of size 400 × 430; the number of pixels in each of the OD sub-images of size 20 × 20 is 400. The size of coefficient vector x is 430 × 1, while that of the test sub-image y is 400 × 1. However, if we had retained the original size of the image, the OD sub-image would have been of size 200 × 200, making the dictionary size 40000 × 40030. Hence down-sampling helps in drastically cutting down on computational complexity and time required. An appropriate value must be chosen for a given image size such that the salient features of the OD are not lost, although down-sampled.

200

VI. C ONCLUSION

100 10

20

30

40

50

60

70

80

Image Number Fig. 5. Deviation of the detected OD center (DIARETDB1 database) compared to the actual OD center using : (i) l1 minimizer (ii) l2 minimizer

Desktop with 3.06 GHz Intel Core i3 CPU and 4GB RAM. The proposed method exploits the fact that the salient features of the OD do not drastically vary from one image to another, and that they all lie on a single sub-space. Comparing results using the method reported in [1], it is clear that the results obtained using the two methods is comparable. However, the authors of [1] utilize 5 of the DRIVE images among the training images. In contrast, the proposed method outlined in this paper uses disjoint training and test data-sets. The plot in Fig. 5 shows the comparison in performance estimated as measure of deviation in estimation of OD center using l1 - minimizer versus l2 -minimizer. It is clear that the l1 minimizer outperforms the l2 -minimizer. An instance of where the performance with l1 -minimizer is more accurate compared to that with the l2 -minimizer is shown in Fig. 6. The OD center as detected by the l1 -minimizer is shown as a ’+’ mark while that detected by the l2 -minimizer is shown using ’x’. Clearly

In this paper, we have presented a novel method for location of the OD in fundus images. The method is template-based and works in l1 -minimization framework. A dictionary, generated with OD images as the positive training samples and similarsized sparse matrices, is used. All sub-images that contain the OD at the center are assumed to form a sub-space. Hence, when a sub-image with the OD at the center is expressed as a linear combination of the basis elements from the dictionary, minimizing the l1 -norm, the corresponding coefficients are sparse and far higher than the rest. The sub-image that contains the OD at the center is identified by locating the peak values of the coefficients. The proposed method was successfully applied on the publicly available datasets such as DIARETDB1 and DRIVE. R EFERENCES [1] Mira Park, Jesse S. Jin, and Suhuai Luo, “Locating the optic disc in retinal images,” in Proceedings of the International Conference on Computer Graphics, Imaging and Visualisation (CGIV’06), 2006, pp. 141–145. [2] W. Al-Nuaimy S. Sekhar and A. K. Nandi, “Automated localization of optik disk and fovea in retinal fundus images,” in Proceedings of the 16th European Signal Processing Conference (EUSIPCO), 2008. [3] T. Walter and J.C. Klein, “Segmentation of color fundus images of the human retina: Detection of the optic disc and the vascular tree using morphological techniques,” in Proceedings of the 2nd International Symposium on Medical Data Analysis, 2001, pp. 282–287.

[4] Huiqi Li and Opas Chutatape, “Automatic location of optic disk in retinal images,” in Proceedings of the xxth International Conference on Image Processing, 2001, pp. 837–840. [5] Aliaa Abdel-Haleim Abdel-Razik Youssif, Atef Zaki Ghalwash, and Amr Ahmed Sabry Abdel-Rahman Ghoneim, “Optic disc detection from normalized digital fundus images by means of a vessels direction matched filter,” IEEE Transactions on Medical Imaging, vol. 27, no. 1, pp. 11–18, ”Jan” 2008. [6] Bob Zhang and Fakhri Karray, “Optic disc detection by multi-scale gaussian filtering with scale production and a vessels directional matched filter,” Medical Biometrics Lecture Notes in Computer Science, vol. 6165, pp. 173–180, 2010. [7] Mario Beaulieu Marc Lalonde and Langis Gagnon, “Fast and robust optic disc detection using pyramidal decomposition and hausdorff-based template matching,” IEEE Transactions on Medical Imaging, vol. 20, no. 11, pp. 1193–1200, ”Nov” 2001. [8] John Wright, Allen Y. Yang, Arvind Ganesh, S. Shankar Sastry, and Yi Ma, “Robust face recognition via sparse representation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 2, pp. 210–227, ”Feb” 2009. [9] E. Amaldi and V. Kann, “On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems,” Theoretical Computer Science, vol. 209, pp. 237–260, 1998. [10] D. L. Donoho, “For most large underdetermined systems of linear equations the minimal l1-norm solution is also the sparsest solution,” Communications in Pure and Applied Math, vol. 59, pp. 797–829, 2004. [11] S. Chen, D. Donoho, and M. Saunders, “Atomic decomposition by basis pursuit,” SIAM Review, vol. 43, no. 1, pp. 129–159, 2001. [12] WebLink, ,” DIARETDB1, http://www2.it.lut.fi/project/imageret/diaretdb1/. [13] WebLink, ,” DRIVE, http://www.isi.uu.nl/Research/Databases/DRIVE/. [14] J. Mairal, F. Bach, J. Ponce, and G. Sapiro, “Online learning for matrix factorization and sparse coding,” Journal of Machine Learning Research,, vol. 11, pp. 19–60, 2010. [15] J. Mairal, F. Bach, J. Ponce, and G. Sapiro, “Online dictionary learning for sparse coding.,” in International Conference on Machine Learning, 2009. [16] WebLink, ,” SPAMS, http://www.di.ens.fr/willow/SPAMS/downloads.html.

Suggest Documents