A New Efficient Dictionary and its Implementation on ... - IEEE Xplore

1 downloads 0 Views 1MB Size Report
Sparse representation of signals and images using an over- complete basis function (dictionary) has attracted a lot of attention in the literature recently. Atoms of ...
Proceedings of the 19th International Conference on Digital Signal Processing

20-23 August 2014

A New Efficient Dictionary and its Implementation on Retinal Images

Damber Thapa1, Kaamran Raahemifar3 and Vasudevan Lakshminarayanan1-2,4 School of Optometry and Vision Science1, Physics and Electrical and Computer Engineering2, University of Waterloo, Waterloo, Canada Electrical and Computer Engineering, Ryerson University, Toronto, Canada3 Department of Physics and Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48104,USA4 ABSTRACT Sparse representation of signals and images using an overcomplete basis function (dictionary) has attracted a lot of attention in the literature recently. Atoms of a dictionary are either chosen from a predefined set of functions (e.g. Sine, Cosine or Wavelets), or learned from a training set (KSVD). Recently, a nonlinear (NL) dictionary has been proposed by adding NL functions, such as polynomials, rational, logarithmic, exponential, and phase shifted and higher order cosine functions to the conventional Discrete Cosine Transform (DCT) atoms. In this paper, we present a comprehensive performance comparison of various NL functions that are added to the DCT dictionary. The NL dictionary is also compared with the other known dictionaries such as DCT, Haar and KSVD-based learned dictionary for sparse image reconstruction. In the second part, the NL dictionary is exploited for sparsity based image denoising. Retinal images are used for the analysis.

where α is the sparse coefficient vector and . p is the vector norm. The problem is converted to

minimization when p=0. The goal is to find a coefficient vector α , such that, both Dα − x 2 and || α || is minimized. This is an NP-hard problem [1]. Such problems are very complicated; however, an approximate solution is possible using iterative methods like Orthogonal matching pursuit (OMP) [3]. Another approach is to transform the problem to l1 norm minimization (usually called Basis Pursuit) by replacing p=0 in equation 1 with p=1 and then solve by using linear programming methods [4]. The signal representation error varies largely upon the choice of the dictionary; therefore, the dictionary is chosen in a way that the error is minimized. There have been extensive studies on constructing a basis function that results in better image analysis. The dictionary D can be either chosen as a pre-specific set of functions (analyticbased dictionary) or learned from a training set to fit a given set of signals (learning-based dictionary) [1]. Choosing an analytical dictionary is simple in that the atoms are created using a stationary function such as sine and cosine, and wavelet functions. Discrete Cosine Transform (DCT) dictionary is created using a cosine function. Similarly, wavelets functions can be used to create overcomplete dictionaries for wavelets, Gabor, Contourlets, and Curvelets. Learning-based method such as KSVD uses an analytical-based dictionary or a dictionary created from image patches as an initial dictionary, which is then modified in an iterative process to yield a dictionary that fits a given set of signals [1].

Index Terms— Denoising, reconstruction, dictionary, sparsity, nonlinear, image processing, ophthalmology, retinal images, OCT 1. INTRODUCTION Sparse representation is the approximation of an image/signal with the linear combinations of only a small set of basis vectors called atoms [1]. The sparse representation of a signal has attracted a lot of attention in the last few decades and has been used extensively in many applications such as, image reconstruction, noise reduction, pattern classification, data compression, super-resolution, feature extraction and most recently on compressive sensing. The sparse representation of a signal x ∈ R N using an over complete dictionary D ∈ R N ×K ; N < K can be formulated as follows [2]: min ‖‖    = 

978-1-4799-4612-9/14/$31.00 © 2014 IEEE

l 0 pseudo-norm

Recently, we have proposed a novel analytical-based dictionary called a nonlinear (NL) dictionary [5]. The NL dictionary is constructed by adding NL functions, such as polynomials, rational, logarithmic, exponential, and phase shifted and higher order cosine functions to the conventional DCT atoms.In this paper we provide comprehensive

(1)

841

DSP 2014

Proceedings of the 19th International Conference on Digital Signal Processing

20-23 August 2014

performance comparisons of the NL dictionary with the otherknown over-complete analytical-based dictionaries, such as DCT, Haar as well as with learning-based dictionary (KSVD) for retinal image reconstruction. The NL dictionary is also tested for removing noise from the Optical Coherence Tomography (OCT) images. Besides PSNR, we

calculated the Structural Similarity (SSIM) index that provides means for comparing the performance of different dictionaries. The SSIM index computes the similarity between the original and reconstructed image. The SSIM accounts luminance change, contrast change, and structural change between the two images.

Figure 1: Performance of each set of atoms of the NL dictionary. D1=DCT; D2= Phase added DCT; D3=Phase added DCT plus polynomials and exponential functions; and D4= Phase added DCT plus polynomials, exponential and step functions

978-1-4799-4612-9/14/$31.00 © 2014 IEEE

842

DSP 2014

Proceedings of the 19th International Conference on Digital Signal Processing

20-23 August 2014

2. NONLINEAR DICTIONARY The following equation can be used to create DCT dictionary atoms:

 2π  d i = cos i n;  N 

n = 1,2,...k

(2)

where k represents the size of the discrete signal. Marsousi [6] has shown that adding extra phase components to the conventional DCT atoms improves the image reconstruction. Therefore, a set of evenly selected phases between 0 to 2π is incorporated to the conventional DCT to create phase added DCT. The phase added DCT atoms are generated using Eq. 3 where ϕ represents the added phase term to the DCT atom, and k represents the size of the discrete signal.  2π  (3) d i = cos i n + φ ; n = 1,2,...k N   We further added other vector sets, such as a set of polynomial, rational, logarithmic, and exponential functions to construct a novel NL dictionary. The first set of atoms included in the NL dictionary is the phase added DCT atoms created by Eq. 3. The second set of atoms are the discrete polynomial atoms (Eq. 4) where a, b and c are predefined sets of coefficients. The coefficients are generated using a fixed spacing scheme such that a i = a1 + ∆i, where i represents the number of atoms in each order of the polynomial.

Pi = (ai x[n] + bi )ci ; n = 1,2,...k

(4)

The third set of atoms is created by taking log transform of polynomial shown in Eq. 5. Li = log( Pi ) for ci > 0 (5) The fourth set of atoms is exponential atoms. These atoms are created by placing all the root functions into the exponential transform as shown below:

E ai = e d i

(6)

Ebi = e Pi

(7)

Figure 2: Performance comparison of Haar, NL and KSVD dictionary on fundus and OCT images. PSNR is calculated between the reconstructed image and it’s original. Once the atoms are arranged they are scanned for discontinuities; the dictionary columns are normalized; and duplicate atoms are removed. This new dictionary can be used for various sparsity based image processing techniques. We can also add other functions to the NL dictionary; however, the addition of new atoms increases both the size and complexity of the dictionary.

Finally, a set of step functions ( S i ) are added to the NL dictionary. All the atoms created using DCT, polynomials, logarithmic, exponential and step functions are concatenated to create the NL dictionary (D) as shown in Eq. 8. D = [d i Pi Li E ai Ebi S i ] (8)

978-1-4799-4612-9/14/$31.00 © 2014 IEEE

3. EXPERIMENTAL RESULTS The neural layer of the eye, the retinal, has many important features namely; the macula, fovea, optic disc and retinal

843

DSP 2014

Proceedings of the 19th International Conference on Digital Signal Processing

20-23 August 2014

blood vessels and are visualized using fundus photography. Optical Coherence Tomography (OCT) [7] is another wellestablished method for taking cross-sectional images of the retina. Recent developments in technology allow retinal images to be obtained from remote areas and thereby allow diagnosis and treatment when a specialist is not present [8]. Since the size of the uncompressed retinal images are large, they are often compressed before sent to the ophthalmologist. Sparse representation is one such technique with potential application in telemedicine.

when fewer non-zero sparse coefficients are used; however, the NL dictionary perform better than both Haar and learned dictionaries when large number of non-zero sparse coefficients are used. Figure 4 shows the reconstructed fundus image using only 9 nonzero sparse coefficients per block using Haar, NL and KSVD-based learned dictionaries. In the second part, we exploited our NL dictionary for reducing speckle noise from the OCT image. The sparsitybased denoising is obtained by minimizing the following cost function [9]:

The NL dictionary atoms were used for sparse coding of the retinal images. Fundus and OCT images were used for simulation. Using a fundus camera (Non-Mydriatic Auto Fundus Camera, Nidek AFC-230, Japan) the fundus images were taken from the right eye of one of the authors (DT) who has no ocular pathology. OCT image of a healthy participant was provided by Sankara Nethralaya Eye Hospital Chennai, India. The images were 512×512 pixels. MATLAB software (version R2013a) was used to code the reconstruction programs. The MATLAB source codes for KSVD dictionary learning were downloaded from Ron Rubinstein's webpage: http://www.cs.technion.ac.il/~ronrubin/software.html.

min ‖  ‖  ‖‖

(9)

The images were broken down into 8×8 patches and sparse coding was performed on each patch separately. OMP algorithm was used as an optimization method for sparse coding. Total number of nonzero coefficients (sparsity constant) was fixed to 1, 5, 9, 13, 17, 21, 25 and 29 for eight different cases. Finally, the image was reconstructed from the sparse coefficients x=Dα. To test the effect of different set of atoms, the sparse coding was performed separately with (a) DCT atoms (b) phase added DCT atoms (c) phase added DCT atoms plus polynomials, and exponential atoms, and (d) phase added DCT atoms plus polynomials, exponential and steps functions. The results are shown in Figure. 1 for three fundus images and one OCT image. The results show that PSNR improves with the addition of each of the aforementioned types of atoms. The improvements in PSNRs with the addition of polynomials, exponential and step functions are fairly noticeable for all fundus and OCT images. The sizes of the DCT and NL dictionary were same so the improvement is not because of the increased size of the dictionary. The performance of the NL dictionary is also compared with the analytical-based Haar dictionary and KSVD-based learned dictionary. The PSNR and SSIM indices are plotted against the maximum number of nonzero coefficients. Figure 2 shows the corresponding plots of PSNR versus total number of nonzero coefficients and Figure 3 shows the plots of SSIM versus total number of nonzero coefficients. The results show that PSNRs and SSIMs obtained from NL dictionary are better than those of Haar dictionary but slightly less than those obtained from learned dictionary

978-1-4799-4612-9/14/$31.00 © 2014 IEEE

Figure 3: Performance comparison of Haar, NL and KSVD dictionary on fundus and OCT images. SSIM is calculated between the reconstructed image and it’s original.

844

DSP 2014

Proceedings of the 19th International Conference on Digital Signal Processing

20-23 August 2014

Figure 4: Fundus image reconstructed with 9 non-zero sparse coefficients using Haar, NL, and KSVD dictionaries. where λ is a constant. Although there are many new methods for denoising natural images via sparse representation techniques, we employed simple block averaging method proposed by Elad and Aharon [9] since the aim of this paper is to test the performance of the NL dictionary. In this method, the retinal images were divided into 8×8 overlapping patches. The sparse coding of each patch was carried out using OMP algorithm. The standard deviation  of the noise in the test OCT image is estimated by employing the MATLAB code published by Fang et al. [10] and the stopping criterion for the OMP algorithm was set to

patchsize × c × σ ; where c=1.9 is a constant chosen

empirically. The patches were reconstructed from the sparse coefficients. Finally all overlapping parts of the patches were averaged using the following equation [9].

Figure 5: Performance of NL and KSVD dictionaries for removing speckle noise from OCT image.

(λx + ∑ RkT Dα k ) xˆ =

k

(λI + ∑ RkT Rk )

(10)

MSR =

k

where xˆ is the denoised image, R is the matrix which divides the image into patches and I is an unitary matrix. Mean-to-standard-deviation ratio (MSR) and contrast-tonoise ratio (CNR) were calculated from the original and denoised images to estimate the quality of denoising. These two objective quality matrices are defined as follows,

978-1-4799-4612-9/14/$31.00 © 2014 IEEE

xf

σf

,

where x f and

CNR =

x f − xb 0.5(σ 2f + σ b2 )

, (12)

σ f are mean and standard deviation of

foreground region and

xb and σ b are mean and standard

deviation of background region [10]. Figure 5 (top) shows abackground region (blue rectangle) and 5 foreground regions (white rectangles). The performance of NL dictionary was compared with the KSVD-based leanred dictionary (Table 1). The CNR and MSR are calculated at 5

845

DSP 2014

Proceedings of the 19th International Conference on Digital Signal Processing

20-23 August 2014

[3] J.A. Tropp, “Greed is good: Algorithmic results for sparse approximation,” IEEE Transactions on Information Theory, 50(10), 2231-2242, 2004.

different foreground regions of image and the mean± STD are reported. The performance of NL dictionary is comparable to that of the KSVD-based learned dictionary. Fig. 5 shows the performance of NL and learned dictionary for removing speckle noise from the OCT images. The results obtained from NL dictionary are comparable to those of learned dictionary.

[4] S.S. Chen, D.L. Donoho, and M.A. Saunders, “Atomic decomposition by basis pursuit,” SIAM Review, 43(1), 129159, 2001. [5] M. Tharmalingam and K. Raahemifar, “A nonlinear dictionary for image reconstruction,” ICASSP, 2013.

Table 1: CNR and MSR of original and denoised images. The first row is for original image while the remaining rows are for denoised image using NL and KSVD dictionaries. Dictionary Original image

CNR 3.56±1.20

MSR 2.94±0.95

NL

4.59±1.83

3.62±1.36

KSVD

4.50±1.78

3.55±1.32

[6] M. Marsousi, “Variable length K-SVD: A new dictionary learning approach and multistage OMP method for sparse representation,” Master’s Thesis, Department of Electrical and Computer Engineering, Ryerson University, Canada, 2012. [7] J.G. Fujimoto, C. Pitris, S. A. Boppart, and M. E. Brezinski, “Optical coherence tomography: an emerging technology for biomedical imaging and optical biopsy.” Neoplasia, 2(1-2), 9-25, 2000.

4. CONCLUSION In this paper we compared NL dictionary with various kinds of over-complete dictionaries for image reconstruction in terms of PSNR and SSIM. We demonstrated the improvement in PSNR and SSIM indices by adding various kinds of NL functions in a conventional DCT dictionary. The NL dictionary includes a diverse set of atoms, therefor it able to reconstruct both the harmonic and nonharmonic signals. The NL dictionary provided PSNR and SSIM indices better than those of Haar dictionary and were also comparable to those of the KSVD-based learned dictionary. The advantage of NL dictionary over learning-based dictionary is that it is an analytical dictionary so it does not require a lengthy learning process. The NL dictionary can be used in the sparsity-based image denosing since it is capable of reducing speckle noise from the OCT image. The analytical-based dictionary can be used for sparse representation of image especially when shorter execution time is required. This may have applications in telemedicine.

[8] K. Yogesan, F. Reinholz, I. J. Constable, “Tele-Diabetic Retinopathy screening and imaged -based clinical decision support,” Chapter 11 in Automated image detection of retinal pathology, H. F. Jelinek, M.J. Cree, Eds.339-348, Boca Raton: CRC, 2010. [9] M. Elad and M. Aharon, “Image denoising via sparse and redundant representations over learned dictionaries,” IEEE Transactions on Image Processing, 15(12), 37363745, 2006. [10] L. Fang, S. Li, Q. Nie, J.A. Izatt, C.A. Toth, and S. Farsiu, “Sparsity based denoising of spectral domain optical coherence tomography images,” Biomedical optics express, 3(5), 927-942, 2012.

5. REFERENCES [1] M. Aharon, M. Elad, and A. Bruckstein, “K-SVD: An algorithm for designing over-complete dictionaries for sparse representation,” IEEE transactions on signal processing, 54 (11), 4311-4322, 2006. [2] J. Mairal, F. Bach, J. Ponce, G. Sapiro, A. Zisserman,, “Non-local sparse models for image restoration.” In IEEE 12th International Conference on Computer Vision, 22722279, 2009.

978-1-4799-4612-9/14/$31.00 © 2014 IEEE

846

DSP 2014

Suggest Documents