Locally adaptive multiscale Bayesian method for ...

The final published version of this paper is available at: http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=4421727&url=http%3A%2F%2Fieeexplore.ieee.org% 2Fxpls%2Fabs_all.jsp%3Farnumber%3D4421727 Please cite this article as follows: M. Forouzanfar, H. Abrishami-Moghaddam, and S. Ghadimi, “Locally adaptive multiscale Bayesian method for image denoising based on bivariate normal inverse Gaussian distributions” IEEE Int. Conf. wavelets analysis and pattern recognition (ICWAPR2007), Beijing, China, Nov. 2007, pp. 1696-701

LOCALLY ADAPTIVE MULTISCALE BAYESIAN METHOD FOR IMAGE DENOISING BASED ON BIVARIATE NORMAL INVERSE GAUSSIAN DISTRIBUTIONS MOHAMAD FOROUZANFAR, HAMID ABRISHAMI MOGHADDAM Department of Electrical Engineering, K.N. Toosi University of Technology, Tehran, Iran E-MAIL: [email protected], [email protected]

Abstract Recently, the use of wavelet transform has led to significant advances in image denoising applications. Among wavelet based denoising approaches, Bayesian techniques give more accurate estimates. Considering interscale dependencies, these estimates become closer to the original image. In this context, the choice of an appropriate model for wavelet coefficients is an important issue. The performance can also be improved by estimating model parameters in a local neighborhood. In this paper, we introduce a spatially adaptive MMSE-based Bayesian estimator using bivariate normal inverse Gaussian (NIG) distribution. The NIG distribution can model a wide range of processes, from heavy-tailed to less heavy-tailed processes. Exploiting this new statistical model in the dual-tree complex wavelet domain, we achieved state-of-the-art performance among related recent denoising approaches, both visually and in terms of peak signal-to-noise ratio (PSNR). Keywords: Image denoising; bivariate normal inverse Gaussian distribution; bivariate MMSE-based estimator; complex wavelet transform

1. Introduction A fundamental part of any image processing system is the noise reduction component. Noise arises during acquisition and transmission of images and can be modeled as additive Gaussian most of the time. The main aim of an image denoising algorithm is then to achieve both noise reduction and feature preservation. In this context, wavelet-based methods are of particular interest. In the wavelet domain, the noise is uniformly spread throughout coefficients while most of the image information is

concentrated in the few largest one [1]. The first wavelet-based denoising methods were based on thresholding of detailed subbands coefficients [2, 3]. Most of the wavelet thresholding methods suffer from the drawback that the chosen threshold may not match the specific distribution of signal and noise components at different scales. To address these disadvantages, non-linear estimators based on Bayesian theory were developed [4]. In the Bayesian framework, it has been recognized that a successful denoising algorithm can achieve both noise reduction and feature preservation if it employs a more accurate statistical description of the signal and noise components. Till now, the best models that have been utilized in this framework are generalized Gaussian (GG) [1], symmetric alpha-stable (SαS) [5], and symmetric NIG distributions [6, 7]. Sendur et al [8] showed that the performance of a Bayesian denoising algorithm would be significantly improved if interscale dependencies could be effectively modeled and exploited. Within this framework, GG and SαS models have been extended to their bivariate forms and improved denoising results have been reported [9, 10, 11]. In this paper we apply a bivariate spherical NIG model to the real and imaginary coefficients of the complex wavelet transform at adjacent scales. Consequently, a bivarate Bayesian MMSE-based estimator is designed to denoise wavelet coefficients, efficiently. In order to exploit spatial dependencies, the NIG parameters are estimated using a locally adaptive algorithm. Simulation results using standard noise-free images show that the proposed method performs the best among some recent algorithms in terms of peak signal-to-noise ratio and visual quality. This paper is organized as follows. In Section 2, multivariate normal inverse Gaussian (MNIG) distributions are introduced. Section 3, is dedicated to development of the proposed method. Experimental results are given in Section 4, and the paper is concluded in Section 5.

2. MNIG distribution

3.

The MNIG distribution is a recent variance-mean mixture of a multivariate normal distribution with the inverse Gaussian as the mixing distribution. Recently, there has been an increasing interest in such models for financial and signal processing applications, mainly because the resulting distributions can model a wide range of processes, from heavy-tailed to less heavy-tailed processes. The MNIG distribution of a d-dimensional random variable x is given by [12]

In this Section after formulating the problem of noise reduction in wavelet domain, the proposed bivariate MMSE-based estimator is developed and a spatially adaptive algorithm for estimation of NIG parameters is introduced.

⁄

2

⁄

exp

,

⁄

(1)

T

and T

3.1. Problem formulation Let X be a clean natural image with size M × N which is corrupted with an additive white Gaussian (AWG) noise Z with a zero-valued mean and variance . Then we can write: Y (k, l) = X (k, l) + Z (k, l)

(2)

where Y represents the corresponding noisy image and k and l are variables of spatial locations. The wavelet transform is a linear operation. So, after decomposing an image we get, in each of the detailed subbands at each level of decomposition, sets of noisy wavelet coefficients written as the sum of the transformations of the signal and the noise:

with T

Proposed Method

.

where, ⁄ denotes the modified Bessel function of the second kind with index d. Notice that when d is even, ⁄ can be written with a closed-form expression, which enables computationally more efficient implementations. The shape of the density is specified by two scalar parameters α, δ > 0, two d-dimensional vector parameters β and µ, and one d × d matrix parameter Γ. The parameter α controls the “steepness” of the density, such that the steepness of the density increases monotonically with increasing α. Parameter β determines the skewness of the density, which means that for β ≠ 0, the density will be asymmetrical. Further, δ is the scale parameter and µ a translation parameter. The structure matrix Γ is assumed to be a positive semi-definite symmetric matrix with unit determinant. This matrix controls the degree of correlations between the components of x. This flexible parameterization makes it possible to model a large variety of statistically dependent heavy-tailed data. Notice that when β = 0 the MNIG belongs to the class of elliptical distributions. If in addition Γ = I, it belongs to the class of spherical distributions. Also, there are two limiting distributions arising from MNIG. The multivariate Gaussian distribution is a limiting distribution for the MNIG when δ → ∞ and α → ∞ such that δ / α = σ2. Another important special case for the MNIG is the multivariate Cauchy distribution. This occurs when Γ = I and α → ∞.

yi (s, t) = xi (s, t) + zi (s, t)

(3)

where y, x, and z represent noisy wavelet coefficients, signal coefficients, and noise, respectively. i is the scale parameter varying from 1, 2… J, and J is the total number of possible decompositions. Also, s and t represent the spatial locations and vary from 0 to 2J-i-1. Denoising in the orthogonal wavelet domain has been observed to produce significantly noticeable artifacts such as Gibbs-like ringing around edges and specks in smooth regions [13]. To improve this unwanted phenomenon, many redundant translation invariant wavelet algorithms have been developed [1, 13]. Here, we use the oriented two-dimensional (2-D) dual-tree complex wavelet transform (DTCWT) [14]. The 2-D DTCWT is four-times expansive, but it achieves important additional properties compared with 2-D discrete wavelet transform (DWT): It is nearly shift invariant and directionally selective. Standard 2-D DWT offers the feature selectivity in only 3 directions with poor selectivity for diagonal features, whereas 2-D DTCWT has 12 directional wavelets (6 for each of real and imaginary trees) oriented at angles of ±15º, ±45º, and ±75º.

3.2. Interscale wavelet model Although wavelet transform well decorrelates signals, strong interscale dependencies between wavelet coefficients may still exist [8]. The goal of this paper is to improve the denoising results of related recent methods by explosion of

these dependencies, efficiently. In the wavelet domain, noise level decreases rapidly along scales, while signal structures are strengthened with scale increasing [15], so we use coarser scale (parent) information to improve finer scale (child) estimation. For every two adjacent levels, after extension to the same size and omitting s and t for simplicity, (3) is written in vectorial form as: (4)

y=x+z

where y = (yi, yi+1), x = (xi, xi+1), and z = (zi, zi+1), while subscripts i and i + 1 represents the child and parent coefficients of the same spatial location, respectively. At this stage, we can assume a zero-mean bivariate Gaussian model for the noise component z as

1

,

(7)

,

and are the estimates from the jth projection. where In order to obtain and , we use the local adaptive algorithm proposed by Bhuiyan et al [7]. To obtain SNIG parameters for each coefficient y, we first obtain estimates of the second and fourth order moments, denoted respectively, by and , corresponding to SNIG prior as:

(5)

2

2

where is the noise variance. Sendur et al [8, 16] found that a bivariate spherical distribution is a simple appropriate model for wavelet coefficients of natural images at adjacent scales. Here, we assume a bivariate spherical NIG distribution with zero mean for the signal component x. This means that µ = 0, β = 0, Γ = I, and d = 2. Therefore, (1) is simplified to: ⁄ ⁄

1

,0

1

2

of α and δ as [12]:

exp

,

⁄

6

3

(8) ,0

(9)

where m2 and m4 are the second and fourth order sample moments. The values of m2 and m4 for each coefficient are obtained using a D × D squared window centered at that coefficient as: 1

,

(10)

,

(11)

,

1

(6)

,

with

Now, the corresponding second and fourth order cumulants, and , respectively, are obtained as: denoted by (12)

and 3

.

,0

(13)

Then, the parameters α and δ are estimated as:

3.3. Locally adaptive parameter estimation In [7], a locally adaptive algorithm for estimation of univariate NIG distributions was developed. Here, we extend their work to the bivariate case, where the parameters of the bivariate NIG distribution are estimated for each two adjacent scales adaptively. The principle is straightforward. If the data is spherically distributed with zero mean, then every one-dimensional projection is univariate symmetric NIG (SNIG) with equal α and δ parameters [12]. We can now apply any univariate parameter estimator to each one-dimensional projection, e.g. [7]. Since the data may not be completely spherical, we can obtain an average estimate

3

(14)

,

Also the noise variance is obtained using the median absolute deviation (MAD) of coefficients at the finest (first) level of decomposition [17], as: σ2z =

MAD y1 0.6745

2

(15)

In the Bayesian framework, a risk is minimized to obtain the optimal estimate. By minimizing a quadratic cost function over conditional distribution of the signal component x, the bivariate MMSE-based estimator is given by the marginal conditional mean of xi, given y [11] |

|

.

(16)

Estimate of Child Coefficients

3.4. Bivariate Bayesian MMSE-based estimator 5

0

-5 5 5

0

Using Bayes’ theorem we get

0 -5

Noisy Parent Coef f icients

.

-5

Noisy Child Coef f icients

(17)

. Since the processor, to our knowledge, does not have a closed-form expression, it should be computed numerically. Fig. 1. depicts the numerically computed processor input-output surface. It is obvious that the shrinkage of a coefficient is conditioned on both its amplitude and the value of the corresponding coefficient at the next decomposition level. Exploiting these interscale dependencies, we achieve better denoising results.

3.5. Image reconstruction After estimating the signal component of the noisy coefficients in wavelet domain, we invert the multiscale decomposition to reconstruct the noise-free image .

4. Experimental results In this section, we compare our interscale NIG-based processor with some recent denoising techniques: Donoho’s soft-thresholding technique [2], Achim’s WIN-SAR processor based on alpha-stable prior model [5], and Sendur’s et al bivariate MAP estimator [8]. We tested this various denoising methods on standard 8-bit grayscale images, namely, Lena, Bout (size 512 × 512) and Baboon, Cameraman (size 256 × 256). The images were corrupted by simulated additive white Gaussian noise at three different levels of noise σz ∈ [10, 20, 30]. We used the peak signal-to-noise ratio as an objective performance criterion. The PSNR is defined as: PSNR

10log

where MSE is given by:

255 MSE

(18)

MSE

1

,

,

(19)

,

The denoising process was performed over ten different noise realizations for each standard deviation and the resulting PSNRs were averaged over this ten runs. The parameters of each method have been set according to the values given by their respective authors in the corresponding referred papers. Table 1 summarizes the results obtained. Fig. 2 illustrates a visual comparison between different denoising algorithms performed on Lena for noise standard deviation 20. It can be observed that the proposed method performs considerably better than other methods, both visually and in terms of PSNR. It can be seen that the soft-thresholding technique blurs the edges and fine details in the image. Also, notice that among the four methods, the proposed algorithm provides denoised images with the best quality.

5. Conclusions and perspectives In this paper, a new multiscale bivariate Bayesian denoising technique was proposed. The main innovation was the use of bivariate spherical NIG distributions as the prior model for wavelet coefficients corresponding to signal component. Then a bivariate MMSE-based estimator was developed to efficiently remove noise from noisy coefficients. Locally adaptive parameter estimation improved the denoising results. Finally, the proposed method was compared with some state-of-the art denoising algorithms. Both visual and quantitative comparisons, performed on standard natural images, proved the efficacy of our new proposed method.

We are currently exploring several ways to extend the work presented in this paper. An interesting direction is the further improvement of the statistical image model, by considering a more general case of NIG distributions such as the elliptical form. Also, estimating the NIG parameters using context modeling technique seem to achieve more reliable estimates in comparison with the locally adaptive method. These variations will be published in a future paper as soon as possible.

(a)

(b)

(c)

(d)

(e)

(f)

References [1] S. Mallat, A wavelet tour of signals processing, Academic Press, 1998. [2] D.L. Donoho, “Denoising by soft-thresholding,” IEEE Trans, Information Theory, vol.41, pp. 613-627, 1995. [3] I.K. Fodar and C. Kamath, “Denoising through wavelet shrinkage: An empirical study,” Journal of Electronic Imaging, vol. 12, pp. 151-160, 2003. [4] E. P. Simoncelli, “Bayesian denoising of visual images in the wavelet domain,” in Bayesian Inference in Wavelet Based Models, P. Muller and B. Vidakovic, Eds. New York: Springer-Verlag, June 1999, ch. 18, pp. 291–308. [5] A. Achim, P. Tsakalides, and A. Bezerianos, “SAR image denoising via Bayesian wavelet shrinkage based on Heavy-Tailed modeling,” IEEE Trans. Geosci. Remote Sensing, vol. 41, pp. 1773-1784, Aug. 2003. [6] S. Solbo and T. Eltoft, “Homomorphic wavelet-based statistical despeckling of SAR images,” IEEE Trans. Geosci. Remote Sensing, vol. 42, pp. 711-721, April 2004.

[7] M.I.H. Bhuiyan, M.O. Ahmad, and M.N.S. Swamy, “Wavelet-based despeckling of medical ultrasound images with the symmetric normal inverse Gaussian prior,” Proceeding of ICASSP2007 Conference, Honolulu, USA, pp. 721-724, April 2007. [8] L. Sendur and I.W. Selesnick, “Biavriate shrinkage functions for wavelet-based denoising exploiting interscale dependency,” IEEE Trans. Signal Proc., vol. 50, pp. 2744-2756, Nov. 2002. [9] D. Cho and T.D. Bue, “Multivariate statistical modeling for image denoising using wavelet transforms,” Signal Processing: Image Communication, vol. 20, pp. 77-89, Oct. 2003. [10] A. Achim and E. Kuruoglu, “Image denoising using bivariate α-stable distributions in the complex wavelet domain,” IEEE Signal Processing Letters, vol. 12, pp. 17-20, Jan. 2005.

[11] M. Forouzanfar, H.A. Moghaddam, and M. Dehghani, “Speckle reduction in medical ultrasound images using a new multiscale bivariate Bayesian MMSE-based method,” presented at SIU07, Eskisehir, Turkey, June 2007. [12] T.A. Oigard, A. Hanssen, R.E. Hansen, and F. Godtliebsen, “EM-estimation and modeling of heavy-tailed processes with the multivariate normal inverse Gaussian distribution,” Signal processing, vol. 85, pp. 1655-1673, March 2005. [13] R.R. Coifman and D.L. Donoho, “Translation-invariant De-noising,” In Wavelets and Statistics (Edited by A. Antoniadis and G. Oppenheim), pp. 125-150, Springer-Verlag, New York, 1995. [14] I.W. Selesnic, R.G. Baraniuk, and N.G. Kingsbury, “The dual-tree complex wavelet transform,” IEEE signal processing magazine, pp. 123-151, Nov. 2005. [15] L. Zhang, P. Bao, and X. Wu, “Multiscale LMMSE-based image denoising with optimal wavelet selection,” IEEE Trans. Video Tech. vol. 15, pp. 469-481, April 2005. [16] L. Sendur and I.W. Selesnick, “Biavriate shrinkage with local variance Estimation,” IEEE Signal Processing Letters, vol. 9, pp. 439-441, Dec. 2002. [17] D.L. Donoho and I.M. Johnstone, “Ideal spatial adaptation by wavelet shrinkage,” Biometrika, vol. 81, pp. 425-455, Aug. 1994.