Image deconvolution using hidden Markov tree modeling of complex ...

9 downloads 121 Views 103KB Size Report
eling of the complex wavelet packet transform, to capture the inter-scale dependencies ... A similar approach has been used in [6], with real wavelet packets, and ...
IMAGE DECONVOLUTION USING HIDDEN MARKOV TREE MODELING OF COMPLEX WAVELET PACKETS? André Jalobeanu1 , Nick Kingsbury2 , Josiane Zerubia1 1

Ariana, joint research group CNRS/INRIA/UNSA, INRIA Sophia Antipolis, France - [email protected] 2

Signal Processing Group, Department of Engineering, University of Cambridge, UK - [email protected] ?

This research has been partially supported by an Alliance grant and the EU project, MOUMIR

ABSTRACT In this paper, we propose to use a hidden Markov tree modeling of the complex wavelet packet transform, to capture the inter-scale dependencies of natural images. First, the observed image, blurred and noisy, is deconvolved without regularization. Then its transform is denoised within a Bayesian framework using the proposed model, whose parameters are estimated by an EM technique. The total complexity of this new deblurring algorithm remains O(N). 1. INTRODUCTION The problem we deal with is the reconstruction of a satellite image from blurred and noisy data. The degradation model is represented by the equation :

Y = HX + N

where H X

=h?X

(1)

where Y is the observed data, X is the original image and ? denotes a convolution. N is the additive noise and is supposed to be Gaussian, white and stationary, of known variance  2 . The Point Spread Function (PSF) h is known. In this paper, we propose to denoise the image in a wavelet basis [3], after a deconvolution without regularization. A similar approach has been used in [6], with real wavelet packets, and consists of cancelling the coefficients below a given threshold. Such a method is efficient if the basis provides a compact representation of the signal X , which is true with wavelet bases. In the case of deconvolution, the deconvolved noise covariance must be nearly diagonal [6] in this basis, that is why wavelet packets have to be used. In practice, it means that we also decompose some of the detail subspaces (highest frequency subbands). This enables us to better separate the signal from noise. The noise to be filtered is colored, and is strongly amplified in high frequency areas because of the deconvolution. The packet transform enables us to nearly whiten this noise within each subband, so that the denoising can be made by filtering each coefficient separately. Thus, the deconvolution can be achieved by a rough inverse filtering in the

Fourier domain, followed by a denoising algorithm. Compared to white noise removal problems, here the noise variance depends on the subband, and in some areas its power can exceed the signal power, which makes signal recovery very difficult. We propose to use Hidden Markov Tree (HMT) modeling of the wavelet subbands [1]. This approach outperforms a number of high performance denoising techniques, but has not yet been used for deconvolution. In the classical algorithms, all the coefficients are thresholded separately because they are supposed to be independent within a subband and across the scales. HMT enables us to capture the inter-scale dependencies which are quite strong in natural images because the edges usually propagate across scales. It also offers an efficient modeling of the subband statistics. Moreover, efficient estimation algorithms have been developed to perform the estimation of the hidden Markov model parameters [1], and they can be easily extended to the case of wavelet packet trees [5], without increasing too much the computational complexity. Finally, the computational cost of the denoising algorithm remains O(N ), including the direct and inverse wavelet transforms. 2. COMPLEX WAVELET PACKETS Fast real wavelet packet transforms are neither shift invariant nor rotation invariant, which produces artefacts when the coefficients are filtered. That is why we use a complex wavelet transform [7], which is nearly shift invariant, provides very good directional properties and has efficiently been applied to image denoising. However, the corresponding basis is inadequate for our problem because the deconvolved noise prevents recovery of the signal in the highest frequency subbands. We have extended the original transform by applying the filters on the detail subbands, thus defining a complex wavelet packet (CWP) transform [4], keeping the useful invariance properties of the original transform. The decomposition tree is illustrated by fig. 1. The CWP transform used in this work is able to separate up to 26 directions, enabling us to better represent oriented textures.

j=3 (wavelets)

11 00 00 11 00 11 00 11 00 11

00 11 00 11 11 00 00 11 00 11

00 11 00 11 11 00 00 11 00 11

j=2 (wavelets)

wavelet packets

000 111 000 111 111 000 000 111 000 111

wavelet approximation 00 11 11 00 00 11 00 11 00 11

wavelet detail wavelet packet detail

00 11 00 11 11 00 00 11 00 11

00 11 00 11 11 00 00 11 00 11

Fig. 1. Wavelet packet tree We use a quad-tree algorithm [7], by noting that an approximate shift invariance can be obtained with a real biorthogonal transform by doubling the sampling rate at each scale. This is achieved by computing 4 parallel wavelet trees, which are differently subsampled. The shift invariance is perfect at level 1, and approximate beyond this level. Therefore, it involves two pairs of odd and even length biorthogonal filters. At level j = 1, it is a non-decimated wavelet transform whose coefficients are re-ordered into 4 trees by using their parity. For j > 1, each tree is processed separately, with a combination of odd and even filters depending on the tree. The transform is computed by a fast filter bank technique of complexity O(N ). Finally, the coefficients of the 4 trees are combined to form complex numbers by a linear transform. Thresholding the coefficient magnitudes without modifying the phase opens the way to a fully shift invariant noise filtering method.

where P (st = m) represent the weight of the Gaussian density of variance vtm . Each subband is characterized by a set of variances fv m g, the same for the whole subband. To take into account the interactions between scales, a dependence graph is constructed over the hidden states. One state st is supposed to be dependent only on the parent state sp(t) and the children states si2C (t) . This induces a Markov structure on the states (a scale only depends on the neighbour scales). Since the states refer to CWP coefficients, the dependence graph is organized as the CWP tree of Fig. 1. The graph representing the parent-child relationships and the link between hidden states st and coefficients t , is shown on Fig. 2. j=4 wavelets j=3 wavelets j=2 wavelets

wavelet packets

j=3 wavelet subband

3. THE HMT PRIOR MODEL

j=2 wavelet subband

The statistical properties of the CWP subbands of natural images (as the unknown image to be recovered) lead us to design the following prior model, in order to capture the interactions between the scales. We use a multiscale complex Gaussian mixture [1], which has the following properties : ? the distribution of the complex subbands does not depend on the phase of the coefficients, ? it is heavy-tailed w.r.t. the magnitude, ? the coefficients are dependent across scales. The high non-Gaussianity of each subband is captured by the Ns -component Gaussian mixture (where Ns  3). The mixture is obtained by combining several zero-mean Gaussian distributions with different variances. Each coefficient (indexed by t) is supposed to be governed by a hidden state st , which decides which variance v m to use in the conditional Gaussian density. We denote the set of the Ns possible states. Thus, one of the coefficients of a complex subband, denoted t , follows the conditional distribution :

P (t j st = m) = v1m e jt j2 =vtm t

(2)

and the subband distribution is well approximated by the resulting mixture :

P (t ) =

X

m2

P (st = m) v1m e jt j2 =vtm t

(3)

wavelet packet subbands

Fig. 2. Dependence graph of the 1-D HMT, and dependence between 2-D CWP coefficients

The interaction between scales is specified by the transition matrix "t . The transition probability from the parent state r to the child state m is :

"mr t = P (st = m j sp(t) = r)

(4)

To make the computation more robust, we assume that the transition matrices are equal within a subband. We have modified the structure of the classical wavelet tree to adapt it to the wavelet packet tree. In a wavelet tree, the detail coefficients are linked across scales. This is true between scales j = 2 and j = J of the CWP transform. Each coefficient at scale j has 4 neighbour children at scale j 1 in the same subband. The link between wavelets and packets is set up as follows. Each wavelet coefficient at scale 2 has 4 wavelet packet children in 4 different subbands (see Fig. 2).

4. THE BAYESIAN APPROACH Let us denote X the deconvolved image without regularization. The inversion is performed in the Fourier domain. F denotes the Fourier transform. F [X ] is divided by F [h] +  where  avoids dividing by zero. For each subband, indexed by k , the variables x and  denote one of the CWP coefficients corresponding respectively to Y and X . Since the noise N is white and Gaussian, equation (1) multiplied by H 1 gives, in the CWP domain :

xt = t + nt where nt = N2 (0; wk ) (5) where N2 denotes a bidimensional Gaussian density. We assume that the noise variance is constant in each subband k . We compute wk by using the inverse of the PSF h, denoted h~ 1 , computed in the same manner as the deconvolved image X . The energy of the CWP is computed, and normalized by the energy of the impulse response (CWP transform of a Dirac). Then we have :

wk

= 22

2 X CWP[~ 1 ]kij CWP[ ]kij

h

ij

(6)



We estimate the unknown coefficients t within a Bayesian framework. We have demonstrated in [4] that this approach provides slightly better results than the Minimax [3] risk calculus. Instead of computing the MAP, which is quite complex, we use the Posterior Mean estimate of t . Bayes rule is used to calculate the expression of the posterior probability :

^t = E [P (t j xt )] = E [P (xt j t )P (t )] where P (xt j t ) is given by : P (xt j t ) =  1w e jxt t j2 =wk k

X

m2

m P (st = m j xt ) vmv+t w xt t

where

if u  0; (u)+ = 0u elsewhere

(11)

The EM algorithm we use is derived from reference [1]. At iteration i we have :





E step : two passes on the tree (forward and back ward) to compute P fst g j fxt g ; i M step :



i+1 = arg max Efst g log P (fxt ; st g j ) j fxt g ; i We denote Tt the subtree containing the node t and all its descendants. We then define the and : mt = P (st = m; T n Tt) (12) tm = P (Tt j st = m) (13) so that the marginal densities can be computed :

(7)

(8)

5.1. E step : Backward

(9)

k

v~tm = (^vtm wk )+

m m

tm = P (st j fxt g) = P t t r r (14) r2 t t The variances vtm are initialized by estimating an independent mixture model from the observed data (i.e. "mr t = 1=Ns ). This is achieved by computing the histogram of each

Using Eqs. (2),(8) and Bayes rule, the PM estimate used for denoising is given by :

^t =

The incomplete data is ft g (training data) and the complete data is ft ; st g (training data augmented with the hidden states). In fact, since the coefficients t are unknown, the estimation is performed on the observed coefficients xt . The variances obtained this way have to be corrected, since they are modified by the noise which contaminates the observed data. A simple way to correct them is to subtract the noise variance :

subband, and fitting a Gaussian mixture to it (any EM or similar algorithm can be used).

The tm corresponding to the wavelet packet coefficients (leaves of the tree) are initialized by the Gaussian kernel tm = (vtm ) 1 e jxj2=vtm , and the tree is traversed backward, each node t is updated knowing its children C (t) : Y X r tm = (vtm ) 1 e jxt j2 =vtm (15) "rm i i

i2C (t) r2

5. HMT PARAMETER ESTIMATION To compute the probabilities P (st = m j xt ) and the variances vtm needed in Eq. (9), we use the classical ForwardBackward estimation algorithm, which is adapted to our specific CWP tree. The set of parameters of the HMT is m  = f"mr t ; vt gt2T;m2 ;r2 . The prior parameters are estimated by an Expectation Maximization (EM) technique [2], which intends to maximize their likelihood :

m f"^mr P ft gt2T j  t ; v^t g = ^ = arg max 



(10)

5.2. E step : Forward The m t corresponding to the roots of the tree (coarsest scale J of the wavelet transform) are initialized by the mixing parameters P (st = m) of the root nodes, which are assumed to be equal to 1=Ns . The tree is traversed forward, each node t is updated according to its parent node p(t) :

mt =

r r "mr t p(t) p(t) P nr n r2 n2 "t t X

(16)



5.3. M step Once the probabilities tm are known, the M step is straight-

forward. For each subband k the variances and the transition probabilities are computed by averaging over the set of nodes k of the subband.

vtm = "mr t =

P

i2k P

jxi j2 im

m i2k i

P

1

X

m i2k p(i) i2k

model (Ns = 1). This illustrates the superiority of the method exploiting the inter-scale dependencies w.r.t. the other ones. The two images mentioned in this table are the image of Fig. 3 and an aerial image of an industrial area, both images containing edges which have a strong persistence across scales. More images, blurring functions and noise variances have been tested, see [5] for details.

(17)

r r im "mr i pP (i) p(i) n n nr n " n2 i i n2 i i

P

6. THE DECONVOLUTION ALGORITHM The initial deconvolution is made in the Cosine Transform (DCT) space instead of the Fourier space to avoid artefacts near the borders of the images. The algorithm is as follows :  DCT of the observation Y  Divide by F [h] +   Inverse DCT of the result, which gives X  CWP transform of X  Initialization of the variances vtm  Initialization of the transitions "mr t ("mr t = 0:7 if m = r; else 0:3=Ns )  Repeat until convergence :  Backward step : compute f tm g  Forward step : compute f mt g  Maximization step : update the parameters   PM computation to filter the noise - see Eq. (9)  Inverse CWP transform.

The images presented in this paper are degraded by an exponential transfer function, equivalent to the following PSF    1 model hij = 1 + i2 + j 2 = . The observed image is blurred ( = 1) and corrupted by noise of variance  2 = 2 and is shown in Fig. 3. The reconstructed image using the proposed algorithm is shown in the same figure. The following table shows the SNR improvement between observed and restored images for different models using complex wavelet packets. Gauss 5.22 3.40 5.35 3.24

8. CONCLUSION We have proposed a deconvolution algorithm using Hidden Markov Tree modeling of the complex wavelet packet transform. It exhibits better directional and shift invariance properties than methods using real wavelet packet transforms, and outperforms the non-iterative techniques which do not take into account the dependence between scales. 9. REFERENCES [1] M. Crouse, R. Nowak, and R. Baraniuk. Wavelet-based statistical signal processing using Hidden Markov Models. IEEE Trans. on SP, (46):886–902, 1998.

7. RESULTS

image cameraman,  2 = 2 cameraman,  2 = 8 aerial,  2 = 2 aerial,  2 = 8

Fig. 3. Observed and restored cameraman image

Indep. 6.26 4.42 5.40 3.75

HMT 6.75 4.85 5.68 3.90

We have a number of classes Ns = 3. The proposed technique exhibits a higher SNR than the independent mixture model, which is better than the independent Gaussian

[2] A. Dempster, N. Laird, and D. Rubin. Maximum Likelihood from incomplete data via the EM algorithm. Journal of Roy. Stat. Soc. B, 39:1–38, 1977. [3] D. Donoho and I. Johnstone. Ideal spatial adaptation via wavelet shrinkage. Biometrika, 81:425–455, 1994. [4] A. Jalobeanu, L. Blanc-Féraud, and J. Zerubia. Satellite image deconvolution using complex wavelet packets. INRIA Research Report 3955, www.inria.fr/RRRT/RR3955.html, June 2000. [5] A. Jalobeanu, R. Nowak, N. Kingsbury, and J. Zerubia. Multiscale Markov modeling for image deconvolution. INRIA Research Report, to appear in July 2001. [6] J. Kalifa and S. Mallat. Bayesian inference in wavelet based methods, chapter Minimax restoration and deconvolution. Springer, 1999. [7] N. Kingsbury. The dual-tree complex wavelet transform: a new efficient tool for image restoration and enhancement. In Proc. of EUSIPCO, pages 319–322, Rhodes, Greece, 1998.