A Variational Approach to Maximum a Posteriori ... - Semantic Scholar

A Variational Approach to Maximum a Posteriori Estimation for Image Denoising A. Ben Hamza and Hamid Krim Department of Electrical and Computer Engineering North Carolina State University, Raleigh, NC 27695-7914, USA {abhamza,ahk}@eos.ncsu.edu

Abstract. Using first principles, we establish in this paper a connection between the maximum a posteriori (MAP) estimator and the variational formulation of optimizing a given functional subject to some noise constraints. A MAP estimator which uses a Markov or a maximum entropy random field model for a prior distribution can be viewed as a minimizer of a variational problem. Using notions from robust statistics, a variational filter called Huber gradient descent flow is proposed. It yields the solution to a Huber type functional subject to some noise constraints, and the resulting filter behaves like a total variation anisotropic diffusion for large gradient magnitudes and like an isotropic diffusion for small gradient magnitudes. Using some of the gained insight, we are also able to propose an information-theoretic gradient descent flow whose functional turns out to be a compromise between a neg-entropy variational integral and a total variation. Illustrating examples demonstrate a much improved performance of the proposed filters in the presence of Gaussian and heavy tailed noise.

1

Introduction

Linear filtering techniques abound in many image processing applications and their popularity mainly stems from their mathematical simplicity and their efficiency in the presence of additive Gaussian noise. A mean filter for example is the optimal filter for Gaussian noise in the sense of mean square error. Linear filters, however, tend to blur sharp edges, destroy lines and other fine image details, fail to effectively remove heavy tailed noise, and perform poorly in the presence of signal-dependent noise. This led to a search for nonlinear filtering alternatives. The research effort on nonlinear median-based filtering has resulted in remarkable results, and has highlighted some new promising research avenues [1]. On account of its simplicity, its edge preservation property and its robustness to impulsive noise, the standard median filter remains among the favorites for image processing applications [1]. The median filter, however, often tends to remove fine details in the image, such as thin lines and corners [1]. In recent

This work was supported by an AFOSR grant F49620-98-1-0190 and by ONR-MURI grant JHU-72798-S2 and by NCSU School of Engineering.

M.A.T. Figueiredo, J. Zerubia, A.K. Jain (Eds.): EMMCVPR 2001, LNCS 2134, pp. 19–34, 2001. c Springer-Verlag Berlin Heidelberg 2001

20

A. Ben Hamza and Hamid Krim

years, a variety of median-type filters such as stack filters, weighted median [1], and relaxed median [2] have been developed to overcome this drawback. In spite of an improved performance, these solutions were missing the regularizing power of a prior on the underlying information of interest. Among Bayesian image estimation methods, the MAP estimator using Markov or maximum entropy random field priors [3,4,5] has proven to be a powerful approach to image restoration. Among the limitations in using MAP estimation is the difficulty of systematically and easily/reliably choosing a prior distribution and its corresponding optimizing energy function, and in some cases the rsulting computational compexilty. In recent years, variational methods and partial differential equations (PDE) based methods [6,7] have been introduced to explicitly account for intrinsic geometry in a variety of problems inluding image segmentation, mathematical morphology and image denoising. The latter will be the focus of the present paper. The problem of denoising has been addressed using a number of different techniques including wavelets [9], order statistics based filters [1], PDE’s based algorithms [6,10,11], and variational approaches [7]. A large number of PDE based methods have particularly been proposed to tackle the problem of image denoising with a good preservation of the edges. Much of the appeal of PDEbased methods lies in the availability of a vast arsenal of mathematical tools which at the very least act as a key guide in achieving numerical accuracy as well as stability. Partial differential equations or gradient descent flows are generally a result of variational problems using the Euler-Lagrange principle [12]. One popular variational technique used in image denoising is the total variation based approach. It was developed in [10] to overcome the basic limitations of all smooth regularization algorithms, and a variety of numerical methods have also been recently developed for solving total variation minimization problems [10,13,14]. In this paper, we present a variational approach to MAP estimation. The key idea behind this approach is to use geometric insight in helping construct regularizing functionals and avoiding a subjective choice of a prior in MAP estimation. Using tools from robust statistics and information theory, we propose two gradient descent flows for image denoising to illustrate the resulting overall methodology. In the next section we briefly recall the MAP estimation formulation. In Section 3, we formulate a variational approach to MAP estimation. Section 4 is devoted to a robust variational formulation. In section 5, an entropic variational approach to MAP estimation is given. An improved entropic gradient descent flow is proposed in Section 6. Finally, in Section 7, we provide experimental results to show a much improved performance of the proposed gradient descent flows in image denoising.

A Variational Approach to Maximum

2

21

Problem Formulation

Consider an additive noise model for an observed image u0 = u + η,

(1)

where u is the original image u : Ω → R, and Ω is a nonempty, bounded, open set in R2 (usually Ω is a rectangle in R2 ). The noise process η is i.i.d., and u0 is the observed image. The objective is to recover u, knowing u0 and also some statistics of η. Throughout, x = (x1 , x2 ) denotes a pixel location in Ω, |·| denotes the Euclidean norm and || · || denotes the L2 -norm. Many image denoising methods have been proposed to estimate u, and among these figure Bayesian estimation schemes [3], wavelet based methods [9], and PDE based techniques [6,10,11]. One commonly used Bayesian approach in image denoising is the maximum a posteriori (MAP) estimation method which incorporates prior information. Denote by p(u) the prior distribution for the unknown image u. The MAP estimator is given by u ˆ = arg max{log p(u0 |u) + log p(u)}, u

(2)

where p(u0 |u) denotes the conditional probability of u0 given u. A general model for the prior distribution p(u) is a Markov random field (MRF) which is characterized by its Gibbs distribution given by [3] 1 F(u) p(u) = exp − , Z λ where Z in the partition function, λ is a constant known as the temperature in the terminology of physical systems. For large λ, the prior probability becomes flat, and for small λ, the prior probability has sharp modes. F is called the energy function and has the form F(u) = c∈C Vc (u), where C denotes the set of cliques for the MRF, and Vc is a potential function defined on a clique. Markov random fields have been extensively used in computer vision particularly for image restoration, and it has been established that Gibbs distributions and MRF’s are equivalent [3], in other words, if a problem can be defined in terms of local potentials then there is a simple way of formulating the problem in terms of MRF’s. If the noise process η is i.i.d. Gaussian, then we have |u − u0 |2 , p(u0 |u) = K exp − 2σ 2 where K is a normalizing positive constant, and σ 2 is the noise variance. Thus, the MAP estimator in (2) yields λ u ˆ = arg min F(u) + |u − u0 |2 . (3) u 2

22


Image estimation using MRF priors has proven to be a powerful approach to restoration and reconstruction of high-quality images. A major problem limiting its utility is, however, the lack of a practical and robust method for systematically selecting the prior distribution. The Gibbs prior parameter λ is also of particular importance since it controls the balance of influence of the Gibbs prior and that of the likelihood. If λ is too small, the prior will tend to have an over-smoothing effect on the solution. Conversely, if it is too large, the MAP estimator may be unstable, reducing to the maximum likelihood solution as λ goes to infinity. Another difficulty in using a MAP estimator is the non-uniqueness of the solution when the energy function F is not convex.

3

A Variational Approach to MAP Estimation

According to noise model (1), our goal is to estimate the original image u based on the observed image u0 and on any knowledge of the noise statistics of η. This leads to solving the following noise-constrained optimization problem min F(u) u

s.t. u − u0 2 = σ 2

(4)

where F is a given functional which is often a criterion of smoothness of the reconstructed image. Using Lagrange’s theorem, the minimizer of (4) is given by λ u ˆ = arg min F(u) + u − u0 2 . (5) u 2 where λ is a nonnegative parameter chosen so that the constraint u0 −u 2 = σ 2 is satisfied. In practice, the parameter λ is often estimated or chosen a priori. Equations (3) and (5) show a close connection between image recovery via MAP estimation and image recovery via optimization of variational integrals. Indeed, Eq. (3) may be written in integral form as Eq. (5). A critical issue is the choice of the variational integral F. The classical functionals (also called variational integrals) used in image denoising are the Dirichlet and the total variation integrals defined respectively as follows 1 2 D(u) = |∇u| dx, and T V (u) = |∇u|dx, (6) 2 Ω Ω where ∇u stands for the gradient of the image u. A generalization of these functionals is the variational integral given by F (|∇u|)dx, (7) F(u) = Ω

where F : R+ → R is a given smooth function called variational integrand or Lagrangian [12].


23

The total variation method [10] basically consists in finding an estimate u ˆ for the original image u with the smallest total variation among all the images satisfying the noise constraint u − u0 2 = σ 2 , where σ is assumed known. Note that the regularizing parameter λ controls the balance between minimizing the term which corresponds to fitting the data, and minimizing the regularizing term. The intuition behind the use of the total variation integral is that it incorportes the fact that discontinuities are present in the original image u (it measures the jumps of u, even if it is discontinuous). The total variation method has been used with success in image denoising, especially for denoising images with piecewise constant features while preserving the location of the edges exactly [15,16]. Using Eq. (7), we define the following functional λ λ L(u) = F(u) + u − u0 2 = F (|∇u|) + |u − u0 |2 dx. (8) 2 2 Ω Thus, the optimization problem (5) becomes λ 2 u ˆ = arg min L(u) = arg min F(u) + u − u0 , u∈X u∈X 2

(9)

where X is an appropriate image space of smooth functions like C 1 (Ω), or the space BV (Ω) of image functions with bounded variation1 , or the Sobolev space2 H 1 (Ω) = W 1,2 (Ω). 3.1

Properties of the Optimization Problem

A problem is said to be well-posed in the sense of Hadamard if (i) a solution of the problem exists, (ii) the solution is unique, (iii) and the solution is stable, i.e. depends continuously on the problem data. It is ill-posed when it fails to satisfy at least one of these criteria. To guarantee the well-posedness of our minimization problem (9), the following result provides some conditions. Theorem 1. Let the image space X be a reflexive Banach space, and let F be (i) weakly lower semicontinuous, i.e. if for any sequence (uk ) in X converging weakly to u, we have F(u) ≤ lim inf k→∞ F(uk ). (ii) coercive, i.e. F(u) → ∞ as u → ∞. Then the functional L is bounded from below and possesses a minimizer, i.e. there exists u ˆ ∈ X such that L(ˆ u) = inf X L. Moreover, if F is convex and λ > 0, then the optimization problem (9) has a unique solution, and it is stable. Proof. From (i) and (ii) and the weak lower semicontinuity of the L2 -norm, the functional L is weak lower semicontinuous, and coercive. 1 2

BV (Ω) = {u ∈ L1 (Ω) : T V (u) < ∞} is a Banach space with the norm uBV = uL1 (Ω) + T V (u). H 1 (Ω) = {u ∈ L2 (Ω) : ∇u ∈ L2 (Ω)} is a Hilbert space with the norm u2H 1 (Ω) = u2 + ∇u2 .

24


Let un be a minimizing sequence of L, i.e. L(un ) → inf X L. An immediate consequence of the coercivity of L is that un must be bounded. As X is reflexive, thus ˆ in X, i.e un u ˆ. Thus L(ˆ u) ≤ lim inf n→∞ L(un ) = un converges weakly to u inf X L. This proves that L(ˆ u) = inf X L. It is easy to check that convexity implies weakly lower semicontinuity. Thus the solution of the optimization problem (9) exists and it is unique because the L2 norm is strictly convex. The stability follows using the semicontinuity of L and the fact that un is bounded. 3.2

Gradient Descent Flows

To solve the optimization problem (9), a variety of iterative methods may be applied such as gradient descent [10], or fixed point method [13,16]. The most important first-order necessary condition to be satisfied by any minimizer of the functional L given by Eq. (8) is that its first variation δL(u; v) vanishes at u in direction of v, that is d δL(u; v) = L(u + v) = 0, (10) d =0

and a solution u of (10) is called a weak extremal of L [12]. Using the fundamental lemma of the calculus of variations, relation (10) yields the Euler-Lagrange equation as a necessary condition to be satisfied by minimizers of L. This Euler-Lagrange equation is given by F (|∇u|) ∇u + λ(u − u0 ) = 0, in Ω, −∇ · (11) |∇u| with homogeneous Neumann boundary conditions. An image u satisfying (11) (or equivalently ∇L(u) = 0) is called an extremal of L. Note that |∇u| is not differentiable when ∇u = 0 (e.g. flat regions in the image u). To overcome the resulting numerical difficulties, we use the following slight modification |∇u| = |∇u|2 + , where is positive sufficiently small. By further constraining λ, we may be in a position to sharpen the properties of the minimizer, as given in the following. Proposition 1. Let λ = 0, and S be a convex set of an image space X. If the Lagrangian F is nonegative convex and of class C 1 such that F (0) ≥ 0, then the global minimizer of L is a constant image. Proof. Since F is convex and F (0) ≥ 0, it follows that F (|∇u|) ≥ F (0). Thus the constant image is a minimizer of L. Since S is convex, it follows that this minimizer is global.


25

Using the Euler-Lagrange variational principle, the minimizer of (9) can be interpreted as the steady state solution to the following nonlinear elliptic PDE called gradient descent flow ut = ∇ · (g(|∇u|)∇u) − λ(u − u0 ),

in Ω × R+ ,

(12)

where g(z) = F (z)/z, with z > 0, and assuming homogeneous Neumann boundary conditions. The following examples illustrate the close connection between minimizing problems of variational integrals and boundary value problems for partial differential equations in the case of no noise constraint (i.e. setting λ = 0): (a) Heat equation: ut = ∆u is the gradient descent flow for the Dirichlet variational integral 1 D(u) = |∇u|2 dx, (13) 2 Ω (b) Perona-Malik PDE: It has been shown in [17] that the anisotropic diffusion PDE of Perona and Malik [6] given by ut = ∇ · (g(|∇u|)∇u), is the gradient descent flow for the variational integral Fc (u) = Fc (|∇u|)dx, Ω

with Lagrangians z2 c2 Fc (z) = log 1 + 2 2 c

c2 or Fc (z) = 2

z2 1 − exp − 2 c

,

where z ∈ R+ and c is a tuning positive constant. The key idea behind using the diffusion functions proposed by Perona and Malik is to encourage smoothing in flat regions and to diffuse less around the edges, so that small variations in the image such as noise is smoothed and edges are preserved. However, it has been noted that the anisotropic diffusion is ill-posed and it becomes well-posed under certain conditions on the diffusion function or equivalently on the Lagrangian [17]. ∇u ) corresponds to the total variation integral (c) Curvature flow: ut = ∇ · ( |∇u| T V (u) =

4

Ω

|∇u|dx.

(14)

Robust Variational Approach

Robust estimation addresses the case where the distribution function is in fact not precisely known [18,9]. In this case, a reasonable approach would be to assume that the density is a member of some set, or some family of parametric

26


families, and to choose the best estimate for the least favorable member of that set. Huber [18] proposed an -contaminated normal set P defined as P = {(1 − )Φ + H :

H ∈ S},

where Φ is the standard normal distribution, S is the set of all probability distributions symmetric with respect to the origin (i.e. such that H(−x) = 1 − H(x)) and ∈ [0, 1] is the known fraction of “contamination”. Huber found that the least favorable distribution in P which maximizes the asymptotic variance (or, equivalently, minimizes the Fisher information) is Gaussian in the center and Laplacian in the tails, and switches from one to the other at a point whose value depends on the fraction of contamination , larger fractions corresponding to smaller switching points and vice versa. For the set P of -contaminated normal distributions, the least favorable distribution has a density function fH given by [18,9] (1 − ) exp(−ρk (z)), fH (z) = √ 2π where ρk is the Huber M-estimator cost function given by  2  z if |z| ≤ k 2 ρk (z) = 2   k|z| − k otherwise 2

(15)

(16)

and k is a positive constant related to the fraction of contamination by the equation [18] φ(k) − Φ(−k) = , (17) 2 k 1− where Φ is the standard normal distribution function and φ is its probability density function. It is clear that ρk is a convex function, quadratic in the center and linear in the tails. Motivated by the robustness of the Huber M-filter in the probabilistic approach of image denoising [1], we define the Huber variational integral as Rk (u) = ρk (|∇u|)dx. (18) Ω

It is worth noting that the Huber variational integral is a hybrid of the Dirichlet variational integral (ρk (|∇u|) ∝ |∇u|2 /2 as k → ∞) and the total variation integral (ρk (|∇u|) ∝ |∇u| as k → 0). One may check that the Huber variational integral Rk : H 1 (Ω) → R+ is well defined, convex, and coercive. It follows from Theorem 1 that the minimization problem λ 2 u ˆ = arg min ρ dx (19) (|∇u|) + | |u − u k 0 2 u∈H 1 (Ω) Ω has a solution. This solution is unique when λ > 0


Proposition 2. The optimization problem (19) is equivalent to 2 λ θ 2 + u ˆ = arg min k |∇u| − θ + |u − u0 | dx . 2 2 (u,θ)∈H 1 (Ω)×R Ω

27

(20)

Proof. For z fixed, define Ψ (θ) = 12 θ2 + k|z − θ| on R. It is is clear that Ψ is convex on R. It follows that Ψ attains its minimum at θ0 such that Ψ (θ0 ) = 0 and Ψ (θ0 ) > 0, that is θ0 = k sign(z − k). Thus we have  k2    kz − if z > k   2  2 z Ψ (θ0 ) = if z = k  2   2  k   −kz − if z < −k 2 It follows that ρk (z) = arg minθ∈R Ψ (θ). This concludes the proof. Using the Euler-Lagrange variational principle, it follows that the Huber gradient descent flow is given by ut = ∇ · (gk (|∇u|)∇u) − λ(u − u0 ),

in Ω × R+ ,

(21)

where gk is the Huber M-estimator weight function [18]  ρ (z)  1 if |z| ≤ k k = gk (z) = k otherwise  z |z| and with homogeneous Neumann boundary conditions. For large k, the Huber gradient descent flow results in the isotropic diffusion (heat equation when λ = 0), and for small k, it corresponds to total variation gradient descent flow (curvature flow when λ = 0). It is worth noting that in the case of no noise constraint (i.e. setting λ = 0), the Huber gradient descent flow yields the robust anisotropic diffusion [19] obtained by replacing the diffusion functions proposed in [6] by robust M-estimator weight functions [18,1]. Chambolle and Lions [14] proposed the following variational integral Φ (u) = φ (|∇u|)dx, Ω

where the Lagrangian φ is defined as  2 z   if |z| ≤    2 if ≤ |z| ≤ 1 φ (z) = |z| − (22) 2   2 2    z − 1 − if |z| ≥ 1 2 2 As mentioned in [14], the case |z| ≥ 1/ may be dropped when dealing with discrete images since in this case the image gradients have bounded magnitudes. Hence, the variational integral Φ becomes the Huber variational integral Rk .

28

5


Entropic Variational Approach

The maximum entropy criterion is an important principle in statistics for modelling the prior probability p(u) of an unknown image u, and it has been used with success in numerous image processing applications [4]. Suppose the available information by way of moments of some known functions mr (u), r = 1, . . . , s. The maximum entropy principle suggests that a good choice of the prior probability p(u) is the one that has the maximum entropy or equivalently has the minimum negentropy [20]

min p(u) log p(u)du u (23) s.t. p(u)du = 1 mr (u)p(u)du = µr , r = 1, . . . , s Using Lagrange’s theorem, the solution of (23) is given by s 1 p(u) = exp − λr mr (u) , Z r=1

(24)

where λr ’s are the Lagrange multipliers, and Z is the partition function. Thus, the maximum entropy distribution p(u) given by Eq. (24) can be used as a model for the prior distribution in MAP estimation. 5.1

Entropic Gradient Descent Flow

Motivated by the good performance of the maximum entropy method in the probabilistic approach to image denoising, we define the negentropy variational integral as H(|∇u|)dx = |∇u| log |∇u|dx, (25) H(u) = Ω

Ω

where H(z) = z log(z), z ≥ 0. Note that −H(z) → 0 as z → 0. It follows from the inequality z log(z) ≤ z 2 /2 that |H(u)| ≤ |∇u|2 dx ≤ u 2H 1 (Ω) < ∞, ∀u ∈ H 1 (Ω), Ω

Thus the negentropy variational integral H : H 1 (Ω) → R is well defined. Clearly, the Lagrangian H is strictly convex, and coercive, i.e. H(z) → ∞ as |z| → ∞. The following result follows from Theorem 1. Proposition 3. Let λ > 0. The minimization problem λ 2 |∇u| log |∇u| + |u − u0 | dx u ˆ = arg min 2 u∈H 1 (Ω) Ω has a unique solution provided that |∇u| ≥ 1.


29

Using the Euler-Lagrange variational principle, it follows that the entropic gradient descent flow is given by ut = ∇ ·

1 + log |∇u| ∇u − λ(u − u0 ), |∇u|

in Ω × R+ ,

(26)

with homogeneous Neumann boundary conditions. Proposition 4. Let u be an image. The negentropy variational integral and the total variation satisfy the following inequality H(u) ≥ T V (u) − 1. Proof. Since the negentropy H is a convex function, the Jensen inequality yields

Ω

H(|∇u|)dx ≥ H

|∇u|dx = H T V (u)

Ω

= T V (u) log T V (u), and using the inequality z log(z) ≥ z − 1 for z ≥ 0, we conclude the proof.

14 |∇ u|−1 |∇ u| log|∇ u| |∇ u| Huber

12

10

8

6

4

2

0

−2

0

1

2

3 |∇ u|

4

5

Fig. 1. Visual comparison of some variational integrands.

6

30

5.2


Improved Entropic Gradient Descent Flow

Some variational integrands discussed in this paper are plotted in Fig. 1. From these plots, one may define a hybrid functional between the negentropy variational integral and the total variation as follows H(u) if |∇u| ≤ e H(u) = T V (u) otherwise. is not differentiable when the Euclidean norm of ∇u Note that the functional H is equal to e ( i.e. Euler number: e = limn→∞ (1 + 1/n)n ≈ 2.71). This difficulty with the following functional HT V defined as is overcome if we replace H H(u) if |∇u| ≤ e HT V (u) = HT V (|∇u|)dx = (27) 2 T V (u) − |Ω|e otherwise Ω where HT V : R+ → R is defined as HT V (z) =

z log(z) if z ≤ e 2z − e otherwise

and |Ω| denotes the Lebesque measure of the image domain Ω. In the numerical implementation of our algorithms, we may assume without loss of generality that Ω = (0, 1) × (0, 1), so that |Ω| = 1. Note that HT V : H 1 (Ω) → R is well defined, differentiable, weakly lower semicontinuous, and coercive. It follows from Theorem 1 that the minimization problem λ 2 u − u u ˆ = arg min H (28) (u) + T V 0 2 u∈H 1 (Ω) has a solution. Using the Euler-Lagrange variational principle, it follows that the improved entropic gradient descent flow is given by HT V (|∇u|) ut = ∇ · (29) ∇u − λ(u − u0 ), in Ω × R+ , |∇u| with homogeneous Neumann boundary conditions.

6

Simulation Results

This section presents simulation results where Huber, entropic, total variation and improved entropic gradient descent flows are applied to enhance images corrupted by Gaussian and Laplacian noise. The performance of a filter clearly depends on the filter type, the properties of signals/images, and the characteristics of the noise. The choice of criteria by which to measure the performance of a filter presents certain difficulties, and only gives a partial picture of reality. To assess the performance of the proposed


31

denoising methods, a mean square error (MSE) between the filtered and the original image is evaluated and used as a quantitative measure of performance of the proposed techniques. The regularization parameter (or Lagrange multiplier) λ for the proposed gradient descent flows is chosen to be proportional to signalto-noise ratio (SNR) in all the experiments. In order to evaluate the performance of the proposed gradient descent flows in the presence of Gaussian noise, the image shown in Fig. 2(a) has been corrupted by Gaussian white noise with SNR = 4.79 db. Fig. 2 displays the results of filtering the noisy image shown in Fig. 2(b) by Huber with optimal k = 1.345, entropic, total variation and improved entropic gradient descent flows. Qualitatively, we observe that the proposed techniques are able to suppress Gaussian noise while preserving important features in the image. The resulting mean square error (MSE) computations are also tabulated in Fig. 2. The Laplacian noise is somewhat heavier than the Gaussian noise. Moreover, the Laplace distribution is similar to Huber’s least favorable distribution [9] (for the no process noise case), at least in the tails. To demonstrate the application of the proposed gradient descent flows to image denoising, qualititive and quantitative comparisons are performed to show a much improved performance of these techniques. Fig. 3(b) shows a noisy image contaminated by Laplacian white noise with SNR = 3.91 db. The MSE’s results obtained by applying the proposed techniques to the noisy image are shown in Fig. 3 with the corresponding filtered images. Note that the improved entropic gradient descent flow outperforms the other flows in removing Laplacian noise. Comparison of these images clearly indicates that the improved entropic gradient descent flow preserves well the image structures while removing heavy tailed noise.

References 1. J. Astola and P. Kuosmanen, Fundamentals of Nonlinear Digital Filtering, CRC Press LLC, 1997. 2. A. Ben Hamza, P. Luque, J. Martinez, and R. Roman, “Removing noise and preserving details with relaxed median filters,” Journal of Mathematical Imaging and Vision, vol. 11, no. 2, pp. 161-177, October 1999. 3. S. Geman and D. Geman, “Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images,” IEEE Trans. Pattern Analysis and Machine intelligence, vol. 6, no. 7, pp. 721-741, July 1984. 4. H. Stark, Ed., Image Recovery: Theory and Application, New York: Academic, 1987. 5. S.C. Zhu and D. Mumford, “Prior learning and Gibbs reaction-diffusion,” IEEE Trans. Pattern Analysis and Machine intelligence, vol. 19, no. 11, pp. 1236-1244, November 1997. 6. P. Perona and J. Malik, “Scale space and edge detection using anisotropic diffusion,” IEEE IEEE Trans. Pattern Analysis and Machine intelligence, vol. 12, no. 7, pp. 629-639, July 1990. 7. J Morel and S. Solemini, Variational Methods in Image Segmentation, Birkhauser, Boston, 1995.

32

A. Ben Hamza and Hamid Krim Gradient descent flows Huber Entropic Total Variation Improved Entropic

MSE SNR = 4.79 SNR = 3.52 SNR = 2.34 234.1499 233.7337 230.0263 205.0146 207.1040 205.3454 247.4875 263.0437 402.0660 121.2550 137.9356 166.4490

(a) Original image

(b) Noisy image SNR = 4.79 db

(c) Huber flow

(d) Entropic flow

(e) Total Variation flow

(f) Improved Entropic flow

Fig. 2. Filtering results for Gaussian noise.

A Variational Approach to Maximum Gradient descent flows Huber Entropic Total Variation Improved Entropic

MSE SNR = 6.33 SNR = 3.91 SNR = 3.05 237.7012 244.4348 248.4833 200.5266 211.4027 217.3592 138.4717 176.1719 213.1221 104.4591 170.2140 208.8639

(a) Original image

(b) Noisy image SNR = 3.91 db

(c) Huber flow

(d) Entropic flow

(e) Total Variation flow

(f) Improved Entropic flow

Fig. 3. Filtering results for Laplacian noise.

33

34


8. C. Samson, L. Blanc-Feraud, G. Aubert, and J. Zerubia, “A variational model for image classification and restoration,” IEEE Trans. Pattern Analysis and Machine intelligence, vol. 22, no. 5, pp. 460-472, May 2000. 9. H. Krim and I.C. Schick, “Minimax description length for signal denoising and optimized representation,” IEEE Trans. Information Theory, vol 45, no. 3, pp. 898-908, April 1999. 10. L. Rudin, S. Osher, and E. Fatemi, “Nonlinear total variation based noise removal algorithms,” Physica D, vol. 60, pp. 259-268, 1992. 11. I. Pollak, A.S. Willsky, and H. Krim, “Image Segmentation and Edge Enhancement with Stabilized Inverse Diffusion Equations,” IEEE Trans. Image Processing, vol. 9, no. 2, pp. 256-266, February 2000. 12. M. Giaquinta and S. Hildebrandt, Calculus of Variations I: The Lagrangian Formalism, Springer-Verlag, 1996. 13. C. Vogel and M. Oman, “Iterative methods for total variation denoising,” SIAM J. Sci. Stat. Comput., vol. 17, pp. 227-238, 1996. 14. A. Chambolle and P.L. Lions, “Image Recovery via total variation minimization and related problems,” Numerische Mathematik, 76, pp. 167-188, 1997. 15. T.F. Chan and C.K. Wong, “Total variation blind deconvolution”, IEEE Trans. Image Processing, vol. 7, no. 3, pp. 370-375, March 1998. 16. K. Ito and K. Kunisch, “Restoration of Edge-Flat-Grey Scale Images,” Inverse Problems, vol 16, no.4, pp. 909-928, August 2000. 17. Y.L. You, W. Xu, A. Tannenbaum, and M. Kaveh, “Behavioral Analysis of anisotropic diffusion in image processing,” IEEE Trans. Image Processing, vol. 5, no. 11, pp. 1539-1553, November 1996. 18. P. Huber, Robust Statistics. John Wiley & Sons, New York, 1981. 19. M.J. Black, G. Sapiro, D.H. Marimont, and D. Heeger, “Robust anisotropic diffusion,” IEEE Trans. Image Processing, vol. 7, no. 3, pp. 421-432, March 1998. 20. H. Krim, “On the distributions of optimized multiscale representations,” ICASSP97, vol. 5, pp. 3673 -3676, 1997.