restriction on the forward operator, and to best of our knowledge, the proposed ..... Image Processing, UCLA Math Depart
Total Variation Regularization for Poisson Vector-Valued Image Restoration with a Spatially Adaptive Regularization Parameter Selection Paul Rodriguez Department of Electrical Engineering Pontificia Universidad Cat´olica del Per´u Lima, Peru Email:
[email protected]
Abstract—We propose a flexible and computationally efficient method to solve the non-homogeneous Poisson (NHP) model for grayscale and color images within the TV framework. The NHP model is relevant to image restoration in several applications, such as PET, CT, MRI, etc. The proposed algorithm uses a novel method to spatially adapt the regularization parameter; it also uses a quadratic approximation of the negative log-likelihood function to pose the original problem as a non-negative quadratic programming problem. The reconstruction quality of the proposed algorithm outperforms state of the art algorithms for grayscale image restoration corrupted with Poisson noise. Moreover, it places no prohibitive restriction on the forward operator, and to best of our knowledge, the proposed algorithm is the only one that explicitly includes the NHP model for color images and that spatially adapts its regularization parameter within the TV framework.
I. I NTRODUCTION Fundamentally, almost every device for image acquisition is a photon counter: Positron Emission Tomography (PET), computed tomography (CT), magnetic resonance imaging (MRI), radiography, CCD cameras, etc. When the count level is low, the resulting image is noisy; such noise is usually modeled via the non-homogeneous Poisson (NHP) model, with the resulting noise being non-additive and pixel-intensity dependent. ˘ , the noise-free image, we consider each bk (pixel of b, Given u observed image) to be an independent Poisson random variable u ˘bk e−˘uk such E{bk } = u ˘k ≥ 0 and P(bk |˘ uk ) = k . bk ! There are several methods (e.g. Richardson-Lucy algorithm and its regularized variants, Wavelet methods, Bayesian methods, etc.) to restore (denoise/deconvolve) non-negative grayscale images corrupted with Poisson noise. For the scope of this paper we focus our attention to methods based on Total variation (TV). The minimization of the TV regularized Poisson loglikelihood cost functional (see (1)) has been successfully employed to restore non-negative grayscale images corrupted with Poisson noise from a number of applications, such as PET images [1], confocal microscopy images [2], [3], and others [4], [5], [6], [7], [8]. All the above mentioned references seek
to minimize
q X
2 +(D u)2 T (u) = (Au)m −bm log((Au)m)+λ (D u) x y
1
m
(1) where b is the acquired image data (corrupted with Poisson noise), A is a forward linear operator, and (Au)m is the m-th element of (Au). Currently for color images, to best of our knowledge, besides [9] there is no algorithm that explicitly includes the NHP model within the TV framework. Furthermore, the original NHP problem (1) features a single regularization parameter (λ), that typically is heuristically chosen. There are several methods to automatically chose and/or adapt the regularization parameter, either globally or spatially, for the `2 -TV problem (see [10], [11], [12] and references therein) and for the `1 TV problem (see [13], [14], [15]); nevertheless, to best of our knowledge, none of these methods have been applied for the NHP model within the TV framework. In this paper we propose to considere a TV regularized Poisson log-likelihood cost functional for vector-valued images and to include a spatially dependent (similar ideas have been previously used in [10], [14], [15] and others, for the `1 /`2 -TV) regularization parameter: X T (u) = γm (Au)m − γm bm · log((Au)m ) + m
s
q X
1 2 2
(Dx un ) + (Dy un )
q q
(2)
n∈C
where n ∈ C = {1, 2, . . .} (or any given number of elements for vector-valued images), q can be different than 1 (0 < q ≤ 2 ) and γm is the spatial regularization parameter, which will be adaptively updated (see Section III-B). We note that if γm = λ−1 ∀m, q = 1 and C = {1} then (2) is equivalent to (1). Our proposed algorithm uses a second order P Taylor approximation of the data fidelity term F (u) = m γm (Au)m − γm bm · log((Au)m ) to pose (2) as a Non-negative Quadratic Programming problem (NQP, see [16]) which can be solved with a similar approach as the
IRN-NQP algorithm [17], [18]. The resulting algorithm has the following advantages: • •
•
does not involve the solution of a linear system, but rather multiplicative updates only, has the ability to handle an arbitrary number of channels in (2), including the scalar (grayscale) and vector-valued (color) images (C = {1, 2, 3} as special cases, spatially adapts its regularization parameter based on the local spatial variability of the restored image; furthermore most of its parameters are automatically adapted to the particular input dataset, and if needed, the norm of the regularization term (q in (1)) can be different than 1 (0 < q ≤ 2 ). II. T ECHNICALITIES
We represent 2-dimensional images by 1-dimensional vectors: un (n ∈ C) is a 1-dimensional (column) or 1D vector that represents a 2D grayscale image obtained via any ordering (although the most reasonable choices are row-major or column-major) of the image pixel. For C = {1, 2, 3} we have that u = [(u1 )T (u2 )T (u3 )T ]T is a 1D (column) vector that represents a 2D color image. The gradient and Hessian of the data fidelity term in (2) P F (u) = m γm (Au)m − γm bm · log((Au)m ) can be easily computed: b ∇F (u) = AT Γ 1 − (3) Au Γb ∇2 F (u) = AT diag (4) A, (Au)2 where Γ = diag(γm ) is a diagonal matrix that corresponds (k) to the spatial regularization parameter. Then QF (u), the quadratic approximation of F (u) (data fidelity term) evaluated at u(k) , can be written as T T 1 (k) QF (u) = F (u(k) ) + g(k) v(k) + v(k) H (k) v(k) (5) 2 where v(k) AT Γ(k) 1 −
= u − u(k) , g(k) = ∇F (u(k) ) = b and H (k) = ∇2 F (u(k) ) = Au(k) (k) Γ b T A diag (Au(k) )2 A. Note that (5) remains the same for color images (n ∈ C = {1, 2, 3}) and scalar operations applied to a vector are considered to be applied element-wise, 2 so that, for example, u = v2 ⇒ u[m] = (v[m]) and 1 1 u = v ⇒ u[m] = v[m] . Also, for the case of color images, the linear operator A in (1) or (2) is assumed to be decoupled, i.e. A is a diagonal block matrix with elements An ; if A is coupled (inter-channel blur) due to channel crosstalk, it is possible to reduced it to a diagonal block matrix via
sa similarity transformation [19].
X
q 2 2 In the present work 1q (D u ) + (D u ) x n y n
is the n∈C
q
generalization of TV regularization to color images (n ∈ C = {1, 2, 3}) with coupled channels (see [20, Section 9], also used in [21] among others), where we note that
sX
(Dx un )2 + (Dy un )2 is the discretization of |∇u| for
n∈C
coupled channels (see [21, eq. (3)]), and Dx and Dy represent the horizontal and vertical discrete derivative operators respectively. III. T HE S PATIALLY A DAPTIVE P OISSON IRN-NQP A LGORITHM In this section we succinctly describe of previous works (within the TV framework) that also use a second order Taylor approximation of the data fidelity term F (u) to solve (1). Then we describe our propose method to adaptively update the regularization term based on the local spatial variability. We continue to summarized the derivation of the Spatially Adaptive Poisson IRN-NQP (Iteratively Reweighted Norm, non-negative quadratic programming) algorithm, where we also provide a brief description of the NQP [16] problem to finally list the Poisson IRN-NQP algorithm. A. Previous Related Work 1) Second Order Taylor Approximations of the NHP model within the TV framework: Within the TV framework, there are several papers [5], [6], [7] that use a second order Taylor approximation of the data fidelity term F (u) to make (1) a tractable problem, but they differ in the constrained optimization algorithm used to carry out the actual minimization and in whether the true Hessian of F (u) or an approximation is used. We must stress that all the above mentioned algorithms involve the solution of a linear system. Moreover, in [5] the minimization is carried out via a non-negatively constrained, projected quasi-Newton minimization algorithm; ∇2 F (u) is used. In [7] a constrained TV algorithm, described in [22], plus the approximation ∇2 F (u) ≈ ηk I, ηk > 0 are used. In [6] a Expectation-Maximization TV approach is used; only the denoising problem is addressed. 2) Adaptation of the Regularization Parameter for `1 /`2 -TV: Several methods have been proposed to adapt the regularization parameter for `1 /`2 -TV; they differ on how the amount of noise (either Gaussian or salt-and-pepper) is estimated and on nature of the regularization parameter which can be a scalar (global) value or a spatially dependent (diagonal matrix) set of regularization parameters. Succintly we mention that for the `2 -TV case in [10] two adaptive regularization schemes were presented to spatially update the regularization parameters. Also in [11] the Stein’s Unbiased Risk Estimate (SURE) was used to estimate the mean square error between the observed image and the denoised one; a global regularization parameter was used. In [12] an extension to the Unbiased Predictive Risk Estimator (UPRE) was used to estimate the global regularization parameter. For the `1 -TV case most of the methods focus on estimating the noisy pixels and then perform the denoising only for the corrupted pixels; this is the case for [13] and [15] (while the former use a global parameter, the latter spatially adapts the valid regularization parameters). Finally we mention that
in [14] an `1 /`2 -TV algorithm was proposed which spatially adapts the regularization parameters based on local statistics of the noise model.
and scaled between 0 and 1) and the histogram of its local variance estimator (ρ = 2, see (8)).
B. Spatially Adaptive Selection of the Regularization Parameter Ideally, non-texture regions (very low local variability) on the original image could (and should) be strongly denoised whereas texture regions (high local variability) should be only moderately denoise in order to preserve the meaningful structures within the original image. On what follows we explore the previous idea to work out our method to spatially adapt the regularization parameter. 1) Automatic Variance Estimation: Recently in [23] a simple and interesting method was proposed to automatically estimate the noise parameter for additive and multiplicative models that corrupts digital images with sufficiently great amount of low-variability areas. Here we notice that the proposed method in [23] can also be used to estimate the locally constant regions in a non-texture image. In what follows we summarized the main results of [23] that will serve our main purpose: to estimate the locally constant regions of the restored image that should be strongly denoised. ˘ , we assume that the observed Given a non-texture image u ˘ +η. image is corrupted with uncorrelated additive noise: b = u Moreover we assume that the noise sequence ηm (that corrupts the original image) has zero mean and it is homoskedastic: it has a constant variance ση2 . In this scenario, the local variance of the observed image (for pixel m) can be written as σb2m = σu2˘m + ση2 .
(a)
(b)
Fig. 1. (a) Barbara (grayscale 512×512), (b) histogram of the local variance of (a), computed via (8). σ ˆη2 (b, 2) = 1.8 · 10−4 (see (9)).
In the present scenario (noise-free image), if we want to segment the locally constant regions we propose to choose a threshold value that is greater than σ ˆη2 (b, ρ), the maximum 2 likelihood unbias estimator ση , but smaller than the unimodal threshold of the local variance estimator, which can be automatically computed via [24]. In Figure 2 we show the result of such segmentation for two threshold values that follow the above constrains.
(6)
Note that if there exists a region where σu2˘m = 0 (i.e. locally constant region) then it is straightforward that σb2m = ση2 . The local sample variance of the observed image b can be easily computed: given a small neighborhood of pixels around pixel m Nm = {bn(l) },
n(l) ∈ Kρ (m), l ∈ [1, P ],
(7)
where P = (2ρ + 1)2 and Kρ (m) is the set of indexes of pixels that are no further away than ρ pixels from location m, we define µ ˆbm (ρ) = mean(Nm ),
σ ˆb2m (ρ) = var(Nm ),
(8)
where mean(.) and var(.) represent the sample mean and variance respectively for each pixel in image b. Then σ ˆη2 (b, ρ) =
P −3 mode(ˆ σb2m (ρ)), P −1
(9)
is the maximum likelihood unbias estimator for ση2 (see [23]). 2) Spatially Adaptive Regularization Based on Locally Constant Regions: In [23] it was noted that for a non-texture noise-free image the local variance estimator (see (8)) will have a unimodal histogram; this is also true for noisy images, nevertheless their histograms will be a shifted version of the noise free one (see [23] for details). To illustrate the previous statement in Figure 1 we show the Barbara image (noise-free
(a)
(b)
Fig. 2. Segmented Barbara based on its local variance (a) with threshold 0.5 ∗ (ˆ ση2 (b, ρ) + τ ) where σ ˆη2 (b, ρ) = 1.8 · 10−4 and τ = 6.9 · 10−3 (computed via [24]) and (b) with threshold τ .
In general, for the case of noisy images, we should use a denoised version of the observed image b in order to segment its locally constant regions. In particular, since we are dealing with images corrupted with Poisson noise and the proposed algorithm is a iterative one (see Section III-D), the current solution u(k) (which will be improved in the next iteration) is ˘ and it still has some residual noise, and an approximation of u (k) therefore we propose to use the local estimator of E{um } to segment the locally constant regions at iteration k. Our aim is to spatially adapt the regularization parameters, and given that (almost) locally constant regions should be strongly denoised while regions with high variability should only moderately denoised we propose to update the regularization parameters with the following rule: let Γ(k) and u(k) be
the (spatial) regularization parameter and the current solution (computed using Γ(k) ) respectively, then we first compute σ ˆη2 (u(k) , ρ), µ(k) = {ˆ µu(k) (ρ)} and σ (k) = {ˆ σµ(k) (ρ)}, and m m (k) proceed to identify the pixels of σ that correspond to locally (k) constant regions: N = {m : σm ≤ α · σ ˆη2 (u(k) , ρ)}, where τ (k) (k) α ∈ [1, σˆ 2 (u(k) ,ρ) ] and τ is the unimodal threshold (comη puted via [24]). Finally we update the spatial regularization parameter via φ ·(1 − α1)+α1 m ∈ N (k+1) (k) γm = βm·γm βm = m (10) ψm ·α2 + 1 m∈ /N where φ and ψ are the normalization (between zero and one) of the numerical values of σ (k) whose indeces belong to and do not belong to N respectively.
!! Ω
(k)
= diag τR,R
X
2 (Dx u(k) n )
+
2 (Dy u(k) n )
. (16)
n∈C
Following a common strategy in IRLS type algorithms [25], the function |x|(q−2)/2 if |x| > R τR,R (x) = (17) (q−2)/2 R if |x| ≤ R , is Xdefined to avoid numerical problems when q < 2 and 2 (k) 2 (Dx u(k) n ) + (Dy un ) has zero-valued components. n∈C
Combining (13) and (14) we can write the quadratic approximation of (2) as (the constant terms are left out) T 1 (k) T (k) (u) = uT H (k) + DT WR D u + c(k) u, (18) 2 (k)
C. Non-negative Quadratic Programming (NQP) Recently [16] an interesting and quite simple algorithm has been proposed to solve the NQP problem: 1 T v Φv + cT v s.t. 0 ≤ v ≤ vmax, (11) v 2 where the matrix Φ is assumed to be symmetric and positive definite, and vmax is some positive constant. The multiplicative updates for the NQP are summarized as follows (see [16] for details on derivation and convergence): |Φnl | if Φnl < 0 Φnl if Φnl > 0 and Φ-nl = Φ+nl = 0 otherwise, 0 otherwise ( " # ) √ c2 + υ (k) ν (k) (k+1) (k) −c + v = min v , vmax (12) 2υ (k) min
where υ (k) = Φ+ v(k) , ν (k) = Φ- v(k) and all algebraic operations in (12) are to be carried out element wise. The NQP is quite efficient and has been used to solve interesting problems such as statistical learning [16] among others. D. Spatially Adaptive Poisson IRN-NQP Algorithm Our aim is to express (2) as a quadratic functional in order to solve a NQP problem. First we note that after algebraic (k) manipulation, QF (u) (see (5)) can be written as 1 (k) QF (u) = uTH (k) u + c(k) u + ζF , (13) 2 where ζF is a constant with respect to u, and c(k) = (k) (k) T (k) T (k) g −H u =A Γ 1 − 2 Aub(k) . While not derived so here, the `q norm of the regularization term in (2) can be represented by a equivalent weighted `2 norm (see [17], [18] for details):
2
λ λ (k) 1/2 (k) (k)
+ζR = uT DT WR Du,(14) R (u) = WR Du
2 2 2
where ζR is a constant with respect to u, IN is a N × N identity matrix, ⊗ is the Kronecker product, C = {1}, N = 1 or C = {1, 2, 3}, N = 3 and (k) D = IN ⊗ [DxT Dy T ]T WR = I2N ⊗ Ω(k) , (15)
to finally note that by using Φ(k) = H (k) + DT WR D and c(k) = AT Γ(k) 1 − 2 Aub(k) we can iteratively solve (1) via (12). It is easy to check that Φ(k) is symmetric and positive definite. Furthermore, we note that the proposed algorithm does not involve any matrix inversion: the fraction in the term c(k) = AT Γ(k) 1 − 2 Aub(k) is a point-wise division (a check for zero must be performed to avoid numerical (k) problems) as Γ b (k) T well as in the term H = A diag (Au) A; also, there 2 is no prohibitive restriction for the forward operator A (e.g. An,k ≥ 0, see [8] and references therein). Two other key aspects of the Poisson IRN-NQP algorithm is that it can autoadapt the threshold value R and that it includes the NQP (k) tolerance (N QP ), used to terminate the inner loop in Algorithm 1 (see [18] for details). Experimentally, α ∈ [1 .. 0.5], γ ∈ [1e-3..5e-1], and L = 35 give a good compromise between computational and reconstruction performance. IV. E XPERIMENTAL R ESULTS For the case of grayscale images, we compared the spatially adaptive Poisson IRN-NQP algorithm with [9] as well as with two state of the art methods: PIDAL-FA [8] and DFS [26]. We use the mean absolute error (MAE = kˆ u − uk1 /n), as the reconstruction quality metric to match the results presented in [8], [26]. we also provide the SNR and SSIM [27] metrics whenever appropriate. For vector-valued (color) images, we compared the proposed algorithm with [9]. All simulations have been carried out using Matlab-only code on a 1.73GHz Intel core i7 CPU (L2: 6144K, RAM: 6G). Results corresponding to the spatially adaptive Poisson IRNNQP algorithm presented here may be reproduced using the the NUMIPAD (v. 0.30) distribution [28], an implementation of IRN and related algorithms. The Cameraman and Peppers images (Figs. IVa and IVb respectively) were scaled to a maximum value M ∈ {5, 30, 100, 255}, then blurred by a 7×7 uniform filter and by a 7×7 out-of-focus kernel (2D pill-box filter) respectively and then Poisson noise was finally added; this matches experiment VI.C setup of [8] for the Cameraman. In Table I we show the ten-trial average Cameraman’s MAE, SNR and SSIM for the spatially adaptive Poisson IRN-
Initialize u(0) = b, for k = 0, 1, ...
Γ(0) = diag(γ (0) )
Γ(k) b (k) , H (k) = AT WF A (Au(k) )2 (k) ΩR = diag τR,R (Dx u(k) )2 +(Dy u(k) )2 ! (k) ΩR 0 (k) WR = (k) 0 ΩR (k)
WF
(a)
(b)
Fig. 3. Input test images: (a) Cameraman (256 × 256), (b) Peppers (512 × 512).
NQP and [9], and Cameraman’s MAE for the PIDAL-FA [8] and DFS [26] algorithms and the ten-trial average Peppers’ MAE and SNR for the spatially adaptive Poisson IRN-NQP and [9]. The quality reconstruction of the proposed algorithm outperforms DFS algorithm and has a better performance than the PIDAL-FA algorithm. The spatially adaptive Poisson IRN-NQP algorithm performed 8 outer loops (with L = 45) requiring 15s and 240s in average (for all cases), for the Cameraman and Peppers deconvolution respectively. V. C ONCLUSIONS The reconstruction quality of the proposed algorithm outperforms state of the art algorithms [8], [26] for grayscale image deconvolution corrupted with Poisson noise. One of the main features of this recursive algorithm is that it spatially adapts the regularization parameter and it is based on multiplicative updates only. To best of our knowledge, the proposed algorithm is the only one that explicitly includes the NHP model for color images within the TV framework.
(a)
(b)
Fig. 4. Blurred (7 × 7 uniform filter) Cameraman (a) with M=30 (SNR=4.6dB), and deconvolved version (b) via the S-A Poisson-IRN-NQP (SNR=10.59dB, MAE=1.18).
R EFERENCES [1] E. Jonsson, S. Huang, and T. Chan, “Total-variation regularization in positron emission tomography,” UCLA CAM Report 98-48, 1998.
= diag
(k)
Φ(k) = H (k) +DT WR D (k) T (k) c = -A Γ 1 − 2τF,F
b Au(k)
u(k,0) = u(k) α (k) (k,0) (k) (k) N QP = γ · kΦ ukc(k) k+c k2
(NQP tolerance)
2
for l = 0, 1, .., L υ (k,l) = Φ+
(k) (k,l)
u
(
, ν (k,l) = Φ-(k) u(k,l) " √ 2 (k)
(k)
(k,l) ν (k,l)
+υ u(k,l+1) = min u(k,l) −c + c2υ(k,l) (k) (k,l+1) (k) if kΦ u kc(k) k +c k2