The use of the linear l1 objective func- tion for the total variation measurement leads to a simplier computational algorithm. Both the steepest descent and an.
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. Y, MONTH 1996
1
A Computational Algorithm for Minimizing Total Variation in Image Restoration Yuying Li and Fadil Santosa Abstract |A reliable and ecient computational algorithm for restoring blurred and noisy images is proposed. The restoration process is based on the minimal total variation principle introduced by Rudin et al [1], [2], [3]. For discrete images, the proposed algorithm minimizes a piecewise linear l1 function (a measure of total variation) subject to a single 2-norm inequality constraint (a measure of data t). The algorithm starts by nding a feasible point for the inequality constraint using a (partial) conjugate gradient method. This corresponds to a deblurring process. Noise and other artifacts are removed by a subsequent total variation minimization process. The use of the linear l1 objective function for the total variation measurement leads to a simplier computational algorithm. Both the steepest descent and an ane scaling Newton method are considered for solving this constrained piecewise linear l1 minimization problem. The resulting algorithm, when viewed as an image restoration and enhancement process, has the feature that it can be used in an adaptive/interactive manner in situations when knowledge of the noise variance is either unavailable or unreliable. Numerical examples are presented to demonstrate the eectiveness of the proposed iterative image restoration and enhancement process. Keywords | image restoration and enhancement, deconvolution, minimal total variation, ane scaling algorithm, Newton method, projected gradient method
I. Introduction
In this paper, we propose an iterative algorithm for image restoration and enhancement. For discrete images, this algorithm minimizes a piecewise linear l1 function with a single inequality constraint. The restoration process has an attractive incremental feature which makes it suitable for an adaptive/interactive algorithm. Let utrue(x; y) specify the grey levels of an unknown black-and-white image in def = [0; a] [0; b]. Assume that we are given an initial image u0 (x; y), which represents the noisy and blurred version of the true image utrue(x; y), i.e., u0(x; y) = (Autrue)(x; y) + "(x; y);
(1)
where (Au)(x; y) denotes a blurring convolution operation and "(x; y) a random white noise. The objective is to restore (to the extent possible) the original image from the noisy and blurred image u0. Unless stated otherwise, we assume in this paper that the variance (or a good estimation) of the noise is given. Many image restoration methods have been proposed since the introduction of the digital image processing in 1960s. The monograph by Lagendijk and Biemond [4] describes many of these approaches. One basic approach is to formulate the restoration problem as a constrained least squares problem. This approach was introduced by Hunt [5], and subsequent improvements have been made [6], [7], [8], [9], [10]. Roughly speaking, the constrained Yuying Li is at Department of Computer Science and Advanced least squares approach speci es an operator C and solves Computing Research Institute, Cornell University, Ithaca, NY 14853. the optimization problem
Her research has been partially supported by the Applied Mathematical Sciences Research Program (KC-04-02) of the Oce of Energy Research of the U.S. Department of Energy under grant DE-FG0290ER25013.A000, and in part by NSF, AFOSR, and ONR through grant DMS-8920550, and by the Cornell Theory Center which receives major funding from the National Science Foundation and IBM corporation, with additional support from New York State and members of its Corporate Research Institute. Fadil Santosa is at School of Mathematics, University of Minnesota, Minneapolis, MN 55455. His research has been partially supported by the Air Force Oce of Scienti c Research under grant F49620-93I-0500, the Department of Energy under grant DE-FG02-94ER25225, and the National Science Foundation under grant DMS-9210489.
min u
subject to
Z
Z
(Cu)2 dxdy
(2)
((Au)(x; y) ? u0(x; y))2 dxdy = 2 :
The most common choice of C corresponds to the Laplacian. Furthermore, certain desired properties in the restored image can be achieved by chosing C and by weighting the L2 norms appropriately. The reader is referred to [4] for further information.
2
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. Y, MONTH 1996
Another more recent approach has been introduced by Rudin et al [1], [3], [2]. They observe that a noise corrupted image is distinguished from a noiseless one by the size of the rst order derivative of the intensity function u(x; y). In [1], [3], [2], the rst order derivatives is measured by R q 2 2
ux + uy dxdy and referred to as Total Variation. Here the subscripts x and y denote the corresponding partial differentiation. Consequently, they propose to restore a noisy and blurred image by minimizing total variation subject to the same constraint in (2): min u
subject to
Z
Z q
u2x + u2y dxdy
(3)
((Au)(x; y) ? u0(x; y))2 dxdy = 2 :
It is worth pointing out that the concept is related to those introduced in [10], [9]. A solution of (3) provides an image with the least total variation among all the images with the standard deviation . It is important to note that the functional being minimized is the integral of the 2-norm of the gradient [ux; uy ] (i.e., the L1 norm of the gradient over the domain ). This is a departure from the earlier constrained least squares approach where the functional to be minimized is typically the square of the 2-norm of the second order derivatives. Numerous examples in [1], [2], [3] provide convincing evidence that minimizing total variation works eectively for many image examples. A comparison of this image enhancement method with others is given in [3]. Dobson and Santosa [11] investigated the theoretical limitations of such a method. Although both formulations (2) and (3) are nonlinear minimization problems with a single equality constraint, they are solved by very dierent computational algorithms. Problem (3) is usually solved by a Lagrangian function approach with an a priori chosen approximate Lagrange multiplier. Many iterative methods, e.g.,[8], [5], correspond to solving the linear system below (A A + C C)u = A u0;
(4)
where the operator A is the adjoint of A; is some a priori chosen parameter. A diculty of the restoration method by solving (4) is the determination of this important but unknown parameter . In [1], [2], [3], the constrained nondierentiable minimization problem (3) is solved by nding the steady state of a nonlinear diusion process. The Euler-Lagrange equation which gives a necessary condition for a minimizer of
(3) is @ ( q ux )+ @ ( q uy ) ? A (Au ? u ): (5) 0 = @x 0 u2x + u2y @y u2x + u2y Homogeneous Neumann boundary condition on u(x; y) is speci ed. To solve the boundary value problem corresponding to the Euler-Lagrange equation, the authors propose to compute the steady state of the nonlinear diusion equation for u(x; y; t) @u = @ ( q ux ) + @ ( q uy ) ? A (Au ? u ); 0 @t @x u2x + u2y @y u2x + u2y (6) with u(x; y; 0) = utrue(x; y) and homogeneous Neumann boundary conditions. The parameter is chosen so that R the constraint (Au ? u0 )2 dxdy = 2 is satis ed, if this is possible. Here, the variable t can be thought of as arti cial time. In computing the steady state solution, they implemented a nite dierence scheme. If the constraint R 2 2
(Au ? u0) dxdy = is satis ed at each iteration, then the computation procedures in [1], [3], [2] correspond to the classical Rosen's projected gradient method [12]. There are a few diculties with this arti cial time evolution method [1], [3], [2] for (3). Firstly, nondierentiability occurs when ux = uy = 0. In [1], [3], [2], a small perturbation has been used to avoid nondierentiability numerically but this may possibly represent a signi cant alteration to the original cost functional. Moreover, since the objective function (3) is not dierentiable and the nonlinear constraint curvature information is not included in the projected gradient direction, it can be slow or even fail to converge to a solution. Secondly, there is no guarantee R that the constraint (Au ? u0 )2 = 2 will be satis ed, particularly when the signal-to-noise ratio (SNR) is relatively high, leading to results which may not be consistent with the data. Thirdly, it is dicult to choose time steps for both eciency and reliability. Attempts have been made to overcome the diculties mentioned above. An active-set type method is proposed in [13] to deal with nondierentiability. This method applies only to the case when A is the identity, i.e., when only random noise is present. Vogel employs a xed-point method [14] to solve a slightly modi ed version of total variation cost function in (3) and incorporates the constraint as a penalty term. A diculty here again is the choice of the size of the penalty parameter. Our objective is to design a reliable and ecient computational algorithm for image restoration and enhancement, based on the total variation minimization principle. How-
LI AND SANTOSA: A MINIMIZATION ALGORITHM FOR IMAGE RESTORATION
3
ever, instead of (3), we use a slightly dierent measure for ing Newton direction is proposed in xIII-B. This direction the rst ordes derivatives [ux; uy ]. Speci cally, we consider is much more expensive to compute but can drastically Z reduce the overall number of minimization iterations and min ju j + juy j dxdy provide better restoration and enhancement. Based on the u(x;y) x Z incremental nature of our algorithm, we propose in xIV to subject to ((Au)(x; y) ? u0(x; y))2 dxdy 2 :(7) use it in an adaptive/interactive manner, suited for the sit
uation when knowledge about the variance of a random The functional above corresponds to that in (3) when we noise is unavailable or unreliable. assume that the image is made up of piecewise constant To illustrate our computational algorithm, we have confunctions over an array of square pixels which are aligned ducted some experiments in Matlab [15] using a Sun Sparc with the x and y axes. 2. We generate data (blurred, noisy images) by convolving We prefer the problem (7) to (3) mainly for two reasons. a known image with a given blurring function, and adding Firstly, a linear discretization of (7) leads to a piecewise measured amounts of random noise. The amount of noise in linear objective function rather than a piecewise nonlinear the data is summerized by the signal-to-noise ratio (SNR): objective function. In particular, 1-dimensional minimizaof the blurred image (dB): SNR def = 10 log variance tion of a piecewise linear function can be done in a simple variance of the noise and ecient manner and subsequently leads to a simple When assessing the quality of the restored images, we conline search procedure in minimizing total variation. We sider the signal-to-noise ratio improvement e SNR are aware that the loss of rotational invariance in the func2 ? u0 k2 (dB): tional may have some eect on image restoration. Secondly, eSNR def = 10 log kkuutrue k ? utruek22 it is known that a nonlinear equality constraint is dicult to follow. The feasible region for (7) strictly includes that When SNR is high, a larger signal-to-noise ratio improveof (3): this allows more exible ways of reducing the to- ment eSNR suggests a better restoration. tal variation. (Our numerical experience indicates that the II. Achieving Feasibility: Deblurring image of the least total variation is typically on the conLet the vector u 2 IRmn be a vector representation of an straint surface, which implies that the original problem has m-by-n discrete image u. Similarly, the vector u0 2 IRmn been solved.) Our computational algorithm consists of two stages. denotes the initial discrete image. Assume that A is an Starting from the initial image u0 , we apply a conjugate mn-by-mn matrix and Au is a discretized approximation gradient process until the inequality constraint in (7) is sat- to the blurring convolution (Au)(x; y). A discretized problem of (7) can be described as is ed. This corresponds to a deblurring process with possible artifacts and noise remaining. Then a second stage min (u) def = kBuk1 subject to kAu ? u0k2 : (8) R of the total variation jux j + juy j dxdy minimization is u2IRmn performed while maintaining satisfaction of the inequality Here a component of the matrix vector product Bu deconstraint. Both the steepest descent and an ane scaling notes either the dierence Ui+1;j ? Ui;j or Ui;j +1 ? Ui;j Newton method will be considered for this minimization where U is a matrix representation of the image u. The disprocess. cretized problem (8) is a piecewise linear and nonlinearly We describe in xII our conjugate gradient process for constrained minimization problem. Computationally, solvachieving inequality constraint feasibility. This is referred ing this problem presents a great challenge since the disto as the deblurring process. The deblurring process is cretized problem is typically very large (e.g., B is 130816immediately followed by total variation minimization, see by-65536 for a 256-by-256 image). Problem (8) is a piecewise linear minimization with a xIII. We propose a descent algorithm for minimizing total R variation juxj + juy j dxdy. At each iteration, the to- single quadratic constraint. In addition to the size of the tal variation of an image is decreased while simultaneously problem, there are two diculties in solving (8). The rst maintaining feasibility of the constraint. This is achieved is achieving and maintaining feasibility for the single inby following a descent direction with a possible correction equality constraint. The second is nondierentiability of for feasibility. We illustrate in xIII-A that the steepest de- the l1 function (u). Traditionally, the single nonlinear inequality constraint scent direction is economical but may fail to converge or take large number of iterations. An alternative ane scal- in (2) and (3) has been handled by a Lagrangian function
4
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. Y, MONTH 1996
approach with a xed a priori estimation for the unknown Lagrange parameter. The quality of image restoration and enhancement depends on this crucial but unknown parameter. We take a dierent approach. Since there is a single quadratic constraint, we choose to achieve the feasibility (thus deblurring of the image) following a minimization process. Speci cally, we consider partially minimizing the convex quadratic function
FIG. 1 may produce a satisfactory image. In general, however, when the conjugate gradient process terminates, the computed image uk still has signi cant noise or artifacts, reduction of the total variation (u) = kBuk1 , as will be described next, is a denoising procedure. Image (C) in FIG. 3 demonstrates the restoration of the motion blurred and noisy image (B) after Stage 1 computation with = 0:95kk2 where is the random noise. The motion blur is averaging over 9 vertical pixels, and SNR=40. Computationally, we have found that, slightly 2; k Au ? u k (9) min 0 2 underestimating the variance of the noise tends to yield u2IRmn higher signal-to-noise ratio improvement. until the feasibility kAu ? u0 k2 is achieved. There are many possible methods for (9) and we choose the conjuIII. Minimizing Total Variation: Denoising gate gradient method [16]. Starting from the available corAfter Stage 1 as described in FIG. 1, feasibility with rupted image u0 , each conjugate gradient step will decrease respect to the inequality constraint is achieved. Hence we kAu ? u0 k2. This typically corresponds to deblurring of the have u such that kAu ? u k . Now we can concen2 initial image u0 . However, since we do not want to satisfy trate onk decreasing thektotal0 variation (u) but maintainAu = u0, we terminate the conjugate gradient process im- ing feasibility of the inequality constraint kAu ? u k . 0 2 mediately when kAu ? u0 k2 . The main cost at each Speci cally, we compute a new image u such that u k+1 conjugate gradient iteration is one matrix vector product has smaller total variation (u) and kAu ? u k k+1 0 2 AT As. As is well known, a conjugate gradient computa- (the image will typically remain similarlyk+1unblurred but tion will bene t greatly from a good preconditioner for the less noisy), i.e., matrix AT A. We subsequently refer the computation in FIG. 1 as Stage 1. If there is only white noise corruption (uk+1) < (uk ); kAuk+1 ? u0k2 : (10) in the initial image u0 , then Stage 1 is not necessary since As in many minimization algorithms, the new image the inequality constraint is satis ed. uk+1 can be computed using a descent method with a line search. Let r(u) denote the gradient of the total variak = 0; r0 = ?(AT Au0 ? AT u0 ); tion (u). A descent direction sk satis es r(uk )T sk < 0. while kAuk ? u0k2 > The basic idea of a descent method is then to follow sk k = k + 1; and decrease the total variation (u) as much as possible, if k = 1 possibly with corrections to maintain feasibility for the ins1 = r 0 ; equality constraint kAu ? u0k2 . else Assume for now that we have a descent direction sk . We k = rkT?1rk?1=rkT?2rk?2; observe that the minimizer k of min>0 (uk + sk ) can sk = rk?1 + k sk?1; be computed eciently since the function (u) is pieceend wise linear. If uk + k sk satis es the inequality constraint yk = AT Ask ; kAu ? u0k2 , then we can simply let uk+1 = uk + k sk .
k = sTk yk ; Otherwise we project the points uk + sk onto the conk = rkT?1rk?1= k ; straint surface kAu ? u0k2 = to nd an image with less uk = uk?1 + k sk ; total variation (u). Speci cally, using a simple backtrackrk = rk?1 ? k yk ; ing technique, we search along
end
end
Fig. 1. Conjugate Gradient Method for Feasibility
When SNR is high, the conjugate gradient process in
d() def = uk + sk + ()A(AT uk ? u0)
(11)
for an image u close to uk + k sk but with smaller (u). Here () denotes the correction stepsize such that
kAu() ? u0 k2 = :
LI AND SANTOSA: A MINIMIZATION ALGORITHM FOR IMAGE RESTORATION
At the iteration k, given any , () can be easily computed by solving the above quadratic equation. Each time a line search with correction is performed, two matrix vector products of the form AT Au need to be computed. More details about the line search can be found in Appendix. Our proposed descent algorithm for image restoration and enhancement works in a simple fashion and is described in FIG. 2. Next, we address the issue of how to compute a good descent direction sk .
A. Steepest Descent Directions The simplest descent direction is the negative gradient. Assume that there is no zero component for the residual Buk . Then r(u) = B T sign(Bu). The steepest descent def T direction sSD k = ?B sign(Buk ) can be computed easily. If the matrix B is explicitly available, this amounts to a very sparse matrix vector product. In FIG. 3, Image (D) is the further enhancement of Image (C) from Stage 1 via the steepest descent method: the signal-to-noise ratio improvement rises from 4:292 to 5:662. The size of the images in FIG. 3 is 372-by-346. The stopping tolerance tol for this computation equals 0:5 10?3 . Although the steepest descent method often leads to fairly good improvement, it can fail to converge or converge extremely slowly. In FIG. 4, we keep SNR=40 and increase the motion blurring to over 21 samples. Observe that the quality of Image (D) is rather unsatisfactory even though we have attained an increase in the signal-to-noise ratio improvement. There are two reasons for this. First, at a nondierentiable point uk , the steepest descent direction may lead to very small progress since the minimizer of min0 (uk + sk ) may be extremely small. Secondly, if the current image is close to the constraint surface, the steepest descent direction can be a poor descent direction because it does not include any curvature information of this inequality constraint. We examine a more sophisticated alternative ane scaling Newton direction next. B. Ane Scaling Newton Directions Since a general unconstrained linear l1 problem has an equivalent linear programming formulation, a projected gradient type method can be used to handle its nondierentiability, e.g.,[17], [18]. At a typical iteration, a projected gradient method handles nondierentiability by following all the nondierentiable hyperplanes exactly, if this is possible. At a vertex all but one nondierentiable hyperplanes are followed exactly based on Lagrange multipliers. Since the size of an image restoration problem is typically large,
5
the possibility of visiting a combinatorial number of vertices seems formidable. Recently, interior point methods have become a popular alternative for overcoming diculties due to linear constraints to avoid large number of iterations e.g., [19], [20]. In addition to important theoretical advantages, an interior point method can typically solve a large problem in a small number of iterations. Ane scaling methods are particularly attractive interior point methods because of their simplicity. In particular, interior point methods for unconstrained linear l1 minimization have been considered [21], [22], [23]. We choose to adapt Coleman-Li's globally and quadratically convergent algorithm [23] to (8) for the following reasons. First this algorithm works on an l1 problem directly; hence adaptation to the total variation minimization problem (8) can be done in a natural fashion. Secondly, unlike other interior point methods, the algorithm in [23] seems less sensitive to a starting point; the computed image from Stage 1 can be readily used. Thirdly, it has appealing convergence properties: global and quadratical convergence. Finally, the algorithm in [23] computationally exhibits the typical ane scaling method behavior: a small number of iterations are required to solve a large problem. The central idea of the algorithm proposed in [23] is to employ ane scaling to generate a good descent direction so that nondierentiability does not immediately prohibit decrease of the objective function (u). To adapt this approach to the image restoration problem (8), we further incorporate the constraint kAu ? u0 k2 when determining a good descent direction. Let r denote the residual and g denote the sign of the residual, i.e., r def = g def =
"
"
#
kAu ? u0 k22 ? 2 ; Bu
(12)
0 : sign(Bu)
(13)
#
Hence nondierentiability of (u) occurs when a residual component (Bu)i is zero. Similar to the motivation of the algorithm in [23] for an unconstrained l1 problem, we consider the nonlinear system below which captures optimality of (8) J = 0; diag(r)(g ? ) = 0;
(14)
where J def = [AT (Au ? u0); B T ] (for more details see [23]). Here is the dual multiplier vector for (8). In addition,
6
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. Y, MONTH 1996
Stage 1 % Deblurring Starting from u0, compute an image uk satisfying kAuk ? u0k2 by applying a CG method in FIG. 1;
Stage 2 % Denoising Step 1 Compute a descent direction sk for (u); Step 2 Compute a stepsize k such that d(k ) de ned in (11) satis es the feasibility kAd(k ) ? u0 k2 and (uk +d(k )) < (uk ). Let uk+1 =
uk + d(k ); If (ukj)?(uk(u)jk+1 ) tol, terminate; otherwise, k := k + 1 and go to Step 1;
Fig. 2. A Computational Algorithm for Image Restoration and Enhancement
Fig. 3. Restoration of a Blurred (motion with 9 Samples) and Noisy Image (Image (C) from Stage 1 only and Image (D) from both stages with the steepest descent direction, = 0:95kk2)
LI AND SANTOSA: A MINIMIZATION ALGORITHM FOR IMAGE RESTORATION
7
Fig. 4. Restoration of a Blurred (Motion with 21 Samples) and Noisy Image (Image (C) from Stage 1 only and Image (D) from both stages with the steepest descent direction, = 0:95kk2)
dual feasibility of (8) requires that ji j 1 for any ri = 0 where 0 is a vector of all zeros and the operation max is as de ned in Matlab [15]. and i > 1. Similar to the algorithm in [23], we can globalize the A Newton step for (14) is Newton step (15) by computing a descent direction as be# #" " sk Jk k 1 A T A (15) low " # #" k+1 diag(gk ? k )JkT ?diag(rk ) T A Jk s j + )A ((1 ? ) j k k k k 1 # " 0 k+1 diag(jgk ? (1 ? k )k j)JkT ?diag(jrk j) : = ? " # diag(rk )gk 0 =? : (17) diag(jrk j)gk In order for the coecient matrices to be suciently nonsingular at each iteration and sk to be a good descent di- The coecient matrix of the above system converges to the rection, globalization of Newton steps (15) is necessary. coecient matrix of (15) asymptotically (thus providing Let the parameter k denote, asymptotically, a measure- good approximations to the Newton steps). ment of optimality, e.g., Let Dk denote the diagonal scaling
def D = kcs def = max(kdiag(rk )(gk ? k )k1 # "k 0 diag(jgk ? (1 ? k )k j:=jrkj) 21 p kdf def = max(max(?k 1 ; 0); max(max(jk j ? jgk j; 0))) ; (1 ? j + I ) j 0 k k k df 1 cs (16) k def = max(k ; csk ) df ; where := denotes componentwise division of vectors as in 0:9 + max(k ; k )
8
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. Y, MONTH 1996
Matlab [15]. A simple algebraic manipulation shows that the direction sk can be computed from a weighted least squares solve "
#
"
#
T 1 = ?Dk? 2 gk ; Dk Jk sk LS A 0
(18)
and k+1 = diag(jrk j)?1diag(jgk ? (1 ? k )k j)JkT sk + gk : The coecient matrix of the least squares problem (17) is typically large and sparse with a single dense row (the rst row of J). For eciency the structure of this single dense row need to be exploited in the sparse linear least squares solve, e.g., [24]. It can be easily veri ed that the solution to (15) satis es the following property: JkT s = ?diag(jrk j)diag(jgk ? (1 ? k )k j)?1 (gk ? k+1); where k+1 is a new approximation to the Lagrange multiplier vector. If the coecient matrices of (14) are always nonsingular, then, under some conditions, the multipliers fk g are bounded. Hence nondierentiability in (u) and the inequality constraint kAu ? u0k2 will not prohibit a large stepsize. To illustrate the potential of the proposed ane scaling Newton algorithm, we apply both the steepest descent and ane scaling Newton directions to a smaller 128-by-128 image, see FIG. 5. The stopping tolerance tol for this computation equals 0:5 10?4. The ane scaling Newton step is computed using a sparse least squares solve exploiting the single dense row structure. The images (C) and (D) in FIG. 5 illustrate the restored images from steepest descent and ane scaling Newton methods respectively. The computed image (C) using the steepest descent algorithm achieves the total variation (u) = 1:40 105. The ane scaling Newton algorithm yields an image (D) with the total variation (u) = 1:23 105. Thus, the image from the ane scaling Newton method has higher signal-to-noise ratio improvement and smaller total variation. Further evidence of the superior behavior of the ane scaling Newton method can be found in Fig. 6. Moreover, increase number of iterations for the steepest descent method will not be able to produce better images (the steepest descent method seems unable to converge to a solution). The better quality of the ane scaling Newton enhancement comes, however, with signi cant more computation: the steepest descent algorithm took 1001 CPU seconds while the Newton method took 9397 CPU seconds.
Blur 9 13 21 9 13 21 9 13 21
SNR Newton Steepest Descent 40 8.601 7.485 40 9.449 8.024 40 10.114 6.926 30 6.005 4.701 30 6.722 5.287 30 7.437 5.355 20 5.716 4.827 20 5.796 4.795 20 5.430 4.660 TABLE I
Comparisons of Signal-to-noise Ratio Improvements
Compared to the steepest descent direction, the computation of an ane scaling Newton direction is very expensive for a large image. Moreover, at least in our current implementation, the explicit matrices A and B are assumed. Although, the matrix A and B are typically very sparse, computing an ane scaling Newton direction via a least squares solve can be too costly. It is crucial to investigate alternative way (perhaps an iterative method) for computing an ane scaling Newton direction (18) by exploiting the structure of the weighted least squares problem. On the other hand, the increasing computing power may make it possible for a sophisticated algorithm: the ane scaling Newton method may be more attractive for parallel computation since it takes a small number of iterations with the major cost of a weighted least squares solve per iteration. Assuming availability of a good parallel sparse least squares solver (which can be available for some supercomputing environment), the ane scaling Newton method may be a good candidate for an ecient and reliable image restoration. In addition, one may choose to combine the two to utilize the advantages of both the steepest descent and the ane scaling Newton directions: use the steepest descent direction when it continues to produce good reduction of the total variation (u) and switch to the ane scaling Newton direction when the steepest descent direction ceases to bring noticeable decrease. We have run several other computations with the present 128-by-128 image with dierent SNR and motion blurring. The siginal-to-noise ratio improvements obtained from the ane scaling and Newton and the steepest descent methods are summarized in Table I.
LI AND SANTOSA: A MINIMIZATION ALGORITHM FOR IMAGE RESTORATION
9
Fig. 5. Restoration of a Blurred (Motion with 21 Samples) and Noisy Image (Image (B) from the steepest descent and Image (D) from the ane scaling Newton direction, = 0:95kk2)
Fig. 6. Restoration of a Blurred (Motion with 13 Samples) and Noisy Image (Image (C) from the steepest descent and Image (D) from the ane scaling Newton direction, = 0:95kk2)
10
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. Y, MONTH 1996
IV. Adaptive/Interactive Image Restoration and Enhancement
The proposed image restoration and enhancement algorithm in FIG. 2 has an attractive feature: restoration is achieved in an incremental fashion. Stage 1 corresponds to deblurring of the initial image u0 . The total variation (hence the noise) is decreased at each iteration in Stage 2 while images remain less blurred. Thus far we have assumed that u0 , Au and = k"k2 are available. In practice, however, the parameter is often estimated. In FIG. 7, Image (D) is computed with = 0:95k"k2 while Image (C) is obtained with estimation = 0:9k"k2. The total variation for Image (D) has lower total variation ((uk ) = 1:13 5) than that of Image (C) ((uk ) = 1:68 5). The signal to noise ratio improvement of Image (D) is higher than that of Image (C). Therefore, Image (D) is clearly preferable to Image (C). Both image restorations are computed with the ane scaling Newton method (they take roughly the same cpu time). This example demonstrates that restoration can be poor if the parameter is set too small. Since our restoration and enhancement algorithm is incremental by nature, an adaptive/interactive restoration and enhancement algorithm can be devised as follows. Starting from the parameter signi cantly smaller than the true noise variance, Stage 1 of the algorithm in FIG. 2 is applied. The noise in the image is then reduced by Stage 2 in FIG. 2. At the end of Stage 2 computation, if the image still appears to have too much noise or artifacts, can be slightly increased and Stage 2 computation is reapplied. This can be repeated until the noise level in the computed image becomes acceptable. An important observation is that, each time the parameter is increased, Stage 2 starts from the current best image and further reduces the total variation. The computation does not start from scratch. Moreover, one needs to be careful about the amount of increase of the parameter . Perhaps a similar scheme which allows both the increase and the decrease of the parameter adjustment can also be useful. The adaptive procedure can be automated if an appropriate criterion measuring the quality of the restoration can be formulated. V. Concluding Remarks
Based on minimizingtotal variation introduced by Rudin et al in [1], [2], [3], we propose a computational algorithm for image restoration and enhancement. For a discrete image, this algorithm minimizes a piecewise linear l1 total variation measure (u) = kBuk1 subject to a single in-
equality constraint. Our computational algorithm consists of two natural stages: a deblurring stage and a denoising stage. In Stage 1, the single inequality constraint feasibility is achieved using a conjugate gradient process. Subsequently, the total variation is decreased at each iteration while maintaining constraint feasibility. Reduction of the total variation is achieved by a descent algorithm with a line search. We have considered two types of descent directions, the steepest descent and the ane scaling Newton directions. The steepest descent direction process is typically very sequential (takes a large number of iterations) and can fail to converge to a solution. However, a steepest descent direction is cheap to compute. Moreover, it does reduce noise signi cantly. Nonetheless, for severely blurred or noisy images, restoration using the steepest descent method may be inadequate. Therefore, we have considered an alternative ane scaling Newton method to overcome the diculty of the steepest descent method due to nondierentiability and constraint curvature. The proposed ane scaling Newton algorithm is based on the ane scaling algorithm of Coleman and Li [23] for an unconstrained linear l1 problem. The main cost of the ane scaling Newton method for our current implementation is solving a weighted least-squares problem. Although our ane scaling Newton algorithm is a natural extension of the algorithm proposed in [23] for an unconstrained linear l1 problems, a rigorous convergence analysis needs to be done. For large images, the computation for either steepest descent and ane scaling Newton direction is extensive, particularly true with the latter. Therefore image restoration and enhancement has great potential of gaining eciency on a parallel computer. The ane scaling Newton method takes small number of iterations for large problems and hence may be more suitable for parallel computation. In addition, it is imperative to exploit the structure of the weighted least squares problem which de nes the ane scaling Newton direction. A good iterative method (including preconditioning) for computing an ane scaling Newton direction will be very useful and worthy of research. We plan to investigate this possibility in the near future. The incremental nature of our image restoration and enhancement algorithm leads naturally to an adaptive/interactive procedure that can be used when the amount of noise is unknown or poorly estimated. Preliminary experience with our restoration and enhancement al-
LI AND SANTOSA: A MINIMIZATION ALGORITHM FOR IMAGE RESTORATION
11
Fig. 7. Image (C) with = 0:9k"k2 and Image (D) with = 0:95k"k2
gorithm suggests that it is eective. Additional a priori constraints can be easily incorporated in our formulation and computational methods. Finally, we would like to comment on the potential of our computational method for other applications. The total variation approach can be applied to a variety of inverse problems for which resolution loss or blurring is inherent. Examples of such problems include tomography and electrical impedance imaging, e.g., [25]. The present computational approach can be adapted naturally to such inverse problems. Acknowledgements
Appendix
Line Search for Decreasing Total Variation and Maintaining Blurring Constraint Feasibility
Given a descent direction sk of (u), we provide an example of a line search procedure for reducing the total variation and maintaining constraint feasibility. First, we temporarily ignore the quadratic constraint kAu ? u0k2 and compute the minimizer k of the total variation (uk +sk ). This can be done in a straightforward manner since the function (u) is piecewise linear. Let d def = B s. We examine the breakpoints of the piecewise linear function (u) along sk :
The authors express their gratitude to Tom Coleman and Gonzalo Arce for helpful comments on this work, to ChunBR def = f? drk i : ? drk i > 0g: guang Sun for use of the Matlab sparse least squares solve ki ki routines. We are also grateful to anonymous referees for their useful suggestions and for pointing out a few related For simplicity, we assume that BR has dimension t and the values are arranged in strictly increasing fashion. Moreworks. over, BR(i) = ? drkiki . Then we can move along the direction sk and check whether the direction sk continues to be descent after crossing each break point BR(i). If kA(uk + k sk ) ? u0k2 , then (uk + k sk ) < (uk ). This stepsize is further backtracked slightly to avoid exact nondierentiability and the line search is done. This is summarized below.
12
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. Y, MONTH 1996
% Compute the minimizer k of (uk + sk ) ratio := gT d; i := 1; := 0; 1 := 0; while ratio < 0 and i length(BR) do ratio := ratio ? 2 g(i) d(i); 1 := ; := BR(i); i = i + 1;
end
:= 1 + max(0:9; 1 ? ) ( ? 1); unew := uk + s; if kAunew ? u0k2 return end; If uk + k sk is outside the constraint kAu ? u0k2 , then the point uk + sk is projected onto the constraint surface kAu ? u0 k2 = (along its normal direction) in a backtracking fashion to compute an image with smaller total variation (u). Assume that gamma(u; s; ; t) returns the value such that
kA(u + s + t) ? u0k22 ? 2 = 0: The following is an example of such a backtracking procedure. % project to the constraint surface kAu ? u0 k2 = t := AT (A u ? u0 );
:= gamma(u; s; ; t); unew := u + s + t; while (unew ) > (u) do := 0:1 ;
:= gamma(u; s; ; t); unew := u + s + t;
end
Yuying Li received her B.S.in Mathematics
Department from Sichuan Unviersity in People's Republic of China, her M.S. and Ph.D. in ComputerScience from University of Waterloo. She is now a senior research associate in Computer Science Department at Cornell University. Her researchinterests include both continuous and nondierentiable optimization. She has also been interested in application problems such as tomographic inversion and image restoration.
Fadil Santosa received his B.S. degree in Me-
chanical Engineering from the University of New Mexico, and his M.S. and Ph.D. degrees in Theoretical and Applied Mechanics from the University of Illinois. He has held positions at Cornell University and University of Delaware. He is now Professor of Mathematics in the School of Mathematics at the University of Minnesota where he is also serving as the Associate Director of the Minnesota Center for Industrial Mathematics. His research interests are inverse problems arising in nondestructive evaluation and geophysics, wave propagation, and image restoration.
LI AND SANTOSA: A MINIMIZATION ALGORITHM FOR IMAGE RESTORATION [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22]
13
References Tech. Rep. Technical Report, Department of Mathematics and Statistics, Univerisity of Maryland, 1991. L.I. Rudin and S. Osher, \Feature-oriented image enhancement using shock lters", SIAM J. Numer. Anal., vol. 27, pp. 919{ [23] Thomas F. Coleman and Yuying Li, \A globally and quadratically convergent ane scaling method for linear l1 problems", 940, 1990. Mathematical Programming, vol. 56, pp. 189{222, 1992. L.I. Rudin, S. Osher, and C. Fu, \Total variation based restoration of noisy blurred images", SIAM J. Numer. Anal., p. to [24] Chunguang Sun, \Dealing with dense rows in the solution of sparse linear least squares problems", Tech. Rep. CTC95TR227, appear, ? Advanced Computing Research Institute, Cornell University, L.I. Rudin, S.Osher, and E. Fatemi, \Nonlinear total variation 1995. based noise removal algorithms", Physica D., vol. 60, pp. 259{ [25] D.C. Dobson and F. Santosa, \An image enhancementtechnique 268, 1992. for electrical impedancetomography", Inverse Problems, vol. 10, Reginald L. Lagendijk and Jan Biemond, Iterative Identi cation pp. 317{334, 1994. and Restoration of Images, Kluwer Academic Publishers, 1991. B. R. Hunt, \The application of constrained least squares estimation to image restoration by digital computer", IEEE Transactions on Computers, vol. C-22, pp. 805{812, 1973. A. K. Katsaggelos, J. Biemond, and D. E. Boekee, \Regularized iterative restoration with ringing reduction", IEEE Trans, Acoust., Speech and Signal Processing, vol. 36, no. 12, pp. 1874{ 1888, 1988. A. K. Katsaggelos, J. Biemond, R. W. Merserau, and R. M. Schafer, \A general formulation of constrained iterative image restoration", in Proceedings ICASSP-85, March, 1985, pp. 700{ 703. A. K. Katsaggelos, J. Biemond, R. W. Schafer, and R. M. Merserau, \A regularizediterative image restoration algorithm", IEEE Transactions on Signal Processing, vol. 39, pp. 914{929, 1991. C. Bouman and K. Sauer, \A generalized Gaussian model for edge preserving MAP estimation", IEEE Trans. on Image Processing, vol. 2, no. 3, pp. 298{310, 1993. S. Alliney and S. Ruzisky, \An algorithm for the minimizationof mixed l1 and l2 norms with applicationsto bayesian estimation", IEEE Trans. on Signal Processing, vol. 42, no. 3, pp. 618{627, 1994. D.C. Dobson and F. Santosa, \Recovery of blocky images from noisy and blurred data", Tech. Rep. 94-7, submitted to SIAM J. Appl. Math., University of Delaware Wave Center Report, 1994. J.G. Rosen, \The gradient projection method for nonlinear programming, part ii, nonlinear constraints", J. Soc. Ind. Appl. Math., vol. 9, pp. 514{532, 1961. K. Ito and K. Kunisch, \An active set strategy for image restoration based on the augmented lagrangian formulation", Tech. Rep. C.R. Vogel and M.E. Oman, \Iterative methods for total variation denoising", Tech. Rep. The MathWorks Inc., Matlab Reference guide, The MathWorks, Natick, Mass, 1992. Gene H. Golub and Charles F. Van Loan, Matrix Computations, The Johns Hopkins University Press, 1989. I Barrodale and F. D. K. Roberts, \An improved algorithm for discrete l1 linear approximation", SIAM J. Num. Anal., vol. 10, pp. 839{848, 1973. R. H. Bartels and A. R. Conn, \An approachto nonlinear l1 data tting", Tech. Rep. CS-81-17, Computer Science Department, University of Waterloo, 1981. N. Karmarkar, \A new polynomial-time algorithm for linear programming", Combinatorica, vol. 4, pp. 373{395, 1984. E.R. Barnes, \A variation on karmarkar's algorithm for solving linear programming problems", Mathematical Programming, vol. 36, pp. 174{182, 1986. M. Meketon, \Least absolute value regression", Tech. Rep., AT&T Bell Laboratories, Holmdel, 1988. Y. Zhang, \A primal-dual interior point approach for computing the l1 and l1 solutions of overdetermined linear systems",