J. Inverse Ill-Posed Probl. 7 (2016), 1–24 DOI 10.1515/jip-2016-111
© de Gruyter 2016
A proximal iteratively regularized Gauss-Newton method for nonlinear inverse problems Hongsun Fu*, Hongbo Liu*, Bo Han, Yu Yang and Yi Hu Abstract. In this paper we discuss the construction, convergence analysis, and implementation of a proximal iteratively regularized Gauss-Newton method for the solution of nonlinear inverse problems with a specific regularization that linearly combines the L2 norm and L1 -norm penalties. This regularization combines two very powerful features: the advantages of L1 -norm based penalty which impose less smoothing on the reconstruction parameter, and the general L2 -norm stabilizing term which can lead to smaller errors in some cases. However, non-linearity and non-smoothness of the problem make it challenging to find an efficient numerical solution. By using the proximal mapping, we derive a generalization of iteratively regularized Gauss-Newton algorithm to handle such nonsmooth objective functions. Analysis on local convergence is carried out in the presence of observation noise. Parameter identification in numerical simulations of partial differential equations demonstrates the efficiency of the proposed method. Keywords. Nonlinear ill-posed, proximal regularized Gauss-Newton method, parameter identification. 2010 Mathematics Subject Classification. 34A55, 47J06, 65F22.
1 Introduction In this paper, we will consider the nonlinear inverse problem which can be formulated as the operator equations F (x) = y, (1.1) where F : D(F ) ⊂ X → Y is a nonlinear operator between the Hilbert spaces X and Y with domain D(F ) and range R(F ), and X and Y are function spaces on the bounded domains Ω ⊂ Rn and ω ⊂ Rm , respectively. Here we assume that X is compactly embedded into L2 (Ω), and Y is either L2 (ω) or H 1 (ω). This problem is motivated by the parameter identification problems for partial differential equations in [9]. This work is partially supported by the National Natural Science Foundation of China (Grant Nos. 41304092, 41474102, 61472058), the Program for New Century Excellent Talents in University (Grant No. NCET-11-0861), the Fundamental Research Funds for the Central Universities (Grant No. 3132014226), and the China Scholarship Council (Grant No. 201506570019).
2
H. Fu, H. Liu, B. Han, Y. Yang and Y. Hu
Inverse problems are usually ill-posed in the sense that small perturbations in the observed data may result in large deviations in the solution [10]. In many applications, the perturbations are from measurement errors caused by noise. Thus given the noisy data, it is important to find an approximation to a stable solution. In practice, the available data in (1.1) is an approximation y δ to y satisfying ∥y δ − y∥Y ≤ δ,
(1.2)
where δ > 0 is a given small noise level, and ∥ · ∥Y is the norm induced by the inner product on Y. Due to the ill-posedness, problem (1.1) has to be stabilized by regularization methods [33]. The most commonly used method of regularization for nonlinear problems is Tikhonov regularization, i.e., to seek an xδα (as an approximation to the solution of (1.1)) to minimize the quadratic functional 1 α ∥F (x) − y δ ∥22 + ∥x − x0 ∥22 , 2 2 where ∥ · ∥2 is the L2 -norm, α > 0 is the regularization parameter, and x0 is an initial guess that plays the role of a selection criterion. The second term in the functional determines the properties of the solution. The minimizer is easy to compute and is usually based on the Gauss-Newton method due to the convexity and differentiability of ∥ · ∥22 . The downside, however, is that the L2 -norm penalty function α2 ∥x − x0 ∥22 can cause considerable over-smoothing of the solution. Thus the L2 -norm constraints are not suitable if the solution is sparse or spatially inhomogeneous (containing discontinuities). In many applications such as medical image restoration and parameter identification, regularization methods using L1 -norms or total variations (TV) have become popular instead of L2 -norm regularization. To be precise, for some regularization parameter β > 0, an approximation to the solution of (1.1) is obtained as a minimizer of 1 Jβ (x) = ∥F (x) − y δ ∥22 + β∥W (x − x0 )∥1 , 2
(1.3)
where ∥ · ∥1 is the L1 -norm and W is a regularization matrix describing the prior distribution. Common choices for W are the identity matrix or a matrix approximating the first or second order derivative operator (see [18, 20]). Specifically, if W is a discretized representation of the gradient operator D, the regularization with L1 -norm leads to total variation regularization, and if W is the identity matrix I, the regularization with L1 -norm leads to sparsity regularization. The advantages of these methods are that they impose less smoothness on the solution, which is useful for discontinuity detection and sparsity recovery. Because of its central importance in inverse problems and signal processing, the efficient minimization of
A proximal method for nonlinear inverse problems
3
the functional Jβ has received much attention, and a wide variety of numerical algorithms have been proposed in [8, 17, 21, 26, 30]. The downside to these methods is that they often require inverting potentially ill-conditioned operators, which leads to numerical problems. One possible remedy is to regularize (1.3), e.g. by Tikhonov regularization. In this paper, in order to address both issues, we introduce the traditional Tikhonov regularization with L1 -norm penalty function and investigate a hybrid regularization scheme: 1 α Jα,β (x) = ∥F (x) − y δ ∥22 + ∥x − x0 ∥22 + β∥W (x − x0 )∥1 , (1.4) 2 2 by choosing appropriate regularization parameters α, β > 0. The motivation for the hybrid regularization scheme originates in the results of Jin et al. [22], Mazzieri et al. [24], and Borsic and Adler [6]. This is a non-linear and non-smooth composite minimization problem. Our main goal is to resolve the computational obstacles posed by the non-differentiability of the L1 -norm. Motivated by iteratively regularized Gauss-Newton method [2], we extend the regularized Gauss-Newton algorithm to solving this non-smooth optimization problem through a simple proximal mapping and establishing its local convergence (see Theorem 3.5). The rest of this paper is organized as follows. In Section 2, we list some notations and discuss the well-posedness for hybrid regularization (1.4), and give some preliminaries on proximity operators that are necessary for our method. In Section 3, we formulate the proximal regularized Gauss-Newton method. Its convergence is analyzed, and the implementation of our algorithm is presented in detail. In Section 4, we illustrate numerical experiments to support theoretical analysis. In Section 5, we conclude the paper with some remarks.
2 Preliminaries 2.1 Well-posedness for hybrid regularization We first address the well-posedness of the problem (1.4). For simplicity, we introduce the functional Rη defined by 1 Rη (x) = ∥x − x0 ∥22 + η∥W (x − x0 )∥1 , for some η > 0. 2 Throughout this paper we will need the following definition of Rη -minimizing solutions, as well as some additional assumptions. Definition 2.1. An element x† is said to be an Rη -minimizing solution to the inverse problem (1.1) if Rη (x† ) = inf{Rη (x) : F (x) = y} < ∞.
4
H. Fu, H. Liu, B. Han, Y. Yang and Y. Hu
Assumption 2.2. (A1) F is weakly closed, continuous and Fréchet differentiable at x ∈ D(F ) ̸= ∅. The Fréchet derivative of F at x ∈ D(F ) will be denoted as F ′ (x), and F ′ (x)∗ will be used to denote the adjoint of F ′ (x). Moreover, F ′ (x) satisfies the Lipschitz condition, i.e. there exists a positive constant L such that ∥F ′ (x1 ) − F ′ (x2 )∥2 ≤ L∥x1 − x2 ∥2 ,
∀x1 , x2 ∈ D(F ).
(2.1)
This condition implies an estimate for the Taylor remainder, namely ∥F (x1 ) − F (x2 ) − F ′ (x2 )(x1 − x2 )∥2 ≤
L ∥x1 − x2 ∥22 . 2
(2.2)
(A2) Problem (1.1) has an Rη -minimizing solution x† such that F (x† ) = y. If x† is not unique, it always refers to a x0 -minimum-norm solution, i.e., an element minimizing ∥x − x0 ∥L2 over the set of solutions to F (x) = y. The proof of the next result is standard (cf., e.g., [15, 16]), and is thus omitted. Theorem 2.3. Under Assumption 2.2, the problem (1.4) is well-posed and consistent, i.e. (i) For every y δ ∈ Y, there exists at least one minimizer xδα,β of the functional in problem (1.4). (ii) For a sequence of data {y n } with y n → y δ in Y, the sequence of corresponding minimizers {xnα,β } contains a subsequence converging to xδα,β . (iii) Let the regularization parameters α(δ) and β(δ) satisfy α(δ), β(δ),
δ2 δ2 , → 0 as α(δ) β(δ)
δ → 0.
(2.3)
Suppose there exists some constant η ≥ 0 such that β(δ) = η. δ→0 α(δ) lim
(2.4)
Then the sequence of minimizers {xδα,β }δ converges to the Rη -minimizing solution x† .
2.2 Proximity operators The proximity operator of a convex function is a natural extension of the projection operator onto a convex set, which was first introduced by Moreau [25]. It plays
A proximal method for nonlinear inverse problems
5
a central role in the analysis and the numerical solution in convex optimization problems, and has recently been used extensively in various inverse problems in signal recovery [5, 11, 28]. Below we briefly recall some basic facts. For detailed description of the theory of proximity operators, we refer the readers to [11, 25]. For simplicity, given x ∈ X , we use the shorthand notation J(x) = β∥W (x − x0 )∥1 . The effective domain of the functional J : X → [0, +∞] is domJ = {x ∈ X |J(x) < +∞} and the set of its minimizers is denoted by argminJ. The subdifferential of J is the set-valued operator ∂J :X → 2X x 7→ {ξ ∈ X |J(z) − J(x) − ⟨ξ, z − x⟩L2 ≥ 0,
∀z ∈ domJ}.
Note that ∂J(x) could be an empty set. Let D(∂J) := {x ∈ domJ : ∂J(x) ̸= ∅} . Then ∀x ∈ X ,
x ∈ argminJ ⇐⇒ 0 ∈ ∂J(x).
The proximal mapping of a convex functional J at z is { } 1 2 proxJ (z) := inf J(x) + ∥x − z∥2 . x∈X 2 Recall that the classic proximal Newton-type methods for composite optimization use the proximal mapping to handle the nonsmooth part of the objective function. To achieve faster rate of convergence, we will use a scaled proximal mapping in our method. Definition 2.4. (Scaled proximal mapping) [23]. Let H : X → X be an operator that is continuous, positive, self-adjoint, and bounded from below. Clearly H is invertible. Then the scaled proximal mapping of a convex functional J at z is defined as } { 1 2 H (2.5) proxJ (z) := inf J(x) + ∥x − z∥H , x∈X 2 where ∥ · ∥H is the norm induced by the inner product ⟨·, ·⟩H on X with ⟨x, z⟩H = ⟨x, Hz⟩L2 . For each z, the value proxH J (z) is called a scaled proximal point. Here are some important properties related to the scaled proximal point:
6
H. Fu, H. Liu, B. Han, Y. Yang and Y. Hu
(1) The scaled proximal mapping proxH J is well-defined, i.e., for each z ∈ domJ, the value proxH (z) exists and is unique. This is because the proximity funcJ tion is strongly convex if H is positive definite. (2) Let ∂J(z) be the sub-differential of J at z. By the first order optimality conditions for (2.5), we get p = proxH J (z) ⇐⇒0 ∈ ∂J(p) + H(p − z)
(2.6)
⇐⇒Hz ∈ (∂J + H)(p), which gives
−1 proxH J (z) = (H + ∂J) (Hz).
(3) The scaled proximal mapping is strictly nonexpansive in the induced norm H ∥ · ∥H . That is, if u = proxH J (z1 ) and v = proxJ (z2 ), then (u − v)T H(z1 − z2 ) ≥ ∥u − v∥2H , and the Cauchy-Schwarz inequality implies ∥u − v∥H ≤ ∥z1 − z2 ∥H (see e.g. Lemma 2.4 in [11]). (4) The proximity operator proxH J : X → X is Lipschitz continuous with con√ stant ∥H∥2 ∥H −1 ∥2 , namely √ H H ∥proxJ (z1 ) − proxJ (z2 )∥2 ≤ ∥H∥2 ∥H −1 ∥2 · ∥z1 − z2 ∥2 (2.7) (see e.g. Lemma 2 in [27]). (5) Let H1 and H2 be two continuous, positive, and self-adjoint operators on X , both bounded from below. Then H2 H2 −1 1 ∥proxH J (z) − proxJ (z)∥2 ≤ ∥H1 ∥2 ∥(H1 − H2 )(z − proxJ (z)∥2 (2.8)
(see e.g. Lemma 3 in [27]). (6) Combining (2.7) and (2.8), we have 2 ∥proxJH1 (z1 ) − proxH J (z2 )∥L2
H1 H1 H2 1 ≤ ∥proxH J (z1 ) − proxJ (z2 )∥L2 + ∥proxJ (z2 ) − proxJ (z2 )∥L2
≤ (∥H1 ∥L2 ∥H1−1 ∥L2 ) 2 ∥z1 − z2 ∥L2 1
2 +∥H1−1 ∥L2 ∥(H1 − H2 )(z2 − proxH J (z2 )∥L2
(2.9)
for every z1 , z2 ∈ X and H1 , H2 that are continuous, positive, and selfadjoint operators on X , both bounded from below.
A proximal method for nonlinear inverse problems
7
3 Proximal regularized Gauss-Newton method (PRGN) In this section we shall derive an algorithm for the hybrid norm functional (1.4) based on the proximal mapping of the nonsmooth part to minimize composite functions. This can be regarded as a generalization of the regularized GaussNewton algorithm and hence will be called the proximal regularized Gauss-Newton method (PRGN). 3.1 PRGN algorithm In this subsection we will describe the PRGN algorithm in detail. We start from the iteratively regularized Gauss-Newton method xn+1 = xn − (F ′ (xn )∗ F ′ (xn ) + αn I)−1 [F ′ (xn )∗ (F (xn ) − y δ ) + αn (xn − x0 )], (3.1) which was introduced by Bakushinskii [2] in 1992. Moreover, one can easily show that xn+1 has the variational characterization, i.e., the point xn+1 is the minimizer of a “linearized” functional αn 1 ∥x − x0 ∥22 . Fnl (x) := ∥F (xn ) − y δ + F ′ (xn )(x − xn )∥22 + 2 2 The iteratively regularized Gauss-Newton method has been successfully applied to a number of nonlinear inverse problems (see e.g. [3,13,15]), and it is extremely effective in practice. In this paper, we modify the iterative scheme (3.1) and propose the following regularized Gauss-Newton method with a scaled proximal mapping: ( ) H(x ) xn+1 = proxJ n xn − H(xn )−1 [F ′ (xn )∗ (F (xn ) − y δ ) + α(xn − x0 )] , (3.2) H(x ) where α > 0 is a regularization parameter, and proxJ n is the proximity operator associated to J and H(xn ) = F ′ (xn )∗ F ′ (xn ) + αI, as defined in (2.5). The a priori choice gives only an order of magnitude for α and is thus practically inconvenient to use. In contrast, the discrepancy principle [15] enables us to construct a concrete scheme for determining the regularization parameter α. Specifically, one chooses α = α(δ) such that ∥F (xδα,β ) − y δ ∥2 = δ
(3.3)
hold, where xδα,β denotes the regularized solution obtained by minimizing (1.4) with β = ηα.
8
H. Fu, H. Liu, B. Han, Y. Yang and Y. Hu
Due to the inexact nature of y δ , the iterations in (3.2) will terminate early using an a posteriori stopping rule. We will employ the a posteriori discrepancy principle, i.e. the iteration stops after n∗ = n∗ (δ, y δ ) steps with ∥F (xn∗ ) − y δ ∥2 ≤ τ δ < ∥F (xn ) − y δ ∥2 ,
(3.4)
where τ > 3 is a number chosen appropriately. We will conclude this subsection with two important properties of the proximal regularized Gauss-Newton method. Proposition 3.1. Let F : X → Y be a Fréchet differentiable operator. Then the iteration (3.2) is equivalent to xn+1 = arg min ∥F (xn ) − y δ + F ′ (xn )(x − xn )∥22 + x∈X
α ∥x − x0 ∥22 + J(x). (3.5) 2
Proof. Since H(xn ) is invertible, writing the first order necessary conditions, which are satisfied by xn+1 , the followings are equivalent: 0 ∈ F ′ (xn )∗ [F (xn ) − y δ + F ′ (xn )(xn+1 − xn )] + α(xn+1 − x0 ) + ∂J(xn+1 ) ⇔ − F ′ (xn )∗ (F (xn ) − y δ ) + F ′ (xn )∗ F ′ (xn )xn + αx0 ∈ (H(xn ) + ∂J)(xn+1 ) ⇔ xn+1 = (H(xn ) + ∂J)−1 [H(xn )xn + α(x0 − xn ) − F ′ (xn )∗ (F (xn ) − y δ )] ( ) H(x ) ⇔ xn+1 = proxJ n xn − H(xn )−1 [F ′ (xn )∗ (F (xn ) − y δ ) + α(xn − x0 ) . Note that since F (xn ) − y δ + F ′ (xn )(x − xn ) in (3.5) is linearized, this proposition can be solved using the first order methods for the minimization of nonsmooth convex functions, such as the thresholding algorithm or forward-backward methods (see [5, 11, 12]). 2 Proposition 3.2. Let Assumption 2.2 hold. Let xˆ ∈ D(F ) ⊂ X be a local minimizer of Jα,β in (1.4). Then xˆ satisfies the fixed point equation ( ) H(x) ˆ xˆ = proxJ xˆ − H(x) ˆ −1 [F ′ (x) ˆ ∗ (F (x) ˆ − y δ ) + α(xˆ − x0 )] . (3.6) ′ (x, Proof. Let Jα,β ˆ v) be the directional derivative of Jα,β (x) at xˆ in the direction v ∈ X . Then the first order optimality conditions for xˆ implies ′ Jα,β (x, ˆ v) ≥ 0,
∀v ∈ X .
(3.7)
9
A proximal method for nonlinear inverse problems
As a consequence of the differentiability of F and the convexity of J, inequality (3.7) can be rewritten as ⟨−F ′ (x) ˆ ∗ (F (x) ˆ − y δ ) − α(xˆ − x0 ), v⟩L2 ≤ J ′ (x, v),
∀v ∈ X .
Consequently, by Proposition 3.1.6 in [7], we also have [ ] − F ′ (x) ˆ ∗ (F (x) ˆ − y δ ) + α(xˆ − x0 ) ∈ ∂J(x). ˆ
(3.8)
To prove that xˆ satisfies the fixed point equation (3.6), by adding H(x) ˆ xˆ to both sides of (3.8) we have [ ] H(x) ˆ xˆ − F ′ (x) ˆ ∗ (F (x) ˆ − y δ ) + α(xˆ − x0 ) ∈ (H(x) ˆ + ∂J)(x). ˆ Since H(x) ˆ is invertible, the previous equation can also be written as { } H(x) ˆ xˆ − H(x) ˆ −1 [F ′ (x) ˆ ∗ (F (x) ˆ − y δ ) + α(xˆ − x0 )] ∈ (H(x) ˆ + ∂J)(x). ˆ By (2.6) we obtain the desired assertion. 2 Here we introduce some notations for rewriting the conditions in Proposition 3.2 for a local minimizer of (1.4). Define G and G˜ by ( ) G(x) = x − H(x)−1 [F ′ (x)∗ F (x) − y δ + α(x − x0 )] (3.9) and
H(x) ˜ G(x) = proxJ (G(x)) .
(3.10)
If xˆ ∈ D(F ) ⊂ X is a local minimizer of (1.4), then the fixed point equation (3.6) can also be written as ˜ x), xˆ = G( ˆ (3.11) ˜ i.e., xˆ is a fixed point of G. 3.2 Local convergence Next we will state and prove a local theorem for the proximal regularized GaussNewton method defined in (3.2). For this purpose, the following two lemmata will be needed. Lemma 3.3. Let {aδn }n∈N0 , δ ≥ 0 be a family of sequences satisfying 0 ≤ aδn ≤ a and
lim sup aδn ≤ a0
δ→0 n→∞
10
H. Fu, H. Liu, B. Han, Y. Yang and Y. Hu
for some a, a0 ≥ 0. For nonnegative constants b, c, and γ0 , let {γnδ } be nonnegative numbers such that γ0δ := γ0 ,
δ 0 ≤ γn+1 ≤ aδn + bγnδ + c(γnδ )2
for all
0 ≤ n < n∗ ,
(3.12)
where n∗ = n∗ (δ, y δ ) ∈ N0 for any δ > 0, and n∗ → ∞ as δ → 0. For p ∈ [0, a], let γ(p) and γ(p) ¯ denote the roots of the equation p + bγ + cγ 2 = γ, i.e. √ 2p 1 − b + (1 − b)2 − 4pc √ γ(p) := , γ(p) ¯ := . 2c 1 − b + (1 − b)2 − 4pc √ If c > 0, b + 2 ac ≤ 1, and γ0 ≤ γ(a), ¯ then γnδ ≤ max{γ0 , γ(a)},
0 ≤ n ≤ n∗ .
If in addition a0 < a, then lim sup γnδ ∗ ≤ γ(a0 ), δ→0
lim sup γn0 ≤ γ(a0 ). n→∞
Proof. See Lemma 4.11 in [14]. Lemma 3.4. Let X be a Hilbert space and let x† be the Rη -minimizing solution. Assume F (x) is Fréchet differentiable at x ∈ D(F ) ̸= ∅ with derivatives bounded in a neighborhood of x† , and that the data y δ satisfies ∥y − y δ ∥2 ≤ δ < ∥F (x0 ) − y δ ∥2 .
(3.13)
Then the regularization parameter α = α(δ, y δ ) obtained from the discrepancy principle (3.3) satisfies lim α(δ, y δ ) = 0 and
δ→0
δ2 = 0. δ→0 α(δ, y δ ) lim
(3.14)
Proof. The proof follows from Theorem 4.11 in [1]. In the following theorem, we use B(x, r) to denote an open ball of radius r > 0 centered at x. Theorem 3.5. Let Assumption 2.2 hold. For some η > 0, let xˆ := xδα,β be a local minimizer of (1.4) with β = ηα, and let ρ := sup {t ∈ [0, R) : B(x, ˆ t) ⊂ D(F )}
A proximal method for nonlinear inverse problems
11
with a constant R > 0. Let xδ0 := x0 ∈ B(x, ˆ ρ)/xˆ such that xˆ − x0 = F ′ (x) ˆ ∗ν
(3.15)
for some ν ∈ X and ∥ν∥2 ≤ ρ. Let y δ satisfy (3.13) and the regularization parameter α be determined by equation (3.3). Moreover, assume that the nonlinear operator F is properly scaled, i.e. √ ∥F ′ (x)∥2 ≤ α, x ∈ B(x, ˆ ρ). (3.16) If L∥ν∥2 is sufficiently small, then the proximal regularized Gauss-Newton method ( ) δ ˜ δn ) = proxH(xn ) G(xδn ) , n = 0, 1, · · · , n∗ xδn+1 = G(x (3.17) J is well defined, the generated sequence {xδn } is contained in B(x, ˆ ρ), and lim ∥xδn∗ − x† ∥2 = 0.
δ→0
Proof. Since xˆ ∈ D(F ) ⊂ X is a local minimizer of (1.4), it follows from (3.11) that H(x) ˆ
G(x) ˆ − proxJ −1
= −H(x) ˆ
(G(x)) ˆ = G(x) ˆ − xˆ
′
[F (x) ˆ ∗ (F (x) ˆ − y δ ) + α(xˆ − x0 )].
(3.18)
Applying (2.9) with H1 = H(xδn ), H2 = H(x), ˆ z1 = G(xδn ), and z2 = G(x), ˆ and taking into account (3.18), we get ( ) H(xδ ) H(x) ˆ ∥xn+1 − x∥ ˆ 2 = ∥proxJ n G(xδn ) − proxJ (G(x)) ˆ ∥2 ( )1 2 ≤ ∥H(xδn )∥2 ∥H(xδn )−1 ∥2 ∥G(xδn ) − G(x)∥ ˆ 2 ( )( ) H(x) ˆ +∥H(xδn )−1 ∥2 ∥ H(xδn ) − H(x) ˆ G(x) ˆ − proxJ (G(x)) ˆ ∥2
(3.19)
( )1 2 = ∥H(xδn )∥2 ∥H(xδn )−1 ∥2 ∥G(xδn ) − G(x)∥ ˆ 2 ( ) +∥H(xδn )−1 ∥2 ∥ H(xδn ) − H(x) ˆ H(x) ˆ −1 [F ′ (x) ˆ ∗ (F (x) ˆ − y δ ) + α(xˆ − x0 )]∥2 . Applying (3.16), we obtain ∥H(xδn )∥2 = ∥F ′ (xδn )∗ F ′ (xδn )∥2 + α ≤ 2α.
(3.20)
12
H. Fu, H. Liu, B. Han, Y. Yang and Y. Hu
It is clear that ∥H(xδn )−1 ∥2 = ∥[F ′ (xδn )∗ F ′ (xδn ) + αI]−1 ∥2 ≤
1 . α
(3.21)
Then α[H(xδn )−1 − H(x) ˆ −1 ]F ′ (x) ˆ ∗ν = αH(xδn )−1 (H(x) ˆ − H(xδn ))H(x) ˆ −1 F ′ (x) ˆ ∗ν = αH(xδn )−1 F ′ (xδn )∗ (F ′ (x) ˆ − F ′ (xδn ))H(x) ˆ −1 F ′ (x) ˆ ∗ν +αH(xδn )−1 (F ′ (x) ˆ ∗ − F ′ (xδn )∗ )F ′ (x)H( ˆ x) ˆ −1 F ′ (x) ˆ ∗ ν.
(3.22)
Applying the Lipschitz condition (2.1) yields ∥α[H(xδn )−1 − H(x) ˆ −1 ]F ′ (x) ˆ ∗ ν∥2 ≤ 2L∥ν∥2 ∥xδn − x∥ ˆ 2.
(3.23)
Inequalities (3.23) and (2.2) imply ∥G(xδn ) − G(x)∥ ˆ 2
( ) = ∥H(xδn )−1 F ′ (xδn )∗ F ′ (xδn )(xδn − x) ˆ − F (xδn ) + y δ −α[H(xδn )−1 − H(x) ˆ −1 ](xˆ − x0 ) −[H(xδn )−1 F ′ (xδn )∗ − H(x) ˆ −1 F ′ (x) ˆ ∗ ](F (x) ˆ − y δ )∥2 ≤ ∥H(xδn )−1 F ′ (xδn )∗ ∥2 ∥F (xδn ) − F (x) ˆ − F ′ (xδn )(xδn − x)∥ ˆ 2 +∥α[H(xδn )−1 − H(x) ˆ −1 ]F ′ (x) ˆ ∗ ν∥2 +∥[H(xδn )−1 F ′ (xδn )∗ − H(x) ˆ −1 F ′ (x) ˆ ∗ ](F (x) ˆ − y δ )∥2 δ L ˆ 22 + 2L∥ν∥2 ∥xδn − x∥ ˆ 2+√ . ≤ √ ∥xδn − x∥ 4 α α
(3.24)
Combining (2.1), (2.2), (3.15) and (3.16) we get ( ) ∥ H(xδn ) − H(x) ˆ H(x) ˆ −1 [F ′ (xδn )∗ (F (x) ˆ − y δ ) + α(xˆ − x0 )]∥2 ( ) = ∥ H(xδn ) − H(x) ˆ H(x) ˆ −1 F ′ (x) ˆ ∗ [(F (x) ˆ − y δ ) + αν]∥2 ≤ L (δ + α∥ν∥2 ) ∥xδn − x∥ ˆ 2.
(3.25)
Applying (3.20), (3.21), (3.24) and (3.25) in (3.19) straightforwardly leads to ( ) √ L δ δ 2 δ √ ∥x − x∥ ∥xn+1 − x∥ ˆ 2≤ 2 ˆ 2 + 2L∥ν∥2 ∥xn − x∥ ˆ 2+√ 4 α n α ( ) δ +L + ∥ν∥2 ∥xδn − x∥ ˆ 2. (3.26) α
13
A proximal method for nonlinear inverse problems
Also, for n < n∗ , according to the discrepancy principle (3.4), there is τ δ < ∥F (xδn ) − y δ ∥2 ≤ ∥F (xδn ) − F (x)∥ ˆ 2 + ∥F (x) ˆ − y δ ∥2 √ ≤ α∥xδn − x∥ ˆ 2 + δ, and thus
δ ∥xδ − x∥ ˆ 2 √ ≤ n . τ −1 α
(3.27)
(3.28)
Applying (3.28) to (3.26) gives ( ) √ L ∥xδn − x∥ ˆ 2 δ 2 δ √ ∥x − x∥ ∥xn+1 − x∥ ˆ 2≤ 2 ˆ 2 + 2L∥ν∥2 ∥xn − x∥ ˆ 2+ τ −1 4 α n ) ( δ ∥x − x∥ ˆ 2 + ∥ν∥2 ∥xδn − x∥ ˆ 2 , 0 ≤ n < n∗ . (3.29) +L √ n α(τ − 1) Setting γnδ := δ γn+1
∥xδn −x∥ ˆ 2 √ , α
it follows from (3.29) that ( ) ( ) 1 1 2 ≤L + (γnδ )2 + + 5L∥ν∥2 γnδ , 2 τ −1 τ −1
0 ≤ n < n∗ .
(3.30) Since L∥ν∥2 is sufficiently small and τ > 3, by Lemma 3.3 we obtain that γnδ is bounded and that √ ∥xδn+1 − x∥ ˆ 2 ≤ α max{γ0δ , γ(0)} < ρ. Then the assertion follows from (3.3), (3.4), (3.16), (3.14) and Theorem 2.3. 3.3 Implementation In this subsection we discuss the implementation of the algorithms, while in the next subsection we discuss the numerical experiments. We will occasionally omit the superscript δ for simplicity. Note that (3.17) is a two-step algorithm, consisting of the classical regularized Gauss-Newton step followed by a “J-projection” in a variable metric. By definition of the proximity operator, given H := H(x) = F ′ (x)∗ F ′ (x) + αI, we have } { 1 2 H (3.31) proxJ (z) = arg min J(x) + ∥x − z∥H , 2 x∈X where
( ) ∥x − z∥2H = ⟨x − z, H(x − z)⟩L2 = ⟨x − z, F ′ (x)∗ F ′ (x) + αI (x − z)⟩L2 = ∥F ′ (x)(x − z)∥22 + α∥x − z∥22 .
(3.32)
14
H. Fu, H. Liu, B. Han, Y. Yang and Y. Hu
Therefore
{
proxH J (z)
= arg min x∈X
} α 1 ′ 2 2 ∥F (x)(x − z)∥2 + ∥x − z∥2 + J(x) . 2 2
(3.33)
As mentioned earlier, we will only consider the case J(x) = β∥W (x − x0 )∥1 . From (3.32) we have { } 1 ′ α H 2 2 proxJ (z) = arg min ∥F (x)(x − z)∥2 + ∥(x − z)∥2 + β∥W (x − x0 )∥1 . 2 2 x∈X (3.34) The first order optimality condition for a minimizer x is given by ( ) 0 ∈ F ′ (x)∗ F ′ (x) + αI (x − z) + β∂(∥W (x − x0 )∥1 ). Furthermore, by the definition of H, one has 0 ∈ H(x − z) + β∂(∥W (x − x0 )∥1 ). Multiplying by µ and adding x on both sides, it yields a fixed point relation x ∈ x + µH(x − z) + µβ∂(∥W (x − x0 )∥1 ), which has to be satisfied for a minimizer x and for all µ ∈ R. Rearranging the terms in the above (formal) optimality condition we get (x − x0 ) − µH(x − z) ∈ (I + µβ∂∥W (·)∥1 )(x − x0 ). We can turn this into an iteration condition by requesting (xk − x0 ) − µk H(xk )(xk − z) ∈ (I + µk β∂∥W (·)∥1 )(xk+1 − x0 ), where µk is the step size. It turns out that the mapping Sµβ := (I + µβ∂∥W (·)∥1 )−1 is well-defined (see e.g. [12]), which yields the iteration xk+1 = x0 + Sµk β ((xk − x0 ) − µk H(xk )(xk − z)) ,
k = 0, 1, · · · .
(3.35)
The numerical results below are obtained by using the Barzilai-Borwein rule [4] for the choice of the step sizes µk . Finally, the full algorithm is summarized as follows. Algorithm 1. (Proximal regularized Gauss-Newton method for the hybrid regularization problem (1.4))
A proximal method for nonlinear inverse problems
15
(i) Choose x0 ∈ D(F ) and determine a value α = α(δ, y δ ). (ii) Determine zn = xδn − H(xδn )−1 [F ′ (xδn )∗ (F (xδn ) − y δ ) + α(xδn − x0 )]. (iii) Choose η > 0 and determine xδn+1 = proxH J (zn ) { } 1 ′ 1 2 2 = arg min ∥F (x)(x − zn )∥2 + α( ∥x − zn ∥2 + η∥W (x − x0 )∥1 ) 2 2 x∈X via the thresholding iteration (3.35). (iv) Check stopping criteria (3.4). Return xδn+1 as a solution or set n ← n + 1 and repeat from (ii).
4 Numerical application In the section we consider some numerical examples on parameter identification in partial differential equations. The aim is to demonstrate the capabilities and the performance of our algorithm in solving a challenging ill–posed problem in the context of nonlinear parameter identification. Example 4.1. Let Ω ⊂ Rd (d = 1, 2) be a bounded domain with Lipschitz boundary ∂Ω. We first consider identification of the diffusion parameter κ in { −∇ · (κ∇u) = f, in Ω, (4.1) u = g, on ∂Ω, from a measurement uδ ∈ H01 (Ω) of the solution u, where f ∈ H −1 (Ω) and 1 g ∈ H 2 (∂Ω). Here, we assume that ∥uδ − u∥H 1 (Ω) ≤ δ. ∥u∥H 1 (Ω) Let p > d and u(κ) be the solution of (4.1). The nonlinear operator F : D(F ) ⊆ W 1,p (Ω) → H01 (Ω) is defined as the parameter-to-solution mapping F (κ) = u(κ). In order to guarantee the well-posedness of the governing boundary value problem, and also for physics reasons, we define { } D(F ) := κ ∈ W 1,p (Ω)|κ ≥ γ > 0 on Ω .
16
H. Fu, H. Liu, B. Han, Y. Yang and Y. Hu
This is a closed and convex subset in W 1,p (Ω). Since W 1,p (Ω) embeds into L∞ (Ω), the operator F is well defined. Therefore solving the inverse problem (4.1) is reduced to solving an equation of the form (1.1). This is the inverse groundwater filtration problem corresponding to the steady state case studied in [31] in which it has shown that F is Fréchet differentiable with a Lipschitz continuous derivative. Further, it can be proved that the Fréchet derivative and its adjoint operator are given by F ′ (κ)h = A(κ)−1 (∇(h∇F (κ)),
with A(κ) : H 2 (Ω)
∩
F ′ (κ)∗ ω = −∇F (κ) · ∇(A(κ)−1 ω), H01 (Ω) → L2 (Ω) defined by A(κ)u = −∇(κ∇u).
Moreover, in [19] it was verified that F ′ is locally bounded. Thus the results derived in the previous sections are applicable, if ∥κ† − κ0 ∥2 and ∥v∥2 in the source condition (3.15) are sufficiently small. To test the proposed algorithm, we present some numerical results for the model problem (4.1) for d = 1, 2. Since the model problem is only well-posed for κ ≥ γ > 0 for almost all x ∈ Ω, we selected the true solution and the initial guess in the interior of the domain D(F ). In all computations, we observed that the sequence generated by iteration stays in D(F ). We implemented the proximal regularized Gauss-Newton method as given in Algorithm 1. The same stopping rule was used in every case. Unless otherwise stated, the classical Gauss-Newton step (ii) was terminated either by the discrepancy principle (3.4) with τ = 3.1, or if 10 iterations were reached. The thresholding iteration step (iii) was terminated if 20 iterations were reached. In the first numerical experiment we consider the one-dimensional problem on the interval Ω = (0, 1) with the sought solution given by 1.75, if 0.1 ≤ x ≤ 0.25, 1.60, if 0.3 ≤ x ≤ 0.4, κ† (x) = 4 sin (2πx) + 1, if 0.6 ≤ x ≤ 1, 1, elsewhere. The boundary conditions were taken to be the homogeneous Dirichlet function u(x) with u(0) = u(1) = 0. The inhomogeneous term f (x) represents the heat source. In order to improve the accuracy and reduce the computation time, we conduct a pair of experiments. We first choose a concentrated heat source at x = 1/3,
17
A proximal method for nonlinear inverse problems
i.e. assume that f 1 (x) = δ(x − 1/3) with δ being the Dirac distribution, and we measure the resulting steady-state temperature distribution u1 (κ). ˆ We then move the heat source to x = 2/3, i.e. assume that f 2 (x) = δ(x − 2/3), and measure the distribution u2 (κ). ˆ In the numerical simulation, the solution of the previous problem (4.1) was based on a standard Galerkin finite element discretization on a uniform grid with mesh size h = 1/m with m = 50. The data is modeled by dei = ue (xi ) + δie ,
i = 1, 2, · · · , m − 1,
e = 1, 2.
Here dei represents the temperature observation for experiment e taken at the node point x = ih, and δie represents the measurement error. A typical realization with observed noisy data is displayed in Fig. 1a for δ = 0.01 and Fig. 1b for δ = 0.05.
0.18
0.18
0.16
0.16 0.14
0.14
0.12
0.12
0.1 0.1 0.08 0.08 0.06 0.06
0.02 0
0.04
Exact data1 Exact data2 Noisy data1 Noisy data2
0.04
0
0.2
0.4
0.6
Exact data1 Exact data2 Noisy data1 Noisy data2
0.02 0 0.8
(a) noisy observed data (δ = 0.01)
1
−0.02
0
0.2
0.4
0.6
0.8
1
(b) noisy observed data (δ = 0.05)
Figure 1. Observed data used for distributed parameter identification in a onedimensional steady-state diffusion equation. The left curve represents the solution corresponding to a point source at x = 1/3. The right curve represents the solution corresponding to a point source at x = 2/3. Circles represent observed data.
In our computation, we take the initial guess κ0 to be the arithmetic mean of and η = 0.5. Although the reconstruction obtained with standard L2 -norm regularization techniques tells something on the sought solution, it does not give information such as sparsity, discontinuities and constancy since the result is too oscillatory. In Fig. 2, we record the computational results with different choices of W . In Figs. 2a and 2c we take W = I, the identity matrix. It is clear that the sparsity of the sought solution is significantly reconstructed. However, the reconstruction result is still oscillatory which is typical for this choice of W . In Figs. 2b and 2d, we take W = D, a discretized representation of the gradient operator. All jump discontinuities are identified correctly and the notorious oscillatory effect is effectively removed. In our experiments we have found that the proximal regularκ†
18
H. Fu, H. Liu, B. Han, Y. Yang and Y. Hu
ized Gauss-Newton method with W = D is slightly more effective while dealing with noise.
2
2 Estimated parameters True parameters
Estimated parameters True parameters
1.9
1.8
1.8 1.7
1.6
1.6 1.4
1.5 1.4
1.2
1.3 1.2
1
1.1 0.8
0
0.2
0.4
0.6
0.8
1
1
0
(a) W = I (δ = 0.01)
0.4
0.6
0.8
1
(b) W = D (δ = 0.01)
2
2 Estimated parameters True parameters
Estimated parameters True parameters
1.8
1.8
1.6
1.6
1.4
1.4
1.2
1.2
1
1
0.8
0.2
0
0.2
0.4
0.6
(c) W = I (δ = 0.05)
0.8
1
0.8
0
0.2
0.4
0.6
0.8
1
(d) W = D (δ = 0.05)
Figure 2. Results for the 1d inverse diffusion coefficient problem with different choices of W .
In the second numerical experiment we consider the two dimensional problem on the unit square, i.e., Ω = [0, 1] × [0, 1], with slightly different boundary conditions, that u = 0 on the left (x = 0) and right (x = 1) sides of the boundary, and no flux (uy = 0) on the top and bottom sides of the boundary. We take f (x, y) = δ(x − x0 , y − y0 ), the Dirac distribution at the point source (x0 , y0 ) = (1/2, 1/2). The forward problem was discretized using Cell-Centered Finite Difference (CCFD) on a uniform nx ×ny grid. Here, we take nx = ny = 64. The true solution κˆ is given in Fig. 3a. In our computation, we take κ0 ≡ 1. Since total variation methods are very effective for recovering “blocky”, possibly discontinuous, images from noisy data, we take W = D and use uδ to reconstruct κ† by our method. In Fig. 3, we plot computational results with δ = 0.01 and
19
A proximal method for nonlinear inverse problems
δ = 0.05, respectively. The reconstruction results in Figs. 3c and 3d are satisfactory. Moreover, the results indicate that the method is better with respect to η since the change of η does not affect the reconstruction much. diffusivity κ
intensity κ 1
1.05 0.95
10
10
1
0.9 20
0.95
20 0.85
30
0.9 30
0.85
0.8 40
0.8
0.75 40
0.75 0.7
50
50
0.7
0.65 60
0.65
60 10
20
30
40
50
60
0.6
10
20
30
40
50
60
(b) η = 1, δ = 0.01
(a) true solution diffusivity κ
diffusivity κ 1.3
10
1.2 10
20
1.1 20
1.2
1.1
1
1 30
30 0.9
0.9 40
40 0.8
0.8 50
50 0.7
0.7 60
60
0.6
0.6 10
20
30
40
50
(c) η = 0.1, δ = 0.05
60
10
20
30
40
50
60
(d) η = 1, δ = 0.05
Figure 3. Results for the 2d inverse diffusion coefficient problem.
Example 4.2. A second nonlinear model problem consists of recovering the potential term in an elliptic equation. Let Ω ⊂ Rd is a bounded domain with Lipschitz boundary ∂Ω and f ∈ L2 (Ω). we consider the equation { −∆u + cu = f in Ω, (4.2) ∂u on ∂Ω, ∂n = 0, The inverse problem is to recover the potential c from an observation uδ of a true solution u ∈ H 1 (Ω), i.e., the nonlinear operator F maps c ∈ X = L2 (Ω) to the
20
H. Fu, H. Liu, B. Han, Y. Yang and Y. Hu
solution u(c) ∈ Y = H 1 (Ω) of (4.2). Such problems arise in heat transfer, e.g., damping design [29] and identifying heat radiative coefficient [32]. It can be shown that for some γ > 0, F is Fréchet differentiable with locally bounded derivative and is weakly sequentially closed on D(F ) = {c ∈ L2 (Ω) : ∥c − c∥ ¯ L2 (Ω) ≤ γ for a c¯ ∈ L2 (Ω), c¯ ≥ 0 a.e.} and that the Fréchet derivative is Lipschitz continuous in a neighborhood of exact parameter c† , so that the convergence results of Theorem 3.5 are applicable. Finally, we consider (4.2) with Ω = [−1, 1]2 , f (x, y) = 1 and true parameter † c is given in Figs.4a. To obtain exact u and noise data uδ , the forward operator was discretized using finite elements method on a mesh with 7938 triangles. In all cases, we take c0 ≡ 1, W = D, η = 1, and use uδ to reconstruct c† by our method. The reconstructions are displayed in Figs. 4b and 4c, respectively. Overall, the constructions accurately captures the shape as well as the magnitude of the potential c† , and thus represents a good approximation.
5 Conclusions In this paper, we considered a class of nonsmooth convex minimization problems whose objective function is the sum of a quadratic data-fidelity term and two regularization terms. A traditional L2 -norm regularization term and a nonsmooth L1 -norm regularization term are utilized. We derived a proximal regularized Gauss-Newton method for these problems, and its local convergence is proved and demonstrated numerically. As a practical application, our method has been used successfully to solve the parameter estimation problems of an elliptical equation. The results illustrate it is effective and robust in solving such problems as shown in several numerical tests.
Acknowledgments Authors sincerely thank the editors and the anonymous reviewers for the very helpful and kind comments to assist in improving the presentation of our paper.
Bibliography [1] Stephan W Anzengruber and Ronny Ramlau, Morozov’s discrepancy principle for Tikhonov-type functionals with nonlinear operators, Inverse Problems 26 (2010), 025001.
A proximal method for nonlinear inverse problems
(b) δ = 0.01
(a) true solution
(c) δ = 0.05
Figure 4. Results for the 2d inverse potential problem.
21
22
H. Fu, H. Liu, B. Han, Y. Yang and Y. Hu
[2] Anatolii Borisovich Bakushinskii, The problem of the convergence of the iteratively regularized Gauss-Newton method, Computational Mathematics and Mathematical Physics 32 (1992), 1503–1509. [3] Anatoly B Bakushinsky and Mihail Yu. Kokurin, Iterative Methods for Approximate Solution of Inverse Problems, 577, Springer Science & Business Media, 2004. [4] Jonathan Barzilai and Jonathan M Borwein, Two-point step size gradient methods, IMA Journal of Numerical Analysis 8 (1988), 141–148. [5] Amir Beck and Marc Teboulle, A fast iterative shrinkage-thresholding algorithm for linear inverse problems, SIAM Journal on Imaging Sciences 2 (2009), 183–202. [6] Andrea Borsic and Alan D Adler, A primal-dual interior-point framework for using the L1 or L2 norm on the data and regularization terms of inverse problems, Inverse Problems 28 (2012), 095011. [7] Jonathan M Borwein and Adrian S Lewis, Convex Analysis and Nonlinear Optimization: Theory and Examples, Springer, 2000. [8] Kristian Bredies, Dirk A Lorenz and Peter Maass, A generalized conditional gradient method and its connection to an iterative shrinkage method, Computational Optimization and Applications 42 (2009), 173–193. [9] Christian Clason and Bangti Jin, A semismooth Newton method for nonlinear parameter identification problems with impulsive noise, SIAM Journal on Imaging Sciences 5 (2012), 505–536. [10] Thierry Colin, Angelo Iollo, Jean-Baptiste Lagaert and Olivier Saut, An inverse problem for the recovery of the vascularization of a tumor, Journal of Inverse and Ill-Posed Problems 22 (2014), 759–786. [11] Patrick L Combettes and Valérie R Wajs, Signal recovery by proximal forwardbackward splitting, Multiscale Modeling & Simulation 4 (2005), 1168–1200. [12] Ingrid Daubechies, Michel Defrise and Christine De Mol, An iterative thresholding algorithm for linear inverse problems with a sparsity constraint, Communications on Pure and Applied Mathematics 57 (2004), 1413–1457. [13] Adrian Doicu, Franz Schreier and Michael Hess, Iteratively regularized GaussNewton method for atmospheric remote sensing, Computer Physics Communications 148 (2002), 214–226. [14] Adrian Doicu, Thomas Trautmann and Franz Schreier, Iterative regularization methods for nonlinear problems, Numerical Regularization for Atmospheric Inverse Problems, Springer, 2010, pp. 221–250. [15] Heinz Werner Engl, Martin Hanke and Andreas Neubauer, Regularization of Inverse Problems, Volume 375 of Mathematics & Its Applications, Kluwer Academic Publishers Group, Dordrecht, Netherlands, 1996.
A proximal method for nonlinear inverse problems
23
[16] Heinz Werner Engl, Karl Kunisch and Andreas Neubauer, Convergence rates for Tkhonov regularisation of non-linear ill-posed problems, Inverse Problems 5 (1989), 523–540. [17] Qibin Fan, Yuling Jiao, Xiliang Lu and Zhiyuan Sun, Lq -regularization for the inverse Robin problem, Journal of Inverse and Ill-posed Problems 24 (2016), 3–12. [18] Gene H Golub, Per Christian Hansen and Dianne Prost O’Leary, Tikhonov regularization and total least squares, SIAM Journal on Matrix Analysis and Applications 21 (1999), 185–194. [19] Martin Hanke, A regularizing Levenberg-Marquardt scheme, with applications to inverse groundwater filtration problems, Inverse Problems 13 (1997), 79–95. [20] Per Christian Hansen and Dianne Prost O’Leary, The use of the L-curve in the regularization of discrete ill-posed problems, SIAM Journal on Scientific Computing 14 (1993), 1487–1503. [21] Kazufumi Ito and Karl Kunisch, Semi-smooth Newton methods for state-constrained optimal control problems, Systems & Control Letters 50 (2003), 221–228. [22] Bangti Jin, Dirk A Lorenz and Stefan Schiffler, Elastic-net regularization: Error estimates and active set methods, Inverse Problems 25 (2009), 115022. [23] Jason D Lee, Yuekai Sun and Michael A Saunders, Proximal Newton–type methods for minimizing composite functions, SIAM Journal on Optimization 24 (2014), 1420–1443. [24] Gisela L Mazzieri, Ruben D Spies and Karina G Temperini, Mixed spatially varying L2 -BV regularization of inverse ill-posed problems, Journal of Inverse and Ill-posed Problems 23 (2015), 571–585. [25] Jean-Jacques Moreau, Fonctions convexes duales et points proximaux dans un espace hilbertien, Comptes Rendus de l’Académie des Sciences (Paris) Série A Math 255 (1962), 2897–2899. [26] Ronny Ramlau and Gerd Teschke, A Tikhonov-based projection iteration for nonlinear ill-posed problems with sparsity constraints, Numerische Mathematik 104 (2006), 177–203. [27] Saverio Salzo and Silvia Villa, Convergence analysis of a proximal Gauss-Newton method, Computational Optimization and Applications 53 (2012), 557–589. [28] Henry Stark and Yongi Yang, Vector Space Projections: A Numerical Approach to Signal and Image Processing, Neural Nets, and Optics, John Wiley & Sons, Inc., 1998. [29] Srdjan Stojanovic, Optimal damping control and nonlinear elliptic systems, SIAM Journal on Control and Optimization 29 (2006), 594–608. [30] Gerd Teschke and Claudia Borries, Accelerated projected steepest descent method for nonlinear inverse problems with sparsity constraints, Inverse Problems 26 (2010), 025007.
24
H. Fu, H. Liu, B. Han, Y. Yang and Y. Hu
[31] Curtis R Vogel, Sparse matrix computations arising in distributed parameter identification, SIAM Journal on Matrix Analysis and Applications 20 (1999), 1027–1037. [32] Masahiro Yamamoto and Jun Zou, Simultaneous reconstruction of the initial temperature and heat radiative coefficient, Inverse Problems 17 (2001), 1181–1202. [33] Ye Zhang, Dmitry V Lukyanenko and Anatoly G Yagola, An optimal regularization method for convolution equations on the sourcewise represented set, Journal of Inverse and Ill-Posed Problems 23 (2015), 465–475.
Received Oct. 18, 2015; revised Apr. 12, 2016; accepted Jul. 12, 2016. Author information Hongsun Fu*, Department of Mathematics, Dalian Maritime University, Dalian 116026, China. E-mail:
[email protected] Hongbo Liu*, School of Information, Dalian Maritime University, Dalian 116026, China. E-mail:
[email protected] Bo Han, Department of Mathematics, Harbin Institute of Technology, Harbin 150001, China. E-mail:
[email protected] Yu Yang, School of Information, Dalian Maritime University, Dalian 116026, China. E-mail:
[email protected] Yi Hu, Department of Mathematical Sciences, Georgia Southern University, Statesboro, GA 30460, USA. E-mail:
[email protected]