Regularization of Linear Least Squares Problems by Total ... - CiteSeerX

Regularization of Linear Least Squares Problems by Total Bounded Variation G. Chavent K. Kunischy November 1996

1 Problem statement and functions of bounded variation We consider the problem of nding approximate solutions to (1.1)

Tu = z

where T is a bounded linear operator from L2( ) to a Hilbert space Y by using regularization techniques based on bounded variation functionals. Here

is a bounded domain in IRn with Lipschitzian boundary and z 2 Y . The operator T is not assumed to be injective and it may be compact. In this case approximate solutions that are stable with respect to z can be obtained by solving the regularized problem. min 1 jTu ? zj2Y + juj2L2 (1.2) 2 2 or (1.3) min 21 jTu ? zj2Y + 2 jruj2L2 ; n

CEREMADE, Universite Paris-Dauphine and INRIA, Rocquencourt Fachbereich Mathematik, Technische Universitat Berlin and Institut fur Mathematik, Universitat Graz y

1

for example. In (1.2) and (1.3) one refers to > 0 as the regularization parameter. Regularization by the square of a Hilbertian norm or seminorm as in (1.2) and (1.3) is well-studied. We refer to [B], [Gr], [L] and the references given there. While (1.2) and (1.3) have attractive analytical properties concerning the stability of their solution with respect to perturbations in the data z and their asymptotic behavior as ! 0 they can have serious practical disadvantages. The L2 -regularization term in (1.2) is not satisfactory to dampen undesired oscillations in the solution u which result from highly noisy data and/or compactness of T . Regularization with the square of the gradient as in (1.3) reduces these oscillations but it overregularizes the solution in the neighborhood of edges and corners. This is well-known in the context of image enhancement, see for example [ROF] and the references given there, and one can quickly convince oneself of this fact by a short MATLAB-code even with T = identity. Due to these shortcomings with (1.2) and (1.3) regularization based on total bounded variation functionals has recently been investigated in several publications [CKP], [ROF], [V], for example. In this case the regularized problems are given by (variations of) Z 1 2 min 2 jTu ? zjY + jruj; (1.4)

R where jruj denotes the bounded variation seminorm, which will be deof this section. For the moment we may think of the term Rscribed at the end 1 ; 1

jruj as the W ( ) seminorm. In the context of image enhancement (1.4) has proved to be extremely eective. Comparing (1.4) to (1.2) or (1.3) we immediately recognize that while (1.2) and (1.3) are quadratic, (1.4) is not. In fact, (1.4) is highly nonlinear and this results in interesting problems both analytically and numerically. In [CKP] basic facts concerning existence of solutions to (1.4), the optimality system, approximation of (1.4) by nite dimensional problems and algorithms that are speci cally adapted for the nonlinear structure of problem (1.4) were presented. In the present analysis we focus on the problem of stability of the solutions to (1.4) with respect to perturbations in the data z and on the asymptotic behavior as ! 0+. Let us now specify the problem to be investigated: 8 Z < min 1 j Tu ? z j2 + ju j2 + jru j Y (P) 2 2 L2

: over u 2 K \ X; 2

where 0; > 0; K is a closed, convex subset of L2 ( ); X = L2 ( ) \ BV ( ), and BV ( ) denotes the space of bounded variation functions to be de ned below. The space X endowed with the norm jujX = jujL2 + jujBV is a Banach space. We shall see that due to the assumption that is bounded X and BV ( ) are equivalent if n 2. The use of the quadratic regularization term serves two purposes. First, for > 0 it provides a coercive term for the subspace of constant functions which are in the kernel of the r-operator (and can be in the kernel of T ) and secondly it gives the possibility to distinguish the structure of stability results for quadratic regularization terms from that of the nonquadratic BV -term. Let us now summarize results from the theory of functions of bounded variation that are relevant to our work. We refer to [CKP], [Gi], [T] for details. As mentioned above is a bounded domain in IRn with Lipschitzian boundary. A function u 2 L1 ( ) is said to be of bounded variation if (1.5) sup

Z

u(x) div v(x)dx : v 2 C 1( )n; 0

jv(x)j1 1; x 2 < 1:

HereR j j1 denotes the l1-norm on IRn. The sup in (1.5) will be denoted by jruj. The space of all functions in L1 ( ) with bounded variation is abbreviated by BV ( ). Endowed with Z

jujBV = jujL1 + jruj;

(1.6)

BV ( ) is a Banach space. Equivalently BV ( ) can be introduced in terms of measures. Let M ( ) denote the space of real Borel measures on which is a Banach space when endowed with the norm Z

jj( ) = jj; 2 M ( ); jj = + + ? being the total variation measure associated to and jj( ) is the total variation of in , [R]. It is known that M ( ) is the dual space of C0( ), the space of continuous function vanishing on the boundary of

and Z Z jj = sup v(x)d(x) : v 2 C0( ); jvjC ( ) 1 :

3

Thus BV ( ) can equivalently be expressed as the space of functions u 2 L1 ( ) for which @x u 2 M ( ), for every 1 i n, with the partial derivatives @x understood in the distributional sense. If the vector valued measure space M n( ) is endowed with the l1 -norm, in the sense that i

i

Z

* j= j

then

n Z X

* i j; * 2 M n( ); j

i=1

n Z X

Z

jujBV = jujL1 + jruj = jujL1 +

i=1

j@x uj; for u 2 BV ( ): i

If instead of the l1-norm on IRn in (1.5) one uses the lp-norm, 1 p < 1, then this induces an equivalent norm on BV ( ).

Proposition 1.1 1 a) If fuj g1 j =1 BV ( ) and lim uj = u in L ( ) then Z

Z

jruj lim inf jruj j:

1 b) For every u 2 BV ( ) \ Lr ( ); r 2 [1; 1) there exists fuj g1 j =1 C ( ) such that

lim uj = u in L

Z

and lim j@x uj j =

r ( )

i

Z

j@x uj; 1 i n: i

c) For every bounded sequence fuj g1 j =1 BV ( ) there exists a subse1 quence fuj gk=1 and u 2 BV ( ) such that lim uj = u in Lp ( ); p 2 [1; n?n 1 ), if n 2, and lim uj = u in Lp; p 2 [1; 1), if n = 1. (Compactness). k

k

k

d) There exists a constant C = C (n) such that Z

?1

ju ? u~j ?1 dx

n

n

R

n

n

Z

C jruj; for all u 2 BV ( );

with u~ = j 1 j u dx. (Sobolev inequality). If n = 1, then the Ln= ?1 norm is understood to be the L1 -norm. 2 n

4

As a consequence of Proposition 1.1 d) we have the following Corollary 1.2 If n = 1 or 2 then BV ( ) and X are equivalent Banach spaces. 2 Sketch of proof of Proposition 1.1. The assertions of Proposition 1.1 are small modi cation of results given in [Gi]. The semicontinuity of the bounded variation functional asserted in a) is proved in ([Gi], Theorem 1.9). As for b) the proof of Theorem 1.17 in [Gi] can be generalized from r = 1 to arbitrary r 2 [1; 1). An application of Rellich's Theorem, see ([Gi], Theorem 1.19) implies c). Finally d) is well-known for u 2 C 1( ). For arbitrary u 2 BV ( ) there exists as Ra consequenceR of b) a sequence uj 2 C 1( ) such that uj ! u in L1 ( ) and jruj j ! jruj. Applying d) to the sequence fuj g we ?1 ( ) and hence uj converges weakly obtain that fuj g1 j =1 is bounded in L to u in L ?1 ( ) and lim u~j = u~. It follows that n

n

n

n

ju ? u~jL

Z

?1

n

n

Z

lim juj ? u~j j lim C jruj j = C jruj:

The subsequent sections are organized as follows. In Section 2 sucient conditions for existence of a unique solution to (P) and the optimality system are given. Section 3 is dedicated to the stability of the solutions with respect to the data z. The asymptotic behavior of the solutions to (P) as ! 0; ! 0 is analyzed in Section 4. 2

2 Existence and optimality conditions We shall use the following hypothesis. One of the following properties holds: (i) > 0, (H) (ii) n = 1 or n = 2, and const 2= ker T , (iii) K is bounded in L2 ( ).

Theorem 2.1 If (H) holds, then there exists a solution to (P). If moreover

> 0 or T is injective then the solution is unique. 5

2

Proof. Let fuj g1j=1 be a minimizing sequence. If (H) (i) or (iii) hold, then fuj g is bounded in L2 ( ). In the case of (H) (ii) boundedness of fuj g in L2 ( ) follows from Proposition 1.1 d) and the fact that > 0. Thus in either case (H) and the form of the cost functional imply that fuj g is bounded in X . By Proposition 1.1 c) there exists a subsequence of fuj g denoted by the same symbol and u 2 X , such that fuj g converges to u strongly in L1( ) and weakly in L2 ( ). It follows that u 2 K . Moreover, T maps weakly convergent sequence in L2 ( ) into weakly convergent sequences in Y . By

Proposition 1.1 a) we nd 1 jT u ? zj2 + j u j2 + Z jru j 2

2 Z 1 2 2 lim inf 2 j Tuj ? z j + 2 j uj jL2 + jruj j Z inf 12 j Tu ? z j2 + 2 j u j2L2 + jru j : u 2 K and thus u is a solution to (P). Concerning uniqueness we note that in the case > 0 the cost functional of (P) is strictly convex and hence u is unique. In the case of injectivity of T the cost functional of (P) is again strictly convex and uniqueness of u follows. 2 Remark 2.2 Let us discuss the possibility of relaxing the conditions of Theorem 2.1 to obtain uniqueness of the solution to (P ) by choosing an equivalent norm for BV ( ). a) Suppose that const 2= ker T holds and that in the de nition of BV ( ) in (1.5) the l1-norm is replaced by the lp-norm for some p 2 (1; 1). Then the functional Z (2.1) u ! jru j 1;1 ( ) is strictly convex, i.e. R jr(u + v )j = R jruj + restricted to W

R 1;1 ( ) implies that ru = rv or rv = ru jr v j for u; v 2 W

for some 0. Let us argue that the solutions that are elements of W 1;1( ) are unique under the assumptions made at the beginning of the remark. In fact, if u1 and u2 were two solutions to (P) in W 1;1( ), then Tu1 = Tu2, since otherwise 21 (u1 + u2) would render the cost in (P) smaller than u1 which is impossible. It follows that Z

Z

Z

jr(u1 + u2)j = jru1j + jru2j; 6

and hence ru1 = ru2 (or ru2 = ru1) for some 0. In view of the facts that Tu1 = Tu2 and that u1 and u2 are minimizers of (P) it follows that ru1 R= ru2. Due Rto Proposition 1.1 d) we further nd that u1 ? u2 = j 1 j u1 dx ? j 1 j u2 dx and speci cally u1 ? u2 is a constant. By assumption const 2= ker T and hence u1 = u2. b) Next we argue that the functional Z

u ! jru j

is not strictly convex on BV ( ) regardless of the choice of p 2 [1; 1] that is taken for the j v(x) jl 1 condition in (1.5). In fact, let

= (?1; 1) (?1; 1); l = (?1; 0] (?1; 1); r = (0; 1) R(?1; 1) and uR 1 = 12 on r ; u1 = 0 on R l ; u2 = ? 12 on l ; u2 = 0 on r : j rR u1 j =

j ru2 j = 1. Moreover j r(u1 + u2 ) j = 2 and hence u ! jru j is not strictly convex on BV ( ). (We thank Dr. Klaus Deckelnick, University of Sussex, for providing the example.) 2 We now specify the optimality system for (P). Theorem 2.3 Let u 2 K \ X . Then u is a solution of (P) if and only if there exists in the dual space of M n ( ) such that D E (2.2) (T (T u ? z) + u; u ? u)L2 ( ) ? div ; u ? u BV ;BV 0 for all u 2 K \ X Z Z D E (2.3) ; ? ru M ;M + jruj jj; for all 2 M n( ): p

n;

n

Here T denotes the adjoint of T 2 L(L2( ); Y ) and - div stands for the conjugate of r : BV ( ) ! M n ( ), i.e. D E D E - div ; u = ; ru for u 2 BV ( ): BV ;BV

M n; ;M n

2

The proof can be obtained with only minor modi cations from that of ([CKP], Theorem 2.2) and it is therefore omitted. Let us note that (2.3) is equivalent to D E Z ; ru = jruj (2.4)

7

and (2.5)

D

Z

E

; jj for all 2 M n ( ):

D

E

Note that (2.5) implies that the restriction to L1n of ! ; can be identi ed with an element in L1 n ( ) with jjL1 1. The Lagrange multiplier of Theorem 2.3 is not unique. In the following theorem we give a necessary and sucient optimality system that involves a unique Lagrange multiplier q 2 H 1( ). Let us recall some function spaces involving the divergence operator [GR]: n

n

o

H (div) = v 2 L2n ( ) : div v 2 L2 ( ) ; which is a Hilbert space for the norm

jvjH (div) = jvj2L2 + jdiv vj2L2 n

1=2

;

and the closed subspace of H (div):

H0(div) = D( )n; where D( )n denotes the space of in nitely dierentiable vector-valued functions with support in and the closure is taken in the norm of H (div). The trace operator (v) = v n j ? can be extended as continuous linear operator from H (div) to H ?1=2 (?). Here n denotes the outer unit normal to the Lipschitzian boundary ? of . The space H0(div) can be characterized as

H0(div) = fv 2 H (div) : (v) = 0g : Theorem 2.4 Let K = X Rand let u be a solution to (P). Then there exists a unique q 2 H 1q ( ) with q dx = 0 such that for ~ = rq we have ~ 2 H0 (div); j~jL2 meas ( ), n

(2.6)

(T T + )u ? T z = div ~ in L2 ( ) (? div ~; u ? u)L2 ( ) +

Z

Z

jru j jru j for all u 2 X:

Conversely, if there exists a pair (u; ~) 2 X H0 (div) such that (2.6), (2.7) hold, then u is a solution to (P). 2

(2.7)

8

Remark 2.5 >From the proof it will follow that ~ of Theorem 2.4 is ob-

tained from the Lagrange multiplier of Theorem 2.3 by restriction of the functional ! h; iM ;M to L2n( ) and subsequent projection of onto the orthogonal complement in L2n( ) of the divergence-free space. We recall the decomposition of L2n( ) as n;

n

L2n( ) = H H ?; where H = fv 2 H0 (div) : div v = 0g, and H ? denotes the orthogonal complement of H in L2n ( ). Accordingly the restriction to L2n ( ) of of Theorem 2.3 can be decomposed as 1 + 2 2 H H ? and 2 can be identi ed with the Lagrange multiplier ~ of Theorem 2.4. 2 Proof of Theorem 2.4. By Theorem 2.3 there exists 2 M n( ) , the dual of M n ( ), such that (2.2) and (2.3) hold. Since K = X we nd D

(2.8) ((T T + ) u ? T z; v)L2 = ? ; rv

E

M n; ;M n

for all v 2 X:

Let us consider the restriction of the functional ! h; iM ;M to L2n ( ).By 2 the Riesz representation theorem it can be identi ed with an q element of Ln ( ) which we denote by ^. Due to (2.5) we nd that j^ jL2 meas ( ). Taking the distributional derivative in ((T T + ) u ? T z; v)L2 = ? (^; rv)L2 for all v 2 H 1( ); n;

n

n

n

it follows that div ^ can be considered as bounded linear functional on L2( ). Hence ^ can be identi ed with an element of H (div) and (2.9) (T T + ) u ? T z = div ^ in L2( ): Combining (2.8) and (2.9) it follows that

div ^; v

(2.10)

D

= ? ; rv L2

E

M n; ;M n

for all v 2 X:

Let us argue that ^ 2 H0 (div). By Green's formula and (2.10) we have

div ^; v

+ ^; rv L2

L2

=

Z

?

v ^ n ds = 0 for every v 2 D( ): 9

Hence the trace of ^ satis es (^) = 0 and ^ 2 H0 (div). Let H = fv 2 H0(div) : div v = 0g. Since H is a closed subspace of L2n( ), we have (2.11) L2n = H H ?; with H ? the orthogonal complement of H in L2n ( ). It is well-known that H ? = frq : q 2 H 1( )g. Let us decompose ^ = 1 + 2 2 H H ?; Rwith div 1 = 0 and 2 = rq. The element q is unique when normalized by

q dx = 0. Then (2.6) follows from (2.9). To verify (2.7) let us observe that by (2.3) D

; r(u ? u)

E

M n; ;M n

Z

+

and by (2.10) (? div 2; u ? u)L2 +

Z

jr u j jr u j for all u 2 X

Z

Z

jr u j jr u j for all u 2 X:

Finally concerning the L2n -norm of 2 we have due to the orthogonal decomposition (2.11) q j2 jL2 j jL2 meas ( ): This concludes the rst part of the proof with ~ = 2. Conversely assume that (u; ~) 2 X H0(div) satis es (2.6), (2.7), and let u be an arbitrary element in X . Let F denote the cost functional in (P). Then by (2.6), (2.7) n

n

Z

Z

F (u) ? F (u) ((T T + ) u; u ? u)L2 + jr u j ? jr u j = (div ~; u ? u) ? (div ~; u ? u) 0 and u is a solution to (P). 2 Corollary 2.6 Under the Rassumptions of Theorem 2.4, q is the unique variational solution satisfying q dx = 0 to the Neumann problem ( ? q = (T T + )u + T z in

(2.12) @q @n = 0 on ?: 10

The compatibility condition is clearly satis ed since Z

((T T + ) u ? T z) dx = 0

by (2.8) with v = 1. If the boundary of is C 1;1 -smooth then q 2 H 2( ). 2

Corollary 2.7 Under the assumption of Theorem 2.4 Z ~ ?div ; u L2( ) = jru j (2.13)

and (2.14)

div ~; u

Z

jru j for all u 2 X: L2 ( )

2

Equality (2.13) follows from (2.7) with u = 0 and u = 2u. By (2.7) and (2.13) one obtains (2.14).

3 Stability with respect to data This section is dedicated to the analysis of the stability of the solution to (P) with respect to the data z. It will be convenient to denote (P) by (Pz ) and solutions to (Pz ) by uz . Let us rst give the stability result that can be attained from a-priori estimates. Theorem 3.1 Assume that (H) holds, let fzng be a sequence of observations in Y converging to z, and let fuz g denote a sequence of solutions to (Pz ). Then there exists a subsequence fuz g of fuz g and uz 2 L2 ( ) such that n

n

nk

n

Z

(3.1) lim uz = uz weakly in L2 ( ) and lim jruz j = nk

nk

Z

jruz j;

and every such sequence has the property that uz is a solution to (P). If 2 (H) (i) holds then lim uz = uz strongly in L2 ( ). Proof. For simplicity let us denote un = uz . We have for every u 2 K Z 1 2 2 (3.2) 2 jTun ? zn jY + 2 jun jL2 + jrun j 12 jTu ? zn j2Y

Z + j u j2L2 + jru j: 2

nk

n

11

If (H) (i) or (iii) hold then it follows directly that fung is bounded in X . In the case of (H) (ii) boundedness of fung in X follows from Proposition 1.1 d). Thus there exists a subsequence of fung denoted, for simplicity by the same symbol and uz 2 L2( ) such that (3.3) lim un = uz weakly in L2 ( ) and lim un = uz in L1 ( ): It follows that uz 2 K and Z Z lim Tun = Tuz weakly in Y and jruz j lim jrun j:

We can now pass to the limit in (3.2) to obtain 1 jTu ? z j2 + j u j2 + Z jru j 1 j Tu ? z j2 + j u j2 + Z jru j Y Y 2 z 2 z L2 z 2 2 L2 for every u 2 K and hence uz is a solution to (Pz ). Let us show next the second convergence statement in (3.1). We have 1 j Tu ? z j2 + ju j2 + lim Z jru j n 2 z 2 z

Z 1 2 2 lim 2 j Tun ? zn j + lim 2 j un j + lim jrun j Z 1 2 2 lim 2 jTun ? zn j + 2 j un j + jrun j Z 1 2 2 lim 2 jTuz ? zn j + 2 j uz j + jruz j Z = 12 j Tuz ? z j2 + 2 j uz j2 + jruz j;

R R R lim jrun j jruz j, which together with jruz j and hence R lim jrun j implies the second statement in (3.1). If (H) (i) holds then one shows similarly that lim j un jL2 = j uz j, which together with weak convergence implies strong convergence in L2 ( ) of un to uz . 2 The following lemma will be useful. Lemma 3.2 For every pair of solutions uz and uz with associated Lagrange multipliers z and z we have hz ? z; r(uz ? uz)iM ;M 0: n;

n

2

12

Proof. From (2.3) of Theorem 2.3 we deduce Z

Z

Z

Z

hz ; ruz ? ruz i + jruz j jruz j and

hz; ruz ? ruzi + jruz j jruz j:

Adding these two inequalities implies

hz ? z; ruz ? ruz i 0 which is the desired result. Theorem 3.3 Let (H) (i) hold. Then for every pair z; z 2 Y j u ? u j 2 1 k T k 2 j z ? z j : z

z L

L(L ;Y )

2

Y

Proof. From (2.2) of Theorem 2.3 we have h(T T + )uz ? T z ? div z ; uz ? uz i 0 h(T T + )uz ? T z ? div z; uz ? uzi 0 (the duality pairing h; i must be interpreted as in (2.2)) and hence h(T T + )(uz ? uz) ? T (z ? z) ? div (z ? z); uz ? uz i 0:

2

Lemma 3.2 implies that

h(T T + )(uz ? uz) ? T (z ? z); uz ? uzi hz ? z; r(uz ? uz )i 0; and hence

j uz ? uz j2 k T k j uz ? uz j | z ? z j; andtheresultfollows: Let Br denote the ball with center at the origin and radius r in Y .

13

2

Theorem 3.4 Let (H) (i) hold and let r > 0. Then Z

Z

j jruz j ? jruz j j r 2 + 1 k T k4 jz ? z j; for every z; z 2 Br . Here we assumed without loss of generality that k T k 1. 2 Proof. From Theorem 3.3 we conclude that j uz jL2 r k T k for every z 2 Br : (3.4)

Utilizing the fact that uz is a solution to (Pz ) we nd Z Z i h jr uz j ? jruz j 12 j Tuz ? z j2 ? j Tuz ? z j2 + j uz j2 ? j uz j2 : Similarly, since uz is a solution to (Pz) Z Z i h jruz j ? jruz j 21 j Tuz ? z j2 ? j Tuz ? z j2 + j uz j2 ? j uz j2 : Combining these two inequalities implies by Theorem 3.3 and (3.4) that Z

Z

j jruz j ? jruz j j 2 j uz ? uz j j uz + uz j

+ 21 j Tuz ? Tuz j (j Tuz + Tuz j + 2r)

r k T k2 j z ? z j+ k T k3 j uz ? uz j r + r

1 r = 2 + k T k4 j u ? z j;

and the desired estimate follows.

2

The last aspect that we address in this section is the stability of the dual variable = z with respect to perturbations in the data z. It will be assumed that (H) (i) is satis ed. This implies existence of a unique solution uz to (Pz ) for every z 2 Y . 14

Theorem 3.5 Assume that (H) (i) holds and that K = X . Let fzng be a

sequence in Y with limit z, and let (uz ; z ) be the solutions and Lagrange multipliers to (Pz ) according to Theorem 2.3. Then uz converges strongly in L2 ( ) to the solution uz of (Pz ) and lim R jruz j = R jruz j. Furthermore there exists a subsequence fz g of fz g converging weakly in H (div) and weak in L1( ) to some z 2 L1 n ( ) \ H0 (div) and for every such cluster point we have (3.5) (T T + )uz = T z + div z in L2 ( ); Z ? (div z ; uz )L2 = jruz j; and (3.6) n

n

n

n

n

nk

(3.7)

(z ; )L1;L1

n

Z

j j dx; for all 2 L1n( ):

2

Proof. The rst part of the claim follows from Theorem 3.1. For simplicity we shall write (un; n) in place of (uz ; z ). The optimality systems for (Pz ) are given by (3.8) (T T + ) un = T z + div n; Z hn; runiM ;M = jrun j; (3.9) n;

n

Z

j j; for all 2 M n ( ): Since L1n( ) M n ( ), the restrictions of ! hn; iM ;M to L1n( ) can be identi ed with elements in L1 n ( ) with the property that j n jL1 1 for all n, by (3.10). From (3.8) it follows that fng is bounded in H0(div) and thus there exists a subsequence of fng denoted by the same symbol 1 and z 2 L1 n ( ) \ H0 (div) such that j z jL1 1; n ! z weak in Ln ( ) and n ! z in H0 (div). Taking the limit in (3.8) implies (3.5) and (3.7) is equivalent to j z jL1 j. To verify (3.6) one argues as in the proof of (3.10)

hn; iM

n

n

n

;M

n;

n

n;

n

n

n

Theorem 2.4 | see (2.10) | that (3.9) implies n

(3.11)

Z

? (div n; un)L2 = jrun j:

Taking the limit in (3.11) one obtains (3.6).

2

It is simple to pass to the limit in the optimality system (2.6), (2.7) and to obtain the following result. 15

Theorem 3.6 Let the assumptions of Theorem 3.5 hold and let (uz ; ~n) de-

note the solutions to (Pz ) with the associated Lagrange multipliers according to Theorem 2.4. Then ~ z converges strongly in L2n( ) to ~ z = rqz and weakly in H0 (div) and (T T + )uz = T z + Z div ~z in LZ 2( ) ?div ~z ; u ? u~z 2 + jruz j jru j for all u 2 X: n

n

n

L

2

4 Asymptotic analysis In this section convergence and rate of convergence of the solutions of the regularized problems as the regularization parameter tends to zero is analyzed. We consider the case where = and for convenience we repeat the problem formulation: (P )

8 < :

Z min 21 j Tu ? z j2Y + 2 j u j2L2 + jru j

over u 2 K \ X;

>From Theorem 2.1 it is known that (P ) admits a unique solution u for every > 0 and z 2 Y . The unregularized least squares problem is given by 8 < min 1 j Tu ? z j2 Y (P0) 2 : over u 2 K \ X: Let us henceforth assume that z 2 Y is xed and that z has a projection z^ 2 Y on range T (K \ X ). Since K is assumed to be convex this projection is necessarily unique. It then follows that (P0) admits a solution and we de ne the set of solutions S = fu 2 K \ X : Tu = z^g: Let us introduce the notion of minimal solution to (P0) in S . For that purpose we consider 16

8 < :

(Q)

min

u 2 S:

1 j u j2 + Z jru j 2 L2

By Proposition 1.1 it is simple to argue the existence of a solution to (Q). Due to the strict convexity of the L2-part of the cost functional in (Q) this solution is unique. It will be denoted by u^z and it is called the minimal solution of (P0). Proposition 4.1 There exists z in the dual of M n ( ) such that (ûz ; u ? u^z )L2 ? hdiv z ;Ru ? u^z iBV ;BV 0 for all u 2 S ; hz ; ? ru^z iM ;M + jru^z j R j j for all 2 M n ( ): n;

n

Here, as in Section 2, ?div is the adjoint to r : BV ( ) ! M n ( ). 2 Let us give a rst result on the behavior of the solutions u as a function of .

Proposition 4.2 R R (i) 21 ju j2L2 + jru j 21 j u^z j2L2 + j u^z j; for every > 0, (ii) j T u^z ? z jY j Tu ? z jY ; for every > 0,

(iii) lim !0+ u = u^z strongly in Lp ( ); p < n?n 2 ; lim !0+ u = u^z weakly in L2 ( ) and lim !0+ Tu = T u^z strongly in Y . (iv) If z 2 range (T ), then j Tu ? z j = o

p

.

2

Proof. The second assertion follows from the fact that u^z 2 S . To verify (i) we use (ii) and the fact that 1 j Tu ? z j2 + j u j2 + Z jru j 2 2 L2

Z (4.1) 21 j T u^z ? z j2Y + 2 ju^z j2L2 + jru^z j: >From (i) it can be deduced that fj u jX g >0 is bounded. Hence there exists a subsequence of fu g >0 , denoted by the same symbol and v 2 X such that 17

u ! v strongly in Lp( ); p
0 with nlim !1 n (vn ? u) = v ;

the limit being taken in X , and the negative polar cone N (C; u) X at u 2 C is given by n o N (C; u) = T (C; u)? = v 2 X : hv; uiX ;X 0 for all u 2 T (C; u) : 18

Note that u^z ? div z de nes an element of X which we denote by the same symbol. Referring to Proposition 4.1 we have (4.5)

div z ? u^z 2 T (S ; u^z )? :

We shall denote the intersections of ker T and K with BV by kerBV T and KBV respectively. It is simple to argue that

T (S ; u^z )? T (KBV ; u^z ) \ kerBV T: De nition 4.3 The minimal solution u^z is called quali ed if T (S ; u^z ) = T (KBV ; u^z ) \ kerBV T: (4.6)

2

Clearly u^z is quali ed if kerBV T = f0g or if KBV = BV ( ). If 0 2 int (KBV ? kerBV T ) then u^Z is quali ed [AE]. If

KBV = fu 2 X : fi(u) bi ; i = 1; : : : ; M g ; with M xed, fi 2 X ; bi 2 IR, then one can proceed as in ([CK], Prop. 3) to show that every element of KBV is quali ed. Proposition 4.4 If u^z is quali ed then div z ? u^z 2 T (KBV ; u^z )? + (kerBV T )?

X

;

2

where the bar with index X denotes closure in X . Proof. Due to (4.5) and the quali cation condition X

div z ? u^z 2 (T (KBV ; u^z ) \ kerBV T )? = T (KBV ; u^z )? + (kerBV T )? ; the last identity being well-known for polar cones ([AE], pg. 175).

2

Let us note that (kerBV T )? cannot be replaced by Rg(T 0) since X is not re exive and hence Rg(T 0) may be a strict subset of (kerBV T )?. Here 19

T 0 : Y ! X denotes the conjugate of T jX 2 L(X; Y ). Let us also note that due to the assumption that T 2 L(L2 ( ); Y ) the elements in the range of T 0 can be identi ed with elements of L2. The condition, which insures strong convergence of u to u^z is given by L

?u^z 2 ? div z + T (KBV ; u^z )? + Rg T 2 :

(H1)

Here the bar with index L2 denotes the closure in L2 . Singe Rg T L it is assumed implicitly in (H1) that T (KBV ; u^z )? ? div z can be identi ed with a subset of L2 ( ). If T (KBV ; u^z )? equals f0g, it is thus required that div z can be identi ed with an element of L2 ( ). We shall discuss this requirement after Theorem 4.6. Under the additional assumption that (H2) ?u^z 2 ? div z + T (KBV ; u^z )? + Rg T ; 2 ( ),

rate of convergence can be proved.

Theorem 4.5 i) If (H1) holds and ! 0+ ; 2 = () ! 0 then the solutions u to (P ; ) converge strongly to u^z in L2 ( ). ii) If (H2) holds and ! 0+ ; , then

j u ? u^z jL2 = O and

q

;

j Tu ? z^ jY = O ( ) :

2

Proof. By the de nition of u 1 j Tu ?z j2 + ju j2 + Z jru j 1 j T u^ ?z j2 + j u^ j2 + Z jru^ j; z 2 Y 2 L2 2 z Y 2 z L2

20

and hence 1 j Tu ? z^ j2 + ju ? u^ j2 1 j T u^ ? z j2 + 1 jTu ? z^ j2 ? 1 j Tu ? z j2 Y 2 2 z 2 z Z 2 Z 2 + j u^z j2 + j u ? u^z j2 ? j u j2 + jru^z j ? jru j : 2 2 2

Using a2 + b2 ? j a + b j2 = ?2(a; b) twice yields 1 j Tu ? z^ j2 + j u ? u^ j2 Y 2 2 z Z Z (Tu ? z^; z ? z^)Y + (ûz ; u^z ? u ) + jru^z j ? jru j (Tu ? z^; z ? z) + (Tu ? z^; z ? z^) + Z (ûz ; u^z ? u )Z + jru^z j ? jru j (4.7)

(Tu ? z^; z ? z) + (ûz ; u^z ? u ) +

Z

Z

jru^z j ? jru j :

Let " > 0 be arbitrary and by (H1) choose w 2 Y; 2 T (KBV ; u^z )?; 2 L2 ( ) such that

u^z = div z ? + T w + ; j jL2 ": The L2 -inner product (ûz ; u^z ? u ) can be interpreted as duality pairing (ûz ; u^z ? u )X ;X and hence 1 jTu ? z^ j2 + j u ? u^ j2 2 2

(4.8)

Y

2

z L

(Tu ? z^; z ? z)Y + (T w; u^z ? u )X ;X + (z ; ru ? ru^z )M ;M + (; u^z ? u )L2 Z Z + (; u ? u^z )X ;X + jru^z j ? jru j

( + j w jY ) j Tu ? z^ j + " j u ? u^z jL2 : n;

n

The following argument is now standard [CK]. The last inequality can be expressed as (4.9) a2 + b2 2Aa + 2Bb; p p where a = j Tu ? z^ j; b = ju ? u^z j; A = ( + j w jY ) and B = ". 21

It is simple to argue that (4.9) implies a 2A + B and b A + 2B; and (i) follows. If (H2) holds then one can take = 0 and (4.8) implies j Tu ? z^ j2Y + ju ? u^z j2 2( + j w jY ) j Tu ? z^ j: If then the asymptotic behavior asserted in (ii) follows. 2 2 Since the regularization term involves an L - as well as a BV -part we aim in the nal result of this paper to improve the R convergence properties of Theorem 4.5. We cannot expect convergence of jru ? ru^z j. In fact, let us assume that = (0; 1), and let u be Heaviside functions with R jump at 1 + ; 2 (0; 1 ). Then u 2 BV (0; 1) and j u ? u jL2 ! 0; j jru j ? 2 R2 j j !R 0 where u is the Heaviside function with jump at 21 . On the

jru other hand jru ? ru j = 2 for all 2 (0; 21 ). Theorem 4.6 In addition to the assumptions of Theorem 4.5 (ii) assume that div z 2 X can be identi ed with an element of L2 ( ). Then Z

Z

j jru j ? jru^z j j K j u ? uz jL2

for a constant K independent of 2 (0; 1]. In particular Z

Z

j jru j ? jru^z j j = O

q

:

Proof. By Proposition 4.1 Z Z jru^z j ? jru j (div z ; u ? u^z ) j div z jL2 j u ? u^z j:

2

Similarly by Theorem 2.3 Z

Z

jru j ? jru^z j (div ; u^z ? u ) j div jL2 j u ? u^z j:

The proof will be completed by arguing that fdiv g 2 (0;1] is bounded in L2 ( ). We have by Theorem 2.3 (T T + ) u = T z + div ; 22

and therefore

jdiv jL2 j u jL2 + k T kL(L2;Y ) j Tu ? z jY j u jL2 + k T kL(L2;Y ) (j Tu ? z^ j + j z^ ? z j) j u jL2 + O( ) k T kL(L2;Y ) :

Remark 4.7 We consider several examples illustrating the condition div z 2 L2 ( ), where z satis es the second condition stated in Proposition 4.1, i.e.

hz ; ? ru^z iM

;M

n;

n

+

Z

Z

jr u ^z j j j for all 2 M n ( );

which can equivalently be expressed as (4.10)

hz ; ru^z iM

;M

n;

n

=

Z

jru^z j;

and Z hz ; iM ;M j j for all 2 M n ( ): (4.11) While these examples will be one-dimensional with = (0; 1), higher dimensional examples can be obtained by extending the functions by constant values in the remaining directions. Let us brie y discuss the requirements that are imposed on z by (4.10), (4.11) and z ; div z 2 L2 ( ) in the one-dimensional case. First of all this necessitates that z 2 H 1( ) C ( ). Satisfying (4.10) amounts, formally, to setting z = sign u^0z at points where u^z is smooth but nonconstant. Since z 2 C ( ), condition (4.11) necessitates by the de nition of M ( ) as the dual of C ( ) to choose z 2 C ( ) such that jz (x)j 1 for all x 2 . Summarizing we see that (at least for piecewise smooth u^z ) the existence of z satisfying the speci ed requirements depends on the possibility of extending the graph of f(x; sgn u^0z (x)) : x such that u^0z (x) 6= 0g to a graph with domain

and ju^0z (x)j 1 for all x 2 . - We now give three examples which illustrate the existence of z - or the lack thereof. (i) Let 8 3 on (0; 1=4) > > > < 4 on (1=4; 1=2) u^z = > 2 on (1=2; 3=4) > > : 1 ? x on (3=4; 1); n;

n

23

then z can be chosen as a C 1( )-function with j z jC 1 (so that (4.11) is satis ed) and such that it equals 1 in a neighborhood of 1/4 - 1 in a neighborhood of 1/2 - 1 in a neighborhood of 3/4 - 1 on (3/4, 1). Then z thus de ned satis es (4.10), (4.11) and div z 2 L2 ( ). (ii) If u^z is chosen as in (i) but with u^z = x ? 1 on (3=4; 1) instead of 1 ? x, then satisfying (4.10) requires that z equals 1 in a neighborhood of 1/4 - 1 in a neighborhood of 1/2 - 1 in a neighborhood of 3/4 1 on (3/4, 1), which can clearly not be satis ed by any function in C 1( ). So in this case there exists no regular z satisfying (4.10) and (4.11). (iii) For u^z given by 8 > > > < u^z = > > > :

sin x on h 0; 21 i 1 on 21 ; 1 ? 21 sin ((x ? 1) + 1) on 1 ? 21 ; 1 ;

with 2 (1; 1) one chooses z 2 C 1( ); j z jC 1 such that 8 < z = :

1 on 0; 21 ?1 on 1 ? 21 ; 1 :

Clearly z satis es (4.10), (4.11) and div z 2 L2 ( ). For = 1 no such z can be found.

2 24

References [AE] J.-P. Aubin and I. Ekeland: Applied Nonlinear Analysis, WileyInterscience, New York, 1984. [B] J. Baumeister: Stable Solutions of Inverse Problems, Vieweg, Braunschweig, 1987. [CK] G. Chavent and K. Kunisch: Convergence of Tikhonov regularization for constrained ill-posed inverse problems, Inverse Problems, 10 (1994), 63 { 76. [CKP] E. Casas, K. Kunisch, and C. Pola: Regularization by functions of bounded variation and applications to image enhancement, preprint. [Gi] E. Giusti: Minimal Surfaces and Functions of Bounded Variation, Birkhauser, Boston, 1984. [Gr] C. W. Groetsch: The Theory of Tikhonov Regularization for Fredholm Equations of the First Kind, Pitman, Boston, 1984. [GR] V. Girault and P. A. Raviart: Finite Elements, Methods for NavierStokes Equations, Springer, Berlin, 1984. [L] A. K. Louis: Inverse und schlechtgestellte Probleme, Teubner, Stuttgart, 1989. [R] W. Rudin: Real and Complex Analysis, McGraw Hill, London, 1970. [ROF] L. Rudin, S. Osher,and E. Fatemi: Nonlinear total variation based noise removal algorithm, Physica D, 60 (1992), 259 { 268. [T] R. Temam: Mathematical Problems in Plasticity, Gauthier-Villars, Kent, 1985. [V] C. Vogel and M. Oman: Iterative methods for total variation denoising, preprint.

25

Regularization of Linear Least Squares Problems by Total ... - CiteSeerX

Regularization of Linear Least Squares Problems by Total ... - CiteSeerX

Suggest Documents

Solving linear least squares problems by Gram-Schmidt ... - CiteSeerX

Weighted total least squares formulated by standard least squares

On the condition number of linear least squares problems ... - CiteSeerX

Truncated Total Least Squares: A New Regularization ... - IEEE Xplore

Least Squares Linear Discriminant Analysis - CiteSeerX

Convex Total Least Squares - arXiv

DUAL REGULARIZED TOTAL LEAST SQUARES

Regularized Constrained Total Least-Squares

Adaptive least squares matching as a non-linear least squares ...

Linear observation based total least squares - Taylor & Francis Online

An Algorithm for Solving Scaled Total Least Squares Problems

TOTAL LEAST SQUARES REGISTRATION of 3D SURFACES

ITERATIVE SOLUTION OF LEAST-SQUARES PROBLEMS APPLIED ...

Numerical investigations of linear least squares ...

Solution of large-scale sparse least squares problems ... - CiteSeerX

Total Least Squares and Chebyshev Norm

Total Least Squares and Chebyshev Norm

STRUCTURED TOTAL LEAST SQUARES FOR HANKEL MATRICES

on the regularization of the recursive least-squares ...

LEAST-SQUARES METHODS FOR LINEAR ... - Semantic Scholar

Mixed linear-nonlinear least squares regression

Flexible least squares for approximately linear

condition numbers for structured least squares problems

Research Article Least Squares Problems with