J. London Math. Soc. (2) 66 (2002) 240–256
Cf 2002
London Mathematical Society DOI: 10.1112/S0024610702003332
ITERATIVE ALGORITHMS FOR NONLINEAR OPERATORS HONG-KUN XU Abstract Iterative algorithms for nonexpansive mappings and maximal monotone operators are investigated. Strong convergence theorems are proved for nonexpansive mappings, including an improvement of a result of Lions. A modification of Rockafellar’s proximal point algorithm is obtained and proved to be always strongly convergent. The ideas of these algorithms are applied to solve a quadratic minimization problem.
1. Introduction Let H be a Hilbert space and let T be a maximal monotone operator in H. It is well known that many problems in nonlinear analysis and optimization can be formulated as the problem: find an x such that 0 ∈ Tx.
(1.1)
Rockafellar [18] devised the proximal point algorithm which generates, starting with an arbitrary initial x0 ∈ H, a sequence (xn ) satisfying: xn+1 := (I + cn T )−1 xn + en ,
n > 0,
(1.2)
where I is the identity operator of H, cn > 0 is a real number, and en is an error vector. Rockafellar proved the weak convergence of algorithm (1.2) if the bounded away from zero and if the sequence of the errors satisfies sequence (cn ) is P uler [10] constructed an example showing that the condition: n ken k < ∞. G¨ Rockafellar’s proximal point algorithm does not converge strongly, in general. This gives rise to a natural question how to modify Rockafellar’s proximal point algorithm so that strong convergence is guaranteed. One such modification has recently been obtained in [19] where Solodov and Svaiter proposed a modified proximal point algorithm which converges strongly to a solution of (1.1). Their construction, at each iteration, consists of two steps. The first step, with xn having been constructed from the previous iteration, produces a point yn following the proximal point algorithm (1.2). In the second step, the yn obtained in step 1 is used and the parameter cn > 0 and the error en are appropriately chosen to construct some closed convex subset Cn (actually Cn is the intersection of a cone and a hyperplane). Then xn+1 is defined as the projection of the initial x0 to Cn . Compared with the original proximal point algorithm (1.2), Solodov and Svaiter’s algorithm, though strongly convergent, does need more computing time due to the projection in the second step. Our concern is this: is it possible to have a modification of the proximal point algorithm which is strongly convergent and which does not much increase computing time? Algorithm (1.4) proposed below may be such a candidate. Received 10 May 2001; revised 10 October 2001. 2000 Mathematics Subject Classification 47J25, 90C25, 47H09, 47H05. Supported in part by the National Research Foundation.
iterative algorithms for nonlinear operators
241
Let us now recall that a self-mapping S of a closed convex subset C of H is said to be nonexpansive if kSx − Syk 6 kx − yk,
x, y ∈ C.
We also recall that the resolvent and the Yosida approximation of the maximal monotone operator T are defined as Jr x := (I + rT )−1 x,
Ar x :=
1 (I − Jr )x ∈ T (Jr x), r
x ∈ H, r > 0.
It is well known that Jr is nonexpansive. There is a strongly convergent algorithm for nonexpansive mappings. Let S: C −→ C be a nonexpansive mapping with a fixed point. Halpern [11] proposed the following algorithm. Choose an initial x0 ∈ C arbitrarily and define (xn ) by xn+1 := αn x0 + (1 − αn )Sxn ,
n > 0,
(1.3)
where (αn ) is a sequence of parameters in [0, 1]. It is proved [11, 14, 23] that, under a suitable choice of parameters (αn ), the sequence (xn ) generated by algorithm (1.3) strongly converges to a fixed point of S which is the projection of the initial x0 to the set of fixed points of S. The point in algorithm (1.3) is though we may not have strong convergence for the sequence (Sxn ), a suitable convex combination of (Sxn ) with the point x0 does have strong convergence. This suggests that we consider a similar algorithm for maximal monotone operators. More precisely, we will consider the following algorithm: xn+1 := αn x0 + (1 − αn )(I + cn T )−1 (xn ) + en ,
n > 0.
(1.4)
We shall show that algorithm (1.4) converges strongly provided that the sequences (αn ), (cn ) of real numbers and the sequence (en ) of errors are chosen appropriately. It is obvious that once we have calculated yn := (I + cn T )−1 (xn ) + en , then the calculation of the mean αn x0 + (1 − αn )yn is much easier than that of the projection of x0 to the closed convex set Cn mentioned previously. In this sense, algorithm (1.4) is better than the algorithm of Solodov and Svaiter [19]. This paper is organized as follows. Section 2 gathers some tools which will be employed in the remaining sections. In Section 3 we prove strong convergence theorems for algorithm (1.3) for nonexpansive mappings. In particular, a result of P. L. Lions [14] has been improved. Extensions to contraction semigroups will be included in Section 4. In Section 5 we apply algorithm (1.4) to maximal monotone operators. Finally, in Section 6, the ideas of Section 3 are applied to solve a quadratic optimization problem.
2. Tools Let X be a Banach space and let ϕ : [0, ∞) −→ [0, ∞) be a strictly increasing continuous function such that ϕ(0) = 0 and ϕ(t) → ∞ as t → ∞. Such a ϕ is called a gauge. Associated with a gauge ϕ is the duality map Jϕ : X −→ X ∗ defined by Browder [5] as Jϕ (x) := {x∗ ∈ X ∗ : hx, x∗ i = kxkϕ(kxk), kx∗ k = ϕ(kxk)},
x ∈ X∗.
242
hong-kun xu
In the case of ϕ(t) = t, we write J instead of Jϕ = J, and call J the normalized duality map. Notice the relation: Jϕ (x) =
ϕ(kxk) J(x), kxk
x 6= 0.
Recall that a Banach space X is said to be smooth if kx + λyk − kxk λ→0 λ exists for each x, y ∈ SX . (Here SX := {v ∈ X : kvk = 1} is the unit sphere of X.) X is uniformly smooth if X is smooth and the limit is attained uniformly for x, y ∈ SX , and X is Fr´echet differentiable if X is smooth and the limit is attained uniformly in y ∈ SX . It is known that X is smooth if and only if each duality map Jϕ is single-valued, that X is Fr´echet differentiable if and only if each duality map Jϕ is norm-to-norm continuous in X, and that X is uniformly smooth if and only if each duality map Jϕ is norm-to-norm uniformly continuous on bounded subsets of X. Following Browder [5], we say that a Banach space X has a weakly continuous duality map if there exists a gauge ϕ such that Jϕ is single-valued and weak-to-weak∗ sequentially continuous lim
(that is, if
w
(xn ) ⊂ X, xn −→x,
then
w∗
Jϕ (xn ) −→ Jϕ (x)).
It is known that `p (1 < p < ∞) has a weakly continuous duality map with gauge ϕ(t) = tp−1 . (See [8] for more details on duality maps.) Set Zt Φ(t) := ϕ(τ)dτ, t > 0. 0
Then Φ is a convex function and Jϕ (x) = ∂Φ(kxk),
x ∈ X,
where ∂ denotes the subdifferential in the sense of convex analysis. We need the subdifferential inequality of the following form for a smooth Banach space X: Φ(kx + yk) 6 Φ(kxk) + hy, Jϕ (x + y)i,
x, y ∈ X;
(2.1)
in particular, kx + yk2 6 kxk2 + 2hy, J(x + y)i,
x, y ∈ X.
(2.2)
Recall also that a Banach space X satisfies Opial’s property [15] if lim sup kxn − xk < lim sup kxn − yk n→∞
n→∞
w
whenever xn −→ x,
y 6= x.
Let now C be a nonempty closed convex subset of X and S : C −→ C be a nonexpansive mapping. Throughout this paper, we always assume that S has a fixed point, and use F(S) to denote the set of fixed points of S; that is, F(S) := {x ∈ C : Sx = x}. For a fixed number t ∈ (0, 1) and a point u ∈ C, by Banach’s contraction principle, there exists a unique zt ∈ C satisfying the equation zt = tu + (1 − t)Szt .
(2.3)
If X is a Hilbert space, Browder [4] proved that s − limt→0 zt exists and is fixed
iterative algorithms for nonlinear operators
243
point of S. We need its counterpart in a Banach space, which was proved by Reich [17]. Lemma 2.1 [17]. Let X be a uniformly smooth Banach space, C be a closed convex subset of X and S be a nonexpansive mapping such that F(S) 6= ?. Let {zt } be defined as in (2.3). Then the strong limt→0 zt exists and is a fixed point of S. We need another result of Reich, but first we recall that a Banach space X is uniformly convex if, for any ε > 0, δ(ε) > 0, where δ(·) is the modulus of convexity of X defined by δ(ε) := inf {1 − 12 kx + yk : kxk 6 1, kyk 6 1, kx − yk > ε} . Lemma 2.2 [16]. Let C be a closed convex subset of a uniformly convex Banach space with a Fr´echet differentiable norm, and let (Sn ) be a sequence of nonexpansive self-mappings of C with a nonempty common fixed point set F. If x1 ∈ C and xn+1 = Sn xn for n > 1, then limn→∞ hxn , J(f1 − f2 )i exists for all f1 , f2 ∈ F. In particular, hq1 − q2 , J(f1 − f2 )i = 0, where f1 , f2 ∈ F and q1 , q2 are weak limit points of (xn ). We will often use Browder’s demiclosedness principle for nonexpansive mappings. Lemma 2.3 (demiclosedness principle [3]). Let X be a uniformly convex Banach space, C be a closed convex subset of X, and S : C −→ C be a nonexpansive mapping such that F(S) 6= ?. Then I − S is demiclosed; that is, w
s
(xn ) ⊂ C, xn −→x, (I − S)xn −→y ⇒ (I − S)x = y. We need Bruck’s result on the asymptotic behaviour of nonexpansive mappings. Put n−1 1X k S x, n > 1, x ∈ C. Sn (x) := n k=0
˜ is a bounded subset of Lemma 2.4 [6]. If X is a uniform Banach space and if C C, then lim sup kS(Sn (x)) − Sn (x)k = 0. n→∞
˜ x∈C
Throughout the rest of the paper, we shall use the notation: for a given sequence (xn ), ωw (xn ) denotes the weak ω-limit set of (xn ); that is, w
ωw (xn ) := {x ∈ X : xnj −→x for some subsequence (xnj ) of (xn )}. We conclude this section with the following lemma. Lemma 2.5. Let (sn ) be a sequence of non-negative real numbers satisfying sn+1 6 (1 − αn )sn + αn βn + γn ,
n > 0,
(γn ) satisfy the conditions: Q where (αn ), (βn ), andP (i) (an ) ⊂ [0, 1], n an = ∞, or equivalently, ∞ n=1 (1 − αn ) = 0; (ii) lim supn→∞ βn 6 P0; (iii) γn > 0(n > 0), n γn < ∞. Then limn sn = 0.
(2.4)
244 Proof.
hong-kun xu For any ε > 0, let N be an integer big enough so that ∞ X γn < ε, n > N. βn < ε, n=N
Using (2.4) and by induction, we obtain, for n > N, ! ! n n n Y Y X (1 − αk ) sN + 1 − (1 − αk ) ε + γn . sn+1 6 k=N
n=N
k=N
Then conditions (ii)–(iii) imply that lim supn sn 6 2ε. This completes the proof.
q
3. Convergence analysis for nonexpansive mappings Let X be a Banach space, C be a closed convex subset of X, and S : C −→ C be a nonexpansive mapping. Suppose that F(S) 6= ?. For a sequence (αn ) of real numbers in [0, 1] and an arbitrary point u ∈ C, we can, starting with another arbitrary initial x0 ∈ C, define a sequence (xn ) in C recursively by the formula: xn+1 := αn u + (1 − αn )Sxn ,
n > 0.
(3.1)
Halpern [11] was the first to study the convergence of the scheme (3.1) in the framework of Hilbert spaces. Lions [14] improved his result and proved the strong convergence of (xn ) (still in a Hilbert space) if the sequence (αn ) of parameters satisfies the following conditions: (1) lim P∞n→∞ αn = 0; Q∞ (2) n=0 αn = ∞, or equivalently, n=0 (1 − αn ) = 0; 2 (3) limn→∞ (αn − αn−1 )/αn = 0. Our first result of this section improves Lions’ result twofold. First we weaken condition (3) by removing the square from the denominator so that our choice of (αn ) includes the natural one of (1/n). Secondly, we prove the strong convergence of scheme (3.1) in the framework of Banach spaces. Theorem 3.1. Let X be a uniformly smooth Banach space, C be a closed convex subset of X, and S : C −→ C be a nonexpansive mapping with a fixed point. Let u, x0 ∈ C be given. Assume that (αn ) ⊂ [0, 1] satisfies the control conditions (1), (2) and (3)0 limn→∞ (αn − αn−1 )/αn = 0. Then the sequence (xn ) generated by algorithm (3.1) converges strongly to a fixed point of S. Proof. We divide the proof into several steps. (1) (xn ) is bounded, so is (Sxn ). Indeed for p ∈ F(S), we have kxn+1 − pk = kαn (u − p) + (1 − αn )(Sxn − p)k 6 αn ku − pk + (1 − αn )kxn − pk. Then by induction we get kxn − pk 6 max{ku − pk, kx0 − pk}, (2) kxn − Sxn k → 0.
n > 0.
iterative algorithms for nonlinear operators
245
It follows from algorithm (3.1) that kxn+1 − Sxn k = αn ku − Sxn k → 0,
(3.2)
and kxn+1 − xn k = k(αn − αn−1 )(u − Sxn−1 ) + (1 − αn )(Sxn − Sxn−1 )k 6 (1 − αn )kxn − xn−1 k + M|αn − αn−1 | = (1 − αn )kxn − xn−1 k + αn βn , where M := supn>1 ku − Sxn−1 k < ∞ and βn := M|αn − αn−1 |/αn → 0 by condition (3)0 . Hence by Lemma 2.5 we conclude that limn→∞ kxn+1 − xn k = 0. This together with (3.2) implies that kxn − Sxn k → 0. (3) lim supn hu − z, J(xn − z)i 6 0, where z = s − limt→0 zt . By Equation (2.3) we can write zt − xn = t(u − xn ) + (1 − t)(Szt − xn ). Apply the subdifferential inequality (2.2) to get kzt − xn k2 6 (1 − t)2 kSzt − xn k2 + 2thu − xn , J(zt − xn )i 6 (1 − t)2 (kSzt − Sxn k + kSxn − xn k)2 + 2t(kzt − xn k2 + hu − zt , J(zt − xn )i) 6 (1 + t2 )kzt − xn k2 + kSxn − xn k(2kzt − xn k + kSxn − xn k) + 2thu − zt , J(zt − xn )i. Hence
t kSxn − xn k kzt − xn k2 + (2kzt − xn k + kSxn − xn k). 2 2t Taking lim sup as n → ∞ yields t lim suphu − zt , J(xn − zt )i 6 lim sup kzt − xn k2 . n→∞ n→∞ 2 Letting t → 0, noting the fact that zt → z in norm and the fact that the duality map J is norm-to-norm uniformly continuous on bounded sets on X, we get hu − zt , J(xn − zt )i 6
lim suphu − z, J(xn − z)i 6 0. n→∞
(3.3)
s
(4) xn −→z. From (3.1) we can write xn+1 − z = αn (u − z) + (1 − αn )(Sxn − z). Apply the subdifferential inequality (2.2) to get kxn+1 − zk2 6 (1 − αn )2 kSxn − zk2 + 2αn hu − z, J(xn+1 − z)i. Thus kxn+1 − zk2 6 (1 − αn )kxn − zk2 + αn βn ,
(3.4)
where βn := 2hu − z, J(xn+1 − z)i satisfies lim supn→∞ βn 6 0 by (3.3). Apply Lemma 2.5 to (3.4) to see that limn→∞ kxn − zk = 0; that is, s
xn −→z.
q
246
hong-kun xu
Remark 3.1. Wittmann [23] also proved the strong convergence of the sequence (xn ). His conditions on the parameters (αn ) are (1), (2), and P (3)∗ n |αn+1 − αn | < ∞. We note that (3) and (3)∗ are not comparable and that (3)∗ and (3)0 are not comparable either. For instance, the sequence (αn ) defined by 1 if n is odd √ n αn = 1 √ if n is even n−1 satisfies (3)0 , but fails to satisfy (3)∗ . Remark 3.2. Halpern [11] observed that conditions (1) and (2) are necessary for the algorithm to strongly converge for all nonexpansive mappings S : C −→ C. It is unclear if they are also sufficient. However, if we replace Sxn in scheme (3.1) with do have strong convergence under conditions (1) and (2). Recall the mean Sn xn , weP k that Sn (x) = (1/n) n−1 k=0 S x. Theorem 3.2. Assume that X is uniformly convex and uniformly smooth. For given u, x0 ∈ C, let (xn ) be generated by the algorithm: xn+1 := αn u + (1 − αn )Sn xn ,
n > 0.
(3.5)
Assume that (i) limn αn = 0; P (ii) n αn = ∞. Then (xn ) strongly converges to a point in F(S). Proof.
For p ∈ F(S), by (3.5) we have kxn+1 − pk 6 αn ku − pk + (1 − αn )kSn xn − pk 6 αn ku − pk + (1 − αn )kxn − pk,
so by induction we get kxn − pk 6 max{ku − pk, kx0 − pk},
n > 0.
Thus (xn ) is bounded, so is (yn ), where yn := Sn xn . Now by Lemma 2.4 we have lim kSyn − yn k = 0.
n→∞
It follows from (2.3) and (2.2) that kzt − yn k2 = kt(u − yn ) + (1 − t)(Szt − yn )k2 6 (1 − t)2 kSzt − yn k2 + 2thu − yn , J(zt − yn )i 6 (1 − t)2 (kSzt − Syn k + kSyn − yn k)2 + 2t(kzt − yn k2 + hu − zt , J(zt − yn )i) 6 (1 + t2 )kzt − yn k2 + kSyn − yn k(2kzt − yn k + kSyn − yn k) + 2thu − zt , J(zt − yn )i.
(3.6)
iterative algorithms for nonlinear operators
247
Hence t kSyn − yn k kzt − yn k2 + (2kzt − yn k + kSyn − yn k). 2 2t Taking lim sup as n → ∞ and by (3.6), we get t lim suphu − zt , J(yn − zt )i 6 lim sup kzt − yn k2 . n→∞ n→∞ 2 hu − zt , J(yn − zt )i 6
Letting t → 0, noting the fact that zt → z ∈ F(S) in norm and the fact that the duality map J is norm-to-norm uniformly continuous on bounded sets of X, we get lim suphu − z, J(yn − z)i 6 0.
(3.7)
n→∞
Since (d/dt)kx + tyk2 = 2hy, J(x + ty)i, we have Z1 2 2 kx + yk = kxk + 2 hy, J(x + ty)idt, 0
x, y ∈ X.
It follows that kxn+1 − zk2 = k(1 − αn )(yn − z) + αn (u − z)k2 = (1 − αn )2 kyn − zk2 Z1 + 2αn hu − z, J((1 − αn )(yn − z) + tαn (u − z))idt 0
6 (1 − αn )kxn − zk2 + αn βn , where βn := 2
Z1 0
(3.8)
hu − z, J((1 − αn )(yn − z) + tαn (u − z))idt.
Since s
(yn − z) − [(1 − αn )(yn − z) + tαn (u − z)] = αn [yn − z − tαn (u − z)]−→0
(n → ∞)
uniformly in t ∈ [0, 1] and since J is uniformly continuous on bounded sets of X, we infer by (3.7) that lim sup βn = 2 lim suphu − z, J(yn − z)i 6 0. n→∞
n→∞
Now apply Lemma 2.5 to (3.8) to get kxn − zk → 0.
q
If we consider the average of xn and Sn xn as xn+1 , we can only have weak convergence. Theorem 3.3. Assume that X is a uniformly convex Banach space which either has a Fr´echet differentiable norm or satisfies Opial’s property. With an initial x0 ∈ C, we define (xn ) by the algorithm: xn+1 := αn xn + (1 − αn )Sn xn , Assume that (i) limn αn = 0; P (ii) n αn = ∞. Then (xn ) weakly converges to a fixed point of S.
n > 0.
(3.9)
248
hong-kun xu
Proof. Write yn := Sn xn . (1) For p ∈ F(S), {kxn − pk} is decreasing; hence (xn ) and (yn ) are bounded. This follows from the following inequalities (since Sn is nonexpansive): kxn+1 − pk 6 αn kxn − pk + (1 − αn )kSn xn − pk 6 kxn − pk. (2) ωw (xn ) ⊂ F(S). This is a consequence of the demiclosedness principle (Lemma 2.3) and the fact that kSxn −xn k → 0, which is implied by the fact that kSyn −yn k → 0 (Lemma 2.4) and the fact that kxn+1 − yn k = αn kxn − yn k → 0. (3) ωw (xn ) is a singleton. We distinguish two cases. Case (1): X has a Fr´echet differentiable norm. Then by Lemma 2.2 and step 2 above we see that q1 , q2 ∈ ωw (xn ) ⇒ hq1 − q2 , J(q1 − q2 )i = 0 ⇒ q1 = q2 . Case (2): X satisfies Opial’s condition. In this case we let q1 , q2 ∈ ωw (xn ) and take subsequences (xni ), (xmj ) such that w
xni −→q1 ,
w
xmj −→q2 .
If q1 6= q2 , then we get the contradiction by Opial’s property: lim kxn − q1 k = lim kxni − q1 k
n→∞
i→∞
< lim kxni − q2 k = lim kxmj − q2 k i→∞
j→∞
< lim kxmj − q1 k j→∞
= lim kxn − q1 k. n→∞
Thus in both cases we have verified that ωw (xn ) contains at most one point.
q
Remark 3.3. It is unclear how to locate the weak limit of (xn ) obtained in Theorem 3.3. However we have the following result. Theorem 3.4. Let X be a uniformly convex Banach space and let P be the (nearest point) projection of X onto F(S). Let (xn ) be the sequence of iterations generated by algorithm (3.3). Then s−limn→∞ Pxn exists. If, in addition, X satisfies Opial’s property, then s − limn→∞ Pxn = w − limn→∞ xn . Proof.
(1) {kPxn − xn k} is decreasing. Indeed we have kxn+1 − Pxn+1 k 6 kxn+1 − Pxn k 6 αn kxn − Pxn k + (1 − αn )kSn xn − Pxn k 6 kxn − Pxn k.
(2) Let τ := limn→∞ kPxn − xn k. If τ = 0, then noting that kPxn − Pxm k 6 kPxn − xn+m+1 k + kPxm − xn+m+1 k 6 kPxn − xn k + kPxm − xm k → 0
(n, m → ∞),
we immediately see that (Pxn ) is Cauchy and hence convergent. Assume τ > 0. Put σ := lim sup kPxn − Pxm k. n,m→∞
iterative algorithms for nonlinear operators
249
If σ > 0, then we can take ε > 0 small enough so that σ < τ, (τ + ε) 1 − δ τ+ε where δ is the modulus of convexity of X. Let N > 1 be an integer such that kPxn − xn k < τ + ε for all n > N. Then since kxn+m+1 − Pxn k 6 kxn − Pxn k < τ + ε and kxn+m+1 − Pxm k 6 kxm − Pxm k < τ + ε for n, m > N, it follows that for n, m > N, τ 6 kxn+m+1 − Pxn+m+1 k 6 kxn+m+1 − 12 (Pxn + Pxm )k kPxn − Pxm k . 6 (τ + ε) 1 − δ τ+ε Letting n, m → ∞ in the last inequality, we arrive at the contradiction: σ < τ. τ 6 (τ + ε) 1 − δ τ+ε We therefore must have σ = 0 and (Pxn ) is strongly convergent. Let v be the limit of (Pxn ). It is easily seen that lim kxn − vk = min{lim kxn − zk : z ∈ F(S)}. (3) If, in addition, X satisfies Opial’s property, upon letting q ∈ F(S) denote the weak limit of (xn ), we see that limn kxn − vk = limn kxn − Pxn k 6 limn kxn − qk, so Opial’s property must imply that v = q. q 4. Extension to contraction semigroups Let X be a Banach space and C be a nonempty closed convex subset of X. Let S = {S(t) : t > 0} be a family of self-mappings of C. Recall that S is said to be a contraction semigroup on C if the following conditions hold: (a) S(0)x = x, x ∈ C; (b) S(t1 + t2 )x = S(t1 )S(t2 )x, t1 , t2 > 0, x ∈ C; (c) for each x ∈ C, the function S(t)x is continuous in t ∈ [0, ∞); (d) for each t > 0, S(t) : C −→ C is a nonexpansive mapping. We shall use F(S) to denote the set of common fixed points of S; that is, \ F(S(t)). F(S) = t>0
Lemma 4.1 (cf. [20]). Let X be a uniformly convex Banach space, C be a closed convex subset of X, and S be a contraction semigroup on C. Assume that F(S) 6= ?. Then there is a family (rt : t > 0) of non-negative numbers such that lim sup kS(r)σt (x) − σt (x)k = 0,
n→∞ x∈C
where σt (x) :=
1 t
Zt 0
r > 0.
S(rt + τ)xdτ.
We now define the algorithm: take u, x0 ∈ C arbitrarily and define xn+1 := αn u + (1 − αn )σtn (xn ),
n > 0.
(4.1)
Employing similar arguments to the discrete cases (Theorems 3.3 and 3.2) we can prove the following two results.
250
hong-kun xu
Theorem 4.1. Let X be a uniformly convex Banach space either with a Fr´echet differentiable norm or satisfying Opial’s property, C be a nonempty closed convex subset of X, and S = {S(t) : t > 0} be a contraction semigroup on C such that F(S) 6= ?. Starting with an arbitrary initial x0 ∈ C, we define (xn ) in C by the relaxed algorithm xn+1 := αn xn + (1 − αn )σtn (xn ),
n > 0,
(4.2)
where (αn ) and (tn ) satisfy the conditions (i) lim P n αn = 0; (ii) n αn = ∞; (iii) limn tn = ∞. Then (xn ) converges weakly to a point in F(S). Theorem 4.2. Let X be a uniformly convex Banach space, let P be the projection of X onto F(S) and let (xn ) be the sequence of iterations generated by algorithm (4.2). Then s − limn→∞ Pxn exists. If, in addition, X satisfies Opial’s property, then s − limn→∞ Pxn = w − limn→∞ xn . 5. Modified proximal point algorithms Let H be a Hilbert space and T be a maximal monotone operator. Assume that the equation 0 ∈ Tx has a solution and let S be the solution set: S = {x ∈ H : 0 ∈ Tx} = T −1 (0). Then S is a closed convex nonempty subset of H and thus the projection PS from H onto F(S) is well-defined. One of the fundamental problems in the theory of maximal monotone operators is to find a solution of T , that is, a point in S. Rockafellar’s proximal point algorithm [18] provides us with a powerful numerical tool to find a point in S. Much effort has gone into the study of the proximal point algorithm and its modifications; see [1, 7, 10, 13, 19, 22] and the references therein. Starting with an arbitrary initial x0 ∈ H, the proximal point algorithm generates a sequence (xn ) in the following way: xn+1 = (I + cn T )−1 xn + en ,
n > 0,
(5.1)
where cn > 0 is a real number and en is an error vector. It is proved in [18] that, under reasonable assumptions on the sequence (cn ) and the errors (en ), the proximal point algorithm always converges weakly to a point in the solution set S. However strong convergence is not guaranteed (see [10]), so attempts have been made to modify the proximal point algorithm so that strong convergence is ensured. One such effort was made by Solodov and Svaiter [19]. Their algorithm generates a sequence (xn ) satisfying xn+1 = PHn ∩Wn (x0 ),
n > 0,
where (a) PHn ∩Wn is the projection of H onto Hn ∩ Wn , where Hn := {z ∈ H : hz − yn , vn i 6 0},
Wn := {z ∈ H : hz − xn , x0 − xn i 6 0};
(b) (yn , vn ) ∈ H × H is an inexact solution of the inclusion: 0 ∈ T (x) + µn (x − xn )
(5.2)
iterative algorithms for nonlinear operators
251
with tolerance σ; that is, vn ∈ T (xn ), vn + µn (yn − xn ) = en , ken k 6 σ max{kvn k, µn kyn − xn k}. It is proved in [19] that if the sequence of regularization parameters (µn ) is bounded from above, then the sequence (xn ) constructed above converges strongly to PS (x0 ). Compared with the original proximal point algorithm, Solodov and Svaiter’s modified proximal point algorithm requires, at each iteration, to calculate a projection, which may not always be an easy job. This gives rise to a natural question whether there is an appropriately modified proximal point algorithm which guarantees strong convergence and which does not substantially increase calculations. We will try to give a way to solve this question. Instead of solving an additional projection at each iteration, we make a relaxation of xn and x0 for the next iterate xn+1 . This should not give a burden in calculating xn+1 , while we have strong convergence. This is the following algorithm. Algorithm 5.1. (1) x0 ∈ H is chosen arbitrarily. (2) Choose a regularization parameter cn > 0 with error en ∈ H and relaxation parameter αn ∈ [0, 1] and compute yn := (I + cn T )−1 (xn ) + en .
(5.3)
(3) Compute the (n + 1)th iterate: xn+1 := αn x0 + (1 − αn )yn .
(5.4)
Theorem 5.1. Let (xn ) be generated by Algorithm 5.1. Assume that (i) αP n → 0; (ii) n αn = ∞; (iii) cP n → ∞; (iv) n ken k < ∞. Then (xn ) converges strongly to PS (x0 ). Proof.
We divide the proof into three steps.
Step 1: (xn ) is bounded. It suffices to show that, for any fixed p ∈ S, kxn − pk 6 kx0 − pk +
n−1 X
kek k,
n > 0.
(5.5)
k=0
It is trivial that (5.5) holds true for n = 0. Assuming the validity of (5.5) for n = m, we now show the validity of (5.5) for n = m + 1. Observing that p is a fixed point of the resolvent Jr , which is nonexpansive, we infer from (5.3) that kyn − pk 6 kxn − pk + ken k.
252
hong-kun xu
Hence by (5.4) we have kxn+1 − pk 6 αn kx0 − pk + (1 − αn )kyn − pk 6 αn kx0 − pk + (1 − αn )kxn − pk + ken k 6 kx0 − pk +
m X
kek k.
k=0
Step 2: lim supn hx0 − q, xn − qi 6 0, where q = PS (x0 ). Take a subsequence (xnj ) of (xn ) so that lim suphx0 − q, xn − qi = limhx0 − q, xnj − qi. j
n
We may also assume that
(5.6)
w
xnj −→x∞ . It thus follows from (5.6) that lim suphx0 − q, xn − qi = hx0 − q, x∞ − qi, n
(5.7)
so it remains to show that x∞ ∈ S. Noting that (yn = Jcn xn + en ) kxn+1 − Jcn xn k 6 kxn+1 − yn k + kyn − Jcn xn k 6 αn kx0 − yn k + ken k → 0, we have
w
Jcnj −1 xnj −1 −→ x∞ . Since Acnj −1 xnj −1 =
1 s (xn −1 − Jcnj −1 xnj −1 ) −→ 0, cnj −1 j
Acnj −1 xnj −1 ∈ T (Jcnj −1 xnj −1 ), taking the limit as n → ∞, we obtain by the maximality of T (hence T is demiclosed) that 0 ∈ T (x∞ ); that is, x∞ ∈ S. Step 3:
s
xn −→q := PS (x0 ). Put βn := 12 hx0 − q, xn+1 − qi and γn := ken k(ken k + 2kxn − qk) → 0. 2
Since kyn − qk 6 (kxn − qk + ken k)2 6 kxn − qk2 + γn , applying the subdifferential inequality (2.2) we obtain kxn+1 − qk2 = k(1 − αn )(yn − q) + αn (x0 − q)k2 6 (1 − αn )kyn − qk2 + 2αn hx0 − q, xn+1 − qi 6 (1 − αn )kxn − qk2 + αn βn + γn . Combining step 2, condition (iv) and Lemma 2.5 we get s
xn −→ q.
q
iterative algorithms for nonlinear operators
253
Algorithm 5.2 (relaxed proximal point algorithm). (1) Select x0 ∈ H arbitrarily. (2) Choose a regularization parameter cn > 0 with error en ∈ H and compute yn := (I + cn T )−1 (xn ) + en . (3) Select a relaxation parameter αn ∈ [0, 1] and compute the (n + 1)th iterate: xn+1 := αn xn + (1 − αn )yn . Theorem 5.2. Let (xn ) be the sequence generated by Algorithm 5.2. Assume that (i) (αn ) is bounded away from 1, namely 0 6 αn 6 1 − δ for some δ ∈ (0, 1); (ii) cP n → ∞; (iii) n ken k < ∞. Then (xn ) converges weakly to a point in S. Proof.
Again we divide the proof into several steps.
Step 1: For p ∈ S, limn kxn − pk exists; in particular, (xn ) is bounded, so is (Jcn xn ) and hence 1 s Acn xn = (xn − Jcn xn )−→0. cn Indeed we have
Since
P n
kxn+1 − pk 6 αn kxn − pk + (1 − αn )kyn − pk 6 αn kxn − pk + (1 − αn )(kxn − pk + ken k) 6 kxn − pk + ken k. ken k < ∞, it is easy to see (cf. [21]) that limn kxn − pk exists.
Step 2: ωw (xn ) ⊂ S. Let z ∈ ωw (xn ) and take a subsequence (xnj ) of (xn ) such that w
xnj −→z.
Since
kxn+1 − Jcn xn k 6 kxn+1 − yn k + kyn − Jcn xn k 6 αn kxn − yn k + ken k → 0,
repeating the proof of the second part of step 2 of Theorem 5.1, we find that z ∈ S. Step 3: limn kxn+1 − yn k = 0. Fix p ∈ S. We have kxn+1 − pk2 = kαn (xn − p) + (1 − αn )(yn − p)k2 = αn kxn − pk2 + (1 − αn )kyn − pk2 − αn (1 − αn )kxn − yn k2 6 αn kxn − pk2 + (1 − αn )(kxn − pk + ken k)2 − αn (1 − αn )kxn − yn k2 . This implies that αn (1 − αn )kxn − yn k2 6 kxn − pk2 − kxn+1 − pk2 + Mken k,
(5.8)
where M > 0 is a constant such that M > 2kxn − pk + ken k, n > 0. Noting that 1 − αn > δ > 0 and kxn+1 − yn k2 6 αn kxn − yn k2 , (5.8) above implies that limn kxn+1 − yn k = 0.
254
hong-kun xu
Step 4:
w
xn −→q
for some q ∈ S.
It suffices to show that ωw (xn ) consists of one point. Let q1 , q2 ∈ ωw (xn ) and let w
xni −→q1 ,
w
xmj −→q2 .
We deduce by step 1 that lim kxn − q2 k2 = lim kxni − q2 k2
n→∞
i→∞
= lim (kxni − q1 k2 + 2hxni − q1 , q1 − q2 i + kq1 − q2 k2 ) i→∞
= lim kxn − q1 k2 + kq1 − q2 k2 .
(5.9)
n→∞
Interchange q1 and q2 to obtain lim kxn − q1 k2 = lim kxn − q2 k2 + kq1 − q2 k2 .
n→∞
n→∞
Adding up (5.9) and (5.10) we obtain q1 = q2 .
(5.10) q
Remark 5.1. It is unclear if the weak limit q of (xn ) in Theorem 5.2 equals PS (x0 ), but we have q = s − limn PS (xn ). Indeed, similar to the proof of Theorem 3.4 we see that s − limn PS (xn ) exists. This strong limit must coincide with the weak limit of (xn ) obtained in Theorem 5.2 since a Hilbert space satisfies Opial’s property. 6. A quadratic minimization problem Consider the quadratic minimization problem: µ (6.1) min hAx, xi + 12 kx − uk2 − hx, bi x∈K 2 where K is a nonempty closed convex subset of a Hilbert space H, u, b ∈ H, µ > 0 is a real number, and A is a bounded linear operator on H which is positive (that is hAx, xi > 0 for x ∈ H). (We do not assume that A is positive definite.) Such a quadratic minimization problem finds applications in image restorations (see [12]). It is known that the objective function in problem (6.1) is uniformly convex; hence (6.1) has a unique solution in K, denoted x∗ . We now design an iterative algorithm which converges strongly to x∗ . Starting with an arbitrary x0 ∈ H, we define a sequence (xn ) in H by ˜ K xn , xn+1 := αn ˜u + (1 − αn A)P
n > 0,
(6.2)
˜ := I + µA, and PK is the projection of H onto K. Recall that ˜ := u + b, A where u PK is nonexpansive and F(PK ) = K. Theorem 6.1. Let (αn ) satisfy the conditions (i) αP n → 0; (ii) n αn = ∞; (iii) limn αn /αn+1 = 1. Then the sequence (xn ) generated by algorithm (6.2) strongly converges to the unique solution x∗ of problem (6.1). Proof.
(1) By (6.2) we can write ˜ ∗ ) + (I − αn A)(P ˜ K xn − x∗ ). xn+1 − x∗ = αn (˜u − Ax
(6.3)
iterative algorithms for nonlinear operators
255
˜ is positive. Since It is easy to see that, for 0 < α < (1+µkAk) , the operator I −αA αn → 0 as n → ∞, we may assume, with no loss of generality, that αn < (1 + µkAk)−1 for all n > 1. Noting that hAx, xi > 0 for all x ∈ H, we deduce that −1
˜ = sup h(I − αn A)x, ˜ xi kI − αn Ak kxk=1
= sup [1 − αn (1 + µhAx, xi)] kxk=1
6 1 − αn . Then it follows from (6.3) that ˜ ∗ k, kxn+1 − x∗ k 6 (1 − αn )kxn − x∗ k + αn k˜u − Ax
n > 0,
so using an induction we get ˜ ∗ k}, kxn − x∗ k 6 max{kx0 − x∗ k, k˜u − Ax
n > 0.
In particular, (xn ) is bounded. (2) It follows from algorithm (6.2) that ˜ K xn k → 0 kxn+1 − PK xn k = αn k˜u − AP
(6.4)
and ˜ K xn − PK xn−1 )k ˜ K xn−1 ) + (1 − αn A)(P kxn+1 − xn k = k(αn − αn−1 )(˜u − AP 6 M|αn − αn−1 | + (1 − αn )kxn − xn−1 k = (1 − αn )kxn − xn−1 k + αn βn , ˜ K xn k : n > 0} < ∞ and βn := M(αn − αn−1 )/αn → 0 by where M := sup{k˜ u − AP condition (iii). Thus by Lemma 2.5 we have kxn+1 − xn k → 0. This, together with (6.4), implies that kxn − PK xn k → 0.
(6.5)
(3) By (6.5) and the demiclosedness principle (Lemma 2.3), we get ωw (xn ) ⊂ F(PK ) = K. We also note that since x∗ solves the minimization problem (6.1), there holds the inequality: ˜ ∗ , x − x∗ i 6 0, x ∈ K. (6.6) h˜ u − Ax Now pick a subsequence (xnj ) of (xn ) such that ˜ ∗ , xn − x∗ i − lim h˜u − Ax ˜ ∗ , xnj − x∗ i. u − Ax lim suph˜ j→∞
n→∞
(6.7)
Passing to a further subsequence if necessary, we may assume that w
x xnj −→˜ ˜. We have x ˜ ∈ K. It follows from (6.7) and (6.6) that for some x ˜ ∗ , xn − x∗ i = h˜u − Ax ˜ ∗, x ˜ − x∗ i 6 0. u − Ax lim suph˜ n→∞
(6.8)
(4) Applying the subdifferential inequality (2.2) we have ˜ K x∗ − x ˜ ∗ )k2 ˜) + αn (˜u − Ax kxn+1 − x∗ k2 = k(I − αn A)(P ˜ ∗ , xn+1 − x∗ i. 6 (1 − αn )kxn − x∗ k2 + 2αn h˜u − Ax
(6.9)
256
iterative algorithms for nonlinear operators
Hence applying Lemma 2.5 to (6.9) and combining with (6.8) we get q kxn − x∗ k → 0. Acknowledgements. The author is grateful to the referee for the comments and suggestions, which improved the presentation of this paper. References 1. P. Alexandre, V. H. Nguyen and P. Tossings, ‘The perturbed generalized proximal point algorithm’, Math. Model. Numer. Anal. 32 (1998) 223–252. 2. J. B. Baillon and H. Brezis, ‘Une remarque sur le comportement asymptotique des semigroupes ´ non lineaire’, Houston J. Math. 2 (1976) 5–7. 3. F. E. Browder, ‘Nonexpansive nonlinear operators in a Banach space’, Proc. Nat. Acad. Sci. USA 54 (1965) 1041–1044. 4. F. E. Browder, ‘Convergence of approximants to fixed points of nonexpansive nonlinear mappings in Banach spaces’, Arch. Rational Mech. Anal. 24 (1967) 82–90. 5. F. E. Browder, ‘Convergence theorems for sequences of nonlinear operators in Banach spaces’, Math. Z. 100 (1967) 201–225. 6. R. E. Bruck, ‘On the convex approximation property and the asymptotic behavior of nonlinear contractions in Banach spaces’, Israel J. Math. 38 (1981) 304–314. 7. J. V. Burke and M. Qian, ‘A variable metric proximal point algorithm for monotone operators’, SIAM J. Control Optim. 37 (1998) 353–375. 8. I. Cioranescu, Geometry of Banach spaces, duality mappings and nonlinear problems (Kluwer, Dordrecht, 1990). 9. K. Goebel and S. Reich, Uniform convexity, hyperbolic geometry, and nonexpansive mappings (Marcel Dekker, New York, 1984). ¨ ler, ‘On the convergence of the proximal point algorithm for convex minimization’, SIAM J. 10. O. Gu Control Optim. 29 (1991) 403–419. 11. B. Halpern, ‘Fixed points of nonexpanding maps’, Bull. Amer. Math. Soc. 73 (1967) 957–961. 12. K. Ito and K. Kunisch, ‘An active set strategy based on the augmented Lagrangian formulation for image restoration’, Math. Model. Numer. Anal. 33 (1999) 1–21. 13. A. Kaplan and R. Tichatschke, ‘Proximal point methods and nonconvex optimization’, J. Global Optim. 13 (1998) 389–406. 14. P. L. Lions, ‘Approximation de points fixes de contractions’, C. R. Acad. Sci. Paris S´er. A 284 (1977) 1357–1359. 15. Z. Opial, ‘Weak convergence of the sequence of successive approximations for nonexpansive mappings’, Bull. Amer. Math. Soc. 73 (1967) 591–597. 16. S. Reich, ‘Weak convergence theorems for nonexpansive mappings in Banach spaces’, J. Math. Anal. Appl. 67 (1979) 274–276. 17. S. Reich, ‘Strong convergence theorems for resolvents of accretive operators in Banach spaces’, J. Math. Anal. Appl. 75 (1980) 287–292. 18. R. T. Rockafellar, ‘Monotone operators and the proximal point algorithm’, SIAM J. Control Optim. 14 (1976) 877–898. 19. M. V. Solodov and B. F. Svaiter, ‘Forcing strong convergence of proximal point iterations in a Hilbert space’, Math. Program. Ser. A 87 (2000) 189–202. 20. K. K. Tan and H. K. Xu, ‘An ergodic theorem for nonlinear semigroups of Lipschitzian mappings in Banach spaces’, Nonlinear Anal. 19 (1992) 805–813. 21. K. K. Tan and H. K. Xu, ‘Approximating fixed points of nonexpansive mappings by the Ishikawa iteration process’, J. Math. Anal. Appl. 178 (1993) 301–308. 22. P. Tossings, ‘The perturbed proximal point algorithm and some of its applications’, Appl. Math. Optim. 29 (1994) 125–159. 23. R. Wittmann, ‘Approximation of fixed points of nonexpansive mappings’, Arch. Math. 58 (1992) 486–491.
Department of Mathematics University of Durban–Westville Private Bag X54001 Durban 4000 South Africa
[email protected]