Some Remarks on the Proximal Point Algorithm - Springer Link

0 downloads 0 Views 377KB Size Report
Dec 7, 2011 - tone operator is the proximal point algorithm (for short, PPA). This algorithm has been introduced by Rockafellar [1]. It gives an approximation to ...
J Optim Theory Appl (2012) 153:769–778 DOI 10.1007/s10957-011-9973-5

Some Remarks on the Proximal Point Algorithm Hadi Khatibzadeh

Received: 12 July 2011 / Accepted: 17 November 2011 / Published online: 7 December 2011 © Springer Science+Business Media, LLC 2011

Abstract In this paper, we obtain some results on the boundedness and asymptotic behavior of the sequence generated by the proximal point algorithm without summability assumption on the error sequence. We also study the rate of convergence to minimum value of a proper, convex, and lower semicontinuous function. Finally, we consider the proximal point algorithm for solving equilibrium problems. Keywords Proximal-point algorithm · Maximal monotone operators · Asymptotic behavior · Rate of convergence · Equilibrium problems · Monotone bifunctions 1 Introduction One of the most popular algorithms for approximation of a zero of a maximal monotone operator is the proximal point algorithm (for short, PPA). This algorithm has been introduced by Rockafellar [1]. It gives an approximation to solutions of a variational inequality for monotone operators, and when the monotone operator be subdifferential of a proper, convex, and lower semicontinuous function, it gives an approximation to solutions of a minimization problem for the convex function. The weak and strong convergence of the sequence generated by PPA and its weighted average to a zero of a maximal monotone operator have been studied by several authors [1–7]. In [1–5], the authors assumed the set of all zeros of the maximal monotone operator be nonempty and studied convergence of the sequence generated by PPA under summability assumption on the error sequence and appropriate assumptions on the parameter sequence. In [6], Djafari Rouhani and Khatibzadeh investigated the asymptotic behavior of the sequence generated by PPA without assuming the set of Communicated by Nicolas Hadjisavvas. H. Khatibzadeh () Department of Mathematics, University of Zanjan, P.O. Box 45195-313, Zanjan, Iran e-mail: [email protected]

770

J Optim Theory Appl (2012) 153:769–778

all zeros of the maximal monotone operator be nonempty. They showed that the sequence generated by PPA is bounded iff the set of all zeros of the maximal monotone operator is nonempty, provided the summability assumption on error sequence and suitable assumption on the parameter sequence. In this paper, our motivation is the study of boundedness and asymptotic behavior of the sequence generated by PPA, without assuming summability of error sequence. In Sect. 2, we give some definition that we need in the sequel. In Sect. 3, we prove that under suitable conditions on error and parameter sequences, the sequence generated by PPA is bounded, provided A is assumed to be coercive. In Sect. 4, we study the asymptotic behavior of the sequence generated by PPA under appropriate assumptions on the error and parameter sequences. Our results in Sects. 3 and 4 extend some results of [1–6] and some results in the literature. In Sect. 5, the rate of convergence to minimum value of a convex, proper, and lower semicontinuous function has been studied. This result extends a result of Güler [7]. Finally, Sect. 6 is devoted to the proximal point algorithm for solving equilibrium problems. The proximal point algorithm for equilibrium problems has been investigated in [8–11] in recent years. In Sect. 5 of this paper, by using our previous work with Hadjisavvas [12], we show that all results on the proximal point algorithm for maximal monotone operators are true for some equilibrium problems with maximal monotone bifunctions.

2 Preliminaries Let H be a real Hilbert space with inner product (., .) and norm |.|. We denote weak convergence in H by  and strong convergence by →. Let A be a nonempty subset of H × H to which we shall refer as a (nonlinear) possibly multivalued operator in H . A is called monotone (resp. strongly monotone) iff (y2 − y1 , x2 − x1 ) ≥ 0 (resp. (y2 − y1 , x2 − x1 ) ≥ α|x1 − x2 |2 , for some α > 0), for all [xi , yi ] ∈ A, i = 1, 2. A is maximal monotone iff A is monotone and R(I + A) = H , where I is the identity mapping of H . The monotone operator A is called coercive iff there exists z ∈ H such that lim|x|→+∞ (y,x−z) |x−z| = +∞, for all [x, y] ∈ A. Given any function ϕ : H → ]−∞, +∞] with domain D(ϕ) (not necessary convex), its subdifferential is defined as a multivalued operator ∂ϕ, where   ∂ϕ(x) = w ∈ H | ϕ(x) − ϕ(y) ≤ (w, x − y), ∀y ∈ H . ϕ is called proper iff there exists x ∈ H such that ϕ(x) < +∞. It is a well-known result that if ϕ is a proper, convex, and lower semicontinuous function, then ∂ϕ is a maximal monotone operator. We refer the reader to the book by Morosanu [5] in order to understand monotone operators and subdifferential of convex functions in Hilbert spaces. Let {cn } be a sequence of positive real numbers, and {fn } be a sequence in H . The proximal point algorithm is defined by  xn = (I + cn A)−1 (xn−1 − fn ), ∀n ≥ 1, (1) x0 = x, where A : D(A) ⊂ H → H is a maximal monotone operator.

J Optim Theory Appl (2012) 153:769–778

771

In the sequel, ωw (x0 ) denotes the set of all weak cluster points of the sequence n −fn {xn } generated by (1). Axn denotes the element xn−1 −x in H , for n ≥ 1, and cn n  wn = ( k=1 ck )−1 ( nk=1 ck xk ).

3 Boundedness In this section, we study the boundedness of {xn } generated by (1) when the monotone operator A is coercive. The following theorem extends [2, Theorem 1]. Theorem 3.1 Let A be a coercive maximal monotone operator. If the sequence { |fcnn | } is bounded, then for each x0 ∈ H , the sequence {xn } generated by (1) is bounded. Proof Let C > 0 be such that for each n ≥ 1,

|fn | cn

< C. By coerciveness of A, there

0) exist K > 0 and y0 ∈ H such that for all [z, w] ∈ A, with |z| > K, (w,z−y |z−y0 | > C. Suppose that there exists n such that |xn+1 − y0 | > K. From (1), we get

xn − xn+1 − fn+1 ∈ cn+1 Axn+1 . Multiplying both sides of (2) by

xn+1 −y0 |xn+1 −y0 | ,

(2)

we get

cn+1 C + |xn+1 − y0 | ≤ |xn − y0 | + |fn+1 |. This implies that



|xn+1 − y0 | ≤ |xn − y0 | + cn+1

 |fn+1 | − C < |xn − y0 |, cn+1

for each n ≥ 0 such that |xn+1 − y0 | > K. It follows that for all n ≥ 0,   |xn+1 − y0 | ≤ max |x0 − y0 |, K . Hence, the sequence {xn } is bounded.



4 Asymptotic Behavior In this section, we establish some results on the weak and strong convergence of {xn } generated by (1) to an element of A−1 (0). We begin this section with a result which shows that the boundedness of {xn } guarantees the existence of a zero of A,under suitable assumptions on {cn } and {fn }. A similar result with the condition ∞ n=1 |fn | < +∞ has been proved in [6] by Djafari Rouhani and Khatibzadeh. First, we recall the following lemma which is a direct application of the well-known Stolz–Cesàro theorem. positive sequences such that Lemma 4.1 Suppose {an } and {bn } be  and

limn→+∞ abnn

= 0, then limm→+∞

m an n=1 m n=1 bn

= 0.

+∞

n=1 bn

= +∞

772

J Optim Theory Appl (2012) 153:769–778

Theorem 4.1 Let {xn } be a bounded sequence generated by (1). If |fcnn | → 0 as  −1 n → +∞ and +∞ n=1 cn = +∞, then A (0) = φ. Moreover, every weak cluster point −1 of wn belongs to A (0). Proof Suppose that y ∈ A(x). By the monotonicity of A, we have  k

−1 k (y, x − wk ) = cn cn (y, x − xn )  =  ≥ 

n=1 k

−1 cn

n=1 k n=1

−1 cn

−1

n=1 k

cn (y − Axn , x − xn ) + (Axn , x − xn )

n=1 k (xn−1 − xn − fn , x − xn ) n=1

k  1

 1 2 cn ≥ |xn − x| − |xn−1 − x| − |fn ||xn − x| 2 2 n=1 n=1 

−1 k  k −( kn=1 cn )−1 2 |x0 − x| − cn |fn ||xn − x|. ≥ 2 k

2

n=1

n=1

Since {wk } is bounded, there exists a subsequence {wkj } of {wk } such that wkj  p ∈ H as j → +∞; substituting k by kj in the above inequality and letting j → +∞, by Lemma 4.1 and our assumptions, we get (y, x − p) ≥ 0. Then, by the maximality of A, we have p ∈ A−1 (0) as desired.



In the two following theorems, we show that the set of all weak cluster points of the bounded sequence {xn } generated by (1) is a subset of A−1 (0).  2 Theorem 4.2 Let {xn } be the bounded sequence generated by (1) and +∞ n=1 cn = ∞ |fn |2 +∞. If n=1 c2 < +∞, and |fc2n | → 0 as n → +∞, then ωw (x0 ) ⊂ A−1 (0). n

n

Proof By the monotonicity of A and (1), we have   fn Axn−1 − Axn , Axn + ≥ 0. cn This implies that

 fn |Axn | ≤ (Axn−1 , Axn ) + Axn−1 − Axn , cn 1 1 1 ≤ |Axn−1 |2 + |Axn |2 − |Axn − Axn−1 |2 2 2 2 1 1 |fn |2 2 + |Axn−1 − Axn | + . 2 2 cn2 

2

J Optim Theory Appl (2012) 153:769–778

773

Then |Axn |2 ≤ |Axn−1 |2 + Since

∞

|fn |2 n=1 cn2

|fn |2 . cn2

(3)

< +∞, there exists limn→+∞ |Axn | = l. Let L = supn≥1 |Axn |.

Theorem 4.1 shows A−1 (0) = φ. Let p ∈ A−1 (0). By the monotonicity of A and (1), we get (xn−1 − xn − fn , xn − p) ≥ 0. This implies that |xn − xn−1 |2 ≤ |xn−1 − p|2 − |xn − p|2 + M|fn |, where M := 2 supn≥1 |un − p|. Summing up from n = 1 to k, by (1), we get k



cn2

Axn

n=1

k fn

2 + ≤ |x0 − p|2 − |xk − p|2 + M |fn |. cn n=1

It follows that   k k |fn |2 |fn | ≤ |x0 − p|2 − |xk − p|2 + M cn2 |Axn |2 + 2 − 2|Axn | |fn |. cn cn n=1

n=1

Hence, k

cn2 |Axn |2 ≤ 2L

n=1

k

cn |fn | + |x0 − p|2 − |xk − p|2 + M

n=1

k

|fn |.

n=1

By (3), we get k |fi |2 |Axk | ≤ |Axn | + . ci2 i=n+1 2

2

Then |Axk |2

k n=1

cn2 ≤

k

cn2

n=1

+M

k k |fi |2 + 2L cn |fn | + |x0 − p|2 − |xk − p|2 2 c i i=n+1 n=1

k

|fn |.

n=1

By the assumptions and Lemma 4.1, Axk → 0 as k → +∞. If xnk  q as k → +∞, then demiclosedness of A, implies that q ∈ A−1 (0).  Theorem 4.3 Let {xn } be the bounded sequence generated by (1) and A = ∂ϕ, where ϕ : H → ]−∞, +∞] is a proper, convex and lower semicontinuous func∞ |fn |2  |fn | tion. If ∞ n=1 cn = +∞, n=1 cn < ∞ and cn → 0 as n → +∞, then ωw (x0 ) ⊂ A−1 (0).

774

J Optim Theory Appl (2012) 153:769–778

Proof By the subdifferential inequality and (1), we get     ci ϕ(xi ) − ϕ(xi−1 ) ≤ ci ∂ϕ(xi ), xi − xi−1 = (xi−1 − xi − fi , xi − xi−1 ) 1 1 1 ≤ −|xi − xi−1 |2 + |xi − xi−1 |2 + |fi |2 ≤ |fi |2 . 2 2 2

(4)

Dividing both sides of (4) by ci and summing up from i = n + 1 to k, we obtain ϕ(xk ) ≤ ϕ(xn ) +

k 1 |fi |2 . 2 ci

(5)

i=n+1

Theorem 4.1 shows A−1 (0) = φ. Let p ∈ A−1 (0). By the subdifferential inequality and (1), we have   cn ϕ(xn ) − ϕ(p) ≤ (xn−1 − xn − fn , xn − p)   1 1 |xn−1 − p|2 − |xn − p|2 + M|fn |, ≤ (6) 2 2 where M := supn≥1 |xn − p|. Summing up from n = 1 to k, we obtain k n=1

k   1 1 cn ϕ(xn ) − ϕ(p) ≤ |x0 − p|2 − |xk − p|2 + M |fn |. 2 2 n=1

By (5), we get k k k   |fi |2 1 1 1 ϕ(xk ) − ϕ(p) cn ≤ cn + |x0 − p|2 − |xk − p|2 2 ci 2 2 n=1

n=1

+M

i=n+1

k

|fn |.

n=1

By the assumptions and Lemma 4.1, we get: ϕ(xk ) − ϕ(p) → 0 as k → +∞. If xkj  q as j → +∞, then ϕ(q) ≤ lim infj →+∞ ϕ(xkj ) = ϕ(p). Hence q ∈ A−1 (0).  Remark 4.1 Theorems 4.2 and 4.3 show that if A−1 (0) is a singleton (which happens if, for example, A is strictly monotone in Theorem 4.2 or ϕ is strictly convex in Theorem 4.3), then xn  p, where p is the unique element of A−1 (0). Remark 4.2 Although Theorems 4.2 and 4.3 do not imply the weak convergence of xn to p ∈ A−1 (0) unless when A is strictly monotone or ϕ is strictly convex, they improve the errors in the proximal point algorithm. They show that the error sequence {|fn |} can go to infinity, provided that a suitable assumption on cn holds. The following theorem extends a result of Boikanyo and Morosanu (see [2]). First, we prove a lemma.

J Optim Theory Appl (2012) 153:769–778

775

Lemma 4.2 Assume {yn } be a positive real sequence satisfying the following inequality: bn yn ≤ yn−1 − yn + an ,

(7)

where {bn } and {an } are positive sequences, then we have: (i) If { abnn } is bounded, then the sequence {yn } is bounded. (ii) If limn→+∞ abnn = 0, then there exists limn→+∞ yn .  (iii) If limn→+∞ abnn = 0 and +∞ n=1 bn = +∞, then yn → 0 as n → +∞. Proof (i) First we prove the boundedness of {yn }. There exists B such that for all n ≥ 1. If yn > B, then     an an − yn < yn−1 + bn − B ≥ yn−1 . yn ≤ yn−1 + bn bn bn

an bn

≤ B,

It follows that: yn ≤ max{y0 , B}. (ii) For each  > 0 there is m0 > 0 such that for each m ≥ m0 , abmm < . By the above argument for all k > m ≥ m0 , we have     ak am+1 ak ≤ · · · ≤ max ym , ≤ max{ym , } ≤ ym + . yk ≤ max yk−1 , ,..., bk bm+1 bk Then there exists limn→+∞ yn . (iii) We divide both sides of (7) by bn and take liminf when n → +∞. It is sufficient to prove, lim infn→+∞ b1n (yn−1 − yn ) ≤ 0. Suppose to the contrary lim inf

1

n→+∞ bn

(yn−1 − yn ) > λ > 0.

Then there exists n0 such that for each n ≥ n0 , b1n (yn−1 − yn ) > λ. Multiplying both sides of this inequality by bn , summing up from n = n0 to m and letting m → ∞, we get a contradiction.  Theorem 4.4 Let {xn } be the sequence generated by (1) and A be a maximal mono tone and strongly monotone operator. If ∞ c = +∞ and |fcnn | → 0 as n → +∞, n n=1 then xn → p as n → +∞, where p is the unique element of A−1 (0). Proof Theorem 3.1 implies the boundedness of {xn }. Now, by Theorem 4.1, A−1 (0) is nonempty. Suppose that p is the single element of A−1 (0). By the strong monotonicity of A and (1), we get (xn−1 − xn − fn , xn − p) ≥ αcn |xn − p|2 . It follows that 2αcn |xn − p|2 ≤ |xn−1 − p|2 − |xn − p|2 + 2M|fn |, where M := supn≥1 |un − p|. The theorem follows by the assumptions and Lemma 4.2. 

776

J Optim Theory Appl (2012) 153:769–778

5 Rate of Convergence Güler [7] computed the rate of convergence of ϕ(xn ) to ϕ(p) as n → +∞, where {xn } is generated by (1) with A = ∂ϕ and p is a minimum point of ϕ, provided that xn → p as n → +∞. In this section, we give a simple proof for Güler’s result without assuming xn → p. First we prove the following elementary lemma. real sequences such Lemma 5.1 Suppose that {an } and {bn } be two positive +∞ that {a } is nonincreasing and convergent to zero, and n n=1 an bn < +∞; then  ( nk=1 bk )an → 0 as n → +∞. Proof By the assumptions on {an } and {bn }, for each m > k, we get  m

 k

m bn ≤ am bn + an bn . am n=1

n=1

n=k+1

Taking limsup when m → +∞, we obtain  m

+∞ lim sup am bn ≤ an bn . m→+∞

n=1

n=k+1

The lemma is proved by letting k → +∞.



Theorem 5.1 Suppose that {xn } be the bounded sequence generated by (1) with ≡ 0 and A = ∂ϕ, where ϕ is a proper, convex lower semicontinuous function. fn  and n −1 If +∞ n=1 cn = +∞, then ϕ(xn ) − ϕ(p) = o(( i=1 ci ) ), where p is a minimum point of ϕ. Proof Summing up (6) from n = 1 to k, we get k n=1

  1 cn ϕ(xn ) − ϕ(p) < |x0 − p|2 . 2

By the proof of Theorem 4.3, ϕ(xn ) − ϕ(p) is nonincreasing and convergent to 0. Now, the theorem follows by Lemma 5.1. 

6 PPA and Equilibrium Problems In this section, we consider the proximal point algorithm for solving equilibrium problems. Given a nonempty subset C of a Hilbert space H , by the term “bifunction” we understand any function F : C × C → R such that F (x, x) = 0, for each x ∈ C. A bifunction F is called monotone (strictly monotone) iff F (x, y) + F (y, x) ≤ 0 ( 0 (equivalently, for some λ > 0) and each x ∈ H there exists xλ ∈ C such that λF (xλ , y) + (xλ − x, y − xλ ) ≥ 0,

∀y ∈ C.

(8)

This element xλ is uniquely defined. In [12] Hadjisavvas and Khatibzadeh proved some criteria for maximality of AF (or by definition, maximality of F ). The following theorem is one of them. Theorem 6.1 Let C ⊂ H be a nonempty, closed, and convex set. If F : C × C → R is monotone, F (., y) is upper hemicontinuous (i.e., upper semicontinuous on line segments) for all y ∈ C and F (x, .) is convex and lower semicontinuous for all x ∈ C, then F is maximal monotone. Therefore, if F satisfies the assumptions of Theorem 6.1, then by (8) there exists a sequence {xn } which satisfies  λn F (xn , y) + (xn − xn−1 + fn , y − xn ) ≥ 0, ∀y ∈ C, (9) x0 = x ∈ C, where {fn } is a sequence in H and {λn } is a positive real sequence. (9) is called the proximal point algorithm for monotone bifunction F . By the definition of AF , (9) is equivalent to (1) with AF instead of A and λn instead of cn . Therefore, every convergence result for the sequence generated by (1) is true for the sequence generated by (9) without any independent proof.

7 Concluding Remarks Asymptotic behavior of the sequence {xn } generated by the proximal point algorithm was investigated under more general assumptions on the parameter and error sequences. It has been shown that the error sequence {|fn |} can go to infinity. The rate of convergence of ϕ(xn ) to minimum value of ϕ was studied. A result of Güler

778

J Optim Theory Appl (2012) 153:769–778

has been extended without assuming strong convergence of {xn }. Finally, we showed the convergence results of the sequence generated by the proximal point algorithm can be applicable in equilibrium problems for monotone bifunctions. Acknowledgements This research was in part supported by a grant from University of Zanjan (No. 9041). The author would like to thank the referee for valuable comments.

References 1. Rockafellar, R.T.: Monotone operators and the proximal point algorithm. SIAM J. Control Optim. 14, 877–898 (1976) 2. Boikanyo, O.A., Morosanu, G.: Modified Rockafellar’s algorithms. Math. Sci. Res. J. 13, 101–122 (2009) 3. Brézis, H., Lions, P.L.: Produits infinis de résolvantes. Isr. J. Math. 29, 329–345 (1978) 4. Lions, P.L.: Une méthode itérative de résolution d’une inéquation variationnelle. Isr. J. Math. 31, 204–208 (1978) 5. Morosanu, G.: Nonlinear Evolution Equations and Applications. Mathematics and Its Applications, vol. 26. Reidel, Dordrecht (1988) 6. Djafari Rouhani, B., Khatibzadeh, H.: On the proximal point algorithm. J. Optim. Theory Appl. 137, 411–417 (2008) 7. Güler, O.: On the convergence of the proximal point algorithm for convex minimization. SIAM J. Control Optim. 29, 403–419 (1991) 8. Flam, S.D., Antipin, A.S.: Equilibrium programming using proximal-like algorithms. Math. Program. 78, 29–41 (1997) 9. Moudafi, A.: Proximal point algorithm extended to equilibrium problems. J. Nat. Geom. 15, 91–100 (1999) 10. Moudafi, A.: Second-order differential proximal method for equilibrium problems. J. Inequal. Pure Appl. Math. 4, 1–16 (2003) 11. Moudafi, A., Thera, M.: Proximal and dynamical approaches to equilibrium problems. In: Lecture Notes in Economics and Mathematical Systems, vol. 477, pp. 187–201. Springer, Berlin (1999) 12. Hadjisavvas, N., Khatibzadeh, H.: Maximal monotonicity of bifunctions. Optimization 59, 147–160 (2010)