The proximal point algorithm with genuine superlinear convergence ...

The proximal point algorithm with genuine superlinear convergence for the monotone complementarity problem by Nobuo Yamashita and Masao Fukushima Department of Applied Mathematics and Physics, Graduate School of Informatics, Kyoto University, Kyoto 606-8501, Japan May 31, 1999

Abstract. In this paper, we consider a proximal point algorithm (PPA) for solving monotone nonlinear complementarity problems (NCP). PPA generates a sequence by solving subproblems that are regularizations of the original problem. It is known that PPA has global and superlinear convergence property under appropriate criteria for approximate solutions of subproblems. However, it is not always easy to solve subproblems or to check those criteria. In this paper, we adopt the generalized Newton method proposed by De Luca, Facchinei and Kanzow to solve subproblems and some NCP functions to check the criteria. Then we show that the PPA converges globally provided that the solution set of the problem is nonempty. Moreover, without assuming the local uniqueness of the solution, we show that the rate of convergence is superlinear in a genuine sense, provided that the limit point satis es the strict complementarity condition. Key words: Nonlinear complememtarity problem, proximal point algrithm, genuine superlinear convergence. AMS: 47H05, 90C33

1 Introduction The nonlinear complementarity problem (NCP) [8] is to nd a vector x 2 Rn such that

F (x) 0; x 0; hx; F (x)i = 0; where F is a mapping from Rn into Rn and h; i denotes the inner product in Rn. Throughout this paper we assume that F is continuously dierentiable and monotone. NCP(F ):

Until now, a variety of methods for solving NCP have been proposed and investigated. Among them, the proximal point algorithm (PPA) proposed by Martinet [7] and further studied by Rockafellar [10] is known for its theoretically nice convergence properties. PPA is originally designed to nd a vector x satisfying 0 2 T (x), where T is a maximal monotone operator. Hence it is applicable to a wide class of problems such as convex programming problems, monotone variational inequality problems and monotone complementarity problems. In this paper, we 1

focus on PPA for solving monotone complementarity problems. PPA generates a sequence fxk g by solving subproblems that are regularizations of the original problem. For NCP(F ), given the current point xk , PPA obtains the next point xk+1 by approximately solving the subproblem

F k (x) 0; x 0; hx; F k (x)i = 0;

(1)

where F k : Rn ! Rn is de ned by

F k (x) := F (x) + ck (x ? xk )

(2)

and ck > 0. The mapping F k is strongly monotone when F is monotone. Hence subproblem (1) is expected to be more tractable than the original problem. With appropriate criteria for approximate solutions of subproblems (1), PPA has global and superlinear convergence property under mild conditions [6, 10] However, it is not always easy to check those criteria for general monotone operator problems. In this paper, we will show that, for monotone complementarity problems, some NCP functions turn out to be useful in constructing practical approximation criteria. Another implementation issue is how to solve subproblems eciently. In the PPA proposed in this paper, we will use the generalized Newton method proposed by De Luca, Facchinei and Kanzow [2] to solve subproblems (1). Since F k is strongly monotone, we can show that the approximation criteria for each subproblem are attained nitely. The PPA then converges globally provided that the solution set of NCP(F ) is nonempty. Moreover, without assuming the local uniqueness of the solution, we can show that the rate of convergence is superlinear. From the practical viewpoint, it is important to estimate computational costs for solving a subproblem at each iteration. We give conditions under which the approximation criteria for the subproblem are eventually ful lled by a single Newton iteration. The paper is organized as follows. In Section 2, we review some concepts and preliminary results that will be used in the subsequent analysis. In Section 3, we describe the proposed PPA for NCP(F ). In Section 4 we show its convergence properties. Throughout we adopt the following notation. For a 2 R, (a)+ denotes maxf0; ag, and for x 2 Rn, [x]+ denote the projection of x onto R+n , the nonnegative orthant of Rn. For two vectors x and y, minfx; yg denotes the vector whose ith element is minfxi ; yi g.

2 Preliminaries In this section, we rst review some mathematical concepts and basic properties of PPA that will be used in the subsequent analysis. We then discuss reformulations of NCP and related results concerning error bounds. Finally, we brie y mention the generalized Newton method for NCP proposed in [2], which will be used to solve subproblems in PPA.

2.1 Mathematical concepts

First we recall some de nitions concerning the monotonicity of a mapping from Rn into itself. De nition 2.1. The mapping F : Rn ! Rn is called a 2

(a) monotone function if hx ? y; F (x) ? F (y)i 0 for all x; y 2 Rn;

(b) strongly monotone function with modulus > 0 if hx ? y; F (x) ? F (y)i kx ? yk for all x; y 2 Rn: 2

(3) (4)

From the de nition, it is clear that a strongly monotone function is monotone. Moreover, if F is a dierentiable monotone function, then rF (x) is positive semide nite for all x 2 Rn .

De nition 2.2. Let H : Rn ! Rn be locally Lipschitz continuous. Then the B subdierential of H at x is the set of n n matrices de ned by 9 8 > > = < i T @B H (x) := > lim ; r H ( x ) > ; : 2! xi

xi

DH x

where DH Rn is the set where H is dierentiable.

Note that the Clarke subdierential of H is de ned by

@H (x) := co @B H (x); where co denotes the convex hull of a set [1]. Next we recall the notion of semismoothness, which lies in between the dierentiability and the directional dierentiability. De nition 2.3. Let H : Rn ! Rn be locally Lipschitz continuous. We say that H is semismooth at x if 0 (5) 2 0 lim 0 V d ! # exists for all d. Moreover, we say that H is strongly semismooth at x if for any d ! 0 and for any V 2 @H (x + d), V d ? H 0(x; d) = O(kdk2 ); where H 0 (x; d) denotes the directional derivative of H at x along direction d. Note that, when H is semismooth at x, the limit (5) is equal to the directional derivative H 0(x; d). V

@H (x+td

d

d;t

3

0

)

2.2 Proximal point algorithm

NCP(F ) is equivalent to the problem of nding a point x such that 0 2 T (x);

(6)

where T : Rn ! 2R is the maximal monotone mapping de ned by n

T (x) := F (x) + N (x);

(7)

with N : Rn ! 2R being the normal cone mapping for R+n de ned by n

(

n if x 0; N (x) := ;fy 2 R j hx ? z; yi 0; 8z 0g otherwise.

With an arbitrary initial point x0 , PPA generates a sequence fxk g converging to a solution of (6) by the iterative scheme: xk+1 Pk (xk ); where Pk : Rn ! Rn is the mapping de ned by Pk := (I + c1 T )?1 , fck g is a positive sequence, and xk+1 Pk (xk ) means that xk+1 is an approximation to Pk (xk ). For NCP(F ), this procedure amounts to approximately solving the following subproblem NCP(F k ): Find x 2 Rn such that k

F k (x) 0; x 0; hx; F k (x)i = 0; (8) where F k is de ned by (2). Note that when ck is small, the subproblem is close to the original one. On the other hand, when ck is large, a solution of the subproblem is expected to lie near xk , and hence the subproblem is presumably easy to solve. To ensure convergence of PPA, xk has to be located suciently near the solution Pk (xk ) of +1

subproblem (1). There have been proposed a number of criteria for the approximate solution of the subproblem. Among others, Rockefeller [10] proposed the following two criteria. 1 X

Criterion 1.

kxk+1 ? Pk (xk )k "k ;

Criterion 2.

kxk+1 ? Pk (xk )k k kxk+1 ? xk k;

k=0

"k < 1: 1 X k=0

k < 1:

Note that Criterion 1 guarantees global convergence, while Criterion 2, which is rather restrictive, ensures superlinear convergence of PPA.

Theorem 2.1 ([10, Theorem 1]) Suppose that the sequence fxk g is generated by PPA with Criterion 1 and that fck g is bounded. If NCP(F ) has at least one solution, then fxk g converges

to a solution x of NCP(F ).

2

4

Note that it is not necessary to let fck g converge to 0 for the global convergence. Therefore, we may keep F k uniformly strongly monotone, so that subproblems (1) are numerically wellconditioned. On the other hand, if we let fck g converge to 0, we can expect rapid convergence of PPA. Luque [6, Theorem 2] showed superlinear convergence without assuming the local uniqueness of the solution of NCP(F ). Theorem 2.2 ([6, Theorem 2]) Suppose that fxk g is generated by PPA with Criteria 1 and 2 and that ck ! 0. If there exist positive constants and C such that dist(x; X ) C kwk whenever x 2 T ?1 (w) and kwk ; where dist(x; X ) denotes the distance from point x to the solution set X of NCP(F ), then the sequence fdist(xk ; X )g converges to 0 superlinearly. 2

2.3 Reformulations of NCP

NCP can be reformulated as a system of equations in various ways. In this subsection, we review basic properties of two reformulations of NCP that will play a crucial role in solving subproblems of PPA. In the remainder of this section, we deal with the problem NCP(F^ ), where F^ : Rn ! Rn is a certain mapping. First we consider the function FB : R2 ! R de ned by

p

FB (a; b) := a + b ? a + b : 2

2

(9)

This function is called Fischer-Burmeister function and has the following property:

FB (a; b) = 0 () a 0; b 0; ab = 0: Any function with this property is often called an NCP function. Using the function FB , we de ne the mapping H : Rn ! Rn by 0 (x ; F^ (x)) 1 FB 1 1 CA : B .. (10) H (x) := @ . ^ FB (xn; Fn (x)) Then it is straightforward to see that NCP(F^ ) is equivalent to the system of equations

H (x) = 0: (11) The mapping H is not dierentiable at a point x such that xi = Fî (x) = 0 for some i. However, when F^ is continuously dierentiable, H is locally Lipschitz, and hence it has the B subdieren-

tial everywhere. Though it is not necessarily easy to calculate the B subdierential of a general locally Lipschitz mapping, De Luca et al. [2] show that, for the mapping H , an element V of @B H (x) is expressed as V = Da + rF^ (x)T Db; (12) 5

where Da , Db are diagonal matrices de ned by

8 < 1 ? p x ;1 ? p F x F x x ((Da )ii ; (Db )ii ) = : (1 ? ; 1 ? );

x 2 +F ^ (x)2 î ( )

i

2 + ^ ( )2 i i

i

i

; if (xi; Fî (x)) 6= (0; 0) otherwise

(13)

and (; ) is a vector satisfying 2 + 2 = 1. De Luca et al. [2] also discuss how to calculate (; ) when (xi ; Fî (x)) = (0; 0). The next proposition will be useful in the analysis of the generalized Newton method for solving subproblems of PPA. Proposition 2.1 Let M be a positive de nite matrix and be a positive constant such that

hv; Mvi kvk ; 8v 2 Rn: (14) Let Da = diag(ai ) and Db = diag(bi ) be diagonal matrices whose diagonal elements are nonnegative and satisfy ai + bi d for all i, where d is a positive constant. Then we have inf k(Da + M T Db )vk B; kvk 2

=1

where B = d=(n maxf1; kM kg). Moreover, the following inequality holds: 1 k(Da + M T Db )?1k B :

Proof. Since any square matrix satis es inf kAvk = inf kAT vk; kvk kvk =1

=1

to prove the rst part of the lemma it suces to show that inf k(Da + Db M )vk B: kvk=1

Let v be an arbitrary vector such that kvk = 1. Then, since

hv; Mvi holds by (14), there exists an index i such that Since it follows from (15) that

vi(Mv)i n :

(15)

vi (Mv)i jvi jkM k;

jvi j nkM k : 6

(16)

Moreover, (15) implies that vi has the same sign as (Mv)i . Hence, by (15) and (16), we have

j((Da + Db M )v)i j = aijvi j + bij(Mv)i j nkM k ai + njv j bi i nkM k ai + n bi (ai + bi ) n max f1; kM kg B: Consequently, we have

k(Da + DbM )vk B:

Next we show the last part of the lemma. Note that, under the given assumptions, Da + M T Db is nonsingular [2, Lemma 5.1]. Since 1 ; k(Da + M T Db)?1 k = inf kvk=1 k(Da + M T Db )vk it follows that

1 k(Da + M T Db )? k B : 1

2

As a direct consequence of this proposition, we have the following corollary. Corollary 2.1 Suppose that F^ is strongly monotone with modulus . Let Da and Db be de ned by (13). Then we have k(Da + rF^ (x)T Db )?1 k B1 ;

p

1

where B1 = (2 ? 2)=(n maxf1; krF^ (x)kg).

2

Now we de ne the function FB : Rn ! R by (17) FB (x) := 21 kH (x)k2 ; where H is given by (10). We note that FB attains its global minimum at a solution of NCP(F^ ), because NCP(F^ ) is equivalent to (11).

Lemma 2.1 The mapping H : Rn ! Rn de ned by (10) has the following properties: (a) If F^ is dierentiable, then H is semismooth. (b) If rF^ is locally Lipschitz continuous, then H is strongly semismooth. 7

(c) If rF^ (x) is positive de nite, then every V 2 @B H (x) is nonsingular. Proof. Items (a) and (c) are shown in [3]. Item (b) is shown in [11]. 2 Lemma 2.2 The function F B : Rn ! R de ned by (17) has the following properties: (a) If F^ is dierentiable, then F B is dierentiable. (b) If F^ is monotone, then any stationary point of F B is a solution of NCP(F^ ). (c) IfpF^ is strongly monotone with modulus and Lipschitz continuous with constant L, then F B (x) provides a global error bound for NCP(F^ ), that is, q kx ? x^k B2(L + 1) (x) for all x 2 Rn; FB

where x^ is the unique solution of NCP(F^ ) and B2 is a positive constant independent of F^ . (d) If F^ is ane and NCP(F^ ) has a solution, then pF B (x) provides a local error bound for NCP(F^ ), that is, there exist positive constants B3 and B4 such that

q

dist(x; X^ ) B3 F B (x) for all x 2 fy 2 Rn j F B (y) B4 g; where X^ denotes the solution set of NCP(F^ ).

Proof. Items (a) and (b) are shown in [5]. Items (c) and (d) are shown in [4].

2

In the PPA to be presented in the next section, we will also utilize the following function : Rn ! R, which has a more favorable error bound property than F B : (x) := where : R2 ! R is de ned by

n X i=1

(xi ; Fî (x));

(a; b) := jabj + j minfa; bgj: Note that is also an NCP function. It is clear that (x) 0 for all x, and (x) = 0 if and only if x is a solution of NCP(F^ ). The next lemma shows an interesting error bound result for the function , which will play an important role in Section 4. Note that this error bound is valid only on the set R+n . Lemma 2.3 Suppose that F^ is strongly monotone with modulus . Then we have

s

kx ? x^k 2 maxf1; kxkg (x) for all x 2 fy 2 R+n j (y) 4 g; where x^ is the unique solution of NCP(F^ ).

8

Proof. Let x 2 R+n be arbitrary. Since F^ is strongly monotone with modulus , we have kx ? x^k2 hx ? x^; F^ (x) ? F^ (^x)i = hx; F^ (x)i + hx^; ?F^ (x)i + hF^ (^x); ?xi n n n X X X jxiFî (x)j + jxîjj(?Fî (x))+ j + jFî (^x)jj(?xi )+j i=1 i=1 i=1 n n X X = jxiFî (x)j + jxîjj(?Fî (x))+ j: i=1

Since it follows that

i=1

(?b)+ j minfa; bgj for all (a; b) 2 R2 ; kx ? x^k2

n n X X jxiFî (x)j + jxîjj(?Fî (x))+ j i=1 i=1 n n o X jxi Fî(x)j + jxî jj minfxi; Fî (x)gj i=1 maxf1; kx^k1g (x) maxf1; kx^kg (x):

Hence we have

kx ? x^k

s

maxf1; kx^kg (x)

s

maxf1; kx^kg (x)

s

maxf1; kx^ ? xk + kxkg (x) : Therefore, if kx^ ? xk + kxk 1, then the desired inequality holds. If kx^ ? xk + kxk > 1, then

s

p

s

!

1 ? (x) kx ? x^k kxk (x) :

Since 1 ? (x)= 12 whenever (x) 4 , we also have the desired inequality. 2 We note that, unlike Lemma 2.1 (c), Lemma 2.3 does not assume the Lipschitz continuity of ^ F . Moreover, unlike Lemma 2.1 (d), the error bound result shown in Lemma 2.3 is explicitly represented in terms of the modulus of strong monotonicity of F^ .

9

2.4 Generalized Newton method

In this section, we review the generalized Newton method for solving NCP proposed by De Luca, Facchinei and Kanzow [2]. The PPA to be presented in the next section will use this method to solve subproblems. Procedure 1. (Generalized Newton method for NCP(F^ ))

Step 1: Choose a constant 2 (0; 12 ). Let x0 be an initial point and set j := 0. Step 2: If xj satis es a termination criterion, then stop. Step 3: Choose Vj 2 @B H (xj ) and get dj satisfying Vj dj = ?H (xj ):

(18)

Step 4: If xj + dj satis es the termination criterion, then stop. Otherwise, nd the smallest nonnegative integer ij such that

F B (xj + 2?i dj ) (1 ? 2?i )F B (xj ): j

j

Step 5: Set xj+1 := xj + 2?i dj and j := j + 1, and go to Step 2. j

Note that Procedure 1 is a slight simpli cation of the algorithm in [2]. Within the framework of the present paper, however, there is essentially no dierence between them, because we only consider the case where F is strongly monotone. For Procedure 1 with the termination criterion ignored, the following convergence result holds. Proposition 2.2 [2] Suppose that F^ is dierentiable and strongly monotone and that rF^ is Lipschitz continuous around the unique solution x^ of NCP(F^ ). Then Procedure 1 globally converges to x^ and the rate of convergence is quadratic. 2 Since the mappings F k involved in the subproblems generated by PPA are strongly monotone, Procedure 1 can be applied to these problems eectively.

3 Algorithm and its convergence properties In this section we describe PPA for NCP(F ) and study its convergence properties.

Algorithm 1 Step 1: Choose parameters 2 (0; 1), c0 2 (0; 1) and an initial point x0 2 Rn. Set k := 0. Step 2: If xk satis es F B (xk ) = 0, then stop. 10

Step 3: Let F k : Rn ! Rn be de ned by (2), and apply Procedure 1 to obtain an approximate solution x~k+1 of NCP(F k ) that satis es the conditions 3

k k ([~xk+1 ]+ ) 4 maxf1; kc[~ xk+1 ]+ kg2

and where and

q

kF B (~xk+1 ) c4k kxk ? x~k+1 k; k (x) :=

kF B (x) :=

n X i=1

n X i=1

(19) (20)

(xi ; Fik (x))

F B (xi ; Fik (x))2 :

Step 4: Set xk+1 := [~xk+1]+ , ck+1 := ck , and k := k + 1. Go to Step 2. Remark 3.1 The condition (19) in Step 3 corresponds to Criterion 1 of PPA, while the con-

dition (20) corresponds to Criterion 2.

Remark 3.2 If xk is a solution of NCP(F ), then the algorithm stops at Step 2. Otherwise,

since F k (xk ) = F (xk ), xk is not a solution of NCP(F k ) at Step 3. Moreoversince F k is strongly monotone, Theorem 2.2 ensures that Procedure 1 can nitely nd x~k+1 satisfying (19) and (20).

First we show that Algorithm 1 has a global convergence property. Theorem 3.1 Suppose that NCP(F ) has at least one solution. Then the sequence fxk g generated by Algorithm 1 converges to a solution x of NCP(F ). Proof. It suces to show that fxk g satis es the assumption of Theorem 2.1, that is, fxk g satis es Criterion 1. Since xk+1 = [~xk+1 ]+ in Step 4 and 0 < ck < 1, we have, by (19) in Step 3, 3

k (xk+1 ) 4 maxf1;ckkxk+1 kg2

c4k :

(21) (22)

Since F k is strongly monotone with modulus ck , it then follows from Lemma 2.3 and (22) that

s

k k+1 kxk+1 ? Pk (xk )k 2 maxf1; kxk+1 kg (cx ) ; k

where Pk (xk ) is the unique solution of NCP(F k ). By (21) and (23), we have

kxk+1 ? Pk (xk )k ck : 11

(23)

k Since 1 k=1 ck < 1, it follows from Theorem 2.1 that fx g converges to a solution of NCP(F ).

2

Next we give conditions for Algorithm 1 to converge superlinearly. For this purpose, we rst show that the inverse of the maximal monotone operator T de ned by (7) is locally Lipschitz near the solution set of NCP(F ) under the following assumption: Assumption 1: k minfx; F (x)gk provides a local error bound for NCP(F ). Note that when F is ane, Assumption 1 holds by Lemma 2.2 (d) On the other hand, when rF (x) is positive de nite at any solution x of NCP(F ), Assumption 1 holds by Lemma 2.1 (c) and [9, Propostion 3]. Proposition 3.1 Let T be the maximal monotone mapping de ned by (7). If Assumption 1 holds and the solution set X of NCP(F ) is nonempty, then there exist positive constants C and such that dist(x; X ) C kwk 8x 2 T ?1 (w); 8w with kwk : Proof. The mapping T de ned by (7) is expressed as T (x) = T1 (x) Tn (x);

where Ti (x) R is given by

8 > if xi > 0 < fFi (x)g Ti (x) = > fFi (x) + vi j vi 2 (?1; 0]g if xi = 0 :; otherwise

for i = 1; ; n. Consider a pair (x; w) such that w 2 T (x). If xi > 0, we have

jwi j = jFi (x)j j minfxi ; Fi (x)gj: If xi = 0 and Fi (x) > 0, it is clear that

jwi j 0 = j minfxi; Fi (x)gj: If xi = 0 and Fi (x) 0, then there exists vi 0 such that jwi j = jFi (x) + vij: Hence we have Consequently we have

jwij = jFi (x) + vij jFi (x)j = j minfxi ; Fi (x)gj: kwk k minfx; F (x)gk:

It then follows from Assumption 1 that the desired property holds. 2 By using Proposition 3.1, we show that Algorithm 1 has superlinear rate of convergence. 12

Theorem 3.2 Suppose that Assumption 1 holds. Let fxk g be generated by Algorithm 1. Then the sequence fdist(xk ; X )g converges to 0 superlinearly. Proof. By Theorem 3.1, fxk g is bounded. Hence we may suppose that F k is uniformly Lipschitz continuous with modulus L on a bounded set containing fxk g. It then follows from

Lemma 2.2 (c) that there exists a positive constant B2 such that q kx~k+1 ? P (xk )k B2(L + 1) k (~xk+1): k

ck

FB

Hence by (20) in Step 3, we have

kx~k+1 ? Pk (xk )k B2(L + 1)c3k kxk ? x~k+1k: 3 Since 1 ~k g satis es Criterion 2. Thereforeby k=1 ck < 1, the last inequality implies that fx

Proposition 3.1 and [6, Theorem 2.1], there exists a constant C > 0 such that for suciently large k dist(~xk+1 ; X ) 2 C 2 1 dist(xk ; X ): (C + (1=ck ) ) 2 Noting that dist(xk+1 ; X ) dist(~xk+1 ; X ), we then have dist(xk+1 ; X )

C dist(xk ; X ): (C 2 + (1=ck )2 ) 21

Since ck ! 0, fdist(xk ; X )g converges to 0 superlinearly. 2 Theorem 3.2 says that the sequence fxk g generated by Algorithm 1 converges to the solution set X superlinearly under mild conditions. However, this does not necessarily mean that Algorithm 1 is practically ecient, because it says nothing about computational costs to solve a subproblem at each iteration. So it is important to estimate the number of iterations Procedure 1 spends at each iteration of Algorithm 1. Moreover, it is particularly interesting to see under what conditions Procedure 1 requires just a single iteration. In the next section, we answer this question.

4 Genuine superlinear convergence In this section we give conditions under which a single Newton step of Procedure 1 for NCP(F k ) attains (19) and (20) in Step 3 of Algorithm 1, thereby genuine superlinear convergence of Algorithm 1 is ensured. First we show that (19) is implied by (20) for suciently large k. Lemma 4.1 When k is suciently large, if

q

kF B (~xk+1 ) c4k kxk ? x~k+1 k 13

holds, then

3

k k ([~xk+1 ]+ ) 4 maxf1; kc[~ xk+1 ]+ kg2

also holds Proof. Since k is uniformly locally Lipschitz continuous, there exists L > 0 such that k ([~xk+1 ]+ ) Lk[~xk+1 ]+ ? Pk (xk )k Lkx~k+1 ? Pk (xk )k; (24)

q

for all k suciently large Moreoversince kF B (x) provides a global error bound for NCP(F k ) by Lemma 2.2 (c), there exists > 0 such that

q kx~k+1 ? Pk (xk )k c kF B (~xk+1 ): k

(25)

It then follows from (20)(24) and (25) that there exists a positive constant 0 such that k ([~xk+1 ]+ ) 0 c3k kx~k+1 ? xk k: Since fkx~k+1 ? xk kg converges to 0 and since fk[~xk+1 ]+ kg is bounded, (19) holds for suciently large k. 2 This lemma says that (20) implies (19) for all k suciently large. Therefore, in the remainder of this section, we only consider conditions under which (20) is satis ed after a single Newton step for NCP(F k ). The following lemma indicates the relation between kxk ? Pk (xk )k and dist(xk ; X ). Lemma 4.2 For suciently large k, there exists a constant B5 > 0 such that

kxk ? Pk (xk )k Bc 5 dist(xk ; X ): k xk .

Proof. Let xpk be the nearest point in X from Since fxk g is bounded, so is fxk g. Thus the function q k F B (x) is Lipschitz continuous on a bounded set containing fxk g and fxk g. Moreover F B (x) is also uniformly Lipschitz qcontinuous on the same set. Let L1 > 0 and p L2 > 0 be Lipschitz constants of F B (x) and kF B (x), respectively. Then we have q q q F B (xk ) = F B (xk ) ? F B (xk ) L1kxk ? xk k = L1 dist(xk ; X ):

It follows from Lemma 2.2 (c) that

k ? Pk xk

k

(xk )

B2 (L2 + 1) q k k F B (x ) ck q = B2 (Lc2 + 1) F B (xk ): k

14

Combining the above inequalities and letting B5 = B2 (L2 + 1)L1 yield the desired inequality.

2

Next we assume that the strict complementarity is satis ed at the limit point of the generated sequence. The assumption ensures the twice dierentiability of H .

Assumption 2: (a) The limit point x of the sequence fxk g generated by Algorithm 1 is nondegenerate, that

is, xi + Fi (x ) > 0 holds for all i. (b) F is twice continuously dierentiable. For the sake of convenience, we de ne the mapping H k : Rn ! Rn by 0 F B (x1 ; F k (x)) 1 1 B CA : . k .. H (x) := @ F B (xn ; Fnk (x))

Lemma 4.3 Suppose that Assumptions 1 and 2 hold. Then H k is twice continuously dieren-

tiable in a neighborhood of xk for suciently large k, and there exists a positive constant B6 such that

krH k (xk )T (xk ? Pk (xk )) ? H k (xk ) + H k (Pk (xk ))k B6kxk ? Pk (xk )k2 : Proof. Let F : R2n+1 ! Rn and H : R2n+1 ! Rn be de ned by F (x; y; ) := F (x) + (x ? y); 0 F B (x1 ; F1 (x; y; )) 1 CA ; .. H (x; y; ) := B @ . F B (xn ; Fn (x; y; ))

respectively. Suppose that x is the limit point of the sequence fxk g. Thenby Assumption 2, H is twice continuously dierentiable in a neighborhood N of (x ; x ; 0). Hence, there exists a positive constant B6 such that

1 0 x ? x0

rxH (x; y; )T ry H (x; y; )T r H (x; y; )T B@ y ? y0 CA ? H (x; y; ) + H (x0 ; y0; 0 )

? 0 B6(kx ? x0k2 + ky ? y0k2 + k ? 0k2 ); 8(x; y; ); (x0 ; y0 ; 0) 2 N: (26)

We also note that H k is twice continuously dierentiable near xk when k is suciently large. Since rxH (x; xk ; ck ) = rH k (x) and since (xk ; xk ; ck ); (Pk (xk ); xk ; ck ) 2 N for suciently large k, substituting (x; y; ) = (xk ; xk ; ck ) and (x0 ; y0 ; 0 ) = (Pk (xk ); xk ; ck ) into (26) yields the desired inequality. 2 15

Now let us denote

xkN := xk ? Vk?1 H k (xk ); Vk 2 @B H k (xk ):

(27) Note that xkN is a point produced by a single Newton iteration of Procedure 1 for NCP(F k ) with the initial point xk . By using Corollary 2.1 and Lemma 4.3, we can show the following key lemma. Lemma 4.4 Suppose that Assumptions 1 and 2 hold. Then there exists B7 > 0 such that k k 2 kxkN ? Pk (xk )k B7kx ?c Pk (x )k k

for suciently large k.

Proof.

First note that, by Lemma 4.3, rH k (xk ) exists and hence V k = rH k (xk ) for all k suciently large. By (12), rH k (xk ) is expressed as rH k (xk ) = Da + rF k (xk )T Db . MoreoverfkrF k (xk )kg is bounded. Thereforeby Corollary 2.1 and Lemma 4.3, there exist B1 > 0 and B6 > 0 such that

kxkN ? Pk (xk )k = kxk ? Pk (xk ) ? rH k (xk )?1(H k (xk ) ? H k (Pk (xk )))k krH k (xk )?1kkrH k (xk )(xk ? Pk (xk )) ? H k (xk ) + H k (Pk (xk ))k k k 2 B6kx B? Pc k (x )k 1k

for suciently large k. Consequently, letting B7 = B6 =B1 shows the lemma. 2 Now we are in a position to establish the main result of this section. Theorem 4.1 Suppose that Assumptions 1 and 2 hold. Let xkN be given by (27). Then for suciently large k, xkN satis es the condition (20) in Step 3 of Algorithm 1, that is

q

kF B (xkN ) c4k kxkN ? xk k:

Proof. Let > 0 be arbitrary. Since fdist(xk ; X )g converges to 0 superlinearly by Theorem

3.2, we have for suciently large k

dist(xk ; X ) c6k : It follows from Lemma 4.2 that

kxk ? Pk (xk )k B5c5k : Then by Lemma 4.4, we have

kxkN ? Pk (xk )k Bc 7 kxk ? Pk (xk )k2 k B5 B7c4k kxk ? Pk (xk )k: 16

By the triangle inequality, the last inequality yields 1 ? B5 B7 c4k kxk ? P (xk )k c4 kxk ? xk k: k N k N

B5 B7

q

(28)

On the other hand, since kFB (x) is uniformly locally Lipschitz continuous, there exists L2 > 0 such that qk k FB (xN ) L2 kxkN ? Pk (xk )k: Hence, by (28) it suces to show L2

B5 B7 ck : 1 ? B B 4

5

7

Since is arbitrary, choosing suciently small yields the last inequality. 2 This theorem along with Theorem 3.2 ensures that Algorithm 1 converges superlinearly in a genuine sense, provided that the limit of the generated sequence fxk g is nondegenerate.

References [1] F.H. Clarke, Optimization and Nonsmooth Analysis, John Wiley & Sons, New York, New York, 1983. [2] T. De Luca, F. Facchinei and C. Kanzow, A semismooth equation approach to the solution of nonlinear complementarity problems, Mathematical Programming, 75 (1996), pp. 407{439. [3] F. Facchinei and J. Soares, A new merit function for nonlinear complementarity problems and a related algorithm, SIAM Journal on Optimization, 7 (1997), pp. 225{247. [4] A. Fischer, An NCP-function and its use for the solution of complementarity problems, Recent Advances in Nonsmooth Optimization, eds., D.-Z. Du, L. Qi and R.S. Wormersley, World Scienti c Publishers, Singapore, pp. 88{105, 1995. [5] C. Geiger and C. Kanzow, On the resolution of monotone complementarity problems, Computational Optimization and Application, 5 (1996), pp. 155{173. [6] F.J. Luque, Asymptotic convergence analysis of the proximal point algorithm, SIAM Journal on Control and Optimization, 22 (1984), pp. 277{293. [7] B. Martinet, Perturbation des methodes d'opimisation, R.A.I.R.O., Analyse Numerique, 12 (1978), pp. 153{171. [8] J.-S. Pang, Complementarity problems, Handbook of Global Optimization, eds., R. Horst and P. Pardalos, Kluwer Academic Publishers, Boston, Massachusetts, pp. 271{338, 1994. 17

[9] J.-S. Pang, and L. Qi, Nonsmooth equations: motivation and algorithms, SIAM Journal on Optimization, 3 (1993), pp. 443{465. [10] R.T. Rockafellar, Monotone operators and the proximal point algorithm, SIAM Journal on Control and Optimization, 14 (1976), pp. 877{898. [11] N. Yamashita and M. Fukushima, Modi ed Newton methods for solving a semismooth reformulation of monotone complementarity problems, Mathematical Programming, 76 (1997), pp. 469{491.

18

The proximal point algorithm with genuine superlinear convergence ...

The proximal point algorithm with genuine superlinear convergence ...

Suggest Documents

Convergence Analysis of the Relaxed Proximal Point Algorithm

On the Superlinear and Linear Convergence of the Parareal Algorithm ...

Dual convergence of the proximal point method with Bregman ...

Superlinear Convergence of a General Algorithm for ... - Springer Link

Superlinear Convergence of Interior{Point Algorithms for ... - CiteSeerX

On the Asymptotic Superlinear Convergence of the

PROXIMAL POINT ALGORITHM FOR ... - Semantic Scholar

On the convergence of the proximal algorithm for nonsmooth functions ...

On the convergence of the proximal algorithm for ... - MAFIADOC.COM

Global and Superlinear Convergence of the

Convergence Analysis of Proximal Gradient with

A proximal point algorithm revisit on the alternating ...

Rachford splitting method and the proximal point algorithm for ...

Some Remarks on the Proximal Point Algorithm - Springer Link

Primal and dual convergence of a proximal point ... - DIM-UChile

On the superlinear convergence of a trust region algorithm for ... - LSEC

On the Asymptotic Superlinear Convergence of the Augmented ...

On the Superlinear Convergence of MINRES - Temple Math

A proximal point algorithm for DC functions on ... - Semantic Scholar

A relaxed customized proximal point algorithm ... - Optimization Online

an inexact hybrid generalized proximal point algorithm and some new ...

On the superlinear convergence in computational elasto-plasticity ...

On the sublinear and superlinear rate of convergence ... - Springer Link

Convergence of Dual Algorithm with Arbitrary