Abstract. In this paper, we discuss the convergence properties of a class of descent algorithms for minimi/ing a continuously differentiable function f on Rn ...
JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS: Vol. 95, No. 1. pp. 177-188, OCTOBER 1997
Some Convergence Properties of Descent Methods1,2 Z. WEI, 3 L. Ql,4 AND H. JlANG 5
Communicated by O. L. Mangasarian
Abstract. In this paper, we discuss the convergence properties of a class of descent algorithms for minimi/ing a continuously differentiable function f on Rn without assuming that the sequence {xk} of iterates is bounded. Under mild conditions, we prove that the limit infimum of ||Vf(xk)|| is zero and that false convergence does not occur when f is convex. Furthermore, we discuss the convergence rate of {||xk||} and {f(x k )} when {xk} is unbounded and {f(x k )} is bounded. Key Words. Unconstrained differentiable minimization, descent methods, global convergence, rate of convergence.
1. Introduction Consider the following unconstrained minimization problem: where / is assumed continuously differentiable on R". Descent algorithms for solving (1) usually generate a sequence {xk} such that f(x k +1)mfxeR"f(x).
decreasing
and
Clearly,
does not happen if {xk} has a convergent subsequence in the above example. Note that the sequence {xk} generated by descent algorithms for (1) does not necessarily have an accumulation point. Therefore, it is worth studying how descent algorithms behave when {xk} does not have an accumulation point. The study of the minimizing sequence was pioneered by Auslender, Crouzeix, and their colleagues (Refs. 3-5). The relation between minimizing and stationary sequences of unconstrained and constrained optimization problems has appeared recently; see Refs. 6-8. Similar results for complementarity problems and variational inequalities appeared in Ref. 9. Descent algorithms for solving (1) usually generate the next iteration at the kth step by taking a certain step tk in some descent direction - co or
if V/is Holder continuous on R"; i.e., there exist two positive scalars />0 and A/>0 such that, for all x, yeR",
Proof. Since for all k,
we have
which implies that (f(xk)} is a monotonically decreasing sequence. lf f(xk) tends to —oo, then we complete the proof. Therefore, in the following discussion, we assume that {f(xk)} is a bounded set. (i) Suppose that (i) is not true. Then, there exists e > 0 such that, for all k,
It follows from (4), (5), and (Al) that
The above inequality, (8), and the boundedness of {f(xk)} imply that
By using (2) and (4), we obtain that, for any k,
JOTA: VOL. 95, NO. 1, OCTOBER 1997
181
Then, (10) implies
which yields that {x/c} is convergent, say to a point x*. From (8) and (10), we have
Without loss of generality, we may assume that there exists an index set K such that
by (4). Then from (5), we deduce that, for k e K,
Hence, for all k e K,
Taking the limit for k e K, we have
Assumption (Al), (11), and (4) imply that ||V/(^)|| =0, which contradicts (8). This completes the proof of (i). (ii) Suppose that there exist an infinite index set K and a positive scalar e>0 such that, for all keK,
Analogous to the proof of (i), it is easy to prove from (9) and (12) that
and
Therefore, for all keK,
JOTA: VOL. 95, NO. 1, OCTOBER 1997
182
Using (7) and the Taylor expansion formula, we have
The above two inequalities and (2) yield Dividing the above inequality by tk\\dk\, and taking the limit as k-oo, keK, we obtain by (13)
which contradicts (12), by (4) and Assumption (Al).
D
Let
By slightly modifying the proof of Theorem 3.1 in Ref. 12, we can obtain the following results, which show that our algorithm cannot exhibit the phenomenon of false convergence. Theorem 2.2. Suppose that (Al) holds. If f is a convex function on R", then If Gk is an identity matrix for any k, our algorithm reduces to the steepest descent algorithm. In this case, the authors in Ref. 13 have proved that {x k }k=1 converges to a minimum point of/if/does have a minimum point. This is the content of (ii) and (iii) of the following Proposition 2.1, for which we give a simpler proof. The conclusion (i) of Proposition 2.1 is a special case of Theorem 2.2, but the proof given below is also simpler than that given by Ref. 12. Proposition 2.1. Suppose that Gk =1 for all k. If/is a convex function on R", then:
(0
//*;
(ii)
{xk} is an unbounded set if and only if/has an empty set of minima;
JOTA: VOL. 95, NO. 1, OCTOBER 1997
183
(iii) if/has a nonempty set of minima, then xk converges to a minimal point of f. Proof. Note that, for all x and all k,
by convexity of f. It follows from (2) and (4) that, for all x e Rn and all k,
(i) We prove this conclusion by the following three cases (a), (b), (c). (i)(a) f*=f. This case is trivial, since f*=f= lim k _ 00 f(x k ). (i)(b) {xk} is bounded. From the fact that {/(**)} is a monotonically decreasing sequence, we have that
which combined with (i) of Theorem 2.1 implies that there exists an index set K and a point x ** eR" such that
The convexity of f implies that x** is a minimal point of f. Therefore, f* = /(x**)=f (i)(c) We now assume that f > -oo and {xk} is unbounded. Suppose that there exist x e R", e>0, and k1 such that, for all k>k 1 ,
Setting x = x in (15), we have
Therefore, the fact that {f(xk)} is bounded from below and the inequality
imply that
184
JOTA: VOL. 95, NO. 1, OCTOBER 1997
hence, Then, (17) implies that {|| xk - x \\2} is a descent sequence for sufficiently large k. It follows that {\\xk\\} is bounded, which contradicts our assumption. (ii)[ => ]. Assume that/has an optimal solution point x*. Setting x = x^ in (15), and noting that f ( x * ) < f ( x k ) , we obtain
By using (5) and ?*e(0, 1], we have
Therefore, (2) and (4) yield
The inequality (18) implies that, for any k,
Hence, for any k,
which combined with (19) implies that {xk} is bounded. This is a contradiction. (ii)[