Appl. Math. Inf. Sci. 6, No. 1S, 147S-154S (2012)
147
Applied Mathematics & Information Sciences An International Journal c 2012 NSP
Natural Sciences Publishing Cor.
A new class of Conjugate Gradient Methods with extended Nonmonotone Line Search Hailin Liu1 and Xiaoyong Li2 1 2
School of Computer Science,Guangdong Polytechnic Normal University,Guangzhou, Guangdong 510665, P. R. China Laboratoire Collisions Agrgats Ractivit, Universit Paul Sabatier, 31062 Toulouse Cedex 09, France
Received: Apr 17, 2011; Revised Jul 21, 2011; Accepted Aug 4, 2011 Published online: 1 January 2012
Abstract: In this paper, we propose a new nonlinear conjugate gradient method for large-scale unconstrain optimization which possesses the following properties:(i)the sufficient descent condition −gkT dk ≥ 87 kgk k2 holds without any line searches;(ii)With exact line search, this method reduces to a nonlinear version of the Liu-Storey conjugate gradient scheme.(iii)Under some assumption, global convergence of this method is proved with a new nonmonotone line search.Preliminary numerical results show that this method is very efficient. Keywords: Conjugate gradient, Sufficient descent, Hybrid method, Unconstrained optimization.
kgk k2 kgk−1 k2
βkF R =
1. Introduction Consider minn f (x)
x∈R
(1)
βkP RP =
n
(5)
gkT yk−1 kgk−1 k2
(6)
where f : R −→ R is a smooth nonlinear function and its gradient g is avaiable.Nonlinear conjugate gradient method is well suited for solving large scale problems,its iterative formula is given by
βkCD =
kgk k2 T d −gk−1 k−1
(7)
xk+1 = xk + αk dk
(2)
βkLS =
gkT yk−1 T d −gk−1 k−1
(8)
−gk , f or k = 1; −gk + βk dk−1 , f or k ≥ 2,
(3)
βkDY =
kgk k2 T dk−1 yk−1
(9)
dk =
where gk = ∇f (xk ), dk is the search direction,αk is a step-size obtained by a one-dimensional line search and 2 βk is a scalar.There are many formulas have been prog T yk−1 ||yk−1 || βkHZ = Tk − 2gkT dk−1 T (10) posed to compute the scalar βk for αk is not the exact one2 dk−1 yk−1 (dk−1 yk−1 ) dimensional minimizer in practice and f is not a quadratic function.Among them,four well-known formulas for βk respectively ,where yk−1 = gk − gk−1 and k.k means the are called the Hestense-Stiefel (HS)([1]), Flether-Reeves (FR) ([2]), Polak-Ribi. e olyak (PRP)([3]), Conjugate-Descent Euclidean norm. In the already-existing convergence analysis and im(CD)([4]), Liu-Storey(LS) ([5]), Dai-Yuan(DY)([6]) and plementations of the conjugate gradient method,the extended Hager-Zhang (HZ)([7]) formulas are given by strong Wofle line search,namely g T yk−1 (4) βkHS = Tk (11) f (xk + αk dk ) − f (xk ) ≤ δαk gkT dk dk−1 yk−1 ∗
Corresponding author: e-mail:
[email protected] c 2012 NSP
Natural Sciences Publishing Cor.
148
H. Liu, X. Li : A new class of Conjugate Gradient Methods ...
T
σ1 gkT dk ≤ g(xk + αk dk ) dk ≤ −σ2 gkT dk
(12)
the extended nonmonotone line search,namely (1.12) and f (xk + αk dk ) −
max
0≤j≤m(k)
f (xk−j ) ≤ δαk gkT dk
(13)
where m(k) = min{m(k − 1) + 1, M0 }, m(0) = 1, 0 < δ ≤ σ1 < 1, 0 < σ2 < 1 and M0 ∈ N . In addition,the sufficient descent condition ,namely, −gkT dk ≥ ckgk k2
(14)
has often been used in the literature to analyze the global convergence of conjugate gradient methods with inexact line searches,where c is a positive constant. The convergence behaviors of (1.4),(1.5),(1.6)and (1.7) with some line search conditions have been studied by many authors for many years.Al-Baali [8] proved the global convergence of FR method with the strong Wolfe line search. Liu et al. [9] and Dai and Yuan [10] extended this result to σ = 21 . Although, in practical computation, the PRP method is generally believed to be the most efficient conjugate gradient method. However, Powell[11] constructed a counter example and showed that the PRP method can circle infinitely without approaching the solution, which implies that this method is not globally convergent for general functions.But the PRP+(βk P RP + = max{0, βkP RP }) method with the Wolfe line search is globally convergent when the sufficient descent condition (1.11) is given,the HS method is very familiar with the PRP method.The DY method with the Wolfe and the strong Wolfe line search is globally convergent without the descent condition,but the descent property of the DY method depends on line search or convexity of the objective function. In[12], the authors propose a nonmonotone Newton method, and analyze its convergence. Lucidi and Roma [15] present a nonmonotone algorithm of FR method in 1995.G. H. Liu, L. L. Jing, L. X. Han,and D. Han [14] introduce a class of nonmonotone conjugate gradient methods, this class of nonmonotone conjugate gradient methods is proved to be globally convergent when it is applied to solve unconstrained optimization problems with convex objective functions.In our paper we presented a new class of nonmonotone conjugate gradient methods which is globally convergent when it is applied to solve unconstrained optimization problems with general objective functions. Motivated by the observation of the above ideas, we design our new conjugate gradient method as follows:
dk =
xk+1 = xk + αk dk
(15)
−gk , f or k = 1; −gk + βk dk−1 , f or k ≥ 2,
(16)
where 2
βkN =
||yk−1 || gkT yk−1 − 2gkT dk−1 2 T d T d −gk−1 (−gk−1 k−1 k−1 )
c 2012 NSP
Natural Sciences Publishing Cor.
(17)
with exact line search, we can get that gkT dk−1 = 0,so the N method reduce to LS method.the stepsize αk satisfies the following new nonmonotone line search conditions δαk gkT dk ≥ f (xk + αk dk ) − λ −(1 − λ)
min
max
0≤j≤m(k)
0≤j≤m(k)
f (xk−j )
f (xk−j )
(18)
T
(19)
σ1 gkT dk ≤ g(xk + αk dk ) dk ≤ −σ2 gkT dk
where m(k) = min{m(k − 1) + 1, M0 }, m(0) = 1, 0 < δ ≤ σ1 < 1, 0 < σ2 < 1, 0 ≤ λ ≤ 1 and M0 ∈ N .The new nonmonotone line search can be viewed as some kind of convex combination of the extended strong Wofle line search and the extended nonmonotone line search, when λ = 0 the new nonmonotone line search reduce to the extended strong Wofle line search, and when λ = 1 the new nonmonotone line search reduce to the extended nonmonotone line search. This paper is organized as follows.We will present a new algorithm (Algorithm 2.3), and the sufficient descent property (1.14) of Algorithm 2.2 is also given in the next section.In section 3 the global convergence results of the N method are established.At last the preliminary numerical results are reported.
2. New Algorithm Throughout our paper, we assume that gk 6= 0 for all k, for otherwise a stationary point has been found.Furthermore, in order to establish the global convergence result for the new algorithm, we give the following basic assumption on the objective function. Assumption 2.1(i)The level set Ω = {x ∈ Rn : f (x) ≤ f (x1 )}is bounded, where x1 is the starting point.(ii)f (x) is strongly convex and Lipschitz continuous on Ω. If f satisfies Assumption 2.1(i) and (ii), we can get that kg(x)k ≤ γ¯ , f or all x ∈ Ω
(1)
where γ¯ is a positive constant. If f satisfies Assumption 2.1(ii), we can get that in some neighborhood N of Ω, f is differentiable and its gradient g is Lipschitz continuous, namely, there exists constants L > 0and µ > 0such that for any x, x0 ∈ N kg(x) − g(x0 )k ≤ Lkx − x0 k
(2)
µkx − x0 k2 ≤ (g(x) − g(x0 ))T (x − x0 )
(3)
and
Now we give the following theorem, which illustrates that the formula (1.10) possesses the sufficient descent condition without any line searches.
Appl. Math. Inf. Sci. 6, No. 1S, 147S-154S (2012) / www.naturalspublishing.com/Journals.asp
Theorem 2.2 Consider any method (1.2) and (1.3), where βk = βkN .Then for all k ≥ 1 7 −gkT dk ≥ kgk k2 8
(4)
and from (1.19) we can get that ykT dk = dTk (gk+1 − gk ) ≥ (1 − σ1 )(−gkT dk )
(5)
when βk = βkN we can get that g T yk−1 k−1 dk−1
−gkT dk = kgk k2 − gkT dk−1 ( −gTk k−1 || −2gkT dk−1 (−g||y T d
2
k−1 k−1 )
2
)
2 T = (kgk k (−gk−1 dk−1 ) − gkT dk−1 gkT yk−1 ( 2 2 2 T +2(gkT dk−1 ) ||yk−1 || )/(−gk−1 dk−1 ) 2
ksk k ≥
(6)
Proof.By the convexity assumption, we have
||yk || = ||gk+1 − gk || = kg(xk+1 ) − g(xk )k ≤ L||xk+1 − xk || = Lαk kdk k
Therefore , we can get that (2.4) is true for all k ∈ N . Now we can present a new descent conjugate gradient method as follows: Algorithm 2.3 step1:Given x1 ∈ Rn , ε ≥ 0, 0 < δ ≤ σ1 < 1, 0 < σ2 < 1, 0 ≤ λ ≤ 1, M0 ∈ N , set d1 = −g1 , k := 1, if kg1 k ≤ ε, then stop. step2:Find a αk ≥ 0 by (1.18) and (1.19). step3:Let xk+1 = xk + αk dk and gk+1 = g(xk+1 ) If kgk+1 k ≤ ε, then stop. step4:Compute βk = βkN by the formula (1.17) and generate dk+1 by (1.16). step5:Set k:=k+1, go to step2. Lemma 2.4 Suppose that Assumption 2.1 holds and αk is obtained by the new Nonmonotone line search (1.18) and (1.19).Then there exists c1 > 0, such that c1 (−gkT dk ) kdk k
(7)
where sk = xk+1 − xk . Proof.From Assumption 2.1 we can get that g is Lipschitz continuous in some neighborhood N of Ω, namely, there exists a constant L > 0, such that for any x, x0 ∈ N 0
kg(x) − g(x )k ≤ Lkx − x k
(8)
that is kyk k ≤ Lksk k, so we have ykT dk ≤ Lksk kkdk k
(13)
and from the Lipschitz continuity (2.8), we can get that
1 T dk−1 ), v = 2(gkT dk−1 )yk−1 gk ( − gk−1 2
0
(12)
ykT dk = dTk (gk+1 − gk ) ≥ µαk kdk k2
1 (||u||2 + ||v||2 ) 2 to the second term in (2.6) with
ksk k ≥
(11)
kyk k2 ≤M ykT sk
uT v ≤
u=
1 − σ1 −gkT dk L kdk k
1 let c1 = 1−σ L and from (2.11), we obtain (2.7).Therefore, our proof is complete. Lemma 2.5 Suppose that assumption 2.1 holds, αk is given by (1.18) and (1.19), and βk = βkN . Then there exists a 2 positive constant M = Lµ such that
T − gk−1 dk−1 )
we apply the inequality
(10)
Thus, from (2.9) and (2.10) we obtain
Proof.Since d0 = −g0 ,we have −g0T d0 = kg0 k2 ,which satisfies (2.4).Multiplying (1.3) by gkT we have −gkT dk = kgk k2 − βk gkT dk−1
149
Utilizing (2.13) and (2.14), we can get kyk k2 L2 αk2 kdk k2 L2 αk2 kdk k2 L2 ≤ ≤ = = M (15) 2 T T µαk kdk k2 µ yk sk αk yk dk which completes the proof. Lemma 2.6 Assume that the following inequality holds for all k 0 < m1 ≤ ||gk || ≤ m2
(16)
and αk is given by (1.18) and (1.19), then (1)there exist a positive constant b > 1such that |βk | ≤ b
(17)
(2))there exist a positive constant λ, when ||yk−1 || ≤ λ we have |βk | ≤ for any > 0. Proof.When βk = βkN , from (2.4) and (2.16) we can get g T yk−1 | k−1 dk−1
|βk | ≤ | −gTk ≤|
T gk yk−1 T −gk−1 dk−1
k−1 || + |2gkT dk−1 (−g||y T d
≤
2
k−1 k−1 )
2
|
|
k−1 || T dk−1 ) (−g||y +2|σ2 (−gk−1 T d
≤ (9)
(14)
2m22 + T −gk−1 dk−1 2 (16+64σ2 )m2 7m21
2
k−1 k−1 ) 8σ2 m22 T −gk−1 dk−1
2
|
(18)
c 2012 NSP
Natural Sciences Publishing Cor.
150
H. Liu, X. Li : A new class of Conjugate Gradient Methods ...
(16+64σ )m2
2 2 , obviously we have b > 1 because of let b = 7m21 m2 ≥ m1 > 0. 7m2 let λ = (8+32σ12 )m2 , when ||yk−1 || ≤ λ, we can get
2 g T yk−1 k−1 || − 2gkT dk−1 (−g||y 2| T k−1 dk−1 k−1 dk−1 ) ||gk || ≤ (| −gT dk−1 | k−1 k−1 || T +|2σ(−gk−1 dk−1 ) (−g||y |)||yk−1 || T d )2
|βk | ≤ | −gTk
(8+32σ2 )m2 ||yk−1 || 7m21
we complete the proof. Lemma 2.8 Suppose that αk is given by (1.18) and (1.19), and βk = βkN . Then {l(k)} is increasing. Proof.Assume that
(19)
k−1 k−1
≤
lim ξl(k+1)−1 = 0
k→∞
l(k + 1) < l(k) then we have
≤
This completes the proof. Lemma 2.7Suppose that Assumption 2.1 (i) holds and αk is obtained by the new Nonmonotone line search (1.18) and (1.19), denote ξk = δαk gkT dk, then {f (xk )} is nonincreasing and lim ξl(k+1)−1 = 0 (20) k→∞
k + 1 ≥ l(k) > l(k + 1) ≥ k + 1 − m(k + 1)
f (xl(k+1) ) ≥ f (xl(k) )
max
(21) min
0≤j≤m(k)
(29)
f (xk−j )}.hence, by the definition of l(k + 1)and (2.29), we have l(k + 1) ≥ l(k)
Proof.From (1.18) and (2.21) we can get that f (xk+1 ) ≤ λf (xl(k) ) + (1 − λ)
(28)
but from the Lemma 2.7we have f (xl(k+1) ) ≤ f (xl(k) )
0≤j≤m(k)
(27)
by the definition of l(k + 1) and (2.23), we can obtain
where l(k) = max{i|0 ≤ k−i ≤ m(k), f (xi ) =
(26)
f (xk−j ) + ξ k
≤ f (xl(k) ) + ξ k
(30)
which is contradictory to (2.29).Hence l(k) ≤ l(k + 1), namely {l(k)}is increasing.
(22)
because ξk < 0, we can obtain
3. Convergence analysis
f (xk+1 ) ≤ f (xl(k) ) from (2.21) and m(k) ≤ m(k − 1) + 1, for all k.we can get that f (xl(k) ) ≤
max
0≤j≤m(k−1)+1
≤ max{
max
Theorem 3.1 Suppose that Assumption 2.1 holds and αk is obtained by the new Nonmonotone line search (1.18) and (1.19), Consider any iteration method of the form (1.15) and (1.16), where βk = βkN . Then
f (xk−j )
0≤j≤m(k−1)
f (xk−1−j ), f (xk )}
lim inf ||gk || = 0
(23)
(1)
Proof.If (2.1) is not true, then there exists a constant γ such that 0 < γ ≤ ||gk || ≤ γ for all k.From (1.16) and Lemma 2.6 we have
= max{f (xl(k−1) ), f (xk )} = f (xl(k−1) ), k = 1, 2... therefore {f (xk )} is nonincreasing . Because l(k + 1) − 1k + 1 − m(k + 1) − 1k − M0 ,we have f (xl(l(k+1))−1 ) ≤ f (xl(k−M0 ) )
k→∞
(24)
from the above ineaquation and (2.22) we can obtain
||dl(k+1)−1 || ≤ ||gl(k+1)−1 || + |βl(k+1)−1 |||dl(k+1)−2 || ≤ γ + b||dl(k+1)−2 || ≤ ... l(k+1)−l(k)−3
X
≤γ
bj
j=0
f (xl(k+1) ) ≤ f (xl(l(k+1))−1 ) + ξl(k+1)−1 ≤ f (xl(k−M0 ) ) + ξl(k+1)−1
+b
l(k+1)−l(k)−2
||dl(k)+1 ||
l(k+1)−l(k)−2
≤γ
hence, we have
X
bj
j=0
0 ≤ −ξl(k+1)−1 ≤ f (xl(k−M0 ) ) − f (xl(k+1) ) by (2.25) when the Assumption 2.1 (i) holds, we have
c 2012 NSP
Natural Sciences Publishing Cor.
(25)
+b
l(k+1)−l(k)−2
|βl(k)+1 |||dl(k) ||
(2)
because l(k +1)−l(k) ≤ k +1−[k −m(k)] < M0 +2,we can get
Appl. Math. Inf. Sci. 6, No. 1S, 147S-154S (2012) / www.naturalspublishing.com/Journals.asp
151
Therefore, from ||dl(k+1)−1 || ≤ γ
M0 X
j − l(i) + 1 ≤ [l(i + 1) − 1] − l(i) + 1 ≤ i + 1 − [i − m(i)] ≤ M0 + 1
bj + bM0 |βl(k)+1 |||dl(k) ||
j=0
= q1 + q2 |βl(k)+1 |||dl(k) ||
(3) and (3.5), we have
where q1 = γ
M0 X
j−l(i) j
b , q2 = b
M0
||dj || ≤ γ
.
X
bt + bM0 +1 ||dl(i)−1 ||
(6)
t=0
j=0
from Lemma 2.5 and (1.19) we have
From (3.4) and (3.6), we have
1
1
0 ≤ ||yk || ≤ (M ykT sk ) 2 ≤ (M (1 − σ2 )(−αk gkT dk )) 2 From Lemma 2.6 we note that = q21b2 ,there exists an innegative integer k0 when k ≥ k0 we have |βl(k) | <
j−l(i)
||dj || ≤ γ
X t=0
1 ||dl(k) || b2 ||gl(k) || + |βl(k) |||dl(k)−1 || ≤ q1 + b2 γ + b||dl(k)−1 || ||dl(k)−1 || ≤ q1 + ≤ q3 + b2 b
||dl(k+1)−1 || ≤ q1 +
γ b .Thus,
we have a
||dl(k)−1 || b ||d || q3 + l(k−1)−1 b ≤ q3 + b ||dl(k−1)−1 || q3 + ≤ ... = q3 + b b2 k−k X0 1 1 k−k0 +1 ( )j + ≤ q3 ||dl(k0 )−1 || b b j=0
b + ||dl(k0 )−1 || (4) b−1 Applying l(k) ≥ k − m(k) ≥ k − M0 and Lemma 2.8 we can assume that l(i) − 1 ≤ j ≤ l(i + 1) − 1, i ≥ k0 + 2 , for all j ≥ l(k0 + 2) − 1, thus we have
≤ ... ≤ γ
X t=0
bt + bj−l(i)+1 ||dl(i)−1 ||
||dl(k+1)−1 ||
≥ ρc1 ≥ ρc1
T (−gl(k+1)−1 dl(k+1)−1 )2
||dl(k+1)−1 ||
(8)
γ4 ||dl(k+1)−1 ||
k→∞
≤ q3
j−l(i)
T ρ||sl(k+1)−1 ||(−gl(k+1)−1 dl(k+1)−1 )
lim
∞ X 1 ( )j + ||dl(k0 )−1 || b j=0
||dj || ≤ ||gj || + |βj |||dj−1 || ≤ γ + b(||gj−1 || + |βj−1 |||dj−2 ||)
−ξl(k+1)−1 =
but from (2.20) and (3.8) we can get
||dl(k+1)−1 || ≤ q3 +
≤ q3
b + ||dl(k0 )−1 ||] (7) b−1
for all i − 1 ≥ k0 .By using Lemma 2.4 and Lemma 2.5 we have
thus we can get
for all k ≥ k0 , where q3 = q1 + recursive equation which leads to
bt + bM0 +1 [q3
(5)
1 ||dl(k+1)−1 ||
=0
which is contradictory to (3.7).Hence the theorem is valid. 4.Numerical experiments In this section, we will test PRP, HZ and N conjugate methods with the new nonmonotone line search.In table 41 when ε = 10−6 , δ = 0.01, σ1 = σ2 = 0.1, λ = 0, M0 = 100, for whenε = 10−6 , δ = 0.01, σ1 = σ2 = 0.1, λ = 1 2 , M0 = 100 each method which with capital letters, and each method which with small letters. The problems that we test are from [15]. Table 4-1 show the computation results, where the columns of the tables have the following meanings: Problem: the name of the test problem; Dim: the dimension of the problem; NI: the total number of iterations; NF: the number of the function evaluations; NG: the number of the gradient evaluations. Time:the CPU total time and the star * denotes that this result is the best one among these three methods.
c 2012 NSP
Natural Sciences Publishing Cor.
152
H. Liu, X. Li : A new class of Conjugate Gradient Methods ... Tabel 4-1 Problem Var. dim.
5000
10000
Penalty I
1000
5000
10000
Penalty II
20
50
100
Dim 1000
n hz prp N* HZ PRP n hz prp N* HZ PRP n hz prp N* HZ PRP n hz prp N* HZ PRP n hz prp N* HZ PRP n hz prp N HZ PRP n* hz prp N HZ PRP n hz* prp N HZ PRP
c 2012 NSP
Natural Sciences Publishing Cor.
Method N* HZ PRP 2 2 2 3 3 3 3 3 3 2 2 2 2 2 2 8 1003 29 8 45 29 12 579 94 12 88 56 16 255 50 16 85 50 3332 3332 3332 321 321 321 3304 3303 3303 146 128 128 578 594 592
NI 2 2 2 12 12 12 22 22 22 22 22 22 12 12 12 12 12 12 50 3290 104 50 102 104 54 1882 332 104 338 221 38 797 185 38 197 185 10002 10002 10002 867 868 868 10001 10000 10001 344 322 344 1768 1817 1761
NF 12 12 12
NG 3 3 3
3 3 3 6 6 6 6 6 6 2 2 2 2 2 2 30 1825 72 30 58 72 33 1036 208 72 107 64 17 430 105 17 130 105 3350 3356 3354 342 346 346 3365 3365 3396 221 201 221 631 650 634
Trig. metric
1000
5000
10000
Ext. Rosenbrock
1000
5000
10000
Ext. Powell sing.
1000
5000
n* hz prp N* HZ PRP n hz prp N* HZ PRP n hz prp N* HZ PRP n hz prp N HZ PRP n hz* prp N HZ PRP n hz* prp N HZ PRP n hz* prp N HZ PRP n* hz prp N HZ PRP
224 244 242 62 76 66 62 76 66 59 67 63 59 67 63 65 81 77 65 81 77 3249 3248 3273 486 477 409 3249 3248 3273 486 477 490 3249 3248 3249 486 477 490 3333 3333 3333 1211 1211 1211 3333 3333 3333
644 689 688 104 142 128 104 142 128 108 126 109 108 126 109 116 168 141 116 168 141 10001 10000 10001 1301 1299 1301 10001 10000 10001 1301 1299 1301 10001 10000 10001 1301 1299 1301 10002 10002 10002 3421 3421 3421 10002 10002 10002
231 260 255 84 123 113 84 123 113 83 111 104 83 111 104 101 142 123 101 142 123 3418 3421 3403 518 526 521 3418 3421 3403 521 526 518 3418 3421 3403 521 526 518 3320 3340 3338 1402 1430 1422 3320 3340 3338
Appl. Math. Inf. Sci. 6, No. 1S, 147S-154S (2012) / www.naturalspublishing.com/Journals.asp
10000
Chebyquad
200
500
2000
Brown
& Dennis
Gulf research
Beale
4
3
2
n* hz prp N HZ PRP n* hz prp N HZ PRP n* hz prp N* HZ PRP n hz prp N* HZ PRP n hz prp N* HZ PRP n* hz prp N HZ PRP n* hz prp N HZ PRP n* hz prp
1211 1211 1211 3333 3333 3333 1211 1211 1211 2498 2492 2495 446 521 524 2 2 2 2 2 2 2 2 2 2 2 2 336 340 343 121 144 156 3335 3334 3334 1222 1443 1443 836 836 842 323 334 386
3421 3421 3421 10002 10002 10002 3421 3421 3421 10001 10003 10000 1330 1640 1677 12 12 12 12 12 12 12 12 12 12 12 12 1390 1406 1425 367 421 331 10001 10001 10001 3444 3866 3866 2501 2501 2518 1003 1008 1120
1402 1430 1422 3320 3340 3338 1402 1430 1422 3104 3412 3150 476 544 532 1 1 1 1 1 1 1 1 1 1 1 1 363 367 373 142 211 191 3384 3385 3428 1312 1628 1628 903 903 907 421 432 449
If EM (j0 ) method does not work for example i0 , but PRP method can work, we replace the ri0 EM (j0 ) by a positive constant τ1 which define as follows: ¯ S1 } τ1 = max{ri (EM (j0 )) : (i, j0 )∈
S1 = {(i, j0 ) : method j0 does not work f or example i} (4) If PRP method does not work for example i0 , but EM (j0 ) method can work, we replace the ri0 EM (j0 ) by a positive constant τ2 which define as follows: ¯ S1 } τ2 = min{ri (EM (j0 )) : (i, j0 )∈
Ntotal,i (EM (j)) Ntotal,i (P RP )
(5)
Neither PRP method nor EM (j0 ) method works, we define ri0 EM (j0 ) = 1.The geometric mean of these ratios for EM (j) method over all the test problems isdefined by Y r(EM (j)) = ( ri (EM (j))(1/|S|)
(6)
i∈s
where S denotes the set of the test problems and |S| the number of elements in S. According to the above rule, it is clear that r(P RP ) = 1.The values of r(HZ), r(N ), r(hz), r(n) and r(prp) are listed in Table 4-2. Tabel 4-2 HZ PRP N hz n prp 1.194 1.0 0.912 0.877 0.782 0.854 from tabel 4-2 we can see that the new method is more efficient than HZ method and PRP method. Acknowledgments: This research supported by Natural Science Fundation of Guangdong Province No. 915100800 2000012. References
(1)
Similarly, we compare PRP+ method, MPRP method with PRP method as follows:for each problem i, compute the total numbers of function evaluations and gradient evaluations required by the evaluated methods and PRP method by formula (4.1), and denote them by Ntotal,i (EM ) and Ntotal,i (P RP ); then calculate the ratio ri (EM (j)) =
(3)
where
In order to rank the iterative numerical methods, one can compute the total number of function and gradient evaluations by the formula Ntotal = N F + 5 ∗ N G
153
(2)
[1] M. R. Hestenes and E. L. Stiefel, Methods of conjugate gradients for solving linear systems, J. Research Nat. Bur. Standards, 49 (1952), 409-436. [2] R. Fletcher and C. Reeves, Function minimization by conjugate gradients, Comput. J., 7 (1964), 149-154. [3] B. T. Polyak, The conjugate gradient method in extreme problems, USSR Comp. Math. Math. Phys., 9 (1969), 94-112. [4] R. Fletcher, Practical Methods of Optimization vol. 1: Unconstrained Optimization, John Wiley [5] Y. Liu and C. Storey, Efficient generalized conjugate gradient algorithms, Part 1: Theory, J. Optim. Theory Appl., 69 (1991), 129-137. [6]Y. H. Dai and Y. Yuan, A nonlinear conjugate gradient method with a strong global convergence property, SIAM J. Optim., 10 (1999), 177-182. [7] Hager W W, Zhang H.A new conjugate gradient method with guaranteed descent and an efficient line search. SIAM J Optim. 2005, 16:170-192
c 2012 NSP
Natural Sciences Publishing Cor.
154
[8]Al-Baali M. Descent Property and global convergence of the Fletcher-Reeves method with inexact line searches, SIMA Journal of Numerical Analysis, 5(1) (1985), 121124. [9]G. Liu, J. Han, H. Yin, Global convergence of the Fletcher ¨ CReeves algorithm with inexact line search, Appl. Math. JCU 10B (1995) 75¨C82. [10]H. Dai and Y. Yuan, Convergence of the Fletcher-Reeves method under a generalized Wolfe wearch, J. Comput. Math., 2 (1996), 142¨C148. [11]Powell M J D. Nonconvex minimization calculations and the conjugate gradient method, Lecture notes in Mathematics vol. Springer-Veriag(Berlin), 1006 (1984), 122-141. [12] Grippo L, Lampariello F, Lucidi S. A nonmonotone line search technique for Neton¡¨as methods. SIAM Journal on Numerical Analysis, 123 (1986), 707-716. [13] Lucidi S, Roma M. Nonmonoton conjugate gradient methods for optimization, System Modeling and Optimization, Lecture Notes on Control and Information Sciences, Belin, Springer Verlag, 1995. [14] Liu G H, Jing L L, Han D. A class of nonmonotone conjugate gradient methods for unconstrained optimization, SIAM J Optim, 101 (1999) 127-140. [15]Jorge J M, Burton S G, Kenneth E H. Testing Unconstrained Optimization Software [J ]. ACM Transactions on Mathematical Software 7 (1981) 17-41.
c 2012 NSP
Natural Sciences Publishing Cor.
H. Liu, X. Li : A new class of Conjugate Gradient Methods ...
Hailin Liu obtained his M.Sc. degree in Applied Mathematics from the central south university of technology, china, he is a professor of school of computer science at Guangdong Polytechnic Normal University in Guangzhou, china, Liu’s current research is divided into two areas:(i) optimization theory and it’s application, (ii) computer science and it’s application. Xiaoyong Li is the responding author of this paper , he obtained his M.Sc. degree in Applied Mathematics from the Guangdong Polytechnic Normal University in 2011, he is a software engineer at a Information security corporation in Zhuhai city, his research interest is optimization theory and it’s application.