A Modified Penalty Function Method for Inequality Constraints Minimization Shu-Qin Liu∗
Jianming Shi†
Jichang Dong‡
Shouyang Wang§
January 2004
Abstract: In this paper we introduce a modified penalty function (MPF) method for solving a problem which minimizes a nonlinear programming subject to inequality constraints. Basically, this method is a combination of the modified penalty methods and the Lagrangian methods. It treats inequality constraints with a modified penalty function and avoids the indifferentiability of max {x, 0}. This method alternatively minimizes the MPF and updates the Lagrange multipliers. It converges linearly for a large enough penalty parameter under the standard second-order optimality conditions and superlinear convergence can be attained by increasing the penalty parameter after updating Lagrange multiplier step by step. Numerical experiments show that the proposed method is considerably faster than the method based on classical penalty function. Keywords: optimization, inequality constraints, convergence, modified penalty function (MPF).
1
Introduction
In this paper, we give a new method for solving minimization problems with inequality constraints, i.e., (IQ) min f (x) s.t. C(x) ≤ 0,
(1.1)
where C(x) := (c1 (x), · · ·, cm(x)) . We assume that f (x) and and ci (x) (i = 1, · · · , m) are continuously differentiable functions from Rn to R. Although the modified penalty function (MPF) method proposed in this paper takes after the modified barrier functions method for inequality constraints of Polyak [16], the MPF eliminates major drawbacks of the classical penalty function (CPF). The MPF is well-defined at a solution and maintains the smoothness of the objective and constraint functions in a neighborhood of that solution. When the penalty parameter is large enough, the MPF has a global minimizer for any vector of positive Lagrange multipliers under standard second-order optimality conditions. ∗
School of Economics and Management, Beihang University, Beijing 100083, China Department of computer science and systems engineering, Muroran Institute of Technology, Muroran, Japan. Email:
[email protected] ‡ School of Management, Graduate School of Chinese Academy of Sciences, Beijing 100019, China § Academy of Mathematics and Systems Sciences, Chinese Academy of Sciences, Beijing 100080, China †
1
The main contribution of this paper is to demonstrate that how to construct the MPF which uses a smoothing approximation of the plus function max {x, 0}, and that the convergence rates of the MPF method is up to superlinear like the augmented Lagrangian method. The proof techniques are similar to those used in, for instance, [11] and [19]. The CPF used for solving a nonlinear programming subject to inequality constraints is once differentiable only even if the objective and constraint functions have higher-order differentiability properties, in contrast, the MPF be stated in section 2 remains continuously differentiable as many times as we wish under the assumption enough that the objective and the constraint functions are differentiable. This paper is organized as follows. In the next section some related properties of function p(x, α) are stated. The basic assumptions and optimality conditions of the general nonlinear programming problem are introduced in section 3. The definition of MPF and some its properties are given in section 4. We describe the MPF method in section 5 and present a trust region algorithm for solving the problem in section 6. The convergence results for the MPF method are discussed in section 7. In section 8, we report some numerical experiments on the proposed algorithm. Some concluding remarks on the propounded method are given in section 9.
Properties of function p(x, α)
2
Consider a discontinuous function σ(y) := 1 if y > 0 or 0 otherwise, and its approximation s(y, α) := (1 + e−αy )−1 for α > 0. It is easy to see that max {x, 0} =
x −∞
σ(y)dy.
Substituting s(y, α) for σ(y) in the expression above, in their paper [6], Chen and Mangasarian use the function x 1 s(y, α)dy = x + log(1 + e−αx ) p(x, α) := α −∞ to approximate the plus function x+ := max {x, 0}. Some related properties of the function p(x, α), where α > 0, is presented as follows (These can be found in [6]). 1. p(x, α) has infinitely many continuous differentiabilities, with dp(x, α)/dx = (1 + e−αx )−1 and d2 p(x, α)/dx2 = αe−αx (1 + e−αx )−2 ; 2. p(x, α) is strictly convex and strictly increasing with respect to α on R; 3. p(x, α) > x+ for all x ∈ R; 4. max {p(x, α) − x+ } = p(0, α) = x∈R
5.
log 2 ; α
lim {p(x, α) − x+ } = 0 for all α > 0;
|x|→∞
6. lim p(x, α) = x+ for all x ∈ R; α→∞
7. p(x, α) ∈ (0, ∞) for all x ∈ R, α > 0, the inverse function p−1 is well defined for x ∈ (0, ∞); 8. p(x, α) > p(x, β) for α < β, x ∈ R.
2
Because of these properties, we obtain that max {x, 0} ∼ = p(x, α), i.e., 1 max {x, 0} ∼ = x + log(1 + e−αx ). α Basic assumptions and optimality conditions of classical Lagrangian method will be described in the next section.
3
Basic assumptions and optimality conditions
For problem (1.1), we assume that f (x), C(x) are real valued functions of Rn on C 2 . The classical Lagrangian for this problem is: L(x, λ) := f (x) + λ C(x), m m m where λ ∈ Rm + and R+ denotes the set of nonnegative vector of R , and R++ its interior. ∗ ∗ ∗ Let x be a local minimizer of problem (1.1), and I = {i | ci(x ) = 0} = { 1, · · · , r} be the set of active indices of the inequality constraints at point x∗ . In this paper we assume that the standard second-order sufficient conditions hold at x∗ , namely:
C1. The gradients ∇ci(x∗ ), i = 1, · · · , r are linearly independent, so there exists a unique Lagrange multiplier vector λ∗ ∈ Rm + , such that ∇x L(x∗ , λ∗) = ∇f (x∗ ) +
m
λ∗i ∇ci (x∗ ) = 0.
(3.1)
i=1
C2. The point (x∗ , λ∗ ), the hessian of Lagrangian L(x, λ) with respect to x is positive definite on the affine subspace which is tangent to the set of gradients at x∗ , i.e., d ∇2xxL(x∗ , λ∗)d > 0 for all d ∈ D ⊂ Rn , where
D :=
d d = 0, d ∇ci(x∗ ) = 0, i = 1, · · · , r .
(3.2)
C3. Strong complementary slackness holds for the inequality constraints, i.e., λ∗i ci (x∗ ) = 0 λ∗i > 0 ci (x∗ ) < 0
4
for for for
i = 1, · · ·, m, i = 1, · · ·, r, i = r + 1, · · · , m.
(3.3)
The modified penalty function
In this paper, the modified penalty function (MPF) F (x, λ, α) : Rn × Rm + × R++ → R is defined by the formula: F (x, λ, α) := f (x) +
m
2λi p(ci(x), α) = f (x) +
i=1
m i=1
2λi ci (x) +
1 log 1 + e−αci (x) . α
(4.1)
Because of the properties of p(x, α) in section 2 and the conditions C1-C3 in section 3, we use F (x, λ, α) as an approximate penalty function. Use this MPF, we can obtain many good properties which will be presented and proved in the following sections. If the complementary slackness condition (3.3) holds at the point (x∗ , λ∗ ), we have 3
∗
∗
∗
(P 1) F (x , λ , α) = f (x ) +
m
2λ∗i
i=1 ∗
∗
∗
(P 2) ∇xF (x , λ , α) = ∇f (x ) +
log 2 → f (x∗ ) as α → ∞. α
m
2λ∗i
−αci (x∗ )
1+e
−1
∗
∇ci (x ) = ∇f (x ) +
i=1
(P 3)
∇2xxF (x∗ , λ∗ , α) m
+
=
∗ 2αλ∗i e−αci (x )
∇2xx f (x∗ ) +
m
λ∗i ∇ci(x∗ ).
i=1 m
2λ∗i 1 + e−αci (x
i=1 −2
−αci (x∗ )
1+e
∗
∗)
−1
∇2xxci (x∗ )
∇ci (x∗ )∇ci(x∗ ) = ∇2xxL(x∗ , λ∗)+α∇C(x∗ ) Λ∗ ∇C(x∗ ).
i=1
Here ∇C(x) is the jacobian matrix of the vector functions C(x), and Λ∗ is a diagonal matrix ∗ with λ2 on diagonal. Lemma 4.1 Let A be a symmetric n × n matrix, and B be an r × n matrix, Λ = diag {λi } : Rr → Rr , such that λ = (λ1 , · · · , λr ) > 0 and Bd = 0 implies that Ad, d ≥ µ d, d , µ > 0. Then there exists a number α0 > 0 such that for any 0 < β < µ, α ≥ α0 we have (A + αB ΛB)x, x ≥ β x, x , ∀x ∈ Rn . Proof: For ∀x ∈ Rn , if Bx = 0, we have Ax, x ≥ µ x, x , µ > 0 from conditions of this theorem, which implies that (A + αB ΛB)x, x ≥ µ x, x . And if Bx = 0, we obtain B ΛBx, x > 0, so from A being a symmetric n × n matrix, we have (A + αB ΛB)x, x ≥ µ x, x for a α0 > 0 and any α ≥ α0 . Therefore, for any 0 < β < µ, α ≥ α0 we have (A + αB ΛB)x, x ≥ β x, x , ∀x ∈ Rn .
Compared with the classical penalty function, the MPF is defined at x∗ and has the same order of smoothness as the functions ci (x), i = 1, · · ·, m. Moreover, from property (P 3) we have Theorem 4.2 Suppose that the second-order optimality conditions C1-C3 hold at x∗ , then there exists a number α0 > 0 such that for any α ≥ α0 , the matrix ∇2xx F (x∗ , λ∗, α) is positive definite, i.e., F (x, λ∗ , α) is strongly convex at x∗ . This theorem can be concluded from Lemma 4.1. In the following sections, we will give the MPF method of problem (1.1) and discuss properties of it, and give proofs of the properties. This method combines the penalty and Lagrangian techniques.
5
The modified penalty function method
If the second-order sufficient conditions C1-C3 hold at a solution x∗ to problem (1.1), from properties P 1-P 3 of function p(x, α) and Theorem 4.2, to solve problem (1.1), i.e., to find a global minimizer x∗ , we need only find the unconstrained minimum of the smooth strongly convex function F (x, λ∗ , α) in a neighborhood of x∗ , for a fixed but large enough α > 0. ∗ If the Lagrange multipliers λ ∈ Rm ++ is close enough to λ , then x ˆ := x ˆ(λ, α) := arg min {F (x, λ, α) | x ∈ Rn } 4
is a good approximation to x∗ . It shows that the minimizer x ˆ can be used to improve the approximation λ to the optimal Lagrange multipliers λ∗ when the fixed penalty parameter α > 0 is large enough. Therefore, we can solve problem (1.1) starting from any initial point (x, λ) for a large enough penalty parameter α by alternatively minimizing F (x, λ, α) and updating the Lagrange multipliers λ. Assuming that the unconstrained minimizer x ˆ=x ˆ(λ, α) exists, then we have x, λ, α) = ∇f (ˆ x) + ∇x F (ˆ
m
2λi 1 + e−αci (ˆx)
−1
∇ci(ˆ x) = 0.
(5.2)
i=1
Defining new Lagrange multipliers by the formulas
λˆi := 2λi 1 + e−αci (ˆx)
−1
(5.3)
we can rewrite (5.2) as x, λ, α) = ∇f (ˆ x) + ∇x F (ˆ
m
ˆ = 0. ˆ i ∇ci (ˆ x) = ∇xL(ˆ x, λ) λ
(5.4)
i=1
Therefore, the minimizer x ˆ of F (x, λ, α) is a stationary point of the classical Lagrangian L(x, ˆ λ) and then x ˆ is the minimizer of this Lagrangian in a convex case. From hereafter Theorem 7.4, ˆ − λ∗ ≤ γ λ − λ∗
λ
(5.5)
ˆ x − x∗ ≤ γ λ − λ∗ ,
(5.6)
and
where the norm · denotes the l∞ -norm · ∞ and 0 < γ ≤ 12 , γ is independent of α. Namely, to find an unconstrained minimizer of the MPF F (x, λ, α) in x and to update the Lagrange multipliers, we can reduce the distance between (x, λ) and (x∗ , λ∗) by increasing the penalty parameter α > 0. In section 7, we will prove that the MPF method with a fixed but large enough penalty parameter αk > 0 converges linearly under the second-order optimality conditions. And we can obtain superlinear convergence by increasing the penalty parameter αk from step to step. The iteration approach of the MPF method is as the following algorithm. Algorithm 5.1: step 0: given λ0 , α0 , ε > 0, k := 0, step 1: compute the problem minn F (x, λk , αk ) to obtain xk+1 := arg min F (x, λk , αk ) x ∈ Rn , x∈R
compute αk+1 := nαk , where n is an iterative constant,
k+1
−1
:= 2λki 1 + e−αci (x ) for i = 1, · · ·, m, compute λk+1 i k := k + 1; Step 2: if ∇x L(xk , λk ) < ε, C (+) (x) < ε then stop, otherwise, go to step 1. To solve the subproblem min{F (x, λk , αk ) | x ∈ Rn } in Step 1, we may use any method of unconstrained minimization problem. In this paper, we apply a trust region algorithm, which can guarantee convergence more efficiently. 5
6
Trust region algorithm for unconstrained minimization
At the k-th iteration, we calculate a trial step by solving the subproblem: min φk (d) = gk d + 12 d Bk d s.t. d ≤ k ,
(6.7)
where gk = ∇x F (xk , λk , αk ), Bk is the Hessian of F (x, λk , αk ) at xk , k is a radius of trust region. In a trust region algorithm, after computing a trial step dk , one will determine whether it is acceptable. In this paper, we employ the MPF which uses F (x, λ, α) in (4.1) as a merit function. If the value of merit function F (x, λ, α) reduces enough, then we accept dk , and update xk by xk+1 = xk + dk , otherwise, we reject dk . To evaluate the reduction, we give the actual reduction of the merit function by Aredk := F (xk , λk , αk ) − F (xk+1 , λk , αk ), where dk is the solution of (6.7). Then the predicted reduction is P redk := φk (0) − φk (dk ). k Let rk := Ared P redk . If rk > β1 which is a beforehand parameter, then accept the trial step, otherwise, reject the trial step. We give a trust region algorithm for unconstrained minimization problem as follows.
Algorithm 6.1: step 0: set x0 ∈ Rn , B0 ∈ Rn,n , λ0 > 0, α0 > 0, ε > 0, h1 > 1 > h2 > 0, 0 < β1 < β2 < 1, k := 0; step 1: if gk ≤ ε, then stop, otherwise compute (6.7) to obtain the step dk ; Step 2: compute rk ; if rk ≤ 0, then set xk+1 := xk , otherwise if rk < β1 , then set xk+1 := xk + dk , k+1 := h2 dk , otherwise go to step 3; step 3: if β1 ≤ rk ≤ β2 , then accept the trial step and set xk+1 := xk + dk , k+1 := k , otherwise go to step 4; Step 4: if rk > β2 , then set xk+1 := xk + dk , k+1 := h1 k , k := k + 1, go to step 1. In the next section, we prove that the MPF method with a fixed penalty parameter αk converges linearly whenever the second-order optimality conditions are fulfilled and αk > 0 is large enough. Ill-conditioning of the Hessian ∇2xxF (x, λ, αk ) is less likely as xk → x∗ , it is useful in practice to increase the αk step by step to obtain superlinear convergence.
7
Convergence results
We present give some assumptions on problem (1.1) before we give convergence results of MPF method. We assume that
max
max |ci(x)| x ∈ X
1≤i≤m
6
> 0,
(7.8)
where X is the feasible region of problem (1.1). Further, we also assume that the gradients ∇ci (x∗ ), i = 1, · · · , r are linearly independent. In addition, we define the following sets ∗
Di (·) := Di(λ , α, δ, ε) :=
{λi | λi ≥ ε, |λi − λ∗i | ≤ δα, α ≥ α0 } , {λi | 0 ≤ λi ≤ δα, α ≥ α0 } ,
if i = 1, · · ·, r, if i = r + 1, · · ·, m,
where δ > 0 is a small enough number, 0 < ε < min {λ∗i | i = 1, · · ·, r} and α0 is large enough. In the following we use the simple form D(·) := {(λ, α) | α ≥ α0 }. In addition, let σ := max {ci (x∗ ) | r + 1 ≤ i ≤ m} , I r is the r × r identity matrix, Or×r is the r×r zero matrix, M > 0 is large enough such that λ ≤ M and S(y, ε) := {x ∈ Rn | x − y ≤ ε }. The convergence results in this paper are based on the following condition. Condition 7.1 f (x) ∈ C 2 , ci (x) ∈ C 2 , i = 1, · · ·, m, and the conditions C1-C3 and (7.8) hold. Then there exist α0 > 0 and small enough number δ > 0 that for any 0 < ε < min{λ∗i | 1 ≤ i ≤ r} and any (λ, α) ∈ D(λ∗ , α0 , δ, ε) the following lemma and theorems hold with λ ≤ M . Lemma 7.2 Suppose that Condition 7.1 is satisfied, then
∇Φ∞ = ∇xλˆ Φα (x (r)
∗
, λ∗(r), 0, ∞)
=
∇2xxL
Λ∗(r) 2 ∇C(r)
∇C(r)
Or×r
(7.9)
is nonsingular. Proof:
Suppose ∇Φ∞ y = 0 for a given vector y = (d, v) ∈ Rn+r . Then we must have that v=0 ∇2xxLd − ∇C(r)
(7.10)
and Λ∗(r) 2
∇C(r) d = 0.
(7.11)
In view of λ∗i > 0, i = 1, · · · , r, the expression (7.11) implies that ∇C(r)d = 0. So multiplying (7.10) by d we obtain that d ∇2xx Ld − v ∇C(r) d = 0 . Therefore ∇C(r)d = 0 implies d ∇2xx Ld = 0. Because of C2, this is true only and only if d = 0, v = 0 implies v = 0, i.e., from ∇Φ∞ y = 0, but from C1, and C(r) linearly independent, ∇C(r) we conclude that y = 0. So the matrix ∇Φ∞ is nonsingular. Theorem 7.3 Suppose that Condition 7.1 is satisfied, then F (x, λ, α) has a minimizer x ˆ = x ˆ(λ, α) with respect to x within some open ball centered at x∗ ; i.e., there exists a vector x ˆ=x ˆ(λ, α) = arg min {F (x, λ, α) | x ∈ S(x∗ , ε)} such that x, λ, α) = 0. ∇x F (ˆ
7
Let ti := (λi−λ∗i )α−1 , i = 1, · · · , m, t := (t1 , · · · , tm ) , S(0, δ) := t = (t1 , · · · , tm ) |ti | ≤ δ, . ˆ (r) := (λ ˆ 1, · · · , λ ˆ r ) , λ ˆ (m−r) := In the following proof, we denote that t(r) := (t1 , · · · , tr ) , λ ˆ m). ˆ r+1, · · · , λ (λ Proof:
Note that λ∗i = 0, i = r + 1, · · ·, m, so definition of
−1 2 (αti + λ∗ ) 1 + e−αci (x) i ˆ i (x, t, α) = −1 λ 2αti 1 + e−αci (x)
Let h(x, t, α) :=
m i=r+1
if i = r + 1, · · · , m.
m
ˆ i (x, t, α)∇ci(x) = 2α λ
if i = 1, · · · , r
ti 1 + e−αci (x)
−1
∇ci (x).
i=r+1
Then for any α > 0 and x ∈ S(x∗ , ε), t ∈ S(0, δ) the function h(x, t, α) is smooth enough and h(x∗ , 0, α) = O ∈ Rn , ∇x h(x∗ , 0, α) = On,n , ∇λˆ h(x∗ , 0, α) = On,r . (r)
ˆ (r), t, α) : Rn+r+m+1 → Rn+r defined by Consider the function Φα(x, λ
∇f (x) +
ˆ (r), t, α := Φα x, λ
r
ˆ i∇ci (x) + h(x, t, α) λ
i=1
2α−1 αt(r) + λ∗(r)
1 + e−αc(r) (x)
−1
ˆ (r) − α−1 λ
.
From C3 and h(x∗ , 0, α) = O we obtain
Φα x∗ , λ∗(r), 0, α = 0, ∀ α > 0,
Denote ∇xλˆ Φα := ∇xλˆ Φα x∗ , λ∗(r), 0, α , ∇2xx L := ∇2xx L(x∗ , λ∗ ), ∇C(r) := ∇C(r) (x∗ ) and
λ∗ r
(r)
(r)
: Rr → Rr , λ∗i > 0, i = 1, · · ·, r. Λ∗(r) := diag 2i i=1 Because of ∇x h(x∗ , 0, α) = On,n , ∇λˆ h(x∗ , 0, α) = On,r we obtain (r)
, λ∗(r), 0, α
∗
, λ∗(r), 0, ∞
∇Φα = ∇xλˆ Φα x (r)
∗
∇2xx L
=
Λ∗(r) 2 ∇C(r)
∇C(r)
α−1 I r
.
From the matrix, we have that
∇Φ∞ = ∇xλˆ Φα x (r)
=
∇2xx L
Λ∗(r) 2 ∇C(r)
∇C(r)
Or×r
.
From Lemma 7.2, we know that the matrix ∇Φ∞ is nonsingular, so there exists a α0 > 0 such that for any α ≥ α0 , ∇Φα is nonsingular. So we obtain that there exists a constant ρ. Therefore, from the second imρ > 0 independent of α ≥ α0 such that ∇Φ−1 α ≤ plicit function theorem and the system Φα x∗ , λ∗(r), 0, α = 0, ci (x) ∈ C 2 , i = 1, · · · , m,
we obtain that the system Φα x∗ , λ∗(r), t, α = 0 that there exist smooth vector functions ˆ (r)(0, α) = λ∗ , and in the neighborhood S(0, δ) = x(t, α), ˆλ(r)(t, α) such that x(0, α) = x∗ , λ (r)
(s1 , · · · , sm ) |si | ≤ δ, i = 1, · · ·, m and α ≥ α0 ,
Φα x(t, α), ˆλ(r)(t, α), t, α ≡ 0. 8
(7.12)
Consequently, for x(·) ≡ x(t, α) we obtain from the first n equations of (7.12)
λ(·), · = ∇f (x(·)) + ∇xF x(·), ˆ
m
ˆ i(·)∇c i(x(·)) = 0. λ
(7.13)
i=1
Namely, x(·) is a stationary point of the MPF. To prove that x(·) is a local minimizer of this function, we will use the estimate of (5.5) and (5.6) which will be proved in Theorem 7.4 and the strong convexity of F (x, λ, α) which will be proved in Theorem 7.5. ˆ = Theorem 7.4 Suppose Condition 7.1 is satisfied, then for the pair of vectors x ˆ and λ −1 ˆ λ(λ, α) = 2λ 1 + e−αC(x) , when α being large enough, the estimate ˆ − λ∗ ≤ γ λ − λ∗ and ˆ x − x∗ ≤ γ λ − λ∗
λ hold, and 0 < γ ≤ Proof:
1 2
(7.14)
is independent of α.
From Theorem 7.3, we obtain that there exists ε0 > 0 such that
ˆ (r)(t, α) − λ∗ ≤ ε0 , max x(t, α) − x∗ , λ such that λ(r)(t, α), t, α) ≡ Φα(x(·), ˆλ(r)(·), ·) ≡ 0 Φα(x(t, α), ˆ
(7.15)
and λ(r)(t, α), t, α) ≤ 2ρ, ∀t ∈ S(0, δ) α ≥ α0 .
∇xλˆ Φα (x(t, α), ˆ (r)
Let σ=
max {ci (x∗ )} < 0
r+1≤i≤m
and
ˆ i (t, α) = 2αti 1 + e−αci (x(t,α)) λ
−1
, i = r + 1, · · · , m.
(7.16)
ˆ (m−r) (·) . If δ is small enough Note that λ∗i = 0, i = r + 1, · · ·, m. First we estimate the λ then for any t ∈ S(0, δ) and any α ≥ α0 we have x(t, α) − x(0, α) = x(·) − x∗ ≤ ε and ci (x(t, α)) ≤ σ2 . Therefore
ˆ i = 2(λi − λ∗ ) 1 + e−αci (x(·)) λ i Hence we obtain and
−1
, i = r + 1, · · · , m.
ˆ i ≤ 2e 12 σα λi , i = r + 1, · · ·, m, λ
1 σα ˆ (m−r) (·) − λ∗ ˆ (m−r) = λ 2
λ(m−r) (·) − λ∗(m−r) .
λ (m−r) ≤ 2e
So we proved that when i = r +1, · · ·, m, the first inequality of (7.14) is true. In the following we will give the proof of the rest of (7.14). Rewriting (7.13) we obtain ∇f (x(t, α)) +
r
ˆ i(t, α)∇ci(x(t, α)) + h(x(t, α), t, α) = 0, λ
i=1
9
(7.17)
ˆ i(t, α) = 2(αti + λ∗ ) 1 + e−αci (x(t,α)) λ i
−1
, i = 1, · · · , r.
(7.18)
Differentiating (7.17) with respect to t, we have ∇2xx f (x(·))∇ tx(·) +
r
ˆ i (·)∇2 ci (x(·))∇ tx(·) + ∇x C(r)(x(·)) ∇t λ ˆ (r)(·) + ∇t h(x(·), ·) ≡ 0, λ xx
i=1
i.e.,
ˆ (r)(·) ≡ −∇t h(x(·), ·), ∇2xx L(x(·), ˆλ(r)(·))∇ tx(·) + ∇x C(r) (x(·)) ∇tλ
i.e.,
λ(r)(·)) ∇x C(r) (x(·)) ∇2xx L(x(·), ˆ
∇tx(·) ˆ (r)(·) ∇t λ
where ∇2xx L(x(·), ˆλ(r)(·)) = ∇2xxf (x(·)) +
r
≡ −∇t h(x(·), ·),
(7.19)
ˆ i(·)∇2 ci (x(·)), λ xx
i=1
∇t x(·) = Jt (x(·)) = (∇txj (·)), j = 1, · · ·, n) ∈ Rn,m , ˆ (r),t(·)) = (∇tλ ˆ (r)(·) = Jt (λ ˆ i (·), i = 1, · · · , r) ∈ Rr,m , ∇t λ ∇t h(x(·), ·) =
m
ˆ i(x(·), ·)∇2 ci (x(·))∇ tx(·) λ xx
i=r+1
ˆ (m−r)(x(·), ·) ∈ Rn,m +∇x C(m−r) (x(·)) ∇t λ Let
Dr (·) = diag 1 + e−αci (x(·))
Then
r
i=1
Dr−1 (·) = diag 1 + e−αci (x(·))
∈ Rr,r .
−1 r
.
i=1 α−1 4
Differentiating (7.18) with respect to t and multiplying both sides to
we obtain
diag
α−1 ˆ 1 1 ∇t λ(r)(·) = − Dr−1 (·); Or,m−r . αti + λ∗i e−αci (x(·)) Dr−2 (·)∇xC(r) (x(·))∇ tx(·) − 2 4 2
Multiply both sides of the above system to Dr2 (·) we obtain
α−1 2 1 ˆ (r)(·) = − 1 D1 (·); Or,m−r , Dr (·)∇tλ diag αti + λ∗i e−αci (x(·)) (·)∇xC(r) (x(·))∇ tx(·) − 2 4 2 r
i.e.,
1 α−1 2 Dr (·) diag αti + λ∗i e−αci (x(·)) (·)∇xC(r) (x(·)) 2 4
Let
∇Φ(·) =
ˆ (r)(·) ∇2xx L x(·), λ
∇tx(·) ˆ (r)(·) ∇t λ
diag 12 αti + λ∗i e−αci (x(·)) (·)∇xC(r) (x(·))
10
1 = − Dr (·); Or,m−r (7.20) . 2
∇xC(r) (x(·)) α−1 2 4 Dr (·)
.
Then from (7.19) and (7.20) we obtain
∇t x(·) ˆ (r)(·) ∇t λ
= ∇Φ
−1
(·)
−∇t h(x(·), ·)
= ∇Φ−1 (·)R(·).
− 12 Dr (·); Or,m−r
(7.21)
In order to estimate the norm of the matrix ∇Φ−1 (·) and the matrix R(·), we will consider the matrices ∇t h(x(·), ·) and Dr (·) in more detail. Note that m ˆ i (x, t, α)∇ci(ˆ h(x, t, α) = x), λ i=r+1
Let Dm−r (·) =
ˆ i (x, t, α) = 2αti 1 + e−αci (x) λ
diag 1 + e−αci x(·)
m
i=r+1
−1
, i = r + 1, · · · , m.
, t(m−r) = (ti , i = i + 1, · · · , m), D(t(m−r) ) =
m [diagti ]m i=r+1 , D C(m−r) (x(·)) = [diagc i (x(·))]i=r+1 . Then
∇t h (x(·), ·) =
m
ˆ i(x(·), ·)∇2 ci (x(·))∇ tx(·) + ∇x C(m−r) (x(·)) ∇t λ ˆ (m−r) (x(·), ·), λ xx
i=r+1
ˆ m−r (·) = ∇t λ ˆ i(·), i = r + 1, · · · , m ∇t λ
−1 −2 = Om−r,r ; 2αDm−r (·) + 2α2 diag ti e−αci (x) Dm−r (·)∇xC(m−r) (x(·))∇ tx(·).
In the following, we consider the system (7.21) for t = 0 and α ≥ α0 . First of all note that x(0, α) = 0, ˆ (r)(0, α) = λ∗ = (λ∗ , · · · , λ∗) > 0, λ r 1 (r) ∗ ˆ λi (x(0, α), 0, α) = λi = 0, i = r + 1, · · · , m, ci (x(0, α)) = ci(x∗ ) ≤ σ < 0, i = r + 1, · · · , m,
and also
Dr (0, α) = 2I r , Dr2 = 4I r , D t(m−r)
= 0 = Om−r,m−r ,
t(m−r) σI m−r .
D C(m−r) (x∗ ) = [diagci (x∗ )]m i=r+1 ≤ Further, we obtain
−1 (x(0, α)) 2αDm−r
= 2α diag 1 +
e−αci (x)
ˆ (m−r) (0, α) = Om−r,r , 2αD −1 (x(0, α)) λ m−r
= O
m−r,r
−1 m
−αci (x)
, 2α diag 1 + e
≤ 2αeσα I m−r ,
i=r+1
−1 m
≤ [O m−r,r , 2αeσαI m−r ] ,
i=r+1
∇xλˆ Φ(0, α) = ∇Φα , (r)
ˆ (m−r)(0, α) = ∇C(m−r) (x∗ ) ∇C(m−r) (x∗ ) ∇λ ∇t h(x(0, α), 0, α) =
= Om−r,r , 2α diag 1 + e−αci (x)
−1 m
.
i=r+1
Then for the norm of the matrix ∇t h(x(0, α), 0, α) we have the estimate
∇th(x(0, α), 0, α) ≤ 2αeσα ∇C(m−r) (x∗ ) . 11
ˆ (m−r) (0, α) we have So for the matrix ∇t x(0, α) and ∇tλ
∇t x(0, α) ˆ (r)(0, α) ∇t λ
=
∇Φ−1 α (0, α)
−∇t h(x(0, α), 0, α) [I r ; Or,m−r ]
= ∇Φ−1 α R0 .
(7.22)
σα ∗ Considering the estimate Φ−1 α ≤ ρ and ∇t h(x(0, α), 0, α) ≤ 2αe ∇C(m−r) (x ) , from (7.22) we obtain
≤ ρ 2αeσα ∇C(m−r) (x∗ ) + I r
ˆ (r)(0, α) max ∇tx(0, α) , ∇tλ
= ρ 2αeσα ∇C(m−r) (x∗ ) + 1 . So for a small enough number δ > 0 and any t ∈ S(0, δ) and α ≥ α0 , the inequality ˆ ∇Φ−1 α (x(τ t, α), λ(r)(τ t, α))R(x(τ t, α), (τ t, α)) ≤ 2ρ 2αeσα ∇C(m−r) (x∗ ) + 1 ≤ 4ραLeσα
(7.23)
holds for any 0 ≤ τ ≤ 1 and any α ≥ α0 , where ∇C(m−r) (x∗ ) + 1 ≤ L. At the same time we have
x(t, α) − x∗ ˆ λ(r)(t, α) − λ∗
=
x(t, α) − x(0, α) ˆ (r)(0, α) ˆ λ(r)(t, α) − λ
1
=
0
∇Φ−1 x(τ t, α), ˆλ(r)(τ t, α) R(x(τ t, α), (τ t, α))[t]dτ α
Therefore, considering the estimate (7.22) and (7.23) we obtain
ˆ (r)(t, α) − λ∗ ≤ 4ραLeσα t = 4ραLeσα α−1 λ − λ∗ max x(t, α) − x∗ , λ
∗ ˆ ˆ Let x ˆ(λ, α) = x λ−λ α , α , λ(λ, α) = λ(r) σα 4ραLe }, we obtain
λ−λ∗ α ,α
ˆ ∗ max x(λ, α) − x∗ , λ (r) (λ, α) − λ
ˆ (m−r) ,λ
≤ Cα−1 λ − λ∗
λ−λ∗ α ,α
1
= max 2e 2 σα , 4ρLeσα λ − λ∗ ≤ γ λ − λ∗ . Because when α → ∞
1
2e 2 σα → 0, 4ρLeσα → 0,
there exists large enough α0 , 0 < γ ≤
1 2
is true for any α ≥ α0 .
Theorem 7.5 Suppose Conditions 7.1 is satisfied, then the function F (x, λ, α) is strongly convex in a neighborhood of x∗ . Proof: Now we will prove that F (x, λ, α) is strongly convex in a neighborhood of x ˆ. From the equalities (7.14) we know that x ˆ=x ˆ(λ, α) satisfy the necessary optimality condition for the ˆ α). So we prove the matrix ∇2 F (ˆ x, λ, α) is positive definite only. function F (x, λ, xx We know that ∇xF (x, λ, α) = ∇f (x) +
m
2λi 1 + e−αci (x)
i=1
12
−1
1
. Then for C = max 2αe 2 σα ,
∇ci (x)
and ∇2xx F (x, λ, α)
=
∇2xxf (x) +
m
+
m
2λi 1 + e−αci (x)
i=1 −αci (x)
2λie
−αci (x)
1+e
−1
−2
∇2xx ci (x) .
∇ci (x)∇ci(x)
i=1
ˆ α), we obtain that ˆ i = λ(λ, Therefore, by the virtue of λ r
x, λ, α) = ∇2xxf (ˆ x) + ∇2xxF (ˆ +α +α
r
ˆ 2 ci (ˆ x) + λ∇ xx
i=1 −αci (ˆ x)
2λie
i=1 m
m
i=r+1 −2
−αci (ˆ x)
1+e
2λi 1 + e−αci (ˆx)
2λie−αci (ˆx) 1 + e−αci (ˆx)
−1
∇2xx ci (ˆ x)
∇ci(ˆ x)∇ci(ˆ x)
−2
∇ci (ˆ x)∇ci(ˆ x) ,
i=r+1
ˆ(λ, α) near x∗ and holds for all (λ, α) ∈ D(λ∗ , α0 , δ, ε). From (7.14) for large enough α0 we have x ∗ ∗ ∗ ˆ ˆ ˆ → x and λ → λ∗ we obtain λ(λ, α) near λ uniformly in (λ, α) ∈ D(λ , α0 , δ, ε). So from x r
x) + ∇2xxf (ˆ α
r
ˆ 2 ci (ˆ x) → ∇2xx L(x∗ , λ∗), λ∇ xx
i=1 −αci (ˆ x)
2λi e
1 + e−αc(ˆx)
−2
∇ci(ˆ x)∇ci(ˆ x) → α∇C(r) (x∗ )Λ∗(r)∇C(r) (x∗ ) .
i=1
and
where Λ∗(r) = diag
λ∗i r 2 i=1 m
x) → ci (x∗ ) ≤ σ < 0, i = r + 1, · · · , m, ci (ˆ Therefore
2λi 1 + e−αci (ˆx)
i=r+1 m
α
−1
∇2xx ci(ˆ x) → On,n ,
2λi e−αci (ˆx) 1 + e−αci (ˆx)
−2
∇ci (ˆ x)∇ci (ˆ x) → On,n .
i=r+1
So for a sufficiently large number α0 we have x, λ, α) ∼ ∇2xx F (ˆ = ∇2xx L(x∗ , λ∗ ) + α∇C(r) (x∗ ) Λ∗(r)∇C(r)(x∗ ) = ∇2xx F (x∗ , λ∗, α), ∀(λ, α) ∈ D(λ∗ , α0 , δ, ε). x, λ, α) is positive definite. So the function From C2 and Lemma 4.1 we have the matrix ∇2xx F (ˆ F (x, λ, α) is strongly convex in a neighborhood of x ˆ. In view of (7.14) x ˆ → x∗ , we think that x, λ, α) = 0, the function F (x, λ, α) is strongly convex in a neighborhood of x∗ . Then from ∇x F (ˆ we know that x ˆ is a local minimum of F (x, λ, α). Theorem 7.6 Let f (x) ∈ C 2 , ci(x) ∈ C 2 , i = 1, · · · , m, and the conditions C1-C3 hold at a solution x∗ to problem (1.1) and assumption (7.8) holds. Then there exist a large enough number α0 > 0 and small enough number δ > 0 that for any 0 < ε < min{λ∗i | 1 ≤ i ≤ r} and ˆ in Theorem 7.3 is the global minimizer of F (x, λ, α). any (λ, α) ∈ D(λ∗ , α0 , δ, ε) the vector x 13
Proof: From Theorem 7.3 we know that x ˆ is the minimizer of F (x, λ, α) in a neighborhood of x∗ and C(x∗ ) ≤ 0, λ ≥ 0. Note that F (ˆ x, λ, α) ≤ F (x∗ , λ, α) = f (x∗ ) + =
f (x∗ ) +
m
2λi ci (x∗ ) +
1 ∗ log 1 + e−αci (x ) α
i=1
r ∗ log 2 1 . 2λi ci (x ) + log 1 + e−αci (x ) = f (x∗ ) + 2λi α α i=1 i=1
r
∗
(7.24)
Suppose that there exists a vector x ˜ and a number µ ˜ > 0 such that F (˜ x, λ, α) ≤ F (ˆ x, λ, α) − µ ˜, then from (7.24) we have ∗
F (˜ x, λ, α) ≤ f (x ) +
r
2λi
i=1
where F (˜ x, λ, α) = f (˜ x) +
m
2λi ci (˜ x) +
i=1
log 2 −µ ˜, α
1 log 1 + e−αci (˜x) . α
x) := {i : ci (˜ x) < 0}. Then from the above inequality we obtain Let I− (˜
r 1 log 2 log 1 + e−αci (˜x) + −µ ˜ 2λi α α i=1 i=1 r 1 log 2 −αci (˜ x) ∗ −µ ˜. 2λi ci (˜ x) + log 1 + e + 2λi = f (x ) − α α i=1 i∈I (˜ x)
f (˜ x) ≤ f (x∗ ) −
m
2λi ci (˜ x) +
−
So from the assumption that ci (x) and the norm of λ are bounded, for the large enough α0 and any α ≥ α0 we obtain −
i∈I− (˜ x)
r 2λi log 2 log 1 + eαci (˜x) + → 0 as α → ∞. 2λi α α i=1
Therefore, for the large enough α0 and any α ≥ α0 we have 1 ˜. f (˜ x) ≤ f (x∗ ) − µ 2 From the other side we have f (˜ x) ≥ min {f (x) | x ∈ Rn } . Further we obtain f (˜ x) ≥ f (x∗ ) − α−1
m
λ∗i ,
i=1
Therefore for large enough α0 and any α ≥ α0 we will get that 1 ˜. f (˜ x) ≥ f (x∗ ) − µ 4 which contradicts to (7.25). This completes the proof of the theorem.
14
(7.25)
8
Numerical examples
To illustrate the behavior of the method proposed in this paper, we wrote MATLAB codes (Version 6.3) for the following four examples and ran them on a PC with Windows 2000 (1000 MHz, 128 MB main memory), and the tolerance was 10−4 . Example 1 ([5])
min f (x) = (x1 − 2)2 + (x2 − 1)2 s.t. x21 − x2 ≤ 0, x1 + x2 − 2 ≤ 0.
The algorithm starts at the point x = (2, 2). After 1 iteration, the algorithm finds f (x∗ ) = 1 at x∗ = (1, 1), C (+) (x∗ ) = 0. Example 2 ([14])
(x1 + 1)3 + x2 3 1 − x1 ≤ 0, −x2 ≤ 0.
min f (x) = s.t.
The initial point x = (1, 1). After 5 iterations, the algorithm finds f (x∗ ) =
C (+) (x∗ ) = 2.7667e-013.
8 3
at x∗ = (1, 0),
Example 3 ([14]) min f (x) = −x1 s.t. (x1 − 1)3 + x2 − 2 ≤ 0, (x1 − 1)3 − x2 + 2 ≤ 0, −x1 ≤ 0, −x2 ≤ 0. The algorithm starts at x = (1, 1). An optimal solution is x∗ = (1, 2), f (x∗ ) = −1, and
C (+) (x∗ ) = 4.9562e-009. The algorithm terminates at the 17th iteration. Example 4 ([13]) min f (x) = x21 − 4x1 + x22 − 9x2 + 2x23 − 10x3 + 0.5x2 x3 s.t. 4x1 + 2x2 + x3 − 10 ≤ 0, 2x1 + 4x2 + x3 − 20 ≤ 0, −x1 ≤ 0, −x2 ≤ 0, −x3 ≤ 0. The algorithm starts at x = (1, 1, 1). An optimal solution is x∗ = (0.4101, 3.2307, 1.8974), f (x∗ ) = −28.8201, and C (+) (x∗ ) = 0. The algorithm terminates at the 4th iteration. Example 5 (Problem 394 in [21]):
i(x2i + x4i ), min f (x) = 20 20 2 i=1 s.t. i=1 xi − 1 = 0.
15
We change it into a minimization problem with inequality constraints,
i(x2i + x4i ), min f (x) = 20 20 2 i=1 s.t. i=1 xi − 1 ≤ 0, 2 − 20 i=1 xi + 1 ≤ 0. The initial point is x = (2, 2, · · ·, 2). At the 4th iteration, we find f (x∗ ) = 1.9167 at x∗ = (0.9129, 0.4082, 0.0000, · · ·, 0.0000) and C (+) (x∗ ) = 3.2946e-013. Example 6 ([3])
ix4i , min f (x) = 100 i=1 i+1 s.t. ci (x) := j=1 xj −
i 10
= 0,
i = 1, 2, · · ·, 20.
As the same as the above example, we change it into a minimization problem with inequality constraints,
4 min f (x) = 100 i=1 ixi , i+1 i i = 1, 2, · · ·, 20, s.t. j=1 xj − 10 ≤ 0, l+1 l − j=1 xj + 10 ≤ 0, l = 1, 2, · · ·, 20.
The initial point x = (0.25, 0.25, · · · , 0.25). After 12 iterations, the algorithm finds f (x∗ ) = 0.0228, x∗ = (0.0558, 0.0442, 0.1000, 0.1000, · ·!"· , 0.1000, 0.1000, 0.0000, · · ·, 0.0000) , and C (+) (x∗ ) = 6.7318e-012.
15
elements
In order to show the advantages of our algorithm, we give a simple comparison of our result to the result of CPF method. We can conclude from them that almost for all the six examples, the number of iterations and the CPU time of our algorithm are less than those of CPF method. Focus on Example 6 which has 100 variables, we see that our algorithm decreases substantially the computational time, and reduces considerably the number of iterations. The following Table 1 shows the details of the comparisons between our method and CPF method. Figure 1 depicts the trend of CPU time when the dimension of the test problems grows. Table 1: Iterations and CPU time Example No. 1 2 3 4 5 6
Iteration our method CPF method 13 37 4 9 20 16 4 32 4 11 12 34
(CPF stands for Classical Penalty Function)
16
CPU time (second) our method CPF method 0.1200 2.2130 0.2510 0.3310 0.3910 1.1620 0.3110 0.3310 2.3730 1.1420 39.3570 75.9190
80 approxim ate tim e ofourm ethod 70
approxim ate tim e ofCPF m ethod
60
CPU tim e
50 40 30 20 10 0 0
20
40
60
80
100
120
-10 dim ention oftestproblem
Figure 1: CPU time versus dimension of test problem.
9
Conclusion
If we considered the classical Lagrangian function L(x, λ) for the problem instead of F (x, λ, α), Theorem 7.3 is generally invalid. Example: Consider the problem which can be founded in ([19]), min f (x) = x21 − x22 s.t. c1 (x) = x2 − 2 ≤ 0 c2 (x) = −x2 ≤ 0. We have known that the minimum point is x∗ = (0, 2). The corresponding classical Lagrangian function is L(x, λ) = x21 − x22 + λ1 (x2 − 2) − λ2 x2 .
Then
λ∗1
= 4,
λ∗2
= 0, C(r) = c1 (x),
(0, −1), and So
∇2xx L(x∗ , λ∗)
=
2 0 0 −2
, ∇C(r)(x∗ ) = ∇c1 (x∗ ) =
∇C(r) (x∗ ) d = 0 ⇒ d = (d1 , 0). d ∇2xxL(x∗ , λ∗)d = 2d21 for all d satisfying ∇C(r) (x∗ ) d = 0,
namely the second-order optimality condition C2-C3 are fulfilled. But
inf L(x, λ∗) | x ∈ R2 = inf x21 − x22 + 4x2 − 8 | x ∈ R2 = −∞, #
$
and moreover inf L(x, λ) | x ∈ R2 = −∞ for any λ = (λ1 , λ2) > 0. Now we consider the MPF method. The corresponding Lagrangian function is F (x, λ, α) = x21 − x22 + 2λ1α−1 log(1 + eα(x2 −2) ) + 2λ2 α−1 log(1 + e−αx2 ).
Then ∇2xxF (x∗ , λ∗ , α) =
17
2 0 0 −2 + 2α
.
#
$
So ∇2xx F (x∗ , λ∗, α) is positive definite and x∗ = (0, 2) = arg min F (x, λ∗ , α) | x ∈ R2 for any α ≥ 1. In contrast to the CPF method, the MPF method has some good properties. such that it is defined at the solution and they keep the smoothness of the order of the initial functions in a neighborhood of the feasible set, and moreover its second-order hessian matrix is positive definite, i.e., the MPF’s is strongly convex in the neighborhood of the solution even if in the case of non-convex programming problems.
References [1] A. Auslender, Penalty and barrier methods: a unified framework, SIAM Journal on Optimization, 10 (1999) 211-230. [2] M.C. Bartholomew-Biggs, Recursive quadratic programming methods for nonlinear constraints, Nonlinear Optimization, 1981, 213-211, NATO Conf. Ser. II: Systems Sci., Academic Press, London, 1982. [3] M.C. Bartholomew-Biggs and M.F.G. Hernandez, Using the KKT matrix in an augmented lagrangian SQP method for sparse constrained optimization, Journal of Optimization Theory and Applications, 85 (1995) 201-220. [4] A. Ben-Tal and M. Zibulevsky, Penalty/barrier multiplier methods for convex programming problems, SIAM Journal on Optimization, 7 (1997) 347-366. [5] J. Bracken and G.P. McCormick, Sellected applications of nonlinear programing, John Willy and Sons, New York, 1968. [6] C. Chen and O.L Mangasarian, Smoothing method for convex inequalities and linear complementarity problems, Mathematical Programming, 71 (1995) 51-69. [7] A.R. Conn, N.I.M Gould and P.L. Toint, A globally convergent Lagrangian barrier algorithm for optimization with general inequality constraints and simple bounds, Mathematics of Computation, 66 (1997) 261-288. [8] J.C. Dong, J. Shi, S.Y. Wang, Y. Xue and S.H. Liu, A trust-region algorithm for equalityconstrained optimization via a reduced dimension approach, Journal of Computational and Applied Mathematics, 152 (2003) 99-118. [9] A.L. Dontchev and W.W. Hager, Lipchitzian stability for state constrained nonlinear optimal control, SIAM Journal on Control and Optimization, 36 (1998) 698-718. [10] M.M. El-Alem, Convergence to a second-order point of a trust region algorithm with a nonmonotonic penalty parameter for constrained optimization, Journal of Optimization and Applications, 91 (1996) 61-79. [11] D. Goldfarb, R. Polyak, K. Scheinberg and I. Yuzefovich, A modified barrier-augmented Lagrangian method for constrained minimization, Computational Optimization and Application, 14 (1999) 55-74. [12] N.I.M. Gould and P.L. Toint, A note on the convergence of barrier algorithms to secondorder necessary point, Mathematical Programming, 85 (1999) 433-438.
18
[13] Y.Q. Hu, et al., Exercises of Operational Research, Tsinghua University Press, Beijing, (1998) 67. [14] Y.Q. Hu, et al., Operational Research, Tsinghua University Press, Beijing, (1999) 188-193. [15] K. Madsen and H.B. Nielsen, A finite smoothing algorithm for linear l1 estimation, SIAM Journal on Optimization, 3 (1993) 223-235. [16] O.L. Mangasarian, A condition number for differentiable convex inequalities, Mathematics of Operation Research, 10 (1985) 175-179. [17] G.D. Pillo and S. Lusidi-Biggs, An augmented Lagrangian function with improved exactness properties, SIAM Journal on Optimization, 12 (2001) 376-406. [18] M.C. Pinar and S.A. Zenios, On smoothing exact penalty functions for convex constrained optimization, SIAM Journal on Optimization, 4 (1994) 486-511. [19] R. Polyak, Modified barrier functions (theory and methods),Mathematical Programming, 54 (1992) 177-222. [20] A.M. Rubinov, S.Q. Yang and B.M. Glover, Extended Lagrange and penalty functions in optimization, Journal of Optimization Theory and Applications, 111 (2001) 381-405. [21] K. Schittkowski, More test examples for nonlinear programming codes, Lecture Notes in Economics and Mathematical Systems 282, Springer-Verlag, Berlin, 1987.
19