The Flattened Aggregate Constraint Homotopy Method for Nonlinear ...

Hindawi Publishing Corporation Abstract and Applied Analysis Volume 2014, Article ID 430932, 14 pages http://dx.doi.org/10.1155/2014/430932

Research Article The Flattened Aggregate Constraint Homotopy Method for Nonlinear Programming Problems with Many Nonlinear Constraints Zhengyong Zhou1 and Bo Yu2 1 2

School of Mathematics and Computer Sciences, Shanxi Normal University, Linfen, Shanxi 041004, China School of Mathematical Sciences, Dalian University of Technology, Dalian, Liaoning 116024, China

Correspondence should be addressed to Zhengyong Zhou; [email protected] Received 8 February 2014; Accepted 27 April 2014; Published 29 May 2014 Academic Editor: Victor Kovtunenko Copyright © 2014 Z. Zhou and B. Yu. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The aggregate constraint homotopy method uses a single smoothing constraint instead of 𝑚-constraints to reduce the dimension of its homotopy map, and hence it is expected to be more efficient than the combined homotopy interior point method when the number of constraints is very large. However, the gradient and Hessian of the aggregate constraint function are complicated combinations of gradients and Hessians of all constraint functions, and hence they are expensive to calculate when the number of constraint functions is very large. In order to improve the performance of the aggregate constraint homotopy method for solving nonlinear programming problems, with few variables and many nonlinear constraints, a flattened aggregate constraint homotopy method, that can save much computation of gradients and Hessians of constraint functions, is presented. Under some similar conditions for other homotopy methods, existence and convergence of a smooth homotopy path are proven. A numerical procedure is given to implement the proposed homotopy method, preliminary computational results show its performance, and it is also competitive with the state-of-the-art solver KNITRO for solving large-scale nonlinear optimization.

1. Introduction In this paper, we consider the following nonlinear programming problem: min

𝑓 (𝑥) ,

s.t. 𝑔 (𝑥) ≤ 0,

(1)

where 𝑥 ∈ 𝑅𝑛 is the variable, 𝑔(𝑥) = (𝑔1 (𝑥), . . . , 𝑔𝑚 (𝑥))𝑇 , 𝑓 : 𝑅𝑛 → 𝑅, and 𝑔𝑖 (𝑥) : 𝑅𝑛 → 𝑅, 𝑖 = 1, . . . , 𝑚, are three times continuously differentiable, and 𝑚 is very large, but 𝑛 is moderate. It has wide applications, and a typical situation is the discretized semi-infinite programming problem. From the mid-1980s, much attention has been paid to interior point methods for mathematical programming, and many results on theory, algorithms, and applications on linear programming, convex programming, complementarity problems, semidefinite programming, and linear cone programming were obtained (see monographs [1–6] and

references therein). For nonlinear programming, the typical algorithms used were the Newton-type methods to the perturbed first-order necessary conditions combined with line search or trust region methods with a proper merit function (e.g., [7–9]). The general conditions of global convergence for these methods required that the feasible set be bounded and that the Jacobian matrix be uniformly nonsingular. Another typical class of globally convergent methods for nonlinear programming was probability-one homotopy methods (e.g., [10–12]), whose global convergence can be established under some weaker conditions than the ones for Newton-type methods. The excellent feature is that, unlike line search or trust region methods, they do not depend on the descent of a merit function and so are insensitive to the local minimum of the merit function, in which any search direction is not a descent direction of the merit function. In [10, 11], Feng et al. proposed a homotopy method for nonlinear programming (1), which was called the combined homotopy interior point (abbreviated by CHIP) method;

2

Abstract and Applied Analysis

its global convergence was proven under the normal cone condition (see below for its definition) for the feasible set as well as some common conditions. On the basis of the CHIP method, some modified CHIP methods were presented in [13, 14]; the global convergence was established under the quasinormal cone and pseudocone condition for the feasible set, respectively. In [12], Watson described some probabilityone homotopy methods for the unconstrained and inequality constrained optimization, whose global convergence was established under some weaker assumptions. Recently, Yu and Shang proposed a constraint shifting combined homotopy method in [15, 16], in which not only the objective function but also the constraint functions were regularly deformed. The global convergence was proven under the condition that the initial feasible set, which approaches the feasible set of (1) as the homotopy parameter changes from 1 to 0, not necessarily the feasible set of (1), satisfies the normal cone condition. Let 𝑔max (𝑥) = max1≤𝑖≤𝑚 {𝑔𝑖 (𝑥)}; then (1) is equivalent to min s.t.

𝑓 (𝑥) , 𝑔max (𝑥) ≤ 0,

(2)

which has only one, but nonsmooth, constraint. In [17], the following aggregate function was introduced, which is a smooth approximation of 𝑔max (𝑥) with a smoothing parameter 𝑡 > 0 and induced from the max-entropy theory: 𝑚

𝑔̂ (𝑥, 𝑡) = 𝑡 ln ∑ exp ( 𝑖=1

𝑔𝑖 (𝑥) ). 𝑡

(3)

It is also known as exponential penalty function (see [18]). By using it for all constraint functions of problem (1), an aggregate constraint homotopy (abbreviated by ACH) method was presented by Yu et al. in [19], whose global convergence was obtained under the condition that the feasible set satisfies the weak normal cone condition. Although the ACH method has only one smoothing constraint, the gradient and Hessian of the aggregate constraint function 𝑚

∇𝑥 𝑔̂ (𝑥, 𝑡) = ∑𝑐𝑖 (𝑥, 𝑡) ∇𝑔𝑖 (𝑥) , 𝑖=1

𝑚 𝜕𝑔̂ (𝑥, 𝑡) 1 = (𝑔̂ (𝑥, 𝑡) − ∑𝑐𝑖 (𝑥, 𝑡) 𝑔𝑖 (𝑥)) , 𝜕𝑡 𝑡 𝑖=1 2 𝑔̂ (𝑥, 𝑡) ∇𝑥𝑥 𝑚

= ∑𝑐𝑖 (𝑥, 𝑡) ∇2 𝑔𝑖 (𝑥) 𝑖=1

1𝑚 𝑇 + ∑𝑐𝑖 (𝑥, 𝑡) ∇𝑔𝑖 (𝑥) (∇𝑔𝑖 (𝑥)) 𝑡 𝑖=1 𝑚 1𝑚 𝑇 − ∑𝑐𝑖 (𝑥, 𝑡) ∇𝑔𝑖 (𝑥) ∑𝑐𝑖 (𝑥, 𝑡) (∇𝑔𝑖 (𝑥)) , 𝑡 𝑖=1 𝑖=1

2 𝑔̂ (𝑥, 𝑡) ∇𝑥𝑡

=

𝑚 1 𝑚 ( 𝑐 𝑐𝑖 (𝑥, 𝑡) 𝑔𝑖 (𝑥) 𝑡) ∇𝑔 ∑ ∑ (𝑥, (𝑥) 𝑖 𝑡2 𝑖=1 𝑖 𝑖=1 𝑚

−∑𝑐𝑖 (𝑥, 𝑡) 𝑔𝑖 (𝑥) ∇𝑔𝑖 (𝑥)) , 𝑖=1

(4) where 𝑐𝑖 (𝑥, 𝑡) =

exp (𝑔𝑖 (𝑥) /𝑡) ∈ (0, 1) , exp (𝑔𝑖 (𝑥) /𝑡)

∑𝑚 𝑖=1

𝑚

∑𝑐𝑖 (𝑥, 𝑡) = 1, 𝑖=1

(5) are complicated combinations of gradients and Hessians of all constraint functions and, hence, are expensive to calculate when 𝑚 is very large. Throughout this paper, we assume that the nonlinear programming problem (1) possesses a very large number of nonlinear constraints, but a small number of variables, and the objective and constraint functions are not sparse. For such a problem, the number of constraint functions can be so large that the computation of gradients and Hessians of all constraint functions is very expensive and cannot be stored in memory and, hence, the general numerical methods for solving nonlinear programming are not efficient. Although active set methods only need to calculate gradients and Hessians of a part of constraint functions, require lower storage, and have faster numerical solution, the working set is difficult to estimate without knowing the internal structure of the problem. In this paper, we present a new homotopy method called the flattened aggregate constraint homotopy (abbreviated by FACH) method for nonlinear programming (1) by using a new smoothing technique, in which only a part of constraint functions is aggregated. Under the normal cone condition for the feasible set and some other general assumptions, we prove that the FACH method can determine a smooth homotopy path from a given interior point of the feasible set to a KKT point of (1), and preliminary numerical results demonstrate its efficiency. The rest of this paper is organized as follows. We conclude this section with some notations, definitions, and a lemma. The flattened aggregate constraint function with some properties is given in Section 2. The homotopy map, existence and convergence of a smooth homotopy path with proof are given in Section 3. A numerical procedure for tracking the smooth homotopy path, and numerical test results with some remarks are given in Section 4. Finally, we conclude the paper with some remarks in Section 5. When discussing scalars and scalar-valued functions, subscripts refer to iteration step so that superscripts can be used for exponentiation. In contrast, for vectors and vectorvalued functions, subscripts are used to indicate components, whereas superscripts are used to indicate the iteration step. The identity matrix is represented by 𝐼. Unless otherwise specified, ‖ ⋅ ‖ denotes the Euclidean norm. Ω = {𝑥 ∈ 𝑅𝑛 | 𝑔(𝑥) ≤ 0} is the feasible set of (1),


3

whereas Ω0 = {𝑥 ∈ 𝑅𝑛 | 𝑔(𝑥) < 0} is the interior of Ω and 𝜕Ω = Ω \ Ω0 is the boundary of Ω. The symbols 𝑅+𝑚 𝑚 and 𝑅++ denote the nonnegative and positive quadrants of 𝑚 𝑅 , respectively. The active index set is denoted by 𝐼(𝑥) = {𝑖 ∈ {1, . . . , 𝑚} | 𝑔𝑖 (𝑥) = 0} at 𝑥 ∈ Ω. For a function 𝐹 : 𝑅𝑛 → 𝑅𝑚 , ∇𝐹(𝑥) is the 𝑛 × 𝑚 matrix whose (𝑖, 𝑗)th element is 𝜕𝐹𝑗 (𝑥)/𝜕𝑥𝑖 ; 𝐹−1 (𝑌) = {𝑥 | 𝐹(𝑥) ∈ 𝑌} is the setvalued inverse for the set 𝑌 ⊆ 𝑅𝑚 . Definition 1 (see [10]). The set Ω satisfies the normal cone condition if and only if the normal cone of Ω at any 𝑥 ∈ 𝜕Ω does not meet Ω0 ; that is, 0

{𝑥 + ∑ 𝑦𝑖 ∇𝑔𝑖 (𝑥) | 𝑦𝑖 ≥ 0} ∩ Ω = 0.

(6)

constraints, which is weaker than the linear independence constraint qualification. The last assumption, the normal cone condition of the feasible set, is a generalization of the convex condition for the feasible set. Indeed, if Ω is a convex set, then it satisfies the normal cone condition. Some simple nonconvex sets, satisfying the normal cone condition, are shown in [10]. In this paper, we construct a flattened aggregate constraint function as follows: 𝑔̂(𝜃,𝑐1 ,𝑐2 ,𝛼) (𝑥, 𝑡) 𝑚

= 𝜃𝑡 ln (∑𝜑 (𝑔𝑖 (𝑥) , 𝑡) exp ( 𝑖=1

𝑖∈𝐼(𝑥)

Definition 2. If and only if there exists 𝜆∗ ∈ 𝑅+𝑚 such that (𝑥∗ , 𝜆∗ ) satisfies ∇𝑓 (𝑥∗ ) + ∇𝑔 (𝑥∗ ) 𝜆∗ = 0, 𝜆∗𝑖 𝑔𝑖

∗

𝜆∗𝑖

(𝑥 ) = 0,

∗

≥ 0, 𝑔𝑖 (𝑥 ) ≤ 0, 𝑖 = 1, . . . , 𝑚,

Definition 3. Let 𝑈 ⊂ 𝑅𝑛 be an open set and let 𝐹 : 𝑈 → 𝑅𝑚 be a differentiable map. We say 𝑦 ∈ 𝑅𝑚 is a regular value of 𝐹 if and only if Range [

𝜕𝐹 (𝑥) ] = 𝑅𝑚 , 𝜕𝑥

∀𝑥 ∈ 𝐹−1 (𝑦) .

where 𝜀 (𝑡) = 𝑐1 𝑡 + 𝑐2 ,

(7)

then 𝑥∗ is called a KKT point of (1) and 𝜆∗ is the corresponding Lagrangian multiplier.

(8)

Lemma 4 (parameterized Sard theorem [20]). Let 𝑈 ⊂ 𝑅𝑛 and 𝑉 ⊂ 𝑅𝑘 be two open sets and let 𝐹 : 𝑈 × 𝑉 → 𝑅𝑚 be a 𝐶𝑟 differentiable map with 𝑟 > max{0, 𝑛 − 𝑚}. If 0 ∈ 𝑅𝑚 is a regular value of 𝐹, then, for almost all 𝑎 ∈ 𝑉, 0 is a regular value of 𝐹𝑎 = 𝐹(⋅, 𝑎).

0, for 𝑧 ≤ −𝛼𝜀 (𝑡) with 𝛼 > 1, { { 𝜑 (𝑧, 𝑡) = {1, for 𝑧 ≥ −𝜀 (𝑡) , { 𝜑 𝑡) , otherwise, (𝑧, {

𝜑 (−𝛼𝜀 (𝑡) , 𝑡) = 0,

𝜑 (−𝜀 (𝑡) , 𝑡) = 1,

Assumption 5. Ω0 is nonempty and Ω is bounded.

𝜕2 𝜑 (−𝜀 (𝑡) , 𝑡) = 0, 𝜕𝑧2

∑ 𝑦𝑖 ∇𝑔𝑖 (𝑥) = 0,

𝑖∈𝐼(𝑥)

𝑦𝑖 ≥ 0, 𝑖 ∈ 𝐼 (𝑥) 󳨐⇒ 𝑦𝑖 = 0, 𝑖 ∈ 𝐼 (𝑥) . (9)

Assumption 7. Ω satisfies the normal cone condition (see Definition 1). The first assumption is the Slater’s condition and the boundedness of the feasible set, which are two basic conditions. The second assumption provides the regularity of

(12)

where, for any 𝑡 ∈ [0, 1], 𝜑(𝑧, 𝑡) satisfies

In this section, we suppose that the following assumptions hold.

Assumption 6. For any 𝑥 ∈ 𝜕Ω, {∇𝑔𝑖 (𝑥) | 𝑖 ∈ 𝐼(𝑥)} are positive independent; that is,

(11)

𝑐1 > 0, 𝑐2 > 0, 0 < 𝜃 ≤ 1 are some adjusting parameters, 𝜑(𝑧, 𝑡) : 𝑅×[0, 1] → [0, 1] is a 𝐶3 bivariate function satisfying 𝜑(𝑧, 𝑡) = 0 for 𝑧 ≤ −𝛼𝜀(𝑡) with 𝛼 > 1, and 𝜑(𝑧, 𝑡) = 1 for 𝑧 ≥ −𝜀(𝑡). For simplicity of discussion, we write 𝑔̂(𝜃,𝑐1 ,𝑐2 ,𝛼) (𝑥, 𝑡) ̂ 𝑡). By its definition, 𝜑(𝑧, 𝑡) can be expressed as as 𝑔(𝑥,

𝜕2 𝜑 (−𝛼𝜀 (𝑡) , 𝑡) = 0, 𝜕𝑧2

2. The Flattened Aggregate Constraint Function

𝑔𝑖 (𝑥) 𝜀 (𝑡) ) + exp (− )) , 𝜃𝑡 𝜃𝑡 (10)

𝜕𝜑 (−𝛼𝜀 (𝑡) , 𝑡) = 0, 𝜕𝑧 𝜕3 𝜑 (−𝛼𝜀 (𝑡) , 𝑡) = 0, 𝜕𝑧3 𝜕𝜑 (−𝜀 (𝑡) , 𝑡) = 0, 𝜕𝑧

(13)

𝜕3 𝜑 (−𝜀 (𝑡) , 𝑡) = 0. 𝜕𝑧3

There exist many kinds of functions 𝜑(𝑧, 𝑡), such as polynomial functions and spline functions. Since polynomial functions have simple mathematical expressions and many excellent properties, throughout this paper, 𝜑(𝑧, 𝑡) is chosen as 𝜑 (𝑧, 𝑡) =

−20(𝑧 + 𝜀 (𝑡))7 −70(𝑧 + 𝜀 (𝑡))6 + (𝛼𝜀 (𝑡) − 𝜀 (𝑡))7 (𝛼𝜀 (𝑡) − 𝜀 (𝑡))6 −84(𝑧 + 𝜀 (𝑡))5 −35(𝑧 + 𝜀 (𝑡))4 + + + 1. (𝛼𝜀 (𝑡) − 𝜀 (𝑡))5 (𝛼𝜀 (𝑡) − 𝜀 (𝑡))4

(14)

4


Let 𝐼𝜀 (𝑥, 𝑡) = {𝑖 | 𝑔𝑖 (𝑥) > −𝛼𝜀(𝑡)}; by the definition of ̂ 𝑡) in (10) can be rewritten as 𝜑(𝑧, 𝑡), 𝑔(𝑥,

1 ≥ ∑ 𝜆 𝑖 (𝑥, 𝑡)

𝑔 (𝑥) ) 𝑔̂ (𝑥, 𝑡) = 𝜃𝑡 ln ( ∑ 𝜑 (𝑔𝑖 (𝑥) , 𝑡) exp ( 𝑖 𝜃𝑡 𝑖∈𝐼 (𝑥,𝑡) 𝜀

+ exp (−

(2) By lim𝑥→𝑥,𝑡→𝑡 (−𝜀(𝑡) − 𝑔𝑖∗ (𝑥)) = −𝜀(𝑡) < 0, we have

𝑖∈𝐼𝜀 (𝑥,𝑡)

(15) 𝜀 (𝑡) )). 𝜃𝑡

=

𝑖∈𝐼𝜀 (𝑥,𝑡)

(16)

∑𝑗∈𝐼𝜀 (𝑥,𝑡) 𝜑 (𝑔𝑗 (𝑥) , 𝑡) exp (𝑔𝑗 (𝑥) /𝜃𝑡) + exp (−𝜀 (𝑡) /𝜃𝑡)

≥

𝜑 (𝑔𝑖∗ (𝑥) , 𝑡) exp (𝑔𝑖∗ (𝑥) /𝜃𝑡) 𝜑 (𝑔𝑖∗ (𝑥) , 𝑡) exp (𝑔𝑖∗ (𝑥) /𝜃𝑡) + exp (−𝜀 (𝑡) /𝜃𝑡)

≥

𝜑 (𝑔𝑖∗ (𝑥) , 𝑡) 1 + exp ((−𝜀 (𝑡) − 𝑔𝑖∗ (𝑥)) /𝜃𝑡)

̂ 𝑡) with respect to 𝑥 is The gradient of 𝑔(𝑥, ∇𝑥 𝑔̂ (𝑥, 𝑡) = ∑ (𝜆 𝑖 (𝑥, 𝑡) + 𝜉𝑖 (𝑥, 𝑡)) ∇𝑔𝑖 (𝑥) ,

∑𝑖∈𝐼𝜀 (𝑥,𝑡) 𝜑 (𝑔𝑖 (𝑥) , 𝑡) exp (𝑔𝑖 (𝑥) /𝜃𝑡)

󳨀→ 1. (20)

where 𝜆 𝑖 (𝑥, 𝑡) =

𝜑 (𝑔𝑖 (𝑥) , 𝑡) exp (𝑔𝑖 (𝑥) /𝜃𝑡)

, ∑𝑗∈𝐼𝜀 (𝑥,𝑡) 𝜑 (𝑔𝑗 (𝑥) , 𝑡) exp (𝑔𝑗 (𝑥) /𝜃𝑡) + exp (−𝜀 (𝑡) /𝜃𝑡) (17)

𝜉𝑖 (𝑥, 𝑡) =

Hence, ∑𝑖∈𝐼𝜀 (𝑥,𝑡) 𝜆 𝑖 (𝑥, 𝑡) → 1. Together with item 1, we have ∑𝑖∈𝐼(𝑥) 𝜆 𝑖 (𝑥, 𝑡) → 1 for 𝜃𝑡 ↓ 0, 𝑡 → 𝑡, and 𝑥 → 𝑥. (3) If 𝑔𝑖 (𝑥) ≥ −𝜀(𝑡), then 𝜕𝜑(𝑔𝑖 (𝑥), 𝑡)/𝜕𝑧 = 0 by its definition, and hence 𝜉𝑖 (𝑥, 𝑡) = 0;

(21)

else

𝜃𝑡𝜕𝜑 (𝑔𝑖 (𝑥) , 𝑡) /𝜕𝑧 exp (𝑔𝑖 (𝑥) /𝜃𝑡)

. ∑𝑗∈𝐼𝜀 (𝑥,𝑡) 𝜑 (𝑔𝑗 (𝑥) , 𝑡) exp (𝑔𝑗 (𝑥) /𝜃𝑡) + exp (−𝜀 (𝑡) /𝜃𝑡) (18)

󵄨 󵄨 󵄨󵄨 󵄨 𝜃𝑡 󵄨󵄨𝜕𝜑 (𝑔𝑖 (𝑥) , 𝑡) /𝜕𝑧󵄨󵄨󵄨 exp (𝑔𝑖 (𝑥) /𝜃𝑡) ≤ 𝜃𝑡𝑀, 󵄨󵄨𝜉𝑖 (𝑥, 𝑡)󵄨󵄨󵄨 ≤ 󵄨 exp (−𝜀 (𝑡) /𝜃𝑡) (22)

̂ 𝑡) only relate to a part Then the gradient and Hessian of 𝑔(𝑥, of constraint functions; that is, 𝑖 ∈ 𝐼𝜀 (𝑥, 𝑡). Proposition 8. For any 𝜃 > 0, 𝑡 ∈ (0, 1], 𝑥 ∈ Ω with 𝜃𝑡 ↓ 0, 𝑡 → 𝑡 ∈ [0, 1], 𝑥 → 𝑥 ∈ 𝜕Ω,

where 𝑀 = max(𝑧,𝑡)∈𝑅×[0,1] |𝜕𝜑(𝑧, 𝑡)/𝜕𝑧|. Then, we have 𝜉(𝑥, 𝑡) → 0 for 𝜃𝑡 ↓ 0, 𝑡 → 𝑡, and 𝑥 → 𝑥. Proposition 9. For any given 0 < 𝜃 ≤ 1,

(1) 𝜆 𝑖 (𝑥, 𝑡) → 0 for 𝑖 ∉ 𝐼(𝑥);

̂ 𝑡) ≤ max{𝑔max (𝑥), −𝜀(𝑡)} + 𝜃𝑡 ln(𝑚𝐼𝜀 + 1), (1) −𝜀(𝑡) ≤ 𝑔(𝑥,

(2) ∑𝑖∈𝐼(𝑥) 𝜆 𝑖 (𝑥, 𝑡) → 1;

̂ 𝑡) ≤ 𝑔max (𝑥) + 𝜃𝑡 ln(𝑚𝐼𝜀 + 1) (2) 𝑔max (𝑥) ≤ 𝑔(𝑥, when 𝑔max (𝑥) ≥ −𝜀(𝑡),

(3) 𝜉(𝑥, 𝑡) → 0. Proof. (1) Because 𝑥 ∈ 𝜕Ω, there exists 𝑖∗ ∈ 𝐼(𝑥) such that 𝑔𝑖∗ (𝑥) = 0. By the continuity of 𝜑(𝑧, 𝑡) and 𝑔(𝑥), we know that lim𝑥→𝑥,𝑡→𝑡 𝜑(𝑔𝑖∗ (𝑥), 𝑡) = 𝜑(𝑔𝑖∗ (𝑥), 𝑡) = 1 and lim𝑥→𝑥 (𝑔𝑖 (𝑥) − 𝑔𝑖∗ (𝑥)) = 𝑔𝑖 (𝑥) < 0 for 𝑖 ∉ 𝐼(𝑥); hence

̂ 𝑡) = −𝜀(𝑡) when 𝑔max (𝑥) ≤ −𝛼𝜀(𝑡) with 𝛼 > 1, (3) 𝑔(𝑥, where 𝑚𝐼𝜀 denotes the cardinality of 𝐼𝜀 (𝑥, 𝑡). Proof. (1) By 𝜑(𝑧, 𝑡) ≥ 0, we have

0 ≤ 𝜆 𝑖 (𝑥, 𝑡) exp (𝑔𝑖 (𝑥) /𝜃𝑡) ≤ 𝜑 (𝑔𝑖∗ (𝑥) , 𝑡) exp (𝑔𝑖∗ (𝑥) /𝜃𝑡) exp ((𝑔𝑖 (𝑥) − 𝑔𝑖∗ (𝑥)) /𝜃𝑡) = 𝜑 (𝑔𝑖∗ (𝑥) , 𝑡) 󳨀→ 0. Hence, item 1 is satisfied for 𝜃𝑡 ↓ 0, 𝑡 → 𝑡, and 𝑥 → 𝑥.

𝑔̂ (𝑥, 𝑡) = 𝜃𝑡 ln ( ∑ 𝜑 (𝑔𝑖 (𝑥) , 𝑡) exp ( 𝑖∈𝐼𝜀 (𝑥,𝑡)

(19)

+ exp (− ≥ 𝜃𝑡 ln (exp (− = −𝜀 (𝑡) .

𝜀 (𝑡) )) 𝜃𝑡

𝜀 (𝑡) )) 𝜃𝑡

𝑔𝑖 (𝑥) ) 𝜃𝑡

(23)


5

Then the left inequality can be obtained. By 𝜑(𝑧, 𝑡) ≤ 1, we have 𝑔̂ (𝑥, 𝑡) = 𝜃𝑡 ln ( ∑ 𝜑 (𝑔𝑖 (𝑥) , 𝑡) exp ( 𝑖∈𝐼𝜀 (𝑥,𝑡)

𝜃 ∈ (0, 1] satisfying 𝜃 ≤ − max{𝑔max (𝑥∗ ), −𝜀(1)}/(2 ln(𝑚+1)), ̂ by Proposition 9(1), for any 𝑥 ∈ Ω, 𝑔̂ (𝑥, 1) ≤ max {𝑔max (𝑥) , −𝜀 (1)} + 𝜃 ln (𝑚 + 1)

𝑔𝑖 (𝑥) ) 𝜃𝑡

≤ max {𝑔max (𝑥∗ ) , −𝜀 (1)} + 𝜃 ln (𝑚 + 1)

𝜀 (𝑡) + exp (− )) 𝜃𝑡

=

1 max {𝑔max (𝑥∗ ) , −𝜀 (1)} 2

(27)

< 0, 𝑔 (𝑥) 𝜀 (𝑡) ≤ 𝜃𝑡 ln ( ∑ exp ( 𝑖 ) + exp (− )) . 𝜃𝑡 𝜃𝑡 𝑖∈𝐼 (𝑥,𝑡) 𝜀

(24)

̂ ⊂ Ω𝜃 (1)0 . which means Ω (3) For any 𝑥 ∈ Ω, that is, 𝑔max (𝑥) ≤ 0, by the right inequality of Proposition 9(1), lim 𝑔̂ (𝑥, 𝑡) ≤ lim max {𝑔max (𝑥) , −𝜀 (𝑡)}

Then the right inequality can be obtained by Proposition 2.3 in [19]. (2) For any 𝑥, 𝑡 with 𝑔max (𝑥) ≥ −𝜀(𝑡), let 𝑔𝑖∗ (𝑥) = 𝑔max (𝑥); then 𝜑(𝑔𝑖∗ (𝑥), 𝑡) = 1 by its definition, and hence

𝑡↓0

𝑡↓0

(28)

≤ max {𝑔max (𝑥) , −𝑐2 } ≤ 0, where the second inequality comes from −𝜀(𝑡) = −𝑐1 𝑡 − 𝑐2 ≤ −𝑐2 < 0 for 𝑡 ∈ [0, 1], which means Ω ⊂ lim𝑡↓0 Ω𝜃 (𝑡). Then, together with item 1,

𝑔 (𝑥) 𝑔̂ (𝑥, 𝑡) = 𝜃𝑡 ln ( ∑ 𝜑 (𝑔𝑖 (𝑥) , 𝑡) exp ( 𝑖 ) 𝜃𝑡 𝑖∈𝐼 (𝑥,𝑡) 𝜀

+ exp (−

≥ 𝜃𝑡 ln (𝜑 (𝑔𝑖∗ (𝑥) , 𝑡) exp ( = 𝜃𝑡 ln (exp (

lim Ω𝜃 (𝑡) = Ω.

𝜀 (𝑡) )) 𝜃𝑡 𝑔𝑖∗ (𝑥) )) 𝜃𝑡

𝑡↓0

(25)

𝑔𝑖∗ (𝑥) )) 𝜃𝑡

Proposition 11. There exists a 𝜃 ∈ (0, 1] such that, for any ̂ 𝑡) ≠ 0. 𝑡 ∈ (0, 1], 𝑥 ∈ 𝜕Ω𝜃 (𝑡); then ∇𝑥 𝑔(𝑥, Proof. If not, for any 𝜃 ∈ (0, 1], there exist corresponding ̂ 𝜃 , 𝑡𝜃 ) = 0, which 𝑡𝜃 ∈ (0, 1] and 𝑥𝜃 ∈ 𝜕Ω𝜃 (𝑡𝜃 ) such that ∇𝑥 𝑔(𝑥 means that there must exist three sequences {𝜃𝑘 }∞ 𝑘=1 ⊂ (0, 1], 𝑘 ∞ 𝑘 {𝑡𝑘 }∞ ⊂ (0, 1], and {𝑥 } with 𝑥 ∈ 𝜕Ω (𝑡 𝜃𝑘 𝑘 ), such that 𝑘=1 𝑘=1 𝑘 𝜃𝑘 ↓ 0, 𝑡𝑘 → 𝑡, 𝑥 → 𝑥 ∈ 𝜕Ω, and

= 𝑔𝑖∗ (𝑥) = 𝑔max (𝑥) ; then the left inequality is true. The right inequality can be obtained by item 1. ̂ 𝑡). (3) It is trivial by the definition of 𝑔(𝑥, ̂ 𝑡) ≤ 0} and Ω𝜃 (𝑡)0 = Proposition 10. Let Ω𝜃 (𝑡) = {𝑥 | 𝑔(𝑥, ̂ 𝑡) < 0}; one has {𝑥 | 𝑔(𝑥, (1) Ω𝜃 (𝑡) ⊂ Ω for 𝑡 ∈ (0, 1]; ̂ ⊂ Ω0 , there exists (2) for any bounded and closed set Ω ̂ a 𝜃 ∈ (0, 1], such that Ω ⊂ Ω𝜃 (1)0 ; (3) Ω𝜃 (𝑡) → Ω as 𝑡 ↓ 0. Proof. (1) If 𝑥 ∉ Ω, which is equivalent to 𝑔max (𝑥) > 0, by Proposition 9(2), 0 < 𝑔max (𝑥) ≤ 𝑔̂ (𝑥, 𝑡) ,

(29)

(26)

which means 𝑥 ∉ Ω𝜃 (𝑡); then we have Ω𝜃 (𝑡) ⊂ Ω. ̂ is a (2) By the continuity of 𝑔(𝑥) and the fact that Ω ∗ ̂ bounded closed set, there exists a point 𝑥 ∈ Ω such that ̂ and 𝑔max (𝑥∗ ) < 0. For any 𝑔max (𝑥) reaches its maximum in Ω,

∇𝑥 𝑔̂ (𝑥𝑘 , 𝑡𝑘 ) =

∑ (𝜆 𝑖 (𝑥𝑘 , 𝑡𝑘 ) + 𝜉𝑖 (𝑥𝑘 , 𝑡𝑘 )) ∇𝑔𝑖 (𝑥𝑘 ) 󳨀→ 0 𝑖∈𝐼𝜀 (𝑥𝑘 ,𝑡𝑘 )

(30)

as 𝑘 → ∞. Then, by Proposition 8, we know that 𝜆(𝑥𝑘 , 𝑡𝑘 ) → 𝜆, 𝜉(𝑥𝑘 , 𝑡𝑘 ) → 0 as 𝑘 → ∞, and ∑

𝜆𝑖 = ∑ 𝜆𝑖 = 1,

𝑖∈lim𝑘󳨀→∞ 𝐼𝜀 (𝑥𝑘 ,𝑡𝑘 )

𝑖∈𝐼(𝑥)

(31)

where the first equality comes from that 𝜆𝑖 = 0 for 𝑖 ∉ 𝐼(𝑥). Therefore, we have ∑𝑖∈𝐼(𝑥) 𝜆𝑖 ∇𝑔𝑖 (𝑥) = 0 by taking limits on (30); this is a contradiction to Assumption 6. ̂ ⊂ Ω0 , there Proposition 12. For any bounded closed set Ω exists a 𝜃 ∈ (0, 1] such that ̂ = 0, {𝑥 + 𝜆∇𝑥 𝑔̂ (𝑥, 1) | 𝜆 > 0} ∩ Ω for any 𝑥 ∈ 𝜕Ω𝜃 (1).

(32)

6


̂ and four Proof. If not, there exist a bounded closed set Ω ∞ 𝑘 ∞ 𝑘 sequences {𝜃𝑘 }𝑘=1 ⊂ (0, 1], {𝑥 }𝑘=1 with 𝑥 ∈ 𝜕Ω𝜃𝑘 (1), 𝑘 ∞ 𝑘 ̂ {𝑥̂𝑘 }∞ 𝑘=1 ⊂ Ω, and {𝜆 }𝑘=1 > 0 such that 𝜃𝑘 ↓ 0, 𝑥 → 𝑥 ∈ 𝜕Ω, ̂ 𝜆 𝑖 (𝑥𝑘 , 1) → 𝜆∗ ≥ 0, 𝜉𝑖 (𝑥𝑘 , 1) → 0 as 𝑘 → ∞, 𝑥̂𝑘 → 𝑥̂ ∈ Ω, 𝑖 and 𝑥̂𝑘 = 𝑥𝑘 + 𝜆𝑘

∑ (𝜆 𝑖 (𝑥𝑘 , 1) + 𝜉𝑖 (𝑥𝑘 , 1)) ∇𝑔𝑖 (𝑥𝑘 ) . 𝑖∈𝐼𝜀 (𝑥𝑘 ,1) (33)

𝐻 (𝑥, 𝜆, 𝑡) (1 − 𝑡) (∇𝑓 (𝑥) + 𝜆∇𝑥 𝑔̂ (𝑥, 𝑡)) + 𝑡 (𝑥 − 𝑥0 ) ) =( 𝜆𝑔̂ (𝑥, 𝑡) − 𝑡𝜆0 𝑔̂ (𝑥0 , 1)

(39)

= 0,

By Proposition 8, we have ∑

3.1. The Flattened Aggregate Constraint Homotopy. Using ̂ 𝑡) in (10), the flattened aggregate constraint function 𝑔(𝑥, we construct the following flattened aggregate constraint homotopy:

𝜆∗𝑖 = ∑ 𝜆∗𝑖 = 1,

𝑖∈lim𝑘󳨀→∞ 𝐼𝜀 (𝑥𝑘 ,1)

𝑖∈𝐼(𝑥)

(34)

where the first equality comes from that 𝜆∗𝑖 = 0 for 𝑖 ∉ 𝐼(𝑥). Then, we have ∑ (𝜆 𝑖 (𝑥𝑘 , 1) + 𝜉𝑖 (𝑥𝑘 , 1)) ∇𝑔𝑖 (𝑥𝑘 ) 𝑖∈𝐼𝜀 (𝑥𝑘 ,1)

(35)

󳨀→ ∑ 𝜆∗𝑖 ∇𝑔𝑖 (𝑥) . 𝑖∈𝐼(𝑥)

By Assumption 6 and using (34), ∑ 𝜆∗𝑖 ∇𝑔𝑖 (𝑥) ≠ 0.

𝑖∈𝐼(𝑥)

(36)

Hence {𝜆𝑘 }∞ 𝑘=1 is a bounded sequence, and there must exist a subsequence converging to 𝜆 > 0; then we have by (33) 𝑥̂ = 𝑥 + 𝜆 ∑ 𝜆∗𝑖 ∇𝑔𝑖 (𝑥) , 𝑖∈𝐼(𝑥)

(37)

which contradicts Assumption 7.

3. The Flattened Aggregate Constraint Homotopy Method In [19], the following aggregate constraint homotopy was introduced: 𝐻 (𝑥, 𝜆, 𝑡) (1 − 𝑡) (∇𝑓 (𝑥) + 𝜆∇𝑥 𝑔̂𝜃 (𝑥, 𝑡)) + 𝑡 (𝑥 − 𝑥0 ) ) (38) =( 𝜆𝑔̂𝜃 (𝑥, 𝑡) − 𝑡𝜆0 𝑔̂𝜃 (𝑥0 , 1) = 0, where 𝑔̂𝜃 (𝑥, 𝑡) = 𝜃𝑡 ln ∑𝑚 𝑖=1 exp(𝑔𝑖 (𝑥)/𝜃𝑡) with 𝜃 ∈ (0, 1], 𝜆 is the Lagrangian multiplier of the aggregate constraint ̂ × 𝑅++ . function 𝑔̂𝜃 (𝑥, 𝑡), and (𝑥0 , 𝜆0 ) is the starting point in Ω Under the weak normal cone condition, which is similar to the normal cone condition, it was proved that the ACH determines a smooth interior path from a given interior point to a KKT point. Then, the predictor-corrector procedure can be applied to trace the homotopy path from (𝑥0 , 𝜆0 , 1) to (𝑥∗ , 𝜆∗ , 0), in which (𝑥∗ , 𝜆∗ ) is a KKT point of (1).

where 𝜆 is the Lagrangian multiplier of the flattened aggregate constraint and (𝑥0 , 𝜆0 ) is the starting point and can be ̂ × 𝑅++ . randomly chosen from Ω We give the main theorem on the existence and convergence of a smooth path from (𝑥0 , 𝜆0 , 1) to (𝑥∗ , 𝜆∗ , 0), in which (𝑥∗ , 𝜆∗ ) is a solution of the KKT system of (1), and hence the global convergence of our proposal, namely, the FACH method, can be proven. Theorem 13. Suppose that Assumptions 5–7 hold and 𝜃 ∈ ̂⊂ (0, 1] satisfies Propositions 10–12 for the bounded closed set Ω 0 0 0 0 0 Ω𝜃 (1) ⊂ Ω ; then for almost all starting points 𝑤 = (𝑥 , 𝜆 ) ∈ ̂ × 𝑅++ , the zero-points set 𝐻−1 (0) = {(𝑥, 𝜆, 𝑡) | 𝐻(𝑥, 𝜆, 𝑡) = Ω 0} defines a smooth path Γ𝑤0 , which starts at (𝑥0 , 𝜆0 , 1) and approaches the hyperplane 𝑡 = 0. Furthermore, let (𝑥∗ , 𝜆∗ , 0) be any limit point of Γ𝑤0 , and 𝑦∗ = 𝜆∗ 𝜆(𝑥∗ , 0) (𝜆(𝑥∗ , 0) is a limit of 𝜆(𝑥, 𝑡) as 𝑥 → 𝑥∗ , 𝑡 ↓ 0), then 𝑥∗ is a KKT point of (1) and 𝑦∗ is the corresponding Lagrangian multiplier. Proof. Consider 𝐻 as a map of the variable (𝑤0 , 𝑥, 𝜆, 𝑡), for ̂ × 𝑅++ × (0, 1], which means 𝑥0 ∈ Ω𝜃 (1)0 ; any (𝑤0 , 𝑡) ∈ Ω 0 ̂ , 1) < 0, because hence 𝑔(𝑥 𝜕𝐻 (𝑤0 , 𝑥, 𝜆, 𝑡) 𝜕𝑤0

−𝑡𝐼 0 𝑇 =( 0 ) 0 −𝑡𝜆 (∇𝑥 𝑔̂ (𝑥 , 1)) −𝑡𝑔̂ (𝑥0 , 1) (40)

is nonsingular and the Jacobi matrix of 𝐻(𝑤0 , 𝑥, 𝜆, 𝑡) is of full row rank. Using the parameterized Sard theorem, Lemma 4, ̂ × 𝑅++ , 0 is a regular value of 𝐻(𝑥, 𝜆, 𝑡). for almost all 𝑤0 ∈ Ω From 𝑥 − 𝑥0 ), 𝐻 (𝑥, 𝜆, 1) = ( 𝜆𝑔̂ (𝑥, 1) − 𝜆0 𝑔̂ (𝑥0 , 1)

(41)

we know that (𝑥0 , 𝜆0 ) is the unique and simple solution of 𝐻(𝑥, 𝜆, 1) = 0. Then, by the implicit function theorem, there exists a smooth curve Γ𝑤0 ⊂ Ω𝜃 (𝑡)0 ×𝑅++ ×(0, 1] starting from (𝑥0 , 𝜆0 , 1) and being transversal to the hyperplane 𝑡 = 1. Since Γ𝑤0 can be extended in Ω𝜃 (𝑡)0 × 𝑅++ × (0, 1) until it converges to the boundary of Ω𝜃 (𝑡)0 × 𝑅++ × (0, 1), there must exist an extreme point (𝑥, 𝜆, 𝑡) ∈ 𝜕(Ω𝜃 (𝑡) × 𝑅+ × [0, 1]). Let (𝑥, 𝜆, 𝑡) be any extreme point of Γ𝑤0 other than (𝑥0 , 𝜆0 , 1); then only the following five cases are possible: (1) (𝑥, 𝜆, 𝑡) ∈ Ω × 𝑅+ × {0}, 𝜆 < +∞;


7 Let (𝑥∗ , 𝜆∗ , 𝜆(𝑥∗ , 0)) be any accumulation point; 𝑦𝑖∗ = 𝜆∗ 𝜆 𝑖 (𝑥∗ , 0). By (39) and the fact that 𝜉(𝑥, 𝑡) → 0 as 𝑡 ↓ 0, we have

(2) (𝑥, 𝜆, 𝑡) ∈ Ω𝜃 (1) × 𝑅+ × {1}, 𝜆 < +∞; (3) (𝑥, 𝜆, 𝑡) ∈ 𝜕Ω𝜃 (𝑡) × 𝑅++ × (0, 1), 𝜆 < +∞; (4) (𝑥, 𝜆, 𝑡) ∈ Ω𝜃 (𝑡) × {0} × (0, 1);

𝑚

∇𝑓 (𝑥∗ ) + ∑𝑦𝑖∗ ∇𝑔𝑖 (𝑥∗ ) = 0.

(5) (𝑥, 𝜆, 𝑡) ∈ Ω𝜃 (𝑡) × {+∞} × [0, 1]. Case (2) means that 𝐻(𝑥, 𝜆, 1) = 0 has another solution (𝑥, 𝜆) except (𝑥0 , 𝜆0 ), or (𝑥0 , 𝜆0 ) is a double solution of 𝐻(𝑥, 𝜆, 1) = 0, which contradicts the fact that (𝑥0 , 𝜆0 ) is the unique and simple solution of 𝐻(𝑥, 𝜆, 1) = 0. Case (3) means ̂ 𝑡) = 0, 0 < 𝜆 < ∞, and 0 < 𝑡 < 1; hence 𝑔(𝑥, 𝜆𝑔̂ (𝑥, 𝑡) − 𝑡𝜆0 𝑔̂ (𝑥0 , 1) = −𝑡𝜆0 𝑔̂ (𝑥0 , 1) < 0,

(42)

which contradicts the last equation of (39). Case (4) means ̂ 𝑡) ≤ 0, 𝜆 = 0, and 0 < 𝑡 < 1; hence 𝑔(𝑥, 𝜆𝑔̂ (𝑥, 𝑡) − 𝑡𝜆0 𝑔̂ (𝑥0 , 1) = −𝑡𝜆0 𝑔̂ (𝑥0 , 1) < 0,

(43)

which contradicts the last equation of (39). Then, Cases (2), (3), and (4) are impossible. Because 𝐻(𝑥, 𝜆, 1) = 0 has only one and also unique solution (𝑥0 , 𝜆0 ), Case (2) is impossible. By the continuity and the last equation of (39), we know that Cases (3) and (4) are impossible. If Case (5) holds, there must exist a sequence 𝑘 → 𝑥, 𝜆𝑘 → ∞, {(𝑥𝑘 , 𝜆𝑘 , 𝑡𝑘 )}∞ 𝑘=1 on Γ𝑤0 such that 𝑥 and 𝑡𝑘 → 𝑡. By the last equation of (39), we have 𝑔̂ (𝑥𝑘 , 𝑡𝑘 ) =

𝑡𝑘 𝜆0 𝑔̂ (𝑥0 , 1) 𝜆𝑘

󳨀→ 0

(44)

as 𝑘 → ∞, which means 𝑥 ∈ 𝜕Ω𝜃 (𝑡). The following two cases may be possible. (1) If 𝑡 = 1, let 𝛼 be any accumulation point of {(1 − 𝑡𝑘 )𝜆𝑘 }, which is known to exist from (39) and ̂ 𝑘 , 𝑡𝑘 ) ≠ 0; then we have lim𝑘→∞ ∇𝑥 𝑔(𝑥 𝑥0 = 𝑥 + 𝛼∇𝑥 𝑔̂ (𝑥, 1) .

(45)

If 𝛼 = 0, then 𝑥 = 𝑥0 contradicts 𝑥0 ∈ Ω𝜃 (1)0 ; else contradicts Proposition 12.

(47)

𝑖=1

By the fact that lim𝑡↓0 Ω𝜃 (𝑡) = Ω, we have 𝑥∗ ∈ Ω. If 𝑥∗ ∈ Ω , we know that 0

lim 𝑔̂ (𝑥, 𝑡)

𝑥󳨀→𝑥∗ ,𝑡↓0

≤

lim

𝑥󳨀→𝑥∗ ,𝑡↓0

(max {𝑔max (𝑥) , −𝜀 (𝑡)} + 𝜃𝑡 ln (𝑚𝐼𝜀 + 1))

= max {𝑔max (𝑥∗ ) , 𝜀 (0)} = max {𝑔max (𝑥∗ ) , −𝛼𝑐2 } < 0, (48) where the first inequality comes from Proposition 9(1), the second equality comes from the continuity, and the third equality comes from the definition of 𝜀(𝑡) in (11); hence 𝜆∗ = 0 ̂ 𝑡) = by the last equation of (39); else 𝑥∗ ∈ 𝜕Ω, lim𝑥→𝑥∗ ,𝑡↓0 𝑔(𝑥, 𝑔max (𝑥∗ ) = 0 by Proposition 9(2) and 𝜆 𝑖 (𝑥∗ , 0) = 0 for 𝑖 ∉ 𝐼(𝑥∗ ) by Proposition 8. Thus we have 𝑦𝑖∗ 𝑔𝑖 (𝑥∗ ) = 0,

𝑖 = 1, . . . , 𝑚.

(49)

Summing up, (𝑥∗ , 𝑦∗ ) is a solution of the KKT system of (1), which means that 𝑥∗ is a KKT point of (1) and 𝑦∗ is the corresponding Lagrangian multiplier. 3.2. The Modified Flattened Aggregate Constraint Homotopy. From Proposition 8(3), we know that 𝜉(𝑥, 𝑡) → 0 as 𝑡 ↓ 0 for any 𝑥 ∈ Ω and, hence, we can use the following modified flattened aggregate constraint homotopy (MFACH) instead of the FACH: (1 − 𝑡) (∇𝑓 (𝑥) + 𝜆ℎ (𝑥, 𝑡)) + 𝑡 (𝑥 − 𝑥0 ) ) 𝐻 (𝑥, 𝜆, 𝑡) = ( 𝜆𝑔̂ (𝑥, 𝑡) − 𝑡𝜆0 𝑔̂ (𝑥0 , 1) = 0,

(2) If 𝑡 < 1, by (39), we have

(50)

(1 − 𝑡) lim 𝜆𝑘 ∇𝑥 𝑔̂ (𝑥𝑘 , 𝑡𝑘 ) = − (1 − 𝑡) ∇𝑓 (𝑥) − 𝑡 (𝑥 − 𝑥0 ) ,

where

𝑘󳨀→∞

(46) the right-hand is finite; however, by 𝑘 ̂ , 𝑡𝑘 ) ≠ 0, we know that the left-hand is lim𝑘→∞ ∇𝑥 𝑔(𝑥 infinite; this is a contradiction. As a conclusion, Case (1) is the only possible case. This implies that Γ𝑤0 must approach the hyperplane 𝑡 = 0. 𝑘 ∞ 𝑘 ∞ Because {𝜆(𝑥𝑘 , 𝑡𝑘 )}∞ 𝑘=1 (defined in (17)), {𝑥 }𝑘=1 , and {𝜆 }𝑘=1 are bounded sequences, we know that {(𝑥𝑘 , 𝜆𝑘 , 𝜆(𝑥𝑘 , 𝑡𝑘 ))}∞ 𝑘=1 has at least one accumulation point as 𝑘 → ∞.

ℎ (𝑥, 𝑡) = ∑ 𝜆 𝑖 (𝑥, 𝑡) ∇𝑔𝑖 (𝑥) . 𝑖∈𝐼𝜀 (𝑥,𝑡)

(51)

Remarks. (i) Since 𝜉(𝑥, 𝑡) is dropped in (51), the expressions of the homotopy map (50) and its Jacobian and hence the corresponding code become simpler. Moreover, the computation of H in (50) and its Jacobi matrix DH is a little cheaper than that for the FACH method. (ii) To guarantee that 𝐻(𝑥, 𝜆, 𝑡) in (39) be a 𝐶2 map, 𝜑(𝑧, 𝑡) must be 𝐶3 ; hence, if 𝜑(𝑧, 𝑡) is chosen as a polynomial, the

8


total degree should be seven; in contrast, for the homotopy map in (50), 𝜑(𝑧, 𝑡) is only needed to be 𝐶2 . Then 𝜑(𝑧, 𝑡) should satisfy 𝜑 (−𝛼𝜀 (𝑡) , 𝑡) = 0,

𝜕𝜑 (−𝛼𝜀 (𝑡) , 𝑡) = 0, 𝜕𝑧

𝜕2 𝜑 (−𝛼𝜀 (𝑡) , 𝑡) = 0, 𝜕𝑧2 𝜑 (−𝜀 (𝑡) , 𝑡) = 1,

𝜕𝜑 (−𝜀 (𝑡) , 𝑡) = 0, 𝜕𝑧

(52)

∇𝑓 (𝑥) + 𝜆ℎ (𝑥, 𝑡𝑐 ) ) = 0, 𝐹𝑡𝑐 (𝑥, 𝜆) = ( 𝜆𝑔̂ (𝑥, 𝑡𝑐 )

𝜕2 𝜑 (−𝜀 (𝑡) , 𝑡) =0 𝜕𝑧2

6(𝑧 + 𝜀 (𝑡))5 15(𝑧 + 𝜀 (𝑡))4 + (𝛼𝜀 (𝑡) − 𝜀 (𝑡))5 (𝛼𝜀 (𝑡) − 𝜀 (𝑡))4 10(𝑧 + 𝜀 (𝑡))3 + + 1. (𝛼𝜀 (𝑡) − 𝜀 (𝑡))3

(54)

where 𝑡𝑐 is a small positive constant. For other situations, the steplength will be decreased to make new predictor-corrector steps (see Algorithm 1).

and 𝜑(𝑧, 𝑡) can be chosen as 𝜑 (𝑧, 𝑡) =

the predictor direction. The corrector step terminates when 𝑑(𝑘,𝑖) and 𝐻(𝑤(𝑘,𝑖) , 𝑡(𝑘,𝑖) ) satisfy the tolerance criterions. At each predictor step and corrector step, the feasibility of the predictor point (𝑤𝑘 , 𝑡𝑘 ) and the corrector point (𝑤(𝑘,𝑖) , 𝑡(𝑘,𝑖) ) needs to be checked. If 𝑡𝑘 < 𝑡𝑐 in the predictor step or 𝑡𝑘 < 0 in the corrector step, a damping step is used to get a new point (𝑤(𝑘,0) , 0). Then, if 𝑤(𝑘,0) is feasible, the end game, a more efficient strategy than predictor-corrector steps when the homotopy parameter is close to 0, is invoked. Starting with 𝑤(𝑘,0) , Newton’s method is used to solve

(53)

(iii) The existence and convergence of the homotopy path defined by the MFACH can be proven in a similar way with that of the FACH.

4. The FACH-S-N Procedure and Numerical Results 4.1. The FACH-S-N Procedure. In this section, we give a numerical procedure, FACH-S-N procedure, to trace the flattened aggregate constraint homotopy path by secant predictor and Newton corrector steps. It consists of three main steps: the predictor step, the corrector step, and the end game. The predictor step is an approximate step along the homotopy path: it uses a predictor direction 𝑑𝑘 and a steplength ℎ𝑘 to get a predictor point. The first predictor direction uses the tangent direction, and others use the secant direction. The steplength is determined by several parameters. It is set to no more than 1, which can ensure that the predictor point is close to the homotopy path and hence stays in the convergence domain of Newton’s method in the corrector step. If the angle of the predictor direction and the previous one 𝛽𝑘 is greater than 𝜋/4, the steplength will be decreased to avoid using the opposite direction as the predictor direction. If the corrector criteria are satisfied with no more than four Newton iterations for three times in succession, the steplength will be increased or kept invariable. Otherwise, the steplength will be decreased. Once a predictor point (𝑤(𝑘,0) , 𝑡(𝑘,0) ) is calculated, one or more Newton iterations are used to bring the predictor point back to the homotopy path in the corrector step. The corrector points (𝑤(𝑘,𝑖) , 𝑡(𝑘,𝑖) ), 𝑖 = 1, . . . , 5, are calculated by (𝑤(𝑘,𝑖+1) , 𝑡(𝑘,𝑖+1) ) = (𝑤(𝑘,𝑖) , 𝑡(𝑘,𝑖) ) + 𝑑(𝑘,𝑖+1) , 𝑖 = 0, . . . , 4, where the step 𝑑(𝑘,𝑖+1) is the solution of an augmented system defined by the homotopy equation and the direction perpendicular to

4.2. The Numerical Experiment. Although there exist so many test problems, such as the CUTEr test set [21], we cannot find a large collection of test problems with moderate variables and very many complicated nonlinear constraints. In this paper, six test problems are chosen to test the algorithm. Problem 4.1 is chosen from the CUTEr test set and it is used to illustrate a special situation; others are derived from the discretized semi-infinite programming problems. We also give two artificial test problems 4.2 and 4.3 and use three problems 4.4–4.6 in [22]. The numbers of variables in problem 4.2 and the number of constrains in problems 4.2– 4.6 can be arbitrary. For each test problem, the gradients and Hessians of the objective and constraint functions are evaluated analytically. The FACH-S-N procedure was implemented in MATLAB. To illustrate its efficiency, we also implemented the ACH method in [19] using a similar procedure. In addition, we downloaded the state-of-the-art solver KNITRO, which provides three algorithms for solving large-scale nonlinear programming problems, and we used the interior-point direct method with default parameters to compare with the FACH and MFACH methods. For any iterate point 𝑥𝑘 , if 𝑔max (𝑥𝑘 ) < 10−6 , it was treated as a feasible point. The test results were obtained by running MATLAB R2008a on a desktop with Windows XP Professional operation system, Intel(R) Core(TM) i5-750 2.66 GHz processor, and 8 GB of memory. The default parameters were chosen as follows. (i) Parameters for the flattened aggregate constraint function: 𝜃 = 0.01, 𝑐1 = 0.05, 𝑐2 = 0.5 ∗ 10−5 , and 𝛼 = 2. (ii) Parameters in the end game section: 𝑡𝑐 = 10−6 and 𝑡end = 0.1. (iii) Step size parameters: ℎ0 = 0.1, 𝐵min = [0.5, 0.75], and 𝐵max = [3, 1.5]. (iv) Tracking tolerances: 𝐻tol = 10−5 and 𝐻final = 10−12 . (v) Initial Lagrangian multipliers: 1 for ACH, FACH, and MFACH methods.


9

̂ × 𝑅++ ; initial steplength ℎ1 , steplength Input Give 𝜃 ∈ (0, 1], 𝑐1 > 0, 𝑐2 > 0, 𝛼 > 1, 𝑡end and 𝑡𝑐 ; starting point 𝑤0 ∈ Ω contraction factor 𝐵min , steplength expansion factor 𝐵max ; tracking tolerances 𝐻tol and 𝐻final for correction. Set 𝑡0 = 1, 𝑘 = 1, IT = 0, 𝑁good = 2, 𝛽1 = 0. Step 1. The predictor step. Step 1.1. Compute the predictor direction. If 𝑘 = 1, compute 𝑑 by solving 𝐻󸀠 (𝑥0 , 𝜆0 , 1) ) 𝑑 = −𝑑0 , ( 𝑇 (𝑑0 ) where 𝑑0 = (0, . . . , 0, −1) ∈ 𝑅𝑛+2 , set the predictor direction 𝑑1 = 𝑑/ ‖𝑑‖; else, compute the predictor direction 𝑑𝑘 , the angle 𝛽𝑘 between 𝑑𝑘 and 𝑑𝑘−1 : 𝑇 𝑘−1 (𝑤𝑘−1 , 𝑡𝑘−1 ) − (𝑤𝑘−2 , 𝑡𝑘−2 ) 𝑑𝑘 = , 𝛽𝑘 = arccos ((𝑑𝑘 ) 𝑑 ) . 𝑘−1 𝑘−2 ‖(𝑤 , 𝑡𝑘−1 ) − (𝑤 , 𝑡𝑘−2 )‖ Step 1.2. Adjust the steplength. Adjust the steplength ℎ𝑘 as follows: if the predictor point is infeasible, set ℎ𝑘 = 𝐵min (1)ℎ𝑘 , 𝑁good = 0; else if the corrector step fails or 𝛽𝑘 > 𝜋/4, set ℎ𝑘 = 𝐵min (1)ℎ𝑘−1 , 𝑁good = 0; else if 𝑖 = 5, set ℎ𝑘 = 𝐵min (2)ℎ𝑘−1 , 𝑁good = 0; else if 𝑖 = 4, set ℎ𝑘 = ℎ𝑘−1 , 𝑁good = 𝑁good + 1; else if 𝑖 = 3, set 𝑁good = 𝑁good + 1, if 𝑁good > 2, set ℎ𝑘 = min{1, 𝐵max (2)ℎ𝑘−1 }, else, set ℎ𝑘 = ℎ𝑘−1 ; else, set 𝑁good = 𝑁good + 1, if 𝑁good > 2, set ℎ𝑘 = min{1, 𝐵max (1)ℎ𝑘−1 }, else, set ℎ𝑘 = ℎ𝑘−1 . If ℎ𝑘 < 10−10 , stop the algorithm with an error flag. Step 1.3. Compute the predictor point and check its feasibility. Compute the predictor point (𝑤(𝑘,0) , 𝑡(𝑘,0) ) = (𝑤𝑘−1 , 𝑡𝑘−1 ) + ℎ𝑘 𝑑𝑘 . 0 (𝑘,0) ∉ Ω𝜃 (𝑡(𝑘,0) ) × 𝑅++ or 𝑡(𝑘,0) ≥ 1, If 𝑤 the predictor point is infeasible, goto Step 1.2; else if 𝑡(𝑘,0) ≤ 𝑡end , adjust the steplength ℎ𝑘 , compute the point (𝑤(𝑘,0) , 0), if 𝑤(𝑘,0) ∈ Ω × 𝑅+ , goto Step 3, else, the predictor point is infeasible, goto Step 1.2; else, goto Step 2. Step 2. The corrector step. Set 𝑖 = 0. Repeat If 𝑖 = 5, set IT = IT + 5, 𝑤𝑘 = 𝑤𝑘−1 , 𝑡𝑘 = 𝑡𝑘−1 , 𝑑𝑘+1 = 𝑑𝑘 , 𝑘 = 𝑘 + 1, the corrector step fails, goto Step 1.2. Compute the Newton step 𝑑(𝑘,𝑖+1) by solving 𝐻󸀠 (𝑤(𝑘,𝑖) , 𝑡(𝑘,𝑖) ) (𝑘,𝑖+1) −𝐻 (𝑤(𝑘,𝑖) , 𝑡(𝑘,𝑖) ) )𝑑 =( ), ( 𝑘 𝑇 0 (𝑑 ) the corrector point (𝑤(𝑘,𝑖+1) , 𝑡(𝑘,𝑖+1) ) = (𝑤(𝑘,𝑖) , 𝑡(𝑘,𝑖) ) + 𝑑(𝑘,𝑖+1) ; set 𝑖 = 𝑖 + 1. If 𝑤(𝑘,𝑖) ∉ Ω𝜃 (𝑡(𝑘,𝑖) )0 × 𝑅++ or 𝑡(𝑘,𝑖) ≥ 1, set IT = IT + 𝑖, 𝑤𝑘 = 𝑤𝑘−1 , 𝑡𝑘 = 𝑡𝑘−1 , 𝑑𝑘+1 = 𝑑𝑘 , 𝑘 = 𝑘 + 1, the corrector step fails, goto Step 1.2; else if 𝑡(𝑘,𝑖) < 0, compute the point (𝑤(𝑘,0) , 0) by a damping Newton step, set IT = IT + 𝑖, if 𝑤(𝑘,0) ∈ Ω × 𝑅+ , goto Step 3, else, set 𝑤𝑘 = 𝑤𝑘−1 , 𝑡𝑘 = 𝑡𝑘−1 , 𝑑𝑘+1 = 𝑑𝑘 , 𝑘 = 𝑘 + 1, the corrector step fails, goto Step 1.2. (𝑘,𝑖) Until ‖𝐻(𝑤 , 𝑡(𝑘,𝑖) )‖inf ≤ 𝐻tol and ‖𝑑(𝑘,𝑖) ‖ ≤ 𝐻tol . If 𝑡𝑘 < 𝑡𝑐 , return with 𝑥∗ = 𝑥𝑘 , stop the algorithm; else, set 𝐻tol = min{𝐻tol , 𝑡𝑘 }, 𝑘 = 𝑘 + 1, goto Step 1. Step 3. The end game. Set 𝑙 = 0. Repeat If 𝑙 = 5 or ‖𝑑(𝑘,𝑙) ‖ > ‖𝑑(𝑘,𝑙−1) ‖ for 𝑙 > 1, or 𝑤(𝑘,𝑙) ∉ Ω × 𝑅+ , set 𝑡end = 0.3 ∗ 𝑡end , 𝑤𝑘 = 𝑤𝑘−1 , 𝑡𝑘 = 𝑡𝑘−1 , 𝑑𝑘+1 = 𝑑𝑘 , IT = IT + 𝑙, 𝑘 = 𝑘 + 1, the corrector step fails, goto Step 1.2. Compute the Newton step 𝑑(𝑘,𝑙+1) by solving 𝐹𝑡󸀠𝑐 (𝑤(𝑘,𝑙) )𝑑(𝑘,𝑙+1) = −𝐹𝑡𝑐 (𝑤(𝑘,𝑙) ), the corrector point 𝑤(𝑘,𝑙+1) = 𝑤(𝑘,𝑙) + 𝑑(𝑘,𝑙+1) ; Algorithm 1: Continued.

10


set 𝑙 = 𝑙 + 1. Until ‖𝐹𝑡𝑐 (𝑤(𝑘,𝑙) )‖inf ≤ 𝐻final and ‖𝑑(𝑘,𝑙) ‖ ≤ 𝐻final . Set IT = IT + 𝑙, return with 𝑥∗ = 𝑥(𝑘,𝑙) , stop the algorithm. Algorithm 1: The FACH-S-N procedure.

For each problem with different parameters, we list the value of objective function 𝑓(𝑥) and the maximal function of constraints 𝑔max (𝑥) at 𝑥∗ , the number of iterations 𝐼𝑇, the number of evaluations of gradients of individual constraint functions in the FACH and MFACH methods 𝑁𝑠 (in contrast, 𝑁𝑠 is 𝑚 ∗ 𝐼𝑇 in the ACH method), and the running time in seconds 𝑇𝑖𝑚𝑒. For problems that were not solved by the conservative setting, we also give the reason for failure. The notation “fail1 ” indicates that the steplength in predictor step is smaller than 10−10 ; it is generally due to poor conditioned Jacobian matrix. The notation “fail2 ” means out of memory. The notation “fail3 ” means no result in 5000 iterations or 3600 seconds.

Table 1: Test results for Example 14. Method ACH KNITRO FACH MFACH

2 ), ∑ 100 sin (−𝑥𝑖 + 1.5𝜋 + 𝑥𝑖−1

(55)

+

(𝑡𝑗󸀠 − 𝑥2 ) 𝑥42

− 1,

𝑡𝑖 =

𝑖 , (√𝑚 − 1)

𝑖 = 0, . . . , √𝑚 − 1,

𝑡𝑗󸀠 =

𝑗 , (√𝑚 − 1)

𝑗 = 0, . . . , √𝑚 − 1,

2≤𝑖≤1000

(57)

𝑥0 = (0, 0, 100, 100) ∈ 𝑅4 . Example 17. Consider

𝑥0 = (1, . . . , 1) ∈ 𝑅1000 . Remark 15. In this example, the starting point happens to be an unconstrained minimizer of the objective function and we found that the 𝑥-components of iterative points generated by the FACH and MFACH methods (not other homotopy methods) remain invariant. This is not an occasional phenomenon. In fact, if (𝑥0 , 0) ∈ Ω0 × 𝑅+𝑚 is a solution of the KKT system, that is, 𝑥0 is a stationary point of the objective function 𝑓(𝑥), when the parameters 𝛼, 𝑐1 , and 𝑐2 in FACH and MFACH methods satisfy 𝑔max (𝑥0 ) ≤ −𝛼(𝑐1 + 𝑐2 ), the 𝑥-components of points on the homotopy path Γ𝑤0 will remain invariant, which can be derived from the fact that (𝑥0 , 𝑡𝜆0 𝜀(1)/𝜀(𝑡), 𝑡) is the solution of (39) and (50) for 𝑡 ∈ (0, 1]. Moreover, since 𝐻󸀠 (𝑥0 , 𝜆0 , 1) = (

Time 85.5496 2.8733 8.3912 8.4150

2

2

(𝑡 − 𝑥 ) 𝑔𝑖,𝑗 (𝑥) = 𝑖 2 1 𝑥3

𝑖 = 1; 𝑖 = 2, . . . , 1000; 𝑖 = 1001; 𝑖 = 1002, . . . , 2000,

𝑁𝑠 — — 0 0

IT 35 0 4 4

𝑓 (𝑥) = 𝑥32 + 𝑥42 ,

𝑓 (𝑥) = sin (𝑥1 − 1 + 1.5𝜋)

𝑥1 − 𝜋 { { 2 { {𝑥𝑖−1 − 𝑥𝑖 − 𝜋 𝑔𝑖 (𝑥) = { { −𝑥 − 𝜋 { { 12 {−𝑥𝑖−1 + 𝑥𝑖 − 𝜋

𝑔max (𝑥∗ ) −2.1416 −2.1416 −2.1416 −2.1416

Example 16. Consider

Example 14 (see [21]). Consider

+

𝑓(𝑥∗ ) −99901.0000 −99901.0000 −99901.0000 −99901.0000

𝐼 0 0 ), 0 𝑔̂ (𝑥0 , 1) ∗

(56)

the 𝑥-component of 𝑑1 is 0. By a similar discussion, the 𝑥component of 𝑑(1,𝑖) is 0, and hence 𝑥1 = 𝑥(1,𝑖) = 𝑥0 . In this analogy, we know that 𝑥𝑘 = 𝑥(𝑘,𝑖) = 𝑥0 for any 𝑘 and 𝑖 in algorithm FACH-S-N.

𝑓 (𝑥) =

1 2 ∑ (𝑥 − 1) , 𝑛 1≤𝑘≤𝑛 𝑘

𝑔𝑖 (𝑥) = ∏ cos (𝑡𝑖 𝑥𝑘 ) + 𝑡𝑖 ∑ 𝑥𝑘3 , 𝑡𝑖 =

1≤𝑘≤𝑛

1≤𝑘≤𝑛

0.5 + 𝜋𝑖 , (𝑚 − 1)

𝑖 = 0, . . . , 𝑚 − 1,

(58)

𝑥0 = (−2, . . . , −2) ∈ 𝑅𝑛 . Example 18 (see [22]). Consider 𝑓 (𝑥) = 𝑥12 + 𝑥22 + 𝑥32 , 𝑔𝑖 (𝑥) = 𝑥1 + 𝑥2 exp (𝑥3 𝑡𝑖 ) + exp (2𝑡𝑖 ) − 2 sin (4𝑡𝑖 ) , 𝑡𝑖 =

𝑖 , (𝑚 − 1)

𝑖 = 0, . . . , 𝑚 − 1,

𝑥0 = (−200, −200, 200) ∈ 𝑅3 .

(59)


11 Table 2: Test results for Example 16.

𝑚

Method ACH KNITRO FACH MFACH ACH KNITRO FACH MFACH ACH KNITRO FACH MFACH

102

104

106

∗

𝑓(𝑥 ) 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

𝑔max (𝑥∗ ) −1.3863e − 08 −7.1233e − 10 −1.3863e − 08 −1.3863e − 08 −1.3863e − 08 −8.0005e − 10 −1.3863e − 08 −1.3863e − 08 −1.3863e − 08 −7.8830e − 10 −1.3863e − 08 −1.3863e − 08

IT 394 18 541 567 397 20 541 567 404 13 541 567

𝑁𝑠 — — 721 823 — — 721 823 — — 805 907

Time 0.2423 0.2288 0.3270 0.2692 1.9730 0.7227 0.4359 0.4676 249.3395 2170.4301 10.9486 11.2815

Table 3: (a) Test results for Example 17 with 𝑛 = 100. (b) Test results for Example 17 with 𝑛 = 500. (c) Test results for Example 17 with 𝑛 = 1000. (d) Test results for Example 17 with 𝑛 = 2000. (a)

𝑚 102

103

104


∗

𝑓(𝑥 ) 1.4918 0.2603 1.4915 1.4915 1.4918 0.2603 1.4915 1.4915 1.4918 0.2603 1.4915 1.4915

𝑔max (𝑥∗ ) −1.5282e − 03 −1.9708e − 06 −6.6613e − 16 6.1062e − 15 −1.5282e − 03 −2.7042e − 05 −6.6613e − 16 6.1062e − 15 −1.5282e − 03 −2.5434e − 04 −6.6614e − 16 6.1062e − 15

IT 243 32 156 144 243 19 156 144 243 18 156 144

𝑁𝑠 — — 96 85 — — 96 85 — — 96 86

Time 36.9898 4.1262 0.4067 0.3186 325.6083 25.1720 0.6852 0.5342 3218.5422 271.1651 5.1941 4.0120

IT 405 31 159 139 — — 159 139 — — 159 139

𝑁𝑠 — — 45 25 — — 45 25 — — 45 25

Time 1371.0198 95.1782 3.9951 2.9676 fail3 fail3 6.5173 4.6898 fail2 fail2 31.7107 25.1842

IT — 27 197 197

𝑁𝑠 — — 46 46

Time fail3 359.4507 20.8919 21.2641

(b)

𝑚 102

103

104


∗

𝑓(𝑥 ) 1.2524 0.1417 1.2510 1.2510 — — 1.2510 1.2510 — — 1.2510 1.2510

𝑔max (𝑥∗ ) −1.0693e − 02 −2.0927e − 05 1.2490e − 14 1.2490e − 14 — — 1.2490e − 14 1.2490e − 14 — — 1.2490e − 14 1.2490e − 14 (c)

𝑚 102

Method ACH KNITRO FACH MFACH

𝑓(𝑥∗ ) — 0.1099 1.1880 1.1880

𝑔max (𝑥∗ ) — −5.6905e − 05 2.3370e − 14 2.4258e − 14

12

Abstract and Applied Analysis (c) Continued.

𝑚

∗

103

104

𝑔max (𝑥∗ ) — — 2.3370e − 14 2.4258e − 14 — — 2.3370e − 14 2.4258e − 14

𝑓(𝑥 ) — — 1.1880 1.1880 — — 1.1880 1.1880

Method ACH KNITRO FACH MFACH ACH KNITRO FACH MFACH

IT — — 197 197 — — 197 197

𝑁𝑠 — — 46 46 — — 46 46

Time fail2 fail2 28.8674 27.5140 fail2 fail2 101.3199 99.1354

IT — — 260 258 — — 260 258 — — 260 258

𝑁𝑠 — — 47 46 — — 47 46 — — 47 46

Time fail3 fail3 126.6290 125.7973 fail2 fail2 146.6079 145.7293 fail2 fail2 368.5331 369.3701

IT 1605 2 815 820 1605 2 815 820 1606 2 960 931

𝑁𝑠 — — 288 293 — — 288 294 — — 6444 7094

(d)

𝑚 102

103

104

∗

𝑔max (𝑥∗ ) — — −6.2839e − 14 4.0967e − 14 — — −6.2839e − 14 4.0967e − 14 — — −6.2839e − 14 4.0967e − 14

𝑓(𝑥 ) — — 1.1406 1.1406 — — 1.1406 1.1406 — — 1.1406 1.1406


Table 4: Test results for Example 18. 𝑚

𝑓(𝑥∗ ) 5.3347 1.2000e + 05 5.3347 5.3347 5.3347 1.2000e + 05 5.3347 5.3347 5.3347 1.2000e + 05 5.3347 5.3347


102

104

106

𝑔max (𝑥∗ ) −4.4409e − 16 −3.9900e + 02 1.3323e − 15 1.3323e − 15 −4.4409e − 16 −3.9900e + 02 1.3323e − 15 1.3323e − 15 1.3323e − 15 −3.9900e + 02 −9.2480e − 08 −1.1278e − 07



𝑓 (𝑥) = (𝑥1 − 2𝑥2 + 5𝑥22 − 𝑥23 − 13) + (𝑥1 − 14𝑥2 +

𝑥22

+

𝑥23

𝑓 (𝑥) =

2

𝑖 , (𝑚 − 1)

𝑥0 = (0, −45) ∈ 𝑅2 .

2

− 29) ,

𝑖 = 0, . . . , 𝑚 − 1,

𝑥12 𝑥1 + + 𝑥22 , 3 2

𝑔𝑖 (𝑥) = (1 − 𝑥12 𝑡𝑖2 ) − 𝑥1 𝑡𝑖2 − 𝑥22 + 𝑥2 ,

2

𝑔𝑖 (𝑥) = 𝑥12 + 2𝑥2 𝑡𝑖2 + exp (𝑥1 + 𝑥2 ) − exp (𝑡𝑖 ) , 𝑡𝑖 =

Time 0.7546 0.0141 0.2977 0.3004 7.0581 0.3163 1.3127 1.3185 575.4663 3243.2201 42.7664 41.7933

(60)

𝑖 𝑡𝑖 = , (𝑚 − 1)

(61)

𝑖 = 0, . . . , 𝑚 − 1,

𝑥0 = (−1, 100) ∈ 𝑅2 . To explain the numerical efficiency, we make the following remarks by numerical results in Tables 1, 2, 3, 4, 5 and 6.


13 Table 5: Test results for Example 19.

𝑚 102

104

106

∗

Method ACH KNITRO FACH MFACH ACH KNITRO FACH MFACH

𝑓(𝑥 ) 97.1589 97.1592 97.1589 97.1589 97.1589 102.2095 97.1589 97.1589

𝑔max (𝑥∗ ) 0.0000 −7.5266 − 05 0.0000 0.0000 0.0000 −6.7465 − 01 0.0000 0.0000

IT 213 68 170 160 227 263 249 251

𝑁𝑠 — — 191 183 — — 10926 11321

Time 0.0964 0.5162 0.0795 0.0608 0.5310 15.4868 0.2846 0.2783

ACH KNITRO FACH MFACH

97.1589 188.9496 97.1589 97.1589

0.0000 −8.3841e − 01 0.0000 0.0000

229 17 334 253

— — 427520 486506

50.8763 2206.4660 9.5724 6.8910

Table 6: Test results for Example 20. Method

𝑓(𝑥∗ )

𝑔max (𝑥∗ )

IT

𝑁𝑠

Time

10


2.4305 2.4305 2.4305 2.4305

0.0000 −1.3822e − 10 2.2204e − 16 2.2204e − 16

342 21 348 341

— — 387 374

0.1749 0.0291 0.1368 0.1276

104


2.4305 2.4305 2.4305 2.4305

−6.6632e − 09 −1.6180e − 10 −6.6632e − 09 −6.6632e − 09

344 38 296 358

— — 14710 23559

0.9804 1.0329 0.2610 0.2802

106


2.4305 2.6182 — 2.4305

−4.9783e − 08 −1.2609e − 04 — −4.9021e − 08

344 61 — 516

— — — 2231308

94.3943 2372.2093 fail1 16.3756

𝑚 2

(i) If 𝑔max (𝑥) ≤ −𝛼𝜀(𝑡) for any (𝑥, 𝜆, 𝑡) ∈ Γ𝑤0 , the FACH and MFACH methods do not need to calculate the gradient and Hessian of any constraint functions. (ii) For problems whose gradients and Hessians of constraint functions are expensive to evaluate, the performance of the FACH and MFACH methods is much better than the ACH method based on similar numerical tracing procedures. (iii) Compared to the interior-point direct algorithm of the state-of-the-art solver KNITRO, the test results show that the FACH and MFACH methods perform worse when 𝑚 is small but much better when 𝑚 is large for most problems. In addition, we can see that their time cost increases more slowly than KNITRO solver as 𝑚 increases. (iv) The function 𝜑(𝑧, 𝑡) is important for the flattened aggregate constraint function. Theoretically, the parameters 𝑐1 , 𝑐2 , and 𝛼 in the FACH and MFACH methods can be chosen freely. However, they do matter in the practical efficiency of the FACH and MFACH methods and should be suitably chosen. If

these parameters are too large, then too many gradients and Hessians of individual constraint functions need to be evaluated and, hence, cause low efficiency. On the other hand, if they are too small, the Hessian of the flattened aggregated constraint function (10) may become ill-conditioned. In our numerical tests, we fixed 𝑐1 = 0.05, 𝑐2 = 0.5 ∗ 10−5 , and 𝛼 = 2. In addition, the function 𝜑(𝑧, 𝑡) can be defined in many ways, and preliminary numerical experiments show that the algorithms with different functions 𝜑(𝑧, 𝑡) have similar efficiencies. (v) In algorithm FACH-S-N, we gave only a simple implementation of the ACH, FACH, and MFACH methods. To improve implementation of the FACH and MFACH methods, a lot of work needs to be done on all processes of numerical path tracing, say, schemes of predictor and corrector, steplength updating, linear system solving, and end game. Other practical strategies in the literature for large-scale nonlinear programming problems (e.g., [9, 23–26]) are also very important for improving the efficiency.

14


5. Conclusions By introducing a flattened aggregate constraint function, a flattened aggregate constraint homotopy method is proposed for nonlinear programming problems with few variables and many nonlinear constraints, and its global convergence is proven. By greatly reducing the computation of gradients and Hessians of constraint functions in each iteration, the proposed method is very competitive for nonlinear programming problems with a large number of complicated constraint functions.

Conflict of Interests The authors declare that there is no conflict of interests regarding the publication of this paper.

[12]

[13]

[14]

[15]

[16]

Acknowledgment The research was supported by the National Natural Science Foundation of China (11171051, 91230103).

[17]

References

[18]

[1] A. Ben-Tal and A. Nemirovski, Lectures on Modern Convex Optimization: Analysis, Algorithms, and Engineering Applications, Society for Industrial and Applied Mathematics, Philadelphia, Pa, USA, 2001. [2] S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge University Press, Cambridge, UK, 2004. [3] C. B. Garcia and W. I. Zangwill, Pathways to Solutions, Fixed Points, and Equilibria, Prentice Hall, Englewood Cliffs, NJ, USA, 1981. [4] Y. Nesterov and A. Nemirovskii, Interior-Point Polynomial Algorithms in Convex Programming, vol. 13 of SIAM Studies in Applied Mathematics, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, Pa, USA, 1994. [5] T. Terlaky, Interior Point Methods of Mathematical Programming, vol. 5 of Applied Optimization, Kluwer Academic Publishers, Dordrecht, The Netherlands, 1996. [6] Y. Y. Ye, Interior Point Algorithms, Wiley-Interscience Series in Discrete Mathematics and Optimization, John Wiley & Sons, New York, NY, USA, 1997. [7] A. Forsgren and P. E. Gill, “Primal-dual interior methods for nonconvex nonlinear programming,” SIAM Journal on Optimization, vol. 8, no. 4, pp. 1132–1152, 1998. [8] D. M. Gay, M. L. Overton, and M. H. Wright, “A primal-dual interior method for nonconvex nonlinear programming,” in Advances in Nonlinear Programming, vol. 14 of Applied Optimization, pp. 31–56, Kluwer Academic Publishers, Dordrecht, The Netherlands, 1998. [9] R. J. Vanderbei and D. F. Shanno, “An interior-point algorithm for nonconvex nonlinear programming,” Computational Optimization and Applications, vol. 13, no. 1–3, pp. 231–252, 1999. [10] G. Feng, Z. Lin, and B. Yu, “Existence of an interior pathway to a Karush-Kuhn-Tucker point of a nonconvex programming problem,” Nonlinear Analysis: Theory, Methods & Applications, vol. 32, no. 6, pp. 761–768, 1998. [11] G. C. Feng and B. Yu, “Combined homotopy interior point method for nonlinear programming problems,” in Advances in

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

Numerical Mathematics, vol. 14 of Lecture Notes in Numerical and Applied Analysis, pp. 9–16, 1995. L. T. Watson, “Theory of globally convergent probability-one homotopies for nonlinear programming,” SIAM Journal on Optimization, vol. 11, no. 3, pp. 761–780, 2000/01. Q. H. Liu, B. Yu, and G. C. Feng, “An interior point pathfollowing method for nonconvex programming with quasi normal cone condition,” Advances in Mathematics, vol. 19, no. 4, pp. 281–282, 2000. B. Yu, Q. H. Liu, and G. C. Feng, “A combined homotopy interior point method for nonconvex programming with pseudo cone condition,” Northeastern Mathematical Journal, vol. 16, no. 4, pp. 383–386, 2000. Y. F. Shang, Constraint shifting combined homotopy method for nonlinear programming, equilibrium programming and variational inequalities [Ph.D. thesis], Jilin University, 2006. B. Yu and Y. F. Shang, “Boundary moving combined homotopy method for nonconvex nonlinear programming,” Journal of Mathematical Research and Exposition, vol. 26, no. 4, pp. 831– 834, 2006. X. S. Li, “An aggregate function method for nonlinear programming,” Science in China A: Mathematics, Physics, Astronomy, vol. 34, no. 12, pp. 1467–1473, 1991. B. W. Kort and D. P. Bertsekas, “A new penalty function algorithm for constrained minimization,” in Proceedings of the 1972 IEEE Conference on Decision and Control, New Orleans, La, USA, 1972. B. Yu, G. C. Feng, and S. L. Zhang, “The aggregate constraint homotopy method for nonconvex nonlinear programming,” Nonlinear Analysis: Theory, Methods & Applications, vol. 45, no. 7, pp. 839–847, 2001. E. L. Allgower and K. Georg, Introduction to Numerical Continuation Methods, vol. 45 of Classics in Applied Mathematics, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, Pa, USA, 2003. N. I. M. Gould, D. Orban, and P. L. Toint, “CUTEr (and sifdec), a constrained and unconstrained testing environment, revisited,” Tech. Rep. TR/PA/01/04, CERFACS, Toulouse, France, 2001. D. H. Li, L. Qi, J. Tam, and S. Y. Wu, “A smoothing Newton method for semi-infinite programming,” Journal of Global Optimization, vol. 30, no. 2-3, pp. 169–194, 2004. R. H. Byrd, M. E. Hribar, and J. Nocedal, “An interior point algorithm for large-scale nonlinear programming,” SIAM Journal on Optimization, vol. 9, no. 4, pp. 877–900, 1999. A. R. Conn, N. I. M. Gould, and Ph. L. Toint, LANCELOT: A Fortran Package for Large-Scale Nonlinear Optimization (Release A), vol. 17 of Springer Series in Computational Mathematics, Springer, New York, NY, USA, 1992. R. Fletcher and S. Leyffer, “Nonlinear programming without a penalty function,” Mathematical Programming A, vol. 91, no. 2, pp. 239–269, 2002. P. E. Gill, W. Murray, and M. A. Saunders, “SNOPT: an SQP algorithm for large-scale constrained optimization,” SIAM Journal on Optimization, vol. 12, no. 4, pp. 979–1006, 2002.

Advances in

Operations Research Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

Advances in

Decision Sciences Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

Journal of

Applied Mathematics

Algebra

Hindawi Publishing Corporation http://www.hindawi.com


Volume 2014

Journal of

Probability and Statistics Volume 2014

The Scientific World Journal Hindawi Publishing Corporation http://www.hindawi.com


Volume 2014

International Journal of

Differential Equations Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

Volume 2014

Submit your manuscripts at http://www.hindawi.com International Journal of

Advances in

Combinatorics Hindawi Publishing Corporation http://www.hindawi.com

Mathematical Physics Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

Journal of

Complex Analysis Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

International Journal of Mathematics and Mathematical Sciences

Mathematical Problems in Engineering

Journal of

Mathematics Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014


Volume 2014

Volume 2014


Volume 2014

Discrete Mathematics

Journal of

Volume 2014


Discrete Dynamics in Nature and Society

Journal of

Function Spaces Hindawi Publishing Corporation http://www.hindawi.com


Volume 2014


Volume 2014


Volume 2014

International Journal of

Journal of

Stochastic Analysis

Optimization



Volume 2014

Volume 2014