A sequential cutting plane algorithm for solving ...

0 downloads 0 Views 376KB Size Report
In this paper we look at a new algorithm for solving convex nonlinear programming optimization problems. The algorithm is a cutting plane-based method, where ...
European Journal of Operational Research 173 (2006) 444–464 www.elsevier.com/locate/ejor

Continuous Optimization

A sequential cutting plane algorithm for solving convex NLP problems Claus Still a

a,*

, Tapio Westerlund

b

Department of Mathematics, A˚bo Akademi University, Fa¨nriksgatan 3B, FIN-20500 A˚bo, Finland Process Design Laboratory, A˚bo Akademi University, Biskopsgatan 8, FIN-20500 A˚bo, Finland

b

Received 25 September 2002; accepted 11 February 2005 Available online 14 April 2005

Abstract In this paper we look at a new algorithm for solving convex nonlinear programming optimization problems. The algorithm is a cutting plane-based method, where the sizes of the subproblems remain fixed, thus avoiding the issue with constantly growing subproblems we have for the classical Kelleys cutting plane algorithm. Initial numerical experiments indicate that the algorithm is considerably faster than Kelleys cutting plane algorithm and also competitive with existing nonlinear programming algorithms.  2005 Elsevier B.V. All rights reserved. Keywords: Convex programming; Nonlinear programming; Cutting plane algorithms; Sequential linear programming

1. Introduction Cutting plane techniques have been successfully applied for solving mixed-integer nonlinear programming (MINLP) problems. The Extended Cutting Plane (a-ECP) algorithm was introduced in [17] for solving convex MINLP problems. Later, the algorithm was extended to pseudo-convex problems with pseudo-convex constraints and a convex objective function in [18,14] and was further improved to solve problems with pseudoconvex objective functions in [12].

*

Corresponding author. Tel.: +358 40 7393709. E-mail addresses: claus.still@abo.fi (C. Still), tapio.westerlund@abo.fi (T. Westerlund).

0377-2217/$ - see front matter  2005 Elsevier B.V. All rights reserved. doi:10.1016/j.ejor.2005.02.045

C. Still, T. Westerlund / European Journal of Operational Research 173 (2006) 444–464

445

There are, however, some well-known shortcomings in the original Kelleys cutting plane algorithm [9], the foundation of the above-mentioned algorithm. Convergence of the algorithm for nonlinear programming (NLP) problems is slow and the linear programming (LP) problems grow in size as more and more linearizations are added to the problems in each iteration. One obvious way to improve the convergence properties of the Extended Cutting Plane algorithm is to look for more efficient cutting plane methods for solving NLP problems. In this paper, we describe one such algorithm, which we call the Sequential Cutting Plane (SCP) algorithm. Numerical experiments with the SCP algorithm indicate that it is considerably faster than the original Kelleys cutting plane algorithm when measured by the number of LP subproblems solved. This is a fair measure of the performance of these algorithms for problems where the objective and constraints are easy to evaluate. Typically, we would have explicit expressions defined for the objective and constraints in the optimization problems we are solving. In some cases the constraints may, however, be time-consuming to evaluate. For instance, the optimization problems may have constraints where evaluating the constraints and gradients of the constraints requires solutions to systems of partial differential equations [8], which may consume most of the time used in the optimization process. For these problems, the number of nonlinear function evaluations gives a better indication of the performance of the algorithm. Additionally, the nature of the LP subproblems solved in Kelleys cutting plane algorithm and in the SCP algorithm is different. The LP subproblems grow in each iteration in Kelleys cutting plane algorithm as new linearizations are added to the LP subproblems. Eventually, the LP subproblems become very big and slow to solve. In the SCP algorithm the maximum size of the LP subproblems is limited, which further adds to the competitive advantage of the SCP algorithm over Kelleys cutting plane algorithm. The positive performance results obtained in the numerical experiments indicate that the algorithm could be used as a base for a new MINLP optimization algorithm. In another paper [15], the authors present an MINLP version of the algorithm with promising numerical results compared to existing MINLP algorithms. 2. Overview We consider minimization of the standard nonlinear optimization problem min

f ðxÞ;

s:t:

gðxÞ 6 0; x 2 Rn

ðNLPÞ

and make the following assumptions throughout the paper: Assumption 1. The functions f : Rn ! R and gj : Rn ! R, j = 1, . . ., m, are continuously differentiable and convex. Assumption 2. The constraints g(x) 6 0 include linear constraints defining a bounded region X. Assumption 3. There exists an x 2 Rn such that gj ðxÞ < 0; 8j ¼ 1; . . . ; m. We attempt to solve this problem using an algorithm where we solve a sequence of LP problems of limited size. The maximum size of the LP problems is limited by m + n  1 where m constraints are linearizations of the constraints of the original problem (NLP) and up to n  1 constraints are additional equality constraints formed in order to reduce the set of feasible solutions and force the iterate closer to the feasible region of (NLP). Convergence of the algorithm is proved using a merit function that is reduced in each iteration of the algorithm.

446

C. Still, T. Westerlund / European Journal of Operational Research 173 (2006) 444–464

3. Algorithm In this section we take a closer look at the proposed algorithm. The algorithm proceeds by taking a number of NLP iterations until the optimal solution is found. In an NLP iteration, a sequence of LP subiterations is performed. In each LP subiteration a linear programming problem is solved and a line search performed in the search direction obtained as the solution to the LP problem. Each linear programming problem is formed by constructing cutting planes based on the current iterate. The LP subiterations are performed until some subiteration termination criterion is met. The obtained iterate is required to reduce a merit function sufficiently. If not, a new iterate that reduces the merit function sufficiently is generated. This process is continued until an optimal solution is found. A summary of the algorithm is given in Section 3.4. 3.1. LP subiterations In an LP subiteration we solve an LP problem formed by constructing cutting planes in the current iterate x(i), where (i) denotes the LP subiteration within the NLP iteration. The LP problem we are solving in step (i) is T

min

rf ðxðiÞ Þ d;

s:t:

rgj ðxðiÞ ÞT d 6 gj ðxðiÞ Þ; ðrÞ T

ðiÞ

ðd Þ H d ¼ 0;

j ¼ 1; . . . ; m;

ðLPðiÞ Þ

r ¼ 1; . . . ; i  1; i > 1;

d L 6 d 6 d U; where H(i) is an estimate of the Hessian of the Lagrangian of (NLP) and d (r), r = 1, . . ., i  1, are previously obtained search directions within the NLP iteration. Moreover, d L (0) are lower and upper bounds on d respectively. We call the solution to this problem d (i). For the first LP subiteration we initialize x(1) = xk, where xk is the iterate in NLP iteration k. 3.1.1. Line search The solution d(i) to LP(i) provides a search direction. We use this search direction to minimize the augmented Lagrangian function m m X X þ þ 2 ~ kÞ ¼ f ðxÞ þ Lðx; kj gj ðxÞ þ q ðgj ðxÞ Þ ; j¼1

j¼1

where k is an estimate of the Lagrange multipliers for the problem (NLP), gj(x)+ = max(gj(x), 0) and q (>0) is a penalty parameter. ~ðiÞ ðxÞ ¼ Lðx; ~ kðiÞ Þ, where k(i) is the estimate of the Lagrange multipliers in LP subiteration (i). The Let L line search then looks for the minimum ~ðiÞ ðxðiÞ þ ad ðiÞ Þ aðiÞ ¼ arg min L 06a61

and we let x(i+1) = x(i) + a(i)d(i), where a(i) is the solution from the line search. We require a(i) 2 [0, 1] in order to keep the iterate within the trust region or, in other words, to ensure that dL 6 a(i)d(i) 6 dU. 3.1.2. Subiteration termination criteria After finding x(i+1) in the line search, we repeat the above described procedures in the next LP subiteration with x(i+1) as the starting point. In other words, we set i :¼ i + 1, form a new LP problem LP(i) and use the search direction d(i), obtained as the optimal solution to LP(i), in the line search outlined in Section 3.1.1 to obtain the next iterate.

C. Still, T. Westerlund / European Journal of Operational Research 173 (2006) 444–464

447

We continue this loop until a subiteration termination criterion is met. We terminate the LP subiterations if one of the following criteria is met: • When i > n, where n = dim(x). • When kd(i)k < d, where the tolerance d determines when the search direction d(i) is sufficiently close to zero. • When LP(i) is infeasible.

3.2. NLP iteration In each NLP iteration, a sequence of LP subiterations is performed starting from the current NLP iterate xk, where k is the number of the NLP iteration. The resulting iterate from the LP subiterations is required to reduce a merit function sufficiently. If not, a new iterate that reduces the merit function sufficiently must be generated. The new NLP iterate xk+1 can then be used as the starting point for the next NLP iteration. We continue the process until we find an optimal solution to the optimization problem (NLP). The merit function used is m X MðxÞ ¼ f ðxÞ þ cj gj ðxÞþ ; j¼1

where cj has been chosen to satisfy cj > kj for every estimate of the Lagrange multipliers for (NLP). In order to accept xk+1, we require that Mðxkþ1 Þ 6 Mðxk þ ak d k Þ

ð1Þ

and that Mðxk þ ak d k Þ  Mðxk Þ 6 1  r; r6 ak Dd k Mðxk Þ

  1 r 2 0; ; 2

ð2Þ

where DdM is the directional derivative of M in direction d, i.e. m X T þ Dd MðxÞ ¼ rf ðxÞ d þ cj Dd gj ðxÞ ; j¼1

8 T > < rgj ðxÞ d; þ Dd gj ðxÞ ¼ maxðrgj ðxÞT d; 0Þ; > : 0;

if gj ðxÞ > 0; if gj ðxÞ ¼ 0; otherwise:

Here d k is the search direction obtained in the first LP subiteration for NLP iteration k and ak the result from the corresponding line search i.e. d k = d (1) and ak = a(1). We refer to the test as the modified Goldstein rule. 3.2.1. Finding acceptable iterates The modified Goldstein rule described above is based on the Goldstein test for differentiable functions, see [10], where we have replaced the derivative with the directional derivative Dd k M. In case the iterate xk+1, obtained from the LP subiterations, does not cause a sufficient decrease in the merit function, the algorithm has to find a new iterate that does. A new iterate can be generated by restarting the algorithm from xk and minimizing the merit function M in the search direction d (1) obtained in the first LP subiteration as the solution to LP(1). In other words, let

448

C. Still, T. Westerlund / European Journal of Operational Research 173 (2006) 444–464

aM ¼ arg min Mðxk þ ad ð1Þ Þ 06a61

and xkþ1 ¼ xk þ aM d ð1Þ : We prove in Theorem 1 that the search direction obtained in the first LP subiteration is a descent direction for the merit function. Thus, using this strategy, the iterate will cause a sufficient decrease of the merit function. 3.3. Termination criteria In order to test whether the current iterate is optimal or not, the algorithm will check the feasibility of the iterate, a complementary slackness condition and that the gradient of the Lagrangian is zero. In other words, the iterate is considered optimal when all of the following criteria are met: (1) gj(x) 6 0, j = 1, . . ., m (feasibility). (2) kjgj(x) = P 0, j = 1, . . ., m (complementary slackness). m (3) rf ðxÞ þ j¼1 kj rgj ðxÞ ¼ 0 (gradient of Lagrangian). Note that these are the Karush–Kuhn–Tucker first-order optimality conditions of a solution to (NLP). The algorithm will therefore terminate at optimal solutions to the problem (NLP). Alternative requirements for optimality would be to require feasibility of the current NLP iterate and that the optimal value of LP(1) is zero, i.e. that rf ðxÞT d ð1Þ ¼ 0 for the optimal solution d (1) to LP (1). We prove in Lemma 1 that these requirements are sufficient for x to be an optimal solution to (NLP). 3.4. SCP algorithm We summarize the SCP algorithm in Algorithm 1. 3.5. Illustration of an NLP iteration Next we illustrate an NLP iteration. The first LP subiteration is illustrated in Fig. 1. We have assumed that we have a nonlinear optimization problem min

f ðxÞ;

s:t:

g1 ðxÞ 6 0; x 2 R2 ;

where both f and g1 are nonlinear functions in R2. Assume we have already done k NLP iterations and the current iterate is xk. We then start a new LP subiteration from the current NLP iterate xk by initializing x(1) = xk and H(1) = Hk. We then generate LP(1):

C. Still, T. Westerlund / European Journal of Operational Research 173 (2006) 444–464

min

rf ðxð1Þ ÞT d;

s:t:

rg1 ðxð1Þ Þ d 6 g1 ðxð1Þ Þ;

449

T

L

U

ðLPð1Þ Þ

d 6d6d ; d 2 R2 :

Algorithm 1 (Pseudo-code for the SCP algorithm). 1. Initialize: H1 = I, k = 1, x1 = initial starting point. 2. Do LP subiterations. 2.1 Initialize: i = 1, x(1) = xk, H(1) = Hk. 2.2 Generate LP(i) and solve it to obtain search direction d(i) and Lagrange multiplier estimates k(i) (dual optimal solution to LP(i)). 2.3 If LP(i) infeasible Then If first LP problem solved (i = 1) Then increase bounds on d (Section 3.6.2) and go to 2.2. Else exit LP subiteration (go to 3.). 2.4 If first LP problem (i = 1) Then check termination criteria (Section 3.3) and exit algorithm if optimal solution found. 2.5 If kd (i)k < d Then exit LP subiterations (go to 3.). ~ðiÞ ðxðiÞ þ ad ðiÞ Þ. 2.6 Perform line search to minimize Lagrangian L (i + 1) (i) (i) (i) Let x =x +a d . 2.7 Update the Hessian of the Lagrangian using exact Hessian information (if available) or some approximation (Section 3.6.5). Call it H(i+1). 2.8 Let i :¼ i + 1. If i > n Then exit LP subiterations (go to 3.). Else go to 2.2. 3. Store the current iterate and Hessian estimate from the LP subiterations, i.e. xk+1 = x(i) and Hk+1 = H(i). 4. If not sufficient decrease in merit function (Section 3.2) Then find a new iterate xk+1 that causes a sufficient decrease (Section 3.2.1). 5. If maximum number of allowed NLP iterations not exceeded Then let k :¼ k + 1 and go to 2. Else failure. Maximum number of NLP iterations exceeded. We solve this LP problem and get d (1) as the optimal solution to the problem. The corresponding optimal value of the dual variable of the linearization of g1 in the LP problem is used as the Lagrange multiplier ~ð1Þ starting from x(1) in the direction d (1). We call the solution to the line search estimate k(1). We minimize L a(1). We then take the step x(2) = x(1) + a(1)d (1). Finally, we update the Hessian estimate using the BFGS update formula and obtain a new estimate H(2). In the next step (see Fig. 2) we form a similar LP problem as in step 1 with the exception that an equality constraint T

ðd ð1Þ Þ H ð2Þ d ¼ 0

450

C. Still, T. Westerlund / European Journal of Operational Research 173 (2006) 444–464

Fig. 1. Subiteration 1 of the NLP iteration.

is included in the new LP problem: T

min

rf ðxð2Þ Þ d;

s:t:

rg1 ðxð2Þ ÞT d 6 g1 ðxð2Þ Þ; ðLPð2Þ Þ

T

ðd ð1Þ Þ H ð2Þ d ¼ 0; d L 6 d 6 d U; d 2 R2 : Here H(2) is the current estimate of the Hessian of the Lagrangian of (NLP).

Fig. 2. Subiteration 2 of the NLP iteration.

C. Still, T. Westerlund / European Journal of Operational Research 173 (2006) 444–464

451

The equality constraint forces the solution towards the feasible region and towards the optimal solution f* of the original problem (NLP). The latter is due to the fact that a conjugate direction is chosen with respect to the estimate of the Hessian of the Lagrangian of (NLP). The solution to the problem is called d (2) and the solution to the line search a(2). After taking the step (3) x = x(2) + a(2)d (2), we have exceeded the number of allowed steps within the NLP iteration, and we stop at x(3). We then ensure that x(3) reduces the merit function M sufficiently and perform new NLP iterations as described above until the optimal solution is found. 3.6. Discussion Below we discuss the various parts of the algorithm in more detail in order to show the role of each part in the algorithm. 3.6.1. LP problem We start by looking at the components of the LP problem LP(i). The constraints in the LP problem are based on cutting planes formed using linear estimates of the nonlinear constraints from (NLP) in the current iterate x(i). rgj ðxðiÞ ÞT d 6 gj ðxðiÞ Þ;

j ¼ 1; . . . ; m:

ð3Þ

The objective function is similarly a linearization of the objective function of (NLP) in the point x(i). To see that the constraints are indeed cutting planes, consider the cutting planes from Kelleys cutting plane algorithm for x(i), gj ðxðiÞ Þ þ rgj ðxðiÞ ÞT ðx  xðiÞ Þ 6 0;

j ¼ 1; . . . ; m:

By substituting d = x  x(i) and rearranging, we obtain (3). Thus, we are sequentially generating new cutting planes in each LP subiteration. The difference to Kelleys cutting plane algorithm is that the problem is formulated in terms of a search direction d from the current point x(i), rather than in terms of x directly. Another difference to Kelleys cutting plane algorithm is that the old linearizations are not preserved from previous subiterations. Rather, for i > 1, the solution to the LP problem must be a conjugate search direction to the previously obtained solutions d (r), r = 1, . . ., i  1, within the NLP iteration. Conjugacy is required with respect to the Hessian of the Lagrangian of (NLP). In other words, T

ðd ðrÞ Þ H ðiÞ d ¼ 0;

r ¼ 1; . . . ; i  1

ð4Þ

when i > 1. Here H(i) is an estimate of the Hessian of the Lagrangian. These constraints are easily incorporated into the LP subproblem as linear equality constraints. Finally, the lower and upper bounds (dL and dU) on d, where dL < 0 and dU > 0, are used to avoid unbounded solutions for the LP problem. Note that the choice of dL and dU may affect the convergence speed of the algorithm. If they are chosen too small, the initial steps of the algorithm may be too short. On the other hand, if they are chosen too large, the convergence near the optimal solution may be affected. Current numerical experience show that the convergence speed is not very sensitive to the choice of dL and dU as long as they do not restrict the initial steps in the algorithm. The tolerances used in the numerical experiments are listed in Section 5.2. The conjugacy constraints (4) are very important to speed up the convergence of the algorithm. Note that these constraints are not needed to prove convergence, they are rather needed to improve the numerical properties of the algorithm. Consider again the example given in the previous section. Without the conjugacy constraint, the solutions to the LP problems would be at corner points of the feasible region. Thus, the

452

C. Still, T. Westerlund / European Journal of Operational Research 173 (2006) 444–464

Fig. 3. Search direction in step 2 without conjugacy constraint.

search direction in step 2 of the numerical example would also be at a corner point, see Fig. 3. The conjugacy constraints help improve the search direction by utilizing the previously obtained search directions in the NLP iteration. 3.6.2. Infeasible LP problems If the first LP problem LP(1) solved in the NLP iteration is infeasible, then the LP problem has to be modified to produce a feasible solution. By Assumption 3, problem (NLP) always has feasible solutions. Since the constraints in (NLP) are convex, linearizations of the constraints underestimate the original constraints. Therefore, for a given iterate x(1), we may ensure that we get a feasible solution to LP(1) by increasing the simple bounds on d sufficiently by, for instance, iteratively doubling the bounds d L :¼ 2d L

and

d U :¼ 2d U

and solving LP(1) again until we get a feasible solution. Boundedness of dL and dU is implied by the fact that the feasible region of (NLP) is bounded. In practice, by choosing large enough bounds in the beginning of the algorithm, we may avoid infeasible LP(1) problems entirely. Indeed, the situation never occured for the test problems we solved with the algorithm. Note that when LP(i), i > 1, is infeasible we simply terminate the subiterations. In this case we do not need to update the bounds on d. Particularly the conjugacy constraints may be the reason that LP(i) is infeasible when i > 1. 3.6.3. Line search In order to prove convergence of the algorithm to the optimal solution, the merit function M defined in Section 3.2 must be sufficiently reduced in each NLP iteration. The merit function could be used in the line searches of each LP subiteration as well, but in order to improve the numerical convergence properties of ~ is used. The function L ~ is based on the Lagrangian and is augmented with a the algorithm, P the function L m þ 2 penalty term q j¼1 ðgj ðxÞ Þ to avoid getting too far from the feasible set for those cases where we have poor Lagrange multiplier estimates.

C. Still, T. Westerlund / European Journal of Operational Research 173 (2006) 444–464

453

~ Fig. 4. Line searches based on M and L.

~ is used in the line searches rather than M, consider Fig. 4. When near the feasible set, To understand why L using the merit function M in the line searches may cause the algorithm to take small steps, particularly when ~ is based on an estimate of the Lagrangian made in the current iterate the parameter c is large. The function L and will therefore allow the algorithm to take longer steps also when we are far from the optimal solution. 3.6.4. Lagrange multiplier estimates Estimation of the Lagrange multipliers in the SCP algorithm is straigthforward. From the solution to LP(i), the optimal values of the dual variables of the linearizations of g can be used as Lagrange multiplier estimates. The dual variables correspond to the Lagrange multipliers assuming the constraints of (NLP) are linear or equivalently that the gradients of the constraints are fixed. Thus, the closer we are to the optimal solution, the better these estimates will be as the changes in the gradients are small close to the linearization point. 3.6.5. Hessian estimates If exact Hessian information for the constraints is not available, a Hessian estimate of the Lagrangian of (NLP) must be obtained. The Hessian is needed in order to calculate (d (r))TH(i) for the conjugacy constraints in the LP subproblems. Note that the Hessians of the constraints need not be estimated, it is sufficient to estimate the Hessian of the Lagrangian of (NLP). In our implementation of the algorithm, we have used the BFGS update formula, defined by H ðiþ1Þ ¼ H ðiÞ 

1 T ðsðiÞ Þ H ðiÞ sðiÞ

T

H ðiÞ sðiÞ ðsðiÞ Þ H ðiÞ þ

1 T ðy ðiÞ Þ sðiÞ

T

y ðiÞ ðy ðiÞ Þ :

See, for instance, [6] for more information. Here s(i) denotes the change in x, sðiÞ ¼ xðiþ1Þ  xðiÞ ; and y(i) denotes the change in the gradient of the Lagrangian, y ðiÞ ¼ rLðxðiþ1Þ ; kðiÞ Þ  rLðxðiÞ ; kðiÞ Þ; where Lðx; kÞ ¼ f ðxÞ þ

m X

kj gj ðxÞ:

j¼1

~ kÞ is estimated, since L(x, k) is continuously differentiable Note that the Hessian of L(x, k) instead of Lðx; with respect to x.

454

C. Still, T. Westerlund / European Journal of Operational Research 173 (2006) 444–464

At the end of an NLP iteration, the current Hessian estimate is used to initialize the Hessian in the next NLP iteration. In other words, H(1) = Hk, where Hk is the Hessian estimate from the previous NLP iteration. In the beginning of the SCP algorithm we initialize H1 = I. Other approximation schemes than the BFGS update formula could be used as well. Note that the algorithm does not require that the approximation of the Hessian be positive definite. Note also that we could calculate the Hessian of L(x, k) directly if exact Hessian information is available for the objective and constraints. 3.6.6. Estimating the parameter c Unfortunately, a value of c that is large enough is typically not known in advance. In an implementation of the algorithm, some scheme that estimates c must be used. In our implementation of the algorithm, the estimate of c is based on the Lagrange multiplier estimates obtained in the LP subiterations. We let c be the maximum of all previously obtained Lagrange multiplier estimates and update c whenever we get new Lagrange multiplier estimates that are larger than the current value of c. The initial estimates of c should be chosen small in order to guarantee good numerical performance for the algorithm. If c is chosen too large, the algorithm will not allow the iterate to get further away from the feasible region, although this can be beneficial in the initial stages of the optimization process when the iterate is close to the feasible region but far away from the optimal solution. Consider again Fig. 4.

4. Convergence proof In this section we prove that the described algorithm converges to the optimal solution when applied to problems of the form (NLP) for which Assumptions 1–3 hold. To prove convergence, we show the following: (1) In any NLP iteration, the direction d(1), obtained as a solution to LP(1), is either an optimal solution to (NLP) or a descent direction for the merit function M. (2) Take a limit point x of the sequence {xk} generated by the SCP algorithm. Then x is an optimal solution to (NLP). The first item ensures that the algorithm will reduce a merit function in each iteration until an optimal solution is found. The next item shows that for cases where the algorithm does not terminate, a limit point will be an optimal solution to the optimization problem (NLP). 4.1. Descent direction We begin by showing that the search direction we obtain as a solution to the LP problem LP(1) is a descent direction for the merit function M(x). We start with the following lemma. Lemma 1. Assume that the current iterate is x(1) and that x(1) is feasible in (NLP). Assume further that the optimal value of LP(1) generated in the point x(1) has an optimal value 0, i.e. that $f(x(1))Td (1) = 0 for an optimal solution d (1) to LP(1). Then x(1) is an optimal solution to (NLP). Proof. Assume there is an x such that gj ðxÞ 6 0; j ¼ 1; . . . ; m, and f ðxÞ < f ðxð1Þ Þ. Since f is convex, we have f ðxð1Þ Þ þ rf ðxð1Þ ÞT ðx  xð1Þ Þ 6 f ðxÞ

C. Still, T. Westerlund / European Journal of Operational Research 173 (2006) 444–464

455

T

and thus rf ðxð1Þ Þ ðx  xð1Þ Þ < 0 from our assumption that f ðxÞ < f ðxð1Þ Þ. Similarly, since gj is convex and gj ðxÞ 6 0 for all j = 1, . . ., m, we have gj ðxð1Þ Þ þ rgj ðxð1Þ ÞT ðx  xð1Þ Þ 6 gj ðxÞ 6 0;

j ¼ 1; . . . ; m;

and consequently T

rgj ðxð1Þ Þ ðx  xð1Þ Þ 6 gj ðxð1Þ Þ;

j ¼ 1; . . . ; m:

Let d ¼ x  xð1Þ . Then d is feasible in LP(1) and rf ðxð1Þ ÞT d < 0 contradicting the fact that d (1) was an optimal solution to LP(1) with $f(x(1))Td (1) = 0. h We are now ready to prove that the search directions are descent directions for the merit function M(x). Theorem 1. Let x(1) be the current iterate and d (1) be an optimal solution to LP(1), i.e. to the problem min

rf ðxð1Þ ÞT d;

s:t:

rgðxð1Þ ÞT d 6 gðxð1Þ Þ;

ð5aÞ

U

ð5bÞ

d  d 6 0; L

d  d 6 0: g

U

ð5cÞ L

(1)

Let further k , k and k be the corresponding Lagrange multipliers of LP for the constraints (5a), (5b) and (5c) respectively. If cj > kgj ; j ¼ 1; . . . ; m, then either the vector d (1) is a descent direction for the merit function m X cj gj ðxÞþ MðxÞ ¼ f ðxÞ þ j¼1

or x

(1)

is an optimal solution to (NLP).

Proof. We consider three different cases: (1) gj(x(1)) > 0 for some j 2 {1, . . ., m}. (2) gj(x(1)) 6 0, j = 1, . . ., m and $f(x(1))Td (1) < 0. (3) gj(x(1)) 6 0, j = 1, . . ., m and $f(x(1))Td (1) = 0. First assume that gj(x(1)) > 0 for some j 2 {1, . . ., m}. For a ! 0+ we have Mðxð1Þ þ ad ð1Þ Þ ¼ f ðxð1Þ þ ad ð1Þ Þ þ

m X

cj gj ðxð1Þ þ ad ð1Þ Þþ

j¼1 T

¼ f ðxð1Þ Þ þ arf ðxð1Þ Þ d ð1Þ þ

m X

T

þ

cj ½gj ðxð1Þ Þ þ argj ðxð1Þ Þ d ð1Þ  þ oðaÞ

j¼1

¼ f ðxð1Þ Þ þ arf ðxð1Þ ÞT d ð1Þ þ

m X

cj gj ðxð1Þ Þþ þ a

j¼1 ð1Þ

ð1Þ T ð1Þ

¼ Mðx Þ þ arf ðx Þ d

þa

X

X

cj rgj ðxð1Þ ÞT d ð1Þ þ oðaÞ

j:gj ðxð1Þ Þ>0 T

cj rgj ðxð1Þ Þ d ð1Þ þ oðaÞ

j:gj ðxð1Þ Þ>0 T

¼ Mðxð1Þ Þ þ arf ðxð1Þ Þ d ð1Þ þ a

X

j:gj

ðxð1Þ Þ>0

T

ðcj þ kgj  kgj Þrgj ðxð1Þ Þ d ð1Þ þ oðaÞ

456

C. Still, T. Westerlund / European Journal of Operational Research 173 (2006) 444–464

X

¼ Mðxð1Þ Þ þ a½rf ðxð1Þ Þ þ X

þa

j:gj

j:gj ðxð1Þ Þ>0 T

ðcj  kgj Þrgj ðxð1Þ Þ d ð1Þ þ oðaÞ

ðxð1Þ Þ>0

¼ Mðxð1Þ Þ þ a½rf ðxð1Þ Þ þ X

a

j:gj

T

kgj rgj ðxð1Þ Þ d ð1Þ

m X

kgj rgj ðxð1Þ ÞT d ð1Þ þ a

kgj rgj ðx Þ d

ðcj  kgj Þrgj ðxð1Þ ÞT d ð1Þ

j:gj ðxð1Þ Þ>0

j¼1 ð1Þ T ð1Þ

X

þ oðaÞ:

ðxð1Þ Þ60

For the Lagrange multipliers kg, kU and kL we have m X kgj rgj ðxð1Þ Þ þ kU  kL ¼ 0 rf ðxð1Þ Þ þ j¼1

and thus Mðxð1Þ þ ad ð1Þ Þ ¼ Mðxð1Þ Þ þ aðkL  kU ÞT d ð1Þ þ a a

X j:gj

X

ðcj  kgj Þrgj ðxð1Þ ÞT d ð1Þ

j:gj ðxð1Þ Þ>0 T

kgj rgj ðxð1Þ Þ d ð1Þ þ oðaÞ:

ð6Þ

ðxð1Þ Þ60

If some d rð1Þ ; r 2 f1; . . . ; ng, is at the lower bound, then the Lagrange multiplier kU r of the corresponding upper bound is zero. If, again, some d ð1Þ r ; r 2 f1; . . . ; ng, is at the upper bound, then the Lagrange multiplier kLr of the corresponding lower bound is zero. Note also that the lower bounds are less than zero and the upper bounds greater than zero. Thus, for r 2 {1, . . ., n}, kLr P 0

and

kU whenever r ¼ 0

d rð1Þ ¼ d Lr ð< 0Þ and

kU r P 0

and

kLr ¼ 0 whenever

d rð1Þ ¼ d U r ð> 0Þ:

Therefore, we have n X ð1Þ ðkLr  kU r Þd r 6 0:

ð7Þ

r¼1

Also, we know that cj  kgj > 0;

j ¼ 1; . . . ; m;

by assumption and that T

rgj ðxð1Þ Þ d ð1Þ < 0 for j : gj (x(1)) > 0 and so X ðcj  kgj Þrgj ðxð1Þ ÞT d ð1Þ < 0:

ð8Þ

j:gj ðxð1Þ Þ>0

Finally, for j : gj (x(1)) 6 0 we know that $gj (x(1))Td (1) P 0 whenever kgj > 0 since $gj (x(1))Td (1) = gj (x(1)) when constraint j in (5a) is active and gj (x(1)) 6 0. Thus, X T kgj rgj ðxð1Þ Þ d ð1Þ P 0: ð9Þ j:gj ðxð1Þ Þ60

C. Still, T. Westerlund / European Journal of Operational Research 173 (2006) 444–464

457

Using (7)–(9) in (6), we find that Mðxð1Þ þ ad ð1Þ Þ < Mðxð1Þ Þ for a sufficiently small and thus d (1) is a descent direction. Next we assume that gj (x(1)) 6 0, j = 1, . . ., m, and $f(x(1))Td (1) < 0. From the previous reasoning, we know that when a ! 0+ X T T Mðxð1Þ þ ad ð1Þ Þ ¼ Mðxð1Þ Þ þ arf ðxð1Þ Þ d ð1Þ þ a cj rgj ðxð1Þ Þ d ð1Þ þ oðaÞ j:gj ðxð1Þ Þ>0 T

¼ Mðxð1Þ Þ þ arf ðxð1Þ Þ d ð1Þ þ oðaÞ since gj (x(1)) 6 0, j = 1, . . ., m. As $f (x(1))Td (1) < 0 we find that Mðxð1Þ þ ad ð1Þ Þ < Mðxð1Þ Þ for a sufficiently small and thus d (1) is a descent direction also in this case. Finally, we assume that gj (x(1)) 6 0, j = 1, . . ., m, and $f(x(1))Td (1) = 0. We may then use Lemma 1 to show that x(1) is an optimal solution to (NLP). h 4.2. Limit points Next we look at the limit points x of any infinite sequence {xk} generated by the SCP algorithm. The algorithm will only terminate at optimal solutions satisfying Kuhn–Tucker conditions. It now remains to look at situations when the algorithm does not terminate, i.e. to show that a limit point is an optimal solution to (NLP). To begin with, we need the following lemma. Lemma 2. Assume the SCP algorithm generates an infinite sequence of points {xk} and cj > kgj ; j ¼ 1; . . . ; m, where kg is any estimate of the Lagrange multipliers for the constraints of (NLP) obtained during the optimization process. Then, for a limit point x, we have Dd MðxÞ ¼ 0. Here d is the search direction for the first LP subiteration in an NLP iteration starting from x. Proof. Let d k be the search direction for the first LP subiteration in an NLP iteration starting from xk, i.e. d k = d (1). From Theorem 1 we know that all points xk in the sequence {xk} satisfy Dd k Mðxk Þ 6 0. Assume there exists a limit point x such that Dd MðxÞ < 0. Since {M(xk)} is monotonically decreasing and M(x) continuous, {M(xk)} converges to MðxÞ. Thus, Mðxk Þ  Mðxkþ1 Þ ! 0 when k ! 1. From (1) and (2), the modified Goldstein rule, we have that M(xk+1) 6 M(xk + akdk) < M(xk) and thus Mðxk Þ  Mðxk þ ak d k Þ ! 0 when k ! 1. From (2) we have Mðxk Þ  Mðxk þ ak d k Þ P rak Dd k Mðxk Þ and thus ak Dd k Mðxk Þ ! 0. Since we assumed Dd MðxÞ < 0, we further have ak ! 0. We have Mðxk Þ  Mðxk þ ak d k Þ ¼ ak Dd k Mðxk Þ  oðak Þ:

ð10Þ

458

C. Still, T. Westerlund / European Journal of Operational Research 173 (2006) 444–464

Using (10) and the other inequality from the modified Goldstein rule we get ak Dd k Mðxk Þ  oðak Þ 6 ð1  rÞak Dd k Mðxk Þ: Dividing by ak Dd k Mðxk Þ, we get 1þ

oðak Þ 6 1  r; k d k Mðx Þ

ak D

which is a contradiction since limak !0þ oðak Þ=ak ¼ 0.

h

We are now ready to show that a limit point is an optimal solution to (NLP). Theorem 2. Assume the SCP algorithm generates an infinite sequence of points {xk}. Assume further that cj > kgj ; j ¼ 1; . . . ; m, where kg is any estimate of the Lagrange multipliers for the constraints of (NLP) obtained during the optimization process. Then, for a limit point x, the point x is an optimal solution to (NLP). Proof. From Lemma 2 we know that for a limit point x we have Dd MðxÞ ¼ 0, where d is the solution to the first LP problem, LP(1), in an NLP iteration starting from x. The limit point x must be feasible, since the algorithm would otherwise generate a descent direction (Theorem 1). Then, from Lemma 1, we know that the limit point x must be an optimal solution to (NLP). h Also note that Assumptions 1–3 guarantee that (NLP) has a finite optimal solution. From the theory of exact penalty functions we know that the optimal solutions of the unconstrained problem min s:t:

MðxÞ; x 2 Rn

ðPENÞ

will be the same as for (NLP) if c is chosen large enough. The merit function will, thus, be bounded from below. See, for instance, [1]. As the sequence {M(xk)} is monotonically decreasing it will have a limit point and, therefore, any infinite sequence generated by the algorithm will have a limit point.

5. Numerical experience In order to get numerical experience with the algorithm, an implementation of the algorithm was done in MATLAB. The CPLEX solver from ILOG was used for solving the LP problems. The MATLAB line search routine (fminbnd) was used for the line searches. The test problems used for the numerical experiments have been taken from the Schittkowski test collection [13] and from Robert Vanderbeis home page on the web [16]. Ten problems were chosen from the Schittkowski collection with sizes ranging from two variables up to 15 and 10 larger problems from [16]. Problem characteristics are listed in Table 1, where n is the number of variables, m the total number of constraints and mn the number of nonlinear constraints. The performance of the SCP algorithm was compared to Kelleys cutting plane algorithm for test problems 1–10 and to some commercially available NLP solvers for all test problems 1–20. The a-ECP package [19] was used as the implementation for Kelleys cutting plane algorithm. Note that the a-ECP package is targeted for MINLP problems. However, when there are no integer variables in the optimization problems and all nonlinear functions are convex, the algorithm reduces to the classical Kelleys cutting plane algorithm. Therefore, the package can be used in a comparison with the SCP algorithm. Results from the comparison to Kelleys cutting plane algorithm are given in Table 3. Descriptions of the headers in the tables are given in Table 2.

C. Still, T. Westerlund / European Journal of Operational Research 173 (2006) 444–464

459

Table 1 Problem characteristics for test problems 1–20 #

Problem

n

m

mn

Objective

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

S218 S227 S264 S284 S285 S323 S354 S383 S384 S385 Antenna Antenna_socp Fir_convex Fir_exp Grasp Grasp_vareps Nb_L1_eps Nb_L2 Trafequil Trafequilsf

2 2 4 15 15 2 4 14 15 15 49 49 11 12 16 17 122 122 722 856

1 2 3 10 10 2 1 1 10 10 166 166 243 244 16 16 795 840 361 309

1 2 3 10 10 1 0 0 10 10 156 156 91 1 5 5 793 838 0 0

Linear Nonlinear Nonlinear Linear Linear Nonlinear Nonlinear Nonlinear Linear Linear Linear Linear Linear Linear Linear Linear Nonlinear Nonlinear Nonlinear Nonlinear

Table 2 Description of the headers used in the tables Header

Description

# #Iter #J #LP #QP #Rows #Miniter filterSQP Func Grad Kelley(g > )

Order number of the problem Number of iterations required to solve the problem (NLP iterations for the SCP algorithm) Number of times the Jacobian was evaluated Number of LP problems solved Number of QP problems solved Number of constraints in the largest LP problem solved Number of minor iterations for MINOS Results for the filterSQP algorithm Number of function evaluations Number of gradient evaluations Results for Kelleys cutting plane algorithm Linearizations added for all violated constraints Results for Kelleys cutting plane algorithm Linearization added for the most violated constraint Results for the LANCELOT algorithm Number of constraints of the problem Number of nonlinear constraints of the problem Results for the MINOS algorithm Number of variables of the problem Type of the objective function (linear/nonlinear) Name of the optimization problem Results for the Sequential Cutting Plane algorithm Total number of function evaluations under assumption that the gradients are evaluated numerically, for instance using forward differences (Sum = Func + Grad Æ n)

Kelley(max(g)) LANCELOT m mn MINOS n Objective Problem SCP Sum

460

C. Still, T. Westerlund / European Journal of Operational Research 173 (2006) 444–464

We also compared the SCP algorithm to other NLP solvers, namely the solvers filterSQP [5], LANCELOT [2] and MINOS [11]. The NEOS server [3,4] was used to solve the test problems with these solvers. Results are given in Table 5. 5.1. Considerations for the a-ECP package In order to use the a-ECP package, some modifications had to be done to the optimization problems. (1) The a-ECP package requires the problems to be constrained with simple bounds for all variables. Lower and upper bounds were introduced in the problems when they were missing and set at 10 and 10. Consequently, the starting point for problem S218 was changed from x = (9, 100) to x = (9, 10) to fit within the bounds. (2) If nonlinearities were present in the objective, the problem was rewritten such that the objective was moved to the constraints. In other words, the problem min

f ðxÞ;

s:t:

gðxÞ 6 0; x 2 Rn ;

was rewritten min s:t:

z; f ðxÞ  z 6 0; gðxÞ 6 0; x 2 Rn :

(3) The a-ECP package could not solve problem S383 in its original form due to numerical difficulties. In order to solve the problem, the objective had to be divided by 10 000 and a lower bound of xL = (0.001, . . ., 0.001) was used instead of xL = (0, . . ., 0). The SCP algorithm and the other NLP solvers we tested used the original problem formulations except for the starting point for S218 and the objective scaling for S383. These were changed to be the same as for a-ECP. 5.2. Algorithm tolerances A tolerance g = 103 was used for the SCP algorithm to determine whether the current iterate was feasible or not, i.e. the iterate was considered feasible if gj(x) 6 g, j = 1, . . ., m. In addition a tolerance L = 103 was used to determine whether the current Lagrangian gradient was close enough to zero, i.e. to test k$L(x, k)k 6 L. Furthermore, we used q = 101 as the penalty parameter and dL = (100, . . ., 100) and dU = (100, . . ., 100) as the trust region bounds. We used the same tolerance 103 for determining feasibility and optimality in the other algorithms. Thus, a tolerance g = 103 was used in the a-ECP algorithm. For the algorithms on the NEOS server we used the option ‘‘eps = 1E3’’ for filterSQP, the options ‘‘ctol = 1E3, gtol = 1E3’’ for LANCELOT and the options ‘‘feasibility_tolerance = 1E3, optimality_tolerance = 1E3’’ for MINOS. The linearization strategy used in the a-ECP algorithm was either to add linearizations in each iteration for the most violated constraint or for all non-feasible constraints gj(x) > g, j = 1, . . ., m. These strategies are denoted Kelley(max(g)) and Kelley(g > ) in Table 3.

C. Still, T. Westerlund / European Journal of Operational Research 173 (2006) 444–464

461

Table 3 Test results for test problems 1–10 Problem

S218 S227 S264 S284 S285 S323 S354 S383 S384 S385

Kelley(max(g))

Kelley(g > )

SCP

#Iter

#LP

#Rows

#Iter

#LP

#Rows

#Iter

#LP

#Rows

10 21 65 1090 297 12 109 217 322 337

11 22 66 1091 298 13 110 218 323 338

10 21 65 1090 297 12 110 218 322 337

10 9 34 317 59 8 109 217 63 69

11 10 35 318 60 9 110 218 64 70

10 27 120 1990 590 16 110 218 593 614

9 2 4 2 2 3 5 3 4 4

17 2 12 2 2 6 18 27 23 27

2 2 6 10 10 3 4 13 17 18

5.3. Numerical results This section contains the numerical results from solving the chosen optimization problems 1–10 with the SCP algorithm and Kelleys cutting plane algorithm. These results are listed in Table 3. A description of the headers in the table is given in Table 2. As can be seen from Table 3, the SCP algorithm outperformed Kelleys cutting plane algorithm in all problems except for problem S218 with regard to the number of LP problems solved (#LP). In problem S218, both versions of Kelleys cutting plane algorithm only solved 11 LP problems compared to 17 for the SCP algorithm. It is worth noting for these test problems both that the SCP algorithm generally required fewer solved LP problems and that the size of these LP problems was considerably smaller than for Kelleys cutting plane algorithm. We have assumed that the Sequential Cutting Plane algorithm described in this paper is used for solving optimization problems where evaluation of the objective and constraints is fast. That is indeed the case for all test problems listed here as both the objective and the constraints have explicit algebraic expressions. If evaluation of the constraints and objective takes more time, then the number of LP problems solved is not the correct measure of the performance of the algorithm. Instead, the number of function and gradient evaluations plays a more significant role. 5.3.1. Function and gradient evaluations We also compared the performance of Kelleys cutting plane algorithm and the Sequential Cutting Plane algorithm with respect to function and gradient evaluations. For Kelleys cutting plane algorithm we calculated the number of gradient evaluations directly from the size of the largest LP problem as each row in the LP problem corresponds to one gradient evaluation. In each iteration, each constraint must be evaluated in order to determine if it is infeasible or not. Also the objective is evaluated in each iteration. Thus the number of function evaluations can be calculated as: (number of nonlinear constraints + nonlinear objective) Æ (number of iterations). For the SCP algorithm, we added built-in counters that calculated the number of times constraints and objective functions and their gradients were evaluated. It must be noted that the algorithm has not been optimized to minimize the number of function evaluations, therefore the number could possibly be decreased slightly by modifying the algorithm. The results are listed in Table 4. The total number of function evaluations was used when comparing the algorithms. The total number of function evaluations is calculated based on the assumption that the gradients are evaluated numerically using, for instance, forward or backward differences. Thus, the total number of function evaluations

462

C. Still, T. Westerlund / European Journal of Operational Research 173 (2006) 444–464

Table 4 Function and gradient evaluations for test problems 1–10 Problem

S218 S227 S264 S284 S285 S323 S354 S383 S384 S385

Kelley(max(g))

Kelley(g > )

SCP

Func

Grad

Sum

Func

Grad

Sum

Func

Grad

Sum

10 63 260 10,900 2970 24 109 217 3220 3370

10 21 65 1090 297 12 110 218 322 337

30 105 520 27,250 7425 48 549 3269 8050 8425

10 27 136 3170 590 16 109 217 630 690

10 27 120 1990 590 16 110 218 593 614

30 81 616 33,020 9440 48 549 3269 9525 9900

572 153 1404 510 510 322 435 513 8060 10,100

17 6 48 20 20 12 19 28 240 280

606 165 1596 810 810 346 511 905 11,660 14,300

The column sum contains the total number of function evaluations under assumption that the gradients are evaluated numerically by, for instance, forward differences.

(Sum) is: Sum = Func + Grad Æ n, where Func is the number of function evaluations and Grad is the number of gradient evaluations. As expected, the total number of function evaluations was smaller for Kelleys cutting plane algorithm in more cases. It was interesting to see, though, that the SCP algorithm did not require considerably more function evaluations due to the rapid convergence of the algorithm for the test problems. If the line search had been optimized to take advantage of already calculated function values and gradients, the difference would have been even less. In fact, in several problems the SCP algorithm required significantly fewer function evaluations. Table 5 Comparison of the SCP algorithm with other NLP solvers Problem

S218 S227 S264 S284 S285 S323 S354 S383 S384 S385 Antenna Antenna_socp Fir_convex Fir_exp Grasp Grasp_vareps Nb_L1_eps Nb_L2 Trafequil Trafequilsf

SCP

LANCELOT

MINOS

#Iter

#LP

#J

#Iter

filterSQP #QP

#J

#Iter

#J

#Iter

#Miniter

9 2 4 2 2 3 5 3 4 4 2 3 1 3 7 12 7 4 5 4

17 2 12 2 2 6 18 27 23 27 43 56 1 6 45 79 585 420 41 112

17 2 12 2 2 6 19 28 24 28 44 56 2 6 42 77 585 420 41 112

9 5 9 11 c 5 8 5 11 11 c 298 2 4 c i e 4 5 4

9 5 11 19 c 5 8 6 17 19 c 621 2 4 c i e 5 6 5

10 6 8 7 c 6 9 7 7 7 c 149 3 5 c i e 6 7 6

26 10 19 630 135 8 9 9 160 133 i i 40 23 i i i 37 11 8

23 11 19 544 117 9 10 12 140 116 i i 38 24 i i i 38 13 9

20 6 12 30 25 7 1 1 30 25 i i 8 7 i 10 i i 1 1

16 7 37 188 171 8 13 45 192 165 i i 102 78 i 226 i i 352 347

Error codes: i: too many iterations, c: constraint feasibility not achieved, e: optimality not achieved.

C. Still, T. Westerlund / European Journal of Operational Research 173 (2006) 444–464

463

It can be expected that the SCP algorithm would perform better, with respect to function evaluations, for problems with a larger number of variables and constraints, as it is more expensive to evaluate the constraint gradients than doing the line search. 5.4. Comparison with other solvers We solved all test problems 1–20 with the solvers filterSQP, LANCELOT and MINOS. Since these solvers do not solve LP problems, we compared the number of times that the different algorithms evaluated the Jacobian. For filterSQP we also included the number of QP problems solved and for SCP the number of LP problems solved. The number of iterations performed are also included for informative purposes. For MINOS, we used the number of minor iterations instead of the number of Jacobian evaluations in the table. MINOS evaluates the Jacobian also in the line searches and thus the actual number of Jacobian evaluations can be quite large. We felt the number of minor iterations was a more correct measure in this case. As we can see from Table 5, the SCP algorithm performed very well in comparison to the other NLP solvers. Especially the robustness on problems 11–20 was exceptional compared to the other algorithms.

6. Conclusions In this paper we have presented a novel algorithm for solving NLP problems based on cutting plane techniques. Initial numerical experience indicates that the algorithm needs to solve considerably fewer LP problems compared to Kelleys cutting plane algorithm. The algorithm also performed well in comparison with existing NLP solvers for the selected test problems. Future work on the algorithm will include extending the algorithm to MINLP problems and more extensive testing of the NLP version of the algorithm on problems from the Schittkowski [13] or Hock and Schittkowski [7] test sets. In order to do so, the algorithm has to be extended to solve non-convex equality constrained problems as well. It would also be interesting to look more at the performance of the algorithm on large NLP problems.

Acknowledgments ˚ bo Akademi is gratefully acknowlFinancial support from the Research Institute of the Foundation of A edged. We would also like to thank the reviewers for their insightful comments that helped us improve the paper.

References [1] D.P. Bertsekas, Constrained Optimization and Lagrange Multiplier Methods, Academic Press, New York, 1982. [2] A.R. Conn, N.I.M. Gould, P.L. Toint, A globally convergent augmented Lagrangian algorithm for optimization with general constraints and simple bounds, SIAM Journal on Numerical Analysis 28 (1991) 545–572. [3] J. Czyzyk, M.P. Mesnier, J.J. More´, The NEOS server, IEEE Journal on Computational Science and Engineering 5 (1998) 68–75. [4] E.D. Dolan, The NEOS Server 4.0 Administrative Guide, Technical Memorandum ANL/MCS-TM-250, Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL, 2001. [5] R. Fletcher, S. Leyffer, User Manual for filterSQP, Numerical Analysis Report NA/181, Department of Mathematics, University of Dundee, Dundee, 1998.

464

C. Still, T. Westerlund / European Journal of Operational Research 173 (2006) 444–464

[6] P.E. Gill, W. Murray, M.H. Wright, Practical Optimization, Academic Press, London, 1981. [7] W. Hock, K. Schittkowski, Test examples for nonlinear programming codes, Lecture Notes in Economics and Mathematical Systems, vol. 187, Springer-Verlag, Berlin, 1981. [8] S. Karlsson, Optimization of a Sequential-Simulated Moving-Bed Separation Process with Mathematical Programming Methods, ˚ bo Akademi University, A ˚ bo, 2001. Ph.D. Thesis, Process Design Laboratory, A [9] J.E. Kelley, The cutting plane method for solving convex programs, Journal of SIAM 8 (4) (1960) 703–712. [10] D.G. Luenberger, Linear and Nonlinear Programming, second ed., Addison-Wesley Publishing Company, Reading, Massachusetts, 1984. [11] B.A. Murtagh, M.A. Saunders, MINOS 5.5 Users Guide, Technical Report SOL 83-20R, Stanford University, Stanford, California, 1998. [12] R. Po¨rn, T. Westerlund, A cutting plane method for minimizing pseudo-convex functions in the mixed integer case, Computers and Chemical Engineering 24 (2000) 2655–2665. [13] K. Schittkowski, More test examples for nonlinear programming codes, Lecture Notes in Economics and Mathematical Systems, vol. 282, Springer-Verlag, Berlin, 1987. [14] C. Still, T. Westerlund, Extended cutting plane algorithm, in: C.A. Floudas, P.M. Pardalos (Eds.), Encyclopedia of Optimization, vol. 2, Kluwer Academic Publishers, Dordrecht, 2001, pp. 53–61. [15] C. Still, T. Westerlund, Solving convex MINLP optimization problems using a sequential cutting plane algorithm, Computational Optimization and Applications, submitted for publication. [16] R. Vanderbei, Nonlinear optimization models, Web Home Page, . [17] T. Westerlund, F. Pettersson, An extended cutting plane method for solving convex MINLP problems, Computers and Chemical Engineering 19 (Suppl.) (1995) S131–S136. [18] T. Westerlund, H. Skrifvars, I. Harjunkoski, R. Po¨rn, An extended cutting plane method for a class of non-convex MINLP problems, Computers and Chemical Engineering 22 (1998) 357–365. [19] T. Westerlund, K. Lundqvist, Alpha-ECP Version 5.01. An Interactive MINLP-solver Based on the Extended Cutting Plane ˚ bo Akademi University, A ˚ bo, 2001. Method, Report 01-178-A, Process Design Laboratory, A

Suggest Documents