North-Holland Publishing Company. LINEARLY CONSTRAINED MINIMAX OPTIMIZATION*. Kaj MADSEN and Hans SCHJAER-JACOBSEN. Technical ...
Mathematical Programming 14 (1978) 208-223. North-Holland Publishing Company
LINEARLY
CONSTRAINED
MINIMAX
OPTIMIZATION*
Kaj MADSEN and Hans SCHJAER-JACOBSEN Technical University of Denmark, Lyngby, Denmark
Received 3 March 1977 Revised manuscript received 17 August 1977
We present an algorithm for nonlinear minimax optimization subject to linear equality and inequality constraints which requires first order partial derivatives. The algorithm is based on successive linear approximations to the functions defining the problem. The resulting linear subproblems are solved in the minimax sense subject to the linear constraints. This ensures a feasible-point algorithm. Further, we introduce local bounds on the solutions of the linear subproblems, the bounds being adjusted automatically, depending on the quality of the linear approximations. It is proved that the algorithm will always converge to the set of stationary points of the problem, a stationary point being defined in terms of the generalized gradients of the minimax objective function. It is further proved that, under mild regularity conditions, the algorithm is identical to a quadratically convergent Newton iteration in its final stages. We demonstrate the performance of the algorithm by solving a number of numerical examples with up to 50 variables, 163 functions, and 25 constraints. We have also implemented a version of the algorithm which is particularly suited for the solution of restricted approximation problems. Key words: Optimization, Linear Constraints, Minimax, Quadratic Convergence. 1. I n t r o d u c t i o n
R e c e n t y e a r s h a v e s h o w n an i n c r e a s i n g i n t e r e s t in t h e t h e o r e t i c a l a n d p r a c t i c a l a s p e c t s of n o n l i n e a r m i n i m a x o p t i m i z a t i o n . T h e m o s t c o m p r e h e n s i v e t r e a t m e n t of b o t h t h e o r e t i c a l a n d a l g o r i t h m i c a s p e c t s of c o n s t r a i n e d (and u n c o n s t r a i n e d ) m i n i m a x o p t i m i z a t i o n t o d a y is p r o b a b l y t h e b o o k o f D e m ' y a n o v a n d M a l o z e m o v [3]. T h e y s u g g e s t d i f f e r e n t a p p r o a c h e s to the s o l u t i o n of t h e g e n e r a l i z e d n o n linear programming problem, including methods of successive approximations (with line s e a r c h e s ) , p e n a l t y f u n c t i o n a p p r o a c h e s , a n d t r a n s f o r m a t i o n to nonlinear programming problems. Similar suggestions have more recently been m a d e b y o t h e r a u t h o r s [1, 11]. I n this p a p e r w e shall c o n s i d e r t h e g e n e r a l i z e d n o n l i n e a r p r o g r a m m i n g p r o b lem of m i n i m i z i n g t h e o b j e c t i v e f u n c t i o n
F(x) =---m a x fi(x)
(1)
l
0
say. Suppose
(23)
If tk --< 1 then, since P is convex, and F(Xk)
IIh,~ll>
Ilhgkll--- 1,
hk) >--F(Xk) -- F(Xk, tkh lk) >-- tk{F(Xk) -- F(Xk, hlk)}
(24)
-Ilhkll~, and if tk > 1, then (if 6~ --< 1) F(Xk) -- F(Xk, hk) >-- E >-- ~61.
(25)
Thus (20) is proved. If Ilxk - xll-< 8, and IIh~ll- 82, then ff~(Xk, hk) > F(Xk, h~k). Since F(xk, ") is a convex function the inequality Ilhkll < & would imply that hk was not the solution of (7). Thus Ilhd -- Ak and the l e m m a is proved with 6 = min{81, 82}. If the sequence of points generated by our algorithm is convergent then the limit point is a stationary point. This is a consequence of the following theorem, the proof of which is similar to one given in [4]. It can be found in [5].
Theorem 1. L e t xk, k = l, 2 . . . . . be the sequence generated by the algorithm. L e t L be the set o f stationary points of F and let d(x)=inf~eLl[~:--xll. I f the sequences {Xk} stay in a finite region, then d(Xk)--->O
for k ~ o o .
(26)
K. Madsen and H. Schjcer-Jacobsen/ Linearly constrained minimax optimization
215
Finally we p r o v e a theorem concerning the final rate of c o n v e r g e n c e of the algorithm.
Theorem 2. If the sequence of vectors {Xk} generated by the algorithm is convergent then the final rate of convergence is quadratic provided that (1) the second derivatives of the functions f~ exists and are bounded near the limit point, and (2) the Haar condition is satisfied at the limit point. Proof. We first show that if we do not use the local bounds [[hkl[~Ak in the solution of the linear problem then the iteration is identical to a quadratically convergent N e w t o n iteration ff we start close enough to the solution. N e x t we p r o v e that the local bounds have no influence in the final stages of the iteration. We use the notation hx for the solution at x of the linear problem with the local bounds left out, i.e. F(x, hx) = rain ff'(x, h).
(27)
bED
It follows f r o m T h e o r e m 1 and Proposition 2 that the limit point, x* say, is a strict local minimum. T h e r e f o r e the smoothness condition guarantees that the solution of the linear problem (27), (x + hx), is close to x* when x is close to x*, say x + hx ~ V~, w h e n e v e r x E V1. N o w let x E V~ and suppose that the linear functions and constraints which determine hx are those with indices i E I for the functions and j E J for the constraints. Since these linear functions are equal at the minimax solution hx, h = hx and h,+~ = -ff'(x, h) is a solution of the equations f i ( X ) + f~(x) T"
h + hn+l = O,
a~. (x + h) + bj = 0,
iE L
jEJ
(28)
and since any minimax solution is a stationary point we obtain that there exist 8~ - 0 and Aj with Y, 8~ = 1 and Aj - 0 for j > t such that iEl
8J~(x) = ~ Ajaj.
(29)
j~y
Because of the continuity of f~ and the H a a r condition at x* we can find a neighbourhood V2 _C V1 of x* such that any set of n vectors f r o m (29) is linearly independent when x E V2. T h e r e f o r e the n u m b e r of non-zero elements in (29) is at least (n + 1). Since the vectors are in R" we can reduce the n u m b e r of non-zero elements to exactly (n + 1), say i E 11, j E J~, without violating the sign restrictions on Aj and 8~. T h e r e f o r e (28) with I1 and Jl instead of I and J m a y be regarded as the linear system of a N e w t o n iteration of the non-linear system (with the variables (h, h,+0)
fi(x+h)+h,+l=O, aT-(x+h)+b i=0,
iEI~, jEJ~.
(30)
216
K. Madsen and H. Sch]cer-Jacobsen/ Linearly constrained minimax optimization
The Jacobian of this system is regular because of the linear independence in (29) and because (29) holds with I1 and J~ inserted. Furthermore, a continuity argument shows that if the relation ~'~ S i f t ( y ) = ~
iEll
J~=Jl
Aiaj,
does not hold for y = x* then a neighbourhood V3 C_ V2 of then (31) holds with y = x*. x ~ V3 and since the second 184-185], that
t~i > 0,
Aj-¢0,
~ t31 = 1
(31)
it holds for no y close to x*. Therefore there exists x* such that if Ii and J~ correspond to an x E V3 Thus the Jacobian of (30) is regular at x* when derivatives are bounded near x* it follows, [9, pp.
IIx + h~ - x*l[- g l l x - x*ll 2,
g > 0,
(32)
for x E V4 C_ V3, so we have proved that only the local bounds Ak can prevent the quadratic convergence of our algorithm. In order to prove that the local bounds have no influence in the final part of the iteration we notice that since x* is a strict local minimum of F there exist positive numbers c and d such that cllx -
x*ll-- F ( x ) - f ( x * )
-- F(Xk) -- F(Xk, tkh~k) = F?(Xk, O) -- F ( x k , tkh~k)
(36)
>-- tk{ff'(Xk, O) -- F(Xk, h~k)} >- tkcollh~il = c011hkll.
Therefore F ( x k ) - - F(Xk + hk) F(Xk) -- ff'(Xk, hk)
--
F(Xk) -- F(Xk, hk) + O(hk) F(Xk) -- ff'(Xk, h~)
--> 1 + o(1)
(37)
K. Madsen and H. Schjoer-Jacobsen/ Linearly constrained minimax optimization
and hence, for k
->
217
k2 --->k~ we have
Xk+I = Xk ÷ hk
and
Ak+~ >--IlhkH.
(38)
If Ilhktl = A k for all large values of k we would have Ilhk+lll- Ilhkll for all large k, contradicting the fact that hk-")O. T h e r e f o r e ]lhkll ~ Ak, and hence hk = hxk, f o r infinitely many values of k. Let k3 >- k2 and K1 --- K have the property that KIIxk -- x*ll-< 0.5, IlXk + hxk -x*ll-< Klllhxkll 2, and Klllhxkll--< 0.5 for k >->-k3. Then, for p > k3 ht, = hx, ~ ht,÷~ = hxp+l,
(39)
because Ilhx,+,ll< Ao+,. Since the equality of the left hand side of (39) is true for infinitely many values of p the quadratic convergence is a consequence of (32). This completes the proof of Theorem 2.
4. Numerical examples The following examples have all been calculated in double precision on the IBM 370/165 computer. By x* we shall denote a solution found by the algorithm. As stopping criterion we use Ilhkll-< 10-1°llxklr.
Example 1. Let us consider the following three functions f l ( x l , x2) = x 2 + x 2 + xlx2 - 1, f2(xl, x~) = sin xl, £3(xl,
x~) :
- cos
(40) x~.
We want to solve the linearly constrained minimax problem: Minimize
F ( x ) = maxfi(x),
(41)
subject to xl + x 2 - 0.5 --- 0,
(42)
1_