X ln xjyj where, for example,. = 2n. The direction of movement in the original space can be viewed as follows. First, apply a linear scaling transformation to make.
An Interior Point Potential Reduction Algorithm for the Linear Complementarity Problem Masakazu Kojima
Nimrod Megiddoy
Yinyu Yez
(Revised, December 1989)
Keywords: linear complementarity, P-matrix, interior-point, potential function, linear programming, quadratic programming Abbreviated title: Interior-point algorithm for the LCP Abstract. The linear Tcomplementarity problem (LCP) can be viewed as the problem of minimizing x y subject to y = M x + q and x y 0. We are T
interested in nding a point with x y < for a given > 0: The algorithm proceeds by iteratively reducing the potential function X T f (x y) = ln x y ; ln xj yj where, for example, = 2n. The direction of movement in the original space can be viewed as follows. First, apply a linear scaling transformation to make the coordinates of the current point all equal to 1. Take a gradient step in the transformed space using the gradient of the transformed potential function, where the step size is either predetermined by the algorithm or decided by line search to minimize the value of the potential. Finally, map the point back to the original space. A bound on the worst-case performance of the algorithm depends on the parameter = (M ), which is dened as the minimum of the smallest eigenvalue of a matrix of the form (I + Y ;1 M X )(I + XM T Y ;2 M X );1 (I + XM T Y ;1 ) where X and Y vary over the nonnegative diagonal matrices such that eT XY e and Xjj Yjj n2 . If M is a P-matrix, is positive and the algorithm solves the problem in polynomial time in terms of the input size, j log j, and 1= p. It is also shown that when M is positive semi-denite, the choice of = 2n + 2n yields a polynomial-time algorithm. This covers the convex quadratic minimization problem. Department of Information Sciences, Tokyo Institute of Technology, Meguro, Tokyo, Japan IBM Research, Almaden Research Center, San Jose, CA 95120-6099, and School of Mathematical Sciences, Tel Aviv University, Tel Aviv, Israel z Department of Management Sciences, University of Iowa, Iowa City, IA 52242 y
1
1. Introduction We consider the linear complementarity problem (LCP), that is, given a rational matrix M 2 nn and a rational vector q 2 n , nd vectors x y 2 n such that y = Mx + q ( ) x y0 xT y = 0 Equivalently, we consider the LCP as a quadratic programming problem Minimize xT y subject to y = Mx + q R
R
R
LC P
:
x y0
:
The problem of recognizing whether the LCP has a solution is NP-complete.1 We are interested in the problem of nding a solution in special cases where existence is guaranteed. For example, if M has positive principal minors (i.e., M is a -matrix 1]), a unique solution exists for every q. Also, the problem of computing a Nash-equilibrium point in a two-person game can be reduced to an LCP with q = ;e = (;1 ;1)T and " T# O B M= A O P
where A and B have positive entries. A solution always exists for such problems. It is shown in 5] that the -matrix LCP and the equilibrium problems cannot be NP-hard unless NP = coNP. An -complementary solution is a pair x y 0 such that y = Mx + q and P
xT y
:
The question we are interested in here is the existence of an algorithm for nding an -complementary solution in time which is bounded by a polynomial in the input size (i.e., the length of the binary representation of M and q) and ; log . It is shown in 4] that there exists , where ; log is bounded by the input size, such that an exact solution can be easily computed from an -complementary one. The analysis of this paper is related to the one presented by Karmarkar 3] for his linear programming algorithm. Gonzaga 2] proves for linear programming essentially the same result as the one presented in Section 7. Also, Ye 6] used potential functions in a similar way. 1 This is because the problem of recognizing existence of (0 1)-solutions to Ax b can be set as an LCP: j = 1 ; j , u = Ax ; b, x y u v 0, xT y = vT u = 0.
L
y
x
2
2. An invariant potential function We assume the feasible domain intersects the positive orthant, i.e., there exist x such that y = Mx + q. For brevity, we say that such points are interior. We dene a potential function (x y) at interior points by
y 0 >
f
f
where = ( )
n
(x y) = ln xT y ;
n X j =1
ln
j j
x y
.
> n
Lemma 2.1. If (x y ) ; , then xT y f
K
K ; ; n
:
< e
Proof: The claim follows from f
(x y) = ln x y ; ln
T
ln x
i=1
j j
x y
n X
!n
y ; ln j j i=1 ) ln xT y + ln (
T
=( ;
n Y
n
n
;1
n
x y
n >
; n) ln xT y :
We use the notation ( ) for any polynomial in . Starting from a point where the potential function value is ( ), if the value is reduced during each step by at least ;k for some , then it takes a polynomial number of steps (in terms of , , and ; log ) to reach an -complementary point. We now examine the behavior of the potential function under scaling of x and y. Given any feasible interior point (x0 y0), let X and Y denote diagonal matrices with the coordinates of x0 and y0, respectively, in their diagonals. Dene a linear transformation of space by x0 = X ;1x y0 = Y ;1y Denote M 0 = Y ;1 MX q0 = Y ;1q and 0 0 T j = j j ( = 1 ) and w = ( 1 n ) Also denote W = XY p L
L
p L
n
k
L
n
:
w
x y
j
n
w
:
3
w
:
The transformed problem is Minimize subject to
x0T W y0 y 0 = M 0 x0 + q 0 x0 y0 0 :
Note that feasible solutions of the original problems are mapped into feasible solutions of the transformed problem:
y0 = Y ;1y = Y ;1(Mx + q) = M 0x0 + q0 and, obviously, the point (x0 y0) is mapped into (e e). Consider the transformed po-
tential function
f
0
(x0 y0) = ln x0T Wy0 ;
n X j =1
ln
0
0
j j j
x w y
:
It is easy to verify that 0(x0 y0) = (x y), so a reduction in one of the functions results in the same reduction in the other one. Obviously, the matrix M 0 is a -matrix if and only if M is a -matrix. Now, consider the special case where the current point (x y) is on the \central path", i.e., XY = I for some . In this case, some important structure of M is preserved in the transformation to M 0. For example, if M arises from a linear programming problemthen M is skewsymmetric. In this case, M 0 is also skew-symmetric and arises from a transformed linear programming problem. Obviously, if M is symmetric then so is M 0. Moreover, if M is positive semi-denitethen so is M 0. This implies that if M arises from a convex quadratic programming problem then so does M 0. f
f
P
P
3. Getting an initial point To guarantee a polynomial running time, we have to start the algorithm at an interior point where the value of the potential function is bounded by a polynomial ( ) in the size of the input of the problem. We note that any LCP can be reduced to an LCP whose feasible domain intersects the positive orthant, by adding a pair of complementary variables 0 0: p L
L
x
y = Mx + (e ; Me ; q) 0 + q x
(
0 LC P
)
y0
=
x0
x y0 x0 y0
j j
x y
4
0
= 0 ( = 0 ) j
n
y
where the choice x = y = e, 0 = 0 = 1 yields an interior feasible solution. Moreover, the value of the potential function at this point is ln( +1). Note that this reformulation 0) as preserves the P-matrix property. We may view -complementary solutions of ( approximate solutions to ( ). It is important to note that given an -complementary ; L solution with 2 , a simple procedure nds an exact solution (see 4]). Another possible initialization is as follows. If the original problem has an interior point, then any polynomial-time algorithm for linear programming can be employed to nd an interior point where the value of the potential function is bounded by a polynomial in the input size. This can be done as follows. Solve the following linear programming problem: Maximize subject to y = Mx + q ( ) x
y
n
LC P
LC P
P
>
L
x
y
p L
4. The core of the algorithm During a single iteration the algorithm takes a step in the direction corresponding to steepest descent of the potential function in the transformed space. More precisely, the algorithm transforms the current point (xk yk ) into (e e) by linear scaling, moves to a point in the direction of steepest descent, and then transforms this point back to the original space and calls it (xk+1 yk+1). To simplify the presentation, we assume the current point is indeed (e e) and the potential function has the form f
(x y) = ln xT W y ;
n X j =1
ln
j j j
x w y
We also omit the primes from M 0 and q0. The gradient of is given by f
rxf (x y ) =
;1 x Wy W y ; X e
T
5
:
ry f (x y ) =
;1 x Wy Wx ; Y e
T
where X and Y denote diagonal matrices containing the coordinates of x and y, respectively, in their diagonals. Denote
g = eT w w ; e
If x = y = e, then
:
gx = rx (e e) = eT W e W e ; e = g gy = ry (e e) = eT We We ; e = g Denote by ( x y) the projection of r (e e) on the linear space dened by y = M x. Thus, ( x y) minimizes k x ; g k2 + k y ; g k2
f
f
:
f
subject to
y=M x
It follows that
:
x = (I + M T M );1(I + M T )g y = M (I + M T M );1(I + M T )g Denote d = (( x)T ( y)T )T and h = (gT gT )T .
:
Proposition 4.1.
hT d = kdk2 Proof: The vector d is a projection of h on a linear subspace, so d = Ph where P is a symmetric matrix such that P 2 = P . Thus, kdk2 = hT P 2 h = hT Ph = hT d :
:
Fact 4.2. Fact 4.3.
eT g =
;n ; n and hence kg k p
kg k
n
q
2
; 2 + n < :
6
:
Proof: By denition, since wj > 0, ;1 < gj < ; 1 :
The vector g lies in the simplex dened by these bounds, together with eT g = ; . The maximum norm of a point in this simplex is attained at a vertex, so we have
kgk k(;1 ;1 ; 1)T k =
Corollary 4.4.
maxfk
q
; 2 + n < :
2
p
xk k ykg
<
n
2
:
Proof: Since d is the projection of h on a linear subspace, p p kdk khk = kg k 2 < 2
which implies the claim. We now estimate the amount of reduction in the value of as we move from x = y = e to a point of the form x0 = e ; x, y0 = e ; y, where 0. We would like to choose so as to achieve a reduction of at least ;k for some constant . f
f
t
t
t
n
Proposition 4.5.
ln(eT W e) ; ln (e ; Proof: First, we have
ln (e ;
t
t >
t
x)T W (e ; y) t
k
eT W ( x + y) eT W e
; kdk2t2 :
x)T W (e ; y) = ln eT We ; eT W ( x + y) + 2( x)T W y T ) ; ( x)T W y ln eT W e ; e W ( x + eTyWe t
t
(
t
x)T W y k xk k yk kdk2 eT We
Proposition 4.6. If M T g 6= ;g, then hT d 0. >
7
t
1 kdk 4
>
kpdk2
4 2
:
This implies our claim.
5. Suitable matrices In view of Lemma 4.9, we need to get good lower bounds on kdk. Recall that d 6= 0 provided M T g 6= ;g. However, we are interested in conditions on M which imply that kdk is bounded away from zero, hopefully by ;k for some . Moreover, since the matrix M of the preceding section is actually M 0 = Y ;1MX , we need to look for conditions on M which are invariant under scaling of columns and rows by positive multipliers. n
k
Denition 5.1. For any matrix M and any diagonal matrices (with positive diagonal entries) X and Y , denote
S = S(M Y ;1 X ) = (I + Y ;1MX )(I + XM T Y ;2 MX );1(I + XM T Y ;1) Let (M Y ;1 X ) denote the smallest eigenvalue of S(M Y ;1 X ), and let ;1 X ) : jj jj 0g 0 (M ) = inf f (M Y
:
X
Y
>
:
X
jj Yjj
Also, let 1
(M ) = inf f (M
Y ;1 X ) :
X
jj
jj
Y
>
Obviously, 1(M ) 0 (M ). The distinction between of -matrices, as we show below.
P
9
0 and 0
and
1
n2 g :
is crucial in the context
Remark 5.2. For every M , X and Y , (M Y ;1 X ) 0 and (M Y ;1 X ) = 0 if
and only if I + Y MX is singular. Thus 0(M ) 0. Note that if M has nonnegative principal minors, then I + RMC is nonsingular for any nonnegative diagonal matrices R and C . On the;1 other hand, if M has a negative principal minor, there exist X and Y such that I + Y MX is singular. This can be seen as follows. Suppose f1 g is a set of indices of columns and rows corresponding to a negative minor. Let X and Y be diagonal matrices where jj = ;1jj;1 = for 2 and jj = jj;1 = minf 1 2g otherwise. The determinant of I + Y MX tends to 1 when tends to 0, and to ;1 as tends to innity. It follows that for some value of this determinant equals zero. Since jj jj = 1, it follows that if M has a negative minor, then 1(M ) = 0. ;1
S
X
Y
j
t
S
X
n
Y
t
=t
t
t
t
X
Y
Proposition 5.3. If = 2 , then throughout the execution of the algorithm,
n
kdk2 n (M
Y ;1 X )
n 1
(M )
:
Proof: It follows from the analysis above (see Fact 4.2) that kdk2 = dT h = (
x + y)T g = gT (I + Y ;1MX )(I + XM T Y ;2MX );1 (I + XM T Y ;1 )g (M Y ;1 X )kg k2 (M Y ;1 X ) To prove that actually 1(M ) can be used, recall that the value of the potential function does not increase throughout the execution of the algorithm. Thus, if the initial value is , then remains an upper bound on the value of (x y) and by Lemma 2.1, ; is an upper bound on xT y . The essential thing to note is that for every , the value :
n
K n e
K
K
f
j
of j j is bounded throughout by some . Actually, we have demonstrated above that the initial value can be chosen as ln . Therefore, if = 2 , we can set 0 = Kn = 2. Thus, kdk2 inf f (M Y ;1 X ) : j j 0 j j 2g = 1 (M ) 0 K
x y
n
n
x
y
>
x y
n
K
n
n
e
n
:
6. The case of a -matrix P
Recall that a -matrix is one that has positive principal minors. Remark 5.2 provides some motivation for considering the performance of the algorithm on problems with matrices. In this section we prove that the algorithm presented above solves the LCP with a P-matrix. The algorithm runs until the value of xT y decreases below some prescribed 0. For problems with rational coecients, an 0 can be determined such that an exact solution can be computed from an -complementary one. The following proposition is the key to the validity of the algorithm: P
P
>
>
10
Proposition 6.1. If M is a P-matrix, then for every for every pair x all j .
y produced by the algorithm, if x y T
0 there exists 0 such that , then necessarily j j for
>
>
>
x
y
>
Proof: The proof follows from the fact that the potential function decreases monotonically throughout the execution of the algorithm. In particular, there exists a constant K such that f (x y) < K for every iterate (x y ). By Lemma 2.1, for every such K iterate, xT y < e ;n . However, if M is a P-matrix, the set of pairs x y, such that y = Mx + q and xT y < t, is bounded for every t. The latter can be seen as follows. The quantity xi (Mx)i (M ) = min max x6=O i kxk2 is positive for any P-matrix. Suppose, on the contrary, that the set dened above is unbounded. In particular, the inequalities x
Mx)i +
i(
i i
x q
< t
( = 1 ) i
n
leave x unbounded. However, when x tends to innity, these inequalities, written in the form i i i (Mx)i + 2 kxk kxk2 kxk2 x q
x
give a point x 6= 0 such that
>
>
>
X
Y
n
n
O n
n
=
O n
n= =
n
7. The positive semi-denite case In this section we show that the algorithm of this paper solves linear complementarity problems with positive semi-denite matrices in polynomial time. We derive the formulas 11
in the transformed space, so the reader who wishes to use the original matrix M , has to replace M in the formulas below by Y ;1MX . As in Section 4, consider the potential function: n X (x y) = ln xT y ; ln j j f
where
x y
j =1
. The vector g can be written as
> n
g = xT y XY e ; e
:
The optimality conditions in the problem of projecting h = (gT gT )T into f y = M xg are: x ; g = MT y ; g = ; y=M x Thus we have = (I + MM T );1(I ; M )g x = g + MT y = g ; Since the derivation is done in the transformed space, when we use the original matrix M we have to write as follows. x =g + xT y XM T 0 :
:
y =g ; xT y Y 0
where
T 0 = x y Y ;1(I + Y ;1MX 2M T Y ;1);1(I ; Y ;1 MX )g
Thus,
:
x = xT y X (y + M T 0) ; e y = xT y Y (x ; 0) ; e
:
We can now prove the following:
p
Proposition 7.1. If M is positive semi-denite and = 2 + 2 , then kdk 1.
Proof: Denote
x0 =x ; 0 y 0 =y + M T 0 12
:
n
n
We have
x = xT y Xy0 ; e y = xT y Y x0 ; e
:
If either x0 6 0 or y0 6 0, then at least one coordinate of d = ( ;1, so kdk 1. Thus, assume x0 y 0 0. Also, >
>
(y 0 ; y)T (x0 ; x) = ;(0)T M0 0
Thus,
x y) is less than
:
xT y0 + yT x0 xT y + x0T y0 xT y 0 >
This implies kdk2 =
2
T 0 T 0 xT y (x y + y x ) + 2 2 T y 0)2 + (y T x0)2 ( x T 2 ; 2 T (xT y 0 + y T x0) + 2 (x y) xy 2 T 0 T 0 2 T 2 (x y + y x ) ; 2 T (xT y 0 + y T x0 ) + 2 (x y) 2 xy ! 2 T 0 T 0 = 2 (x 2y x+T yy x ) ; 1
(xT y )2
kXy 0k2 + kY x0 k2 ; 2
:
n
n
n
n
n
2n
=2
n
n
n
;1
2
2 !2 1 1+ p ;1 =1 2 n
:
n
Finally, we can state the following:
Theorem 7.2. If M is positive semi-denite then, starting at any interior point where the potential value is O(nL), the algorithm converges in O(n2 L) iterations.
Proof: If M is ppositive semi-denite, it follows from Proposition 7.1 and Lemma 4.9 (with = 2n + 2n) that f O(1n) : Thus, by Lemma 2.1, after O(n2L) iterations we have xT y 2;L . Initializations that t the above theorem and the next one were given elsewhere (see, for example, 4]).
13
Theorem 7.3. If M is skew-symmetric (including the case of the linear programming problem) then, starting at any interior point where the potential value is O(nL), the algorithm converges in O(nL) iterations.
Proof: In the skew-symmetric case we obtain better estimates as follows. First, in Proposition 4.5, the quadratic term vanishes, so
ln(eT W e) ; ln (e ;
x)T W (e ; y)
t
t
Hence, the inequality of Corollary 4.8 becomes kdk2 ; 2kdk2 2 f
t
t
eT W ( x + y) eT W e
t :
:
Following arguments similar to the ones used in the proof of Lemma 4.9, let 0 = 1 4 denote the maximizer of the parabola in the right-hand side. If 0 , then we have 2 kdk 8. Otherwise, we get kdk 4, and the claim follows from Proposition 7.1. t
=
t < t
f
=
f
=
References 1] D. Gale and H. Nikaido, \The Jacobian matrix and global univalence of mappings", Mathematische Annalen 159 (1965) 81{93. 2] C. C. Gonzaga, \Polynomial ane algorithms for linear programming," unpublished manuscript, Federal University of Rio de Janeiro, 1988. 3] N. Karmarkar, \A new polynomial-time algorithm for linear programming", Combinatorica 4 (1984) 373{395. 4] M. Kojima, S. Mizuno and A. Yoshise, \Polynomial time algorithm for a class of linear complementarity problems", Mathematical Programming 44 (1989) 1{26. 5] N. Megiddo, \A Note on the Complexity of -Matrix LCP and Computing an Equilibrium", Research Report RJ 6439, IBM Almaden Research Center, San Jose, CA 95120-6099, 1988. 6] Y. Ye, \An ( 3 ) potential reduction algorithm for linear programming," unpublished manuscript, The University of Iowa, 1989. P
O n L
14