Interior Point Trajectories in Semide nite Programming 1 Introduction

0 downloads 0 Views 228KB Size Report
Nov 1, 1996 - ated with interior point methods for semide nite programming (SDP) prob- lems. .... known that they can be represented as X = QP QT. P, Z = QD ...
Interior Point Trajectories in Semide nite Programming D. Goldfarb and K. Scheinbergy Columbia University, Dept. of IEOR, New York, NY November 1, 1996

Abstract

In this paper we study interior point trajectories in semide nite programming (SDP) including the central path of an SDP. This work was inspired by the seminal work by Megiddo on linear programming trajectories [15]. Under an assumption of primal and dual strict feasibility, we show that the primal and dual central paths exist and converge to the analytic centers of the optimal faces of, respectively, the primal and the dual problems. We consider a class of trajectories that are similar to the central path, but can be constructed to pass through any given interior feasible point and study their convergence. Finally, we study the rst order derivatives of these trajectories and their convergence. We also consider higher order derivatives associated with these trajectories.

1 Introduction The purpose of this paper is to study properties of the trajectories associated with interior point methods for semide nite programming (SDP) problems. Since many aspects of semide nite programming nd close analogs in linear programming, several interior point methods designed for linear programming (LP) have been successfully extended to apply to semide nite programming (e.g., see [2], [4], [10], [12], [17], [19], [20], [21], [25]). Many  Research

supported in part by NSF Grants DMS 91-06195, DMS 94-14438 and DMS

95-27124 and DOE Grant DE-FG02-92ER25126

y This

author was supported in part by an IBM Cooperative Fellowship

1

of these aspects have also been studied in the more general framework of self-scaled cones in [20], [21]. Many interior point methods can be viewed as iterative approximations to continuous path-following methods. Our aim is to provide a theoretical basis for such methods for SDP by describing the limiting behavior of the continuous central path and related trajectories for such problems. This work is an extension of the linear programming results in [15] to semide nite programming. We characterize the optimal face of an SDP problem and prove that the central path converges to the analytic center of the optimal face. Unlike LP problems, an SDP problem does not always have a strictly complementary primal-dual pair of solutions (e.g, see [3], [12]). Thus the SDP central path cannot be guaranteed to converge to such a pair as it does in LP. However, we show that it converges to a "least nonstrictly complementary" pair, in the sense that the sum of the ranks of the primal and the dual solutions (viewed as matrices) is as large as possible. Another issue that makes SDP di erent from LP is the absence (at least as far as we know) of a suitable concept of a weighted central path. Given that it is dicult in practice to obtain a point on the central path, it is important to have a class of trajectories that have properties similar to the properties of the central path and that pass through any given pair of interior primal and dual solutions. Such trajectories for linear programming are introduced in [6] and [1] and are called primal ane scaling (PAS) trajectories due to the fact that they correspond to continuous versions of primal ane scaling iterative algorithms. We study the SDP analogs of PAS-trajectories and prove that the main convergence results of [1] hold. We show that under the assumptions of primal and dual nondegeneracy and strict complementarity de ned in [3], the rst order derivatives of the central path are bounded in the limit. We also provide formulae for the limit of these derivatives and show that the factorization of only one matrix is required to compute these and all higher order derivatives of a solution on the central path. The paper is organized as follows. In Section 2 we describe the central path for a primal-dual pair of SDP problems and introduce our basic assumptions and some notation. In Section 3 we characterize the optimal faces of the primal and the dual SDP problems, and prove our main convergence result for the primal-dual central path in Section 4. We extend the results of Section 4 to the shifted central path (an analog of the PAS-trajectory) in Section 5. Finally, in Section 6 we analyze the limiting properties of the derivatives of the central path and show that computation of the derivatives 2

requires factorizing a single matrix for all orders of the derivatives.

2 The Central Path In this paper we consider the semide nite programming problem, henceforth referred to as the primal problem,

C X Ai  X = bi; i = 1; : : :; m X  0; X 2 Snn ;

min s:t:

(P )

where C 2 Snn , Ai 2 Snn ; i = 1; : : :; m, Snn denotes the space of real symmetric n  n matrices, and b 2 Rm . The inner product on Snn is P A  B = trace(AB) = i;j Aij Bij and by X  0 (X  0) we mean that X is positive semide nite (positive de nite). The problem dual to (P ) is the semide nite programming problem: max (D)

s:t:

m X i=1 m X i=1

yi bi yi Ai + Z = C;

Z  0; Z 2 Snn : Throughout the paper the following are assumed to hold:

Assumption 2.1 The matrices Ai; i = 1; : : :; m are linearly independent; i.e., P m u A = 0 implies that u = 0; i i=1 i i

i = 1; : : :; m.

Assumption 2.2 Both the primal and the dual problem have interior feasible solutions, i.e.

9X 2 Snn : X  0 and Ai  X = bi; i = 1; : : :; m; 0

and

0

0

9(y ; Z ) 2 Rm  Snn : Z  0 and 0

0

0

3

m X i=1

y0i Ai + Z0 = C:

Under Assumption 2.2 both primal and dual problems have nite optimal solutions, X and (y ; Z ), and the duality gap X  Z = 0 [19]. The optimal solutions also satisfy X Z = Z X = 0 [26]. The central path for the problem (P ) is a trajectory of the solutions X () 2 Snn to the following parametric family of problems for values of the parameter  > 0 ([13],[19],[25]): (P )

C  X ? (ln detX ) Ai  X = bi; i = 1; : : :; m X  0; X 2 Snn :

min s:t:

From Assumption 2.2 and the strict convexity of the logarithmic barrier objective function for any  > 0, problem (P ) has a unique solution that satis es the Karush-Kuhn-Tucker optimality conditions for (P ): (CP )

Z = X ?1 Ai  X = bi; i = 1; : : :; m m X i=1

yi Ai + Z = C;

X; Z  0; X; Z 2 Snn : The central path for the dual problem can be de ned in an analogous manner and is the trajectory (y (); Z ()) 2 Rm  Snn whose points satisfy the same system (CP ) as the points X () on the primal central path. Hence it makes sense to refer to the trajectory (X (); y (); Z ()),  > 0 of solutions to (CP ) as the primal-dual central path. Under Assumption 2.2, not only does this path exist, but also it converges to an optimal primal-dual solution (e.g, see [13], [19], [26]). To conclude this section we introduce some notation that we will use later in the paper. First, we note that the variables X and Z can be viewed both as symmetric matrices and as vectors (obtained from these matrices by stacking their columns one after the other), lying in a n(n +1)=2-dimensional 2 n subspace of R . Whenever we refer to the matrix X as a vector, we denote it by vec(X ). By the constraint matrix A we denote the m  n2 matrix, the i-th row of which equals vec(Ai)T . Note that C  X = vec(C )T vec(X ) is the usual inner product. The Kronecker product M N of matrices M 2 Rnn and N 2 Rnn 4

is de ned as

2

M N = 64

M11N : : : M1n N .. .

.. .

Mn1 N : : : Mnn N

3 7 5:

There are two properties of the Kronecker product that we will need later:

vec(MNP ) = (P T M )vec(N ) and (M P )(N S ) = (MN ) (PS ) [7].

If X is a positive semide nite symmetric matrix, then X has a spectral factorization X = QQT , where Q is an orthogonal matrix of eigenvectors of X and  is a diagonal matrix with the eigenvalues of X on the diagonal. Throughout this paper the upper case letter Q will always denote a matrix with orthonormal columns and  and will always denote diagonal matrices of eigenvalues. Lastly, from properties of the trace we have Property 2.3 Let A 2 Rnn , X 2 Rrr and P 2 Rnr . Then A  PXP T = P T AP  X .

Property 2.4 Let A 2 Rnn , A  0 and B 2 Rnn , B  0. Then A  B  0.

3 Optimal Faces of the Primal and Dual Problems Properties of the faces of the cone of positive semide nite matrices, are studied in [5]. The facial structure of semide nite programming problems (i.e., the intersection of the cone of positive semide nite matrices with an ane subspace) is studied in general terms in [22], [23]. Here we derive a particular system which describes the optimal face of an SDP problem. Let us introduce some more notation and recall some well-known facts. Let R(X ) denote the range (column space) of X . If X is a positive semidefinite symmetric matrix, it can be factorized as

X = QQT ;   0; where  is a diagonal matrix whose diagonal elements are the positive eigenvalues of X and Q is a matrix with orthogonal columns that are eigenvectors corresponding to these eigenvalues. Clearly, R(X ) = span(Q), the subspace spanned by the columns of Q, and the dimension of this subspace (i.e., the number of positive eigenvalues of X ) equals the rank of X . 5

Let X and Z be an optimal primal-dual pair of solutions. It is well known that they can be represented as X = QP QTP , Z = QD QTD , where  and are diagonal matrices with the positive eigenvalues of X and Z , respectively, on their diagonals and QTP QD = 0. Hence, R(X )  Ker(Z ) and R(Z )  Ker(X ). Let OP denote the primal optimal face, i.e., the set of primal optimal solutions, and let OD denote the dual optimal face. Note that both OP and OD are convex subsets of ane subspaces of Snn . By ri OP (ri OD ) we denote the relative interior of OP (OD ). Then the following lemma holds [5]:

Lemma 3.1 For any X 2 OP ((y; Z) 2 OD ) and any X~ 2 riOP ((~y; Z~) 2 riOD ), R(X )  R(X~ ) (R(Z )  R(Z~ )). This lemma shows that any X~ 2 riOP is an optimal solution of maximum rank. Moreover, if both X and X~ are in ri OP , it follows from Lemma 3.1 that R(X ) = R(X~ ). Let us denote this subspace by RP . Analogously, let RD be the subspace spanned by the eigenvectors corresponding to the positive eigenvalues of Z for any dual solution (y; Z ) in the relative interior of OD . Let dim RP = r and dim RD = s. From the complementarity of any primal-dual pair of optimal solutions RP ?RD . Hence, r + s  n. If r + s < n we say that the primal-dual pair of problems does not satisfy strict complementarity. Note that this can never happen in linear programming. If we de ne RN = [RP RD ]? (RN = ; if r +s = n), then RP ?RN ?RD and RP  RN  RD = Rn ; i.e., we have a partition of Rn into three mutually orthogonal subspaces. Let QP be any n  r matrix whose columns form an orthonormal basis for RP . Then any solution X 2 OP can be written as X = QP UQTP ; U  0, so the optimal face of (P ) is given by the set of the solutions to the following system:

Ai  QP UQTP = bi; U  0; U 2 Srr :

i = 1; : : :; m

(1)

Indeed, for any U feasible for (1), QP UQTP is feasible for (P ), and since QP UQTP  Z = 0, for any Z 2 OD (from RP ?RD ), QP UQTP and Z satisfy complementary slackness. Therefore, QP UQTP is an optimal solution to (P ). 6

Similarly, let the columns of QD form an orthonormal basis for RD . Then any optimal dual solution can be written as Z = QD V QTD , V  0, and the optimal face of (D) is given by the set of solutions to the system: Pm

T i=1 yi Ai + QD V QD = C V  0; V 2 Sss :

(2)

Notice that the de nitions of the primal and dual optimal faces are invariant with respect to the choices of QP and QD as long as their columns form orthonormal bases for the subspaces RP and RD , respectively. The following lemma shows that under Assumptions 2.1 and 2.2 both primal and dual optimal faces are bounded. Thus their analytic centers are well de ned, which is important for the results of the next section.

Lemma 3.2 Let Assumptions 2.1 and 2.2 hold, then the optimal sets of the primal and the dual problems are bounded. Proof. Suppose that the set of optimal dual solutions is unbounded. That is, there exists a nonzero direction (u; V ), satisfying: m X i=1 m X i=1

uibi = 0; uiAi + V = 0;

V  0; Multiplying the second equation by an interior feasible primal solution X0, which exists by Assumption 2.2, we obtain 0=

m X i=1

ui Ai  X0 + V  X0 =

m X i=1

ui bi + V  X0 = V  X0:

Therefore V = 0, since V PX0 = trace(V X0) = trace(V 1=2X0 V 1=2) and X0  0. It then follows that i ui Ai = 0, which by Assumption 2.1 implies that u = 0. The boundedness of the set of optimal primal solutions can be proved in a similar manner. 2 7

4 Convergence of the Central Path We prove in this section that the primal central path converges to the analytic center of the optimal face OP . First we show that the limit X of the central path is in the relative interior of the optimal face. Then we show that X is, in fact, the analytic center of the optimal face. We then extend these results to the dual central path. In [27] it is shown that in the case of convex homogeneous self-dual cones, which includes the case of the cone of positive semide nite matrices, the central path converges to a strictly complementary solution provided that one exists. In [14], under the assumption of strict complementarity, it is shown that the primal-dual central path of an SDP problem converges to the analytic center of the optimal face. We obtain the same results without assuming strict complementarity. Let X be the limit of the primal central path as  ! 0.

Lemma 4.1 There exists a spectral factorization X = Q  Q T and a sequence fk g such that X (k ) ! X , Q(k ) ! Q and (ki ) !  , where Q(k )(k )QT (k ) is a spectral factorization of X (k).

Proof. The proof follows trivially from the compactness of the set of the

orthogonal matrices. Notice that the limit  is uniquely de ned by X , and the limit Q , generally speaking, depends on the sequence fk g. 2

We know that X and (y ; Z ) are optimal solutions to the primal and dual problems, respectively. We rst want to prove that each is in the relative interior of the optimal face for its respective problem. Lemma 4.2 X ((y; Z)) belongs to the relative interior of the primal (dual) optimal face. Proof. Let (X (); y(); Z ()) be a point on the central path. Let X~ 2 riOP and (~y; Z~ ) 2 riOD . It is trivial to verify that [X~ ? X ()]  [Z~ ? Z ()] = 0; and since X~  Z~ = 0 and X ()Z () = I , we obtain

X~  X ()?1 + Z~  Z ()?1 = n: 8

(3)

Now both terms on the left side of (3) are nonnegative by Proposition 2.4; hence X~  X ()?1  n: (4) Consider the sequence fk g as de ned in Lemma 4.1, such that X (k ) !  X and the spectral factorizations Q(k )(k ) Q(k ) of X (k ) converge. X = Q  Q T and X~ = Q~P ~ P Q~ TP ; ~ P  0; ~ P 2 Srr . (The columns of Q~ P are eigenvectors of X~ that span RP .) Let us order the columns of Q and partition Q into two parts [Q P ; Q ND] so that Q P has r columns and Q TP Q~ P is nonsingular. This is always possible since Q T Q~ P has full column rank. Let us order the columns of Q(k ) and the columns and rows of  and (k ) and partition them according to the column order and partitioning of Q . Then X = Q P  P Q TP + Q ND  ND Q TND and X ?1(k ) = 1 (k )QTND (k ), and from (4) QP (k )?P 1(k )QTP (k ) + QND (k )?ND 1 Q~P ~ P Q~TP QP (k )?P 1(k )QTP (k )+Q~ P ~ P Q~ TP QND (k )?ND (k )QTND (k )  n: Since both terms in this sum are nonnegative by Proposition 2.4, it follows from Property 2.3 that QTP (k )Q~ P ~ P Q~ TP QP (k )?P 1(k ) = Q~ P ~ P Q~ TP QP (k )?P 1 (k )QTP (k )  n: (5) T T T T  ~ ~ ~  ~ ~ ~  Now, i (k ) ! i and QP (k )QP UP QP QP (k ) ! QP QP UP QP QP  UP 2 Srr . U~  0 and Q TP Q~ P is nonsingular; hence UP  0. It then follows from (5) that r X  ?i 1 (UP )ii  n; i=1

where (UP )ii > 0 (because UP  0). Since the sum of the ratios (UP )ii = i is nite, it follows that  i > 0; i = 1; : : :; r. Therefore X has rank r proving that X 2 ri OP . Similarly, it can be shown that (y ; Z ) 2 ri OD . 2 From Lemma 3.2 it follows that the analytic center of the optimal face OP is well de ned. We now show that X is, in fact, this analytic center. Theorem 4.3 Let X be the limit of the primal central path as  ! 0. Then  TP , where U depends on he choice of the orthonormal basis QP X = QP UQ for RP and is the unique solution to the problem max ln detU T (6) QP Ai QP  U = bi ; i = 1; : : :; m r  r U  0; U 2 S ; 9

i.e., X is the analytic center of the primal optimal face. Proof. Problem (6) can be rewritten in an equivalent form:

max ln detU QTP Ai QP  U = bi QTP CQP  U = c(0) U  0; U 2 Srr ;

(7)

where c(0) is the optimal objective function value. The solution of this problem is unique and satis es the following system of optimality conditions:

U ?1 ? P yi QP AiQTP = tQP CQTP QTP Ai QP  U = bi (8) T QP CQP  U = c(0): For any xed  > 0, let C  X () = c(), where X () is the solution to the problem (CP ). Then X () is a solution to the following problem: max ln detX Ai  X = b i ; i = 1; : : :; m (9) C  X = c() X  0; X 2 Snn : Using notation of Lemma 4.2 X = Q  Q T = Q P  P Q TP + Q ND  ND Q TND , where Q = [Q P Q ND ] is a matrix of eigenvectors of X and  P is a diagonal matrix of positive eigenvalues of X (r of them) and  ND = 0. From Lemma 4.2 it follows that Q P spans RP . As in Lemma 4.2 consider the convergent sequence X (k ) = Q(k )(k )QT (k ) = QP (k )P (k )QTP (k ) + QND (k )ND(k )QTND (k ), such that Q(k ) converges to Q and (k ) converges to  as k ! 0. Since X (k ) = Q(k )(k )QT (k ) and (k ) is diagonal, requiring X in (9) to be of the form "

#

X = Q(k ) U0 V0 QT (k ) = QP (k )UQTP (k ) + QND (k )V QTND (k ); U 2 Srr ; V 2 Sn?rn?r ; where U  0 and V is equal to ND (k ), does not a ect the solution of problem (9). We restrict V and not U because, as we have already 10

shown, the sequence of the solutions X = X () to (9) converges to an optimal solution, where V = 0 and U is a positive de nite matrix. Also, ln detX = ln detU +ln detV . Therefore from (9) and Property 2.3 we obtain the following maximization problem: max ln detU T QP (k )AiQP (k )  U = bi ? QTND (k )AiQND (k )  ND (k ) (10) QTP (k )CQP (k )  U = c(k ) ? QTND (k )CQND (k )  ND (k ) U  0; U 2 S rr : The unique optimal solution U = P (k ) of (10) satis es the following system: U ?1 ? P yi (k )QP (k )Ai QTP (k ) = t(k )QP (k )CQTP (k ) QTP (k )AiQP (k )  U = bi ? QTND (k )AiQND (k )  ND (k ) (11) QTP (k )CQP (k )  U = c(k ) ? QTND (k )CQND (k )  ND (k ) U  0: As k ! 0, Q(k ) ! Q and the system (11) converges to (8) with QP = Q P . Since P (k ) is the solution to (11), then the limit  P  0 has to satisfy (8). This proves that X is the analytic center of the primal optimal face. 2 As in LP, problems (P ) and (D) can be written in a "symmetric" form. Speci cally, we can use the "conic" formulation given in Chapter 3 of [19] (see also [25]). Let L be the subspace of Snn spanned by Ai ; i = 1; : : :; m and D 2 Snn be such that Ai  D = bi ; i = 1; : : :; m. Then problems (P ) and (D) can be formulated as min C  X 0 (P ) s:t: X 2 fD + L? g X  0; X 2 Snn ; and (D0) min D  Z s:t: Z 2 fC + Lg Z  0; Z 2 Snn : Consequently, all the results in this section extend to the dual problem, and in particular, in terms of formulation (D) of the dual, we have the following: 11

Theorem 4.4 Let (y; Z) be a limit point of the dual central path. Then

 TD , where W depends on the choice of the orthogonal basis QD Z = QD WQ for RD and is the unique solution to the problem

max ln detW T i=1 yi Ai + QD WQD = C; W  0; W 2 Sss ;

Pm

(12)

i.e., (y ; Z ) is the analytic center of the dual optimal face.

5 Shifted Central Paths In this section we present a class of primal ane scaling trajectories analogous to those introduced by Bayer and Lagarias [6] and later studied by Adler and Monteiro [1]. Ane scaling vector elds associated with semidefinite programs are studied in [8] and [9]. Here we analyze the limiting behavior of ane scaling trajectories in SDP. As far as we know there is no suitable concept of a scaled or weighted central path, de ned as a trajectory of solution of a class of minimization problems, passing through any given pair of primal and dual interior solutions. Therefore, we do not consider weighted trajectories as in [15]. However, we can consider "shifted" central paths, or primal ane scaling (PAS) trajectories, as they are called in [1] or A-trajectories as they are called in [6]. We study the properties and convergence of these trajectories, using the same techniques that we used for the central path. In [1] it is shown that the tangent to a PAS trajectory at any given point has the same direction as the primal ane scaling step. The same is true in semide nite programming [9]. Consider the family of problems dependent on a parameter  > 0 (SP)

min s:t:

C  X ? (T  X + ln detX ) Ai  X = bi; i = 1; : : :; m X  0;

where T 2 Snn is some arbitrary xed symmetric matrix. If problem (SP ) has a solution for some  > 0 then that solution is unique and satis es the 12

Karush-Kuhn-Tucker necessary conditions (SCP)

Z = X ?1 + T Ai  X = bi; i = 1; : : :; m m X i=1

yi Ai + Z = C;

X; Z  0: The trajectory of dual solutions (y (); Z ()) de ned by the system (SCP), parametrized by , is generally di erent from the dual shifted central path associated with T . Thus, when referring to the shifted central path, we mean the primal and dual trajectories de ned by (SCP). For anyPgiven  > 0 and T 2 Snn if there exists a (y0; Z0) 2 Rm  Snn such that (y0)i Ai + Z0 = C ? T and Z0  0 then (SCP ) has a unique P  solution. Using the notation of [1], let Y (T; ) = fy : yi Ai  C ? T g and I (T ) = f > 0 : Y (T; ) = 6 ;g. By Assumption 2.2, Y (T; 0) =6 ; for any T , and it is easy to see that I (T ) is an open interval, which is nonempty as long as C ? T is not spanned by the matrices Ai (i = 1; : : :; m). Thus I (T ) = (0; dT ) for some dT > 0. Lemma 5.1 If the feasible set of problem (P ) is bounded, then I (T ) = (0; 1) for any T 2 Snn ; i.e., (SCP) has a unique solution for all 0 <  < 1. This lemma is proved in [24] and a similar, but more general, result is proved as Theorem 2.4 in [9]. We want to choose T so that the shifted central path passes through an arbitrary given primal-dual pair of interior solutions. Speci cally, given (X0; y0; Z0) and 0 > 0, if we set T = ?X ?1 + 1 Z ; 0

0

0

then it is easy to verify that I (T ) 3 0 > 0 and, consequently, that I (T ) 3 (0; 0). In other words this choice guarantees the existence of a trajectory passing through the primal-dual point (X0; y0; Z0) for all 0    0 . It is shown in [1] (Proposition 2.4) that in linear programming the choice of the initial dual solution (y0; Z0) does not a ect the PAS trajectory. The same is true in the case of the shifted central path (i.e., PAS trajectory) for a semide nite program. The proofs are identical. 13

We are ready to discuss the limiting behavior of the shifted central path.  y; Z ) be a limit point of the solution (X (); y (); Z ()) to (SCP ) Let (X; as  ! 0. In [9] it is shown that X is an optimal solution to (P ) and (y; Z ) is an optimal solution to (D). As in the case of the central path, we can show that X is in the relative interior of the primal optimal face and (y; Z ) is in the relative interior of the dual optimal face. The proof is analogous to the proof of Lemma 4.2 with the exception that X ()  Z ()  N for some large number N . More importantly, it is trivial to extend the proof of Theorem 4.3 to give  TP , where U is the unique solution to the problem Theorem 5.2 X = QP UQ max ln detU + QTP TQP  U QTP Ai QP  U = bi ; U  0;

i = 1; : : :; m

(13)

i.e., X is the "shifted" analytic center of the primal optimal face.

Just as in the case of LP [1], the dual solutions of the shifted central path converge to the analytic center (not shifted) of the dual optimal face. Theorem 5.3 Let (y; Z) be a limit dual point of the shifted central path,  TD , where W is the unique solution to the problem then Z = QD WQ max ln detW yi Ai + QD WQTD = C; W  0; W 2 Sss ;

P

(14)

i.e., (y ; Z ) is the analytic center of the dual optimal face.

Proof. First we observe that the dual solution (y(); Z ()) associated with the system of optimality condinions (SCP) is the unique optimal solution to the system max s:t:

y T b +  ln det(Z ? T ) m X yiAi + Z = C; i=1

Z  T: 14

(15)

Given a sequence Z (k ) ! Z , we know that Z (k ) ? T ! Z. Let (y (k ); Z (k )) be a sequence of dual solutions on the shifted central path, which converges to (y ; Z ) and for which the sequence of spectral factorizations Z (k ) ? T = Q(k ) (k )Q(k )T converges. By a similar argument to that used in the proof of Theorem 4.3, y = y (k ), W = D (k ) is a solution to the following problem max ln detW

Pm

T T i=1 yi Ai + QD (k )WQD (k ) = C ? T ? QP (k ) P (k )QP (k );(16) y T b = b(k ) W  0; W 2 Sss ; P

where b(k ) = mi=1 bi yi (k ). When k ! 0 the solution to the above problem converges to the solution to the problem max ln detW yi Ai + Q D W Q TD = C; y T b = b(0) W  0; W 2 Sss ;

P

which is equivalent to Problem (14) de ning the analytic center of the dual optimal face. Thus (y; W ) is the unique solution of Problem (14). 2 Thus, the choice of the initial point (X0; y0 ; Z0) a ects the limit of the trajectory of the primal solutions (obviously, only if the optimal face is of dimension greater than zero), but does not a ect the limit of the dual trajectory. Remark. Notice, that the dual problem (15) is in fact a shifted barrier problem for the original dual (D). We now consider the tangent to a shifted central path at an arbitrary point on it. Our results apply to the central path as a special case (T = 0). Let (X; y; Z ) (we omit the argument  for simplicity) be on the shifted central path corresponding to a given shift T . Di erentiating the system (SCP ) with respect to  for any  > 0, yields

Ai  X_ = 0; i = 1; : : :; m 15

m X i=1

_ ?1 = ?X ?1 ? T: y_i Ai ? X ?1XX

(17)

In [9] it is shown that this system of di erential equations is generated by the generalized primal ane scaling vector eld. In our terms, this is equivalent to the fact that we can rewrite the above system as  =0 A  AT y E +  = C: (18) _ ?1=2), y E = y ? y_ , A = AH 1=2, where H = Here  = 2 vec(X ?1=2XX X X and H 1=2 = X 1=2 X 1=2 (i.e., the rows of A equal vec(X 1=2AiX 1=2)T and vec(C ) = vec(X 1=2CX 1=2). Thus,  can be viewed as an orthogonal projection of a scaled objective vector onto the kernel of a scaled constraint matrix, where the scaling depends only on the primal solution. The direction of the tangent is vec(X_ ) = 1=2H 1=2; and the directions y_ and Z_ can be calculated from the dual estimates y E and Z E = Z ? Z_ , which in turn, can be computed from the projection operator. Remark. We would like to study the limiting behavior of the dual estimates y E and Z E = Z ? Z_ that are computed at every step of an algorithm that uses an ane scaling direction. In the next section we show that under assumptions of strict complementarity and primal and dual nondegeneracy the limit of (y E ; Z E ) equals the limit of (y (); Z ()) as  ! 0.

6 Derivatives of the Central and Shifted Central Paths We begin this section by showing that as in case of linear programming [15] the computation of derivatives of any order of solutions on the central path or a shifted central path involves inverting a single matrix. In contrast with LP, the Schur complement of this matrix, which we must factorize, is fully dense, even if the constraint matrices are sparse. Consequently, this factorization step is very expensive in SDP and it is desirable to use as much information (e.g. higher order derivatives) as possible from it, as in the interior point LP methods proposed in [18] and [16]. 16

Let us consider solutions X and (y; Z ) on the shifted central path corresponding to a shift T . ( X , y and Z depend on , but we omit the argument.) As shown in the previous section (see (17)) the derivatives X_ and y_ in vector form satisfy " #" # " # X ?1 X ?1 ?AT vec(X_ ) = vec(X ?1 + T ) : (19) A 0 0 y_ For the second derivatives we have #" # " # " _ ?1XX _ ?1 ? X ?1XX _ ?1 ) X ?1 X ?1 ?AT vec(X ) = 2vec(X ?1XX ; A 0 y 0 and it can be easily shown by induction that the k-th derivatives X (k) and y (k) on the central path must satisfy a system of equations of the form "

X ?1 X ?1 ?AT A 0

#"

#

vec(X k ) = vec(R); yk ( )

( )

_ X;  : : :; X (k?1)). where R is a function of (T; X; X; We now turn to the limiting properties of the rst order derivatives of  y; Z) be the limit of the central path. the central path. As earlier, let (X; We need the following assumption: i) The primal and dual solutions X and (y ; Z) are strictly complementary; i.e., r + s = n. ii) The primal solution X is nondegenerate; i.e., the matrices

Assumption 6.1

#  TP Ai Q P Q TP Ai Q D Q ; i = 1; : : :; m Bi = Q T A Q 0 D i P "

are linearly independent in Snn . iii) The dual solution (y ; Z) is nondegenerate; i.e., the matrices i

h

Bi = Q TP AiQ P i = 1; : : :; m span Srr .

17

For these and other equivalent de nitions of primal and dual nondegeneracy see Alizadeh, Haeberly, and Overton [3]. They also prove that under Assumption 6.1 the optimal primal-dual solution is unique. Hence if Assumption 6.1 holds it makes sense to say that Problems (P ) and (D) are nondegenerate and strictly complementary. For the remainder of this section we shall assume that Assumption 6.1 holds. To show that the rst order derivatives of the central path converge to nite limits as  ! 0, we consider the following system which, as shown in [4], is equivalent to (CP ): 1 (XZ + ZX ) = I 2 Ai  X = bi; i = 1; : : :; m (20) m X i=1

yi Ai + Z = C;

X; Z  0; X; Z 2 Snn :

Viewing all symmetric matrices in the above system as vectors in Rn(n+1)=2 and di erentiating, we obtain the following system 32 2 3 3 2 svec (X_ ) Z  I 0 X I svec (I ) 6 A 7=6 0 0 75 64 y_ (21) 0 75 ; 4 5 4 0 AT I 0 svec(Z_ ) where svec maps a matrix in Snn into a vector in Rn(n+1)=2 and  denotes "symmetric" Kronecker product. See [4] for de nitions and properties of svec and . Alizadeh, Haeberly, and Overton in [4] prove that under Assumption 6.1 the coecient matrix of (21) is nonsingular at the limit  = 0. Therefore, X_ , y_ and Z_ are bounded and converge as  ! 0. Geometrically this means that the central path approaches the boundary of the feasible region at a strictly positive angle; i.e., the angle between the tangent to the central path and the tangent to the boundary at the optimal solution is strictly positive. From X ! X , and the boundedness of the derivative of X as  ! 0, we conclude that X_  lim!0 X_ () = X_ (0)  X_ , where X_ (0) is the right derivative of X () at  = 0. Similar arguments prove that the derivatives of the primal and dual solutions on the shifted central path are bounded and converge as  ! 0. We obtain system similar to (21), with the same coecient matrix and the right hand side, which depends on the shift T and on X and is uniformly bounded as  ! 0. 18

Let us consider the derivatives of the eigenvalues of X and Z on the central path. Since X = X () is symmetric and is a smooth function of a single parameter, its eigenvalues can be ordered so that i = i () of X is in C 1 [11]. If i is an eigenvalue of multiplicity k then _ i = (_ i1 ; : : : _ ik )T is the vector of eigenvalues of the matrix _ i; QTi XQ (22)

where Qi is a matrix of eigenvectors of X corresponding to i. If X ! X and X_ ! X_ as  ! 0 then Qi does not generally converge; however, the subspace spanned by the columns of Qi converges as  ! 0 and therefore any accumulation point of the matrices (22) has the same eigenvalues. Thus the limit of _ i exists. The same can be shown for the derivatives !_ i of the eigenvalues !i of the the dual slack matrix Z . Let us consider the eigenvalues of X and Z that converge to zero. The multiplicity of the zero eigenvalue of X equals s. The multiplicity of the zero eigenvalue of Z is r. We know that for X and Z on the central path,

XZ = I and therefore

_ + X Z_ = I: XZ Since X ! X = Q  Q T , Z ! Z = Q  Q T , X_ ! X_ and Z_ ! Z_ as  ! 0,

QT X_ QQ T Z Q + QT X Q Q T Z_ Q = Q T X_ Q  +  Q T Z_ Q = I: Considering the above system in more detail, we obtain #" # " " #" Q TP Z_ Q P QTP X_ Q P QTP X_ Q D 0 0 +  P 0 0 0 0  D Q TD Z_ Q P QTD X_ QP Q TD X_ QD

#

Q TP Z_ Q D = I; Q TD Z_ Q D

where  P  0 and D  0. From the diagonal blocks of this matrix equation, it then follows that Q TD X_ Q D =  ?D1 ; and

Q TP Z_ Q P =  ?P 1 :

Therefore, we can conclude that there are orderings of the eigenvalues of X_ and Z_ , for which _ D !  ?D1 and _ P !  ?P 1. This generalizes the result on the limits of the derivatives of nonbasic variables (i.e., variables converging 19

to zero) in linear programming. Complete information on X_ and Z_ can be obtained by solving the system (21). We can now conclude, that the dual estimates y E = y ? y_ and Z E = Z ? Z_ that appear in (18) converge to the same limits as y and Z , since y_ and Z_ are bounded as  ! 0. Remark. To prove the convergence of the derivatives of the central path as  ! 0 we assumed primal and dual nondegeneracy and strict complementarity. These assumptions imply the uniqueness of the primal and dual solutions [3]. However, we conjecture that it is sucient to only assume strict complementarity. Note that strict complementarity is necessary for boundedness of the derivatives, since if there is an index i such that both primal and dual eigenvalues i () !  i = 0 and !i () ! !i = 0, then from i()!i() =  it follows that _ i()!i() + i ()!_ i () = 1 and hence that both _ i () and !_ i () cannot be nite at the limit as  ! 0. In [14] it is shown that assuming strict complementarity, jjX () ? X jj = O() and jjZ () ? Zjj = O(), which implies that the derivatives of the central path are bounded as  ! 0.

Acknowledgement. The authors are grateful to Yurii Nesterov for bringing the question of the convergence of the central path in SDP to their attention, to Michael Overton for many stimulating discussions on SDP, and to Jean-Pierre Haeberly and two anonymous referees for helpful comments on earlier versions of the paper.

References [1] I. Adler and R. D. C. Monteiro, "Limiting behavior of the ane scaling continuous trajectories for linear programming problems", Mathematical Programming, 50(1991), pp 29-51. [2] F. Alizadeh, "Interior point methods in semide nite programming with applications to combinatorial optimization", SIAM J. Optimization, 5(1)(1995), pp. 13-51. [3] F. Alizadeh, J-P. A. Haeberly, and M. L. Overton, "Complementarity and nondegeneracy in semide nite programming", submitted to Math20

ematical Programming, 1995.

[4] F. Alizadeh, J-P. A. Haeberly, and M. L. Overton, "Primal-Dual Interior-Point Methods for Semide nite Programming: Convergence Rates, Stability and Numerical results", submitted to Mathematical Programming, 1996. [5] G. Ph. Barker and D. Carlson, "Cones of diagonally dominant matrices", Paci c Journal of Mathematics, 57(1975), No. 1, pp. 15-32. [6] D. A. Bayer and J. C. Lagarias, "The nonlinear geometry of linear programming. I. Ane and projective scaling trajectories", Transactions of AMS, 314(1989), pp 499-526. [7] A. Graham, Kronecker Products and Matrix Calculus: with Applications, Ellis Horwood, London, 1981. [8] L. Faybusovich, "On a matrix generalization of ane-scaling vector elds", SIAM Journal on Matrix Analysis and Applications, 16(1995) No. 3, pp. 886-897. [9] L. Faybusovich, "A Hamiltonian structure of generalized ane scaling vector elds", Journal of Nonlinear Science, 5(1995), pp. 11-28. [10] C. Helmberg, F. Rendl, R. Vanderbei and H. Wolkowicz, "An interiorpoint method for semide nite programming", Manuscript, Program in Statistics and Operations research, Princeton Univ., 1994, to appear in SIAM Journal on Optimization. [11] T. Kato, A Short Introduction to Perturbation Theory for Linear Operators, Springer-Verlag, New York, 1982. [12] M. Kojima, M. Shida and S. Shindoh, "Local convergence of predictorcorrector Infeasible-Interior-Point Algorithms for SDPs and SDLCPs", Research Report on Mathematical and Computer Sciences, Series B 21

(1995), Dept. of Mathematical and Computer Science, Tokyo Institute of Technology, Tokyo, Japan. [13] M. Kojima, S. Shindoh, and S. Hara "Interior-point methods for monotone semide nite linear complementarity problem in symmetric matrices", Research Report, Tokyo Institute of Technology, Tokyo, Japan. To appear in SIAM J. Optimization. [14] Z.-Q. Luo, J. Sturm, and S. Zhang "Superlinear convergence of a symmetric primal-dual path following algorithms for semide nite programming", manuscript (1996), Department of Electrical and Computer Engineering, McMaster University, Ontario, Canada and Econometric Insitute, Erasmum University, Rotterdam, The Netherlands. [15] N. Megiddo, "Pathways to the optimal set in linear programming", in Progress in Mathematical Programming: Interior-Point Algorithms and Related Methods, N. Megiddo, ed., Springer, Berlin, 1989, pp. 131-158. [16] S. Mehrotra, "Higher order methods and their performance", Technical Report 90-16R1 (1991), Dept. of IE and MS, Northwestern Univ., Evanson, IL. [17] R. D. C. Monteiro, "Primal-dual path following algorithms for semidefinite programming", working paper. [18] R. D. C. Monteiro, I. Adler, and M. G. C. Resende, "A polynomial time primal-dual ane scaling algorithm for linear and convex quadratic programming and its power series extension", Mathematics of OR, 15, pp. 191-214. [19] Yu. E. Nesterov and A. S. Nemirovskii, Interior point Methods in Convex Programming: Theory and Application, Society of Industrial and Applied Mathematics, Philadelphia, 1994.

22

[20] Yu. E. Nesterov and M. J. Todd, "Self-scaled cones and interior-point methods in nonlinear programming", CORE discussion paper 9462 (1994), CORE, Louvain-la-neuve, Belgium. [21] Yu. E. Nesterov and M. J. Todd, "Primal-dual interior point algorithms for self-scaled cones", Technical Report 1125(1995), School of Operations Research and Industrial Engineering, Cornell University, Ithaca, New York. [22] G. Pataki, "On the facial structure of cone-LP's and semi-de nite programs", Research Report #MSRR-595 (1994), Graduate School of Industrial Administration, Garnegie Mellon Univ., Pittsburg, PA. [23] M. Ramana, L. Tuncel and H. Wolkowicz, "Strong duality for semidefinite programming", Technical Report CORR 95-12 (1995), Dept. of Comb. and Opt., Univ. of Waterloo, Waterloo, Ontario, Canada. [24] K. Scheinberg, "Issues in interior point methods in semide ntie and linear programming", PhD Thesis, Columbia University, New York, 1996. [25] L. Vanderberghe and S. Boyd, "A primal-dual potential reduction method for problems involving matrix inequalities", Mathematical programming, 69(1995) No. 1, pp 205-236. [26] L. Vanderberghe and S. Boyd, "Positive-de nite programming", Mathematical programming: State of the Art, J. R. Birge and K. G. M. Murty, eds., (1994), pp 276-308, Univ. Of Michigan, Ann Arbor, MI. [27] Y. Ye, "Convergence behavior on central paths for convex homogeneous self-dual cones", a note. [28] Y. Zhang, "On Extending primal-dual interior-point algorithms from linear programming to semide nite programming", Technical report # TR95-20(1995) Department of Math. and Stat., UMBC, Baltimore, Maryland. 23

Suggest Documents