Math Meth Oper Res (2012) 75:165–183 DOI 10.1007/s00186-012-0379-4 ORIGINAL PAPER
A new second-order corrector interior-point algorithm for semidefinite programming Changhe Liu · Hongwei Liu
Received: 26 April 2011 / Accepted: 6 January 2012 / Published online: 20 January 2012 © Springer-Verlag 2012
Abstract In this paper, we propose a second-order corrector interior-point algorithm for semidefinite programming (SDP).√This algorithm is based on the wide neighborhood. The complexity bound is O( n L) for the Nesterov-Todd direction, which coincides with the best known complexity results for SDP. To our best knowledge, this is the first wide neighborhood second-order corrector algorithm with the same complexity as small neighborhood interior-point methods for SDP. Some numerical results are provided as well. Keywords Semidefinite programming · Interior-point methods · Second-order methods · Polynomial complexity 1 Introduction Semidefinite programming (SDP) problem is a generalization of linear programming (LP) problem. It has received considerable attention and has been one of the most active research areas in mathematical programming. SDP has been applied in many areas, such as combinatorial optimization (see e.g., Alizadeh 1991, 1995) and system and control theory (see e.g., Boyd et al. 1994). Several authors have discussed
C. Liu · H. Liu Department of Mathematics, Xidian University, Xi’an, People’s Republic of China H. Liu e-mail:
[email protected] C. Liu (B) Department of Applied Mathematics, Henan University of Science and Technology, Luoyang, People’s Republic of China e-mail:
[email protected]
123
166
C. Liu, H. Liu
generalizations of interior-point methods (IPMs) for LP to the context of SDP. The first IPMs for SDP were independently developed by Alizadeh (1991) and Nesterov and Nemirovsky (1994). Alizadeh (1991) extended Ye’s projective potential reduction algorithm Ye (1990) from LP to SDP. Nesterov and Nemirovsky (1994) presented a deep and unified theory of IPMs for solving the more general conic optimization problems using the notion of self-concordant barriers. For further information on IPMs for solving SDP problems we refer e.g., (Alizadeh et al. 1998; Helmberg et al. 1996; Kojima et al. 1997; Monteiro 1997, 1998, 2003; Monteiro and Zhang 1998; Nesterov and Todd 1997, 1998). Most IPMs are primal-dual path following methods, the iterate is confined to stay within a neighborhood of the central path. Two popular neighborhoods used in IPMs are so-called small neighborhood and negative infinity neighborhood (wide neighborhood). In theory, the worst-case iteration bound proved for wide neighborhood is worse than that proved for small neighborhood. However, wide neighborhood IPMs perform better in practice than small neighborhood IPMs. To bridge this gap, Peng et al. (2002) proposed a new paradigm based on the class of the so-called self-regular functions, under which wider neighborhood IPMs can come arbitrarily close to the best known iteration bound of small neighborhood IPMs. Later, Ai (2004) and Ai and Zhang (2005) proposed a new class of wider neighborhoods. Using these√new neighborhoods, they presented two path-following algorithms maintaining O( n L) iteration-complexity bound, which is known as the best complexity bound for IPMs. Recently, Li and Terlaky (2010) extended the Ai-Zhang technique in Ai and Zhang (2005) to SDP. Feng and Fang (2010) extended the algorithm for LP in Ai (2004) to SDP. In LP, the most computationally successful IPMs have been primal-dual methods using Mehrotra’s predictor-corrector (MPC) steps in Mehrotra (1992). In spite of extensive use of this method in IPMs based optimization packages, it has not been enhanced by equally praiseworthy theoretical guarantees of good performance. In fact, it is acknowledged among practitioners that there are examples on which the MPC algorithm fails to converge (see Nocedal and Wright 1999, p. 407). Cartis (2009) also provided an example that shows that MPC algorithms may fail to converge to an optimal solution. Simple safeguards could be incorporated into the method to force it into the convergence framework of existing methods (see Salahi and Mahdavi-Amiri 2006; Salahi et al. 2007). Most recently, based on Ai and Zhang (2005), Liu et al. (2010) proposed a primal-dual second-order corrector IPM for LP problem. The main difference between the methods in Ai and Zhang (2005) and Liu et al. (2010) lies in that at each iteration, the latter method computes a corrector direction in addition to the Ai-Zhang direction. The reported computational results in Liu et al. (2010) showed that the performance has been improved. Inspired by the works of Liu et al. (2010), Li and Terlaky (2010) and Feng and Fang (2010) we present a second-order corrector algorithm for SDP. We obtain the complexity bound which is the same as Li and Terlaky (2010) and Feng and Fang (2010). The numerical reports show that the new algorithm has a superior practical performance. This paper is organized as follows. In Sect. 2, we first introduce the SDP problems and discuss the symmetrization scheme. Then we propose a primal-dual secondorder corrector algorithm. In Sect. 3, we state and prove some technical results and
123
A second-order corrector algorithm for SDP
167
then, based on these results, establish the iteration-complexity bound of the proposed algorithm. Results of numerical experiments are given in Sect. 4. Finally, some conclusion and final remarks are given in Sect. 5. The following notations are used throughout the paper. Rn denotes the n-dimensional Euclidean space. The Euclidean norm of x ∈ Rn is denoted by x. The set of all m × n matrices with real entries is denoted by Rm×n . A T denotes the transpose of A ∈ Rm×n . The set of all symmetric n ×n matrices is denoted by S n . For M ∈ S n , we write n (S n ) denotes the M 0 (M 0) if M is positive definite (positive semidefinite). S++ + n set of all matrices in S which are positive definite (positive semidefinite). For a matrix M with all real eigenvalues we denote its eigenvalues by λi (M), i = 1, 2, . . . , n, and its smallest and largest eigenvalues by λmin (M) and λmax (M), respectively. The spectral radius of M is denoted by ρ(M) := max{|λi (M)| : i = 1, 2, . . . , n}. Given G and H in Rm×n , the inner product between them in the vector space Rmn is defined as G • H := Tr(G T H ), the trace of the matrix G T H . For M ∈ Rn×n , M F , M denote the Frobenius norm and matrix 2-norm, respectively. For A ∈ Rm×n , vec A denotes the mn-vector obtained by stacking the columns of M one by one from the first to the last column. The Kronecker product of two matrices A and B is denoted by A ⊗ B (see Horn and Johnson 1991 for the details of the Kronecker product). Finally, for X ∈ Rn×n , we let diag(X ) denote the vector in Rn consisting of the diagonal elements of X . Analogously, for a vector x ∈ Rn , we let Diag(x) denote the diagonal matrix in Rn×n whose diagonal elements are obtained from x. 2 SDP problem and preliminary discussions In this paper, we consider the SDP problem (P) min C • X, s.t. Ai • X = bi , i = 1, 2, . . . , m, X 0,
(1)
where C, X ∈ S n , b ∈ Rm , and Ai ∈ S n , i = 1, 2, . . . , m, are linearly independent. The corresponding dual problem is (D) max b T y, s.t.
m
yi Ai + S = C, S 0.
(2)
i=1
where y ∈ Rm and S ∈ S n . The set of primal-dual feasible interior points is denoted by A • X = bi , i = 1, 2, . . . , m, n n × Rm × S++ : i m F 0 := (X, y, S) ∈ S++ i=1 yi Ai + S = C. It is well known (see de Klerk 2002, p. 33) that under the assumptions that F 0 is nonempty and the matrices Ai (i = 1, 2, . . . , m) are linearly independent, the set of primal and dual optimal solutions consists of all the solutions to the following optimality system
123
168
C. Liu, H. Liu
Ai • X = bi , i = 1, 2, . . . , m, m i=1
X 0,
yi Ai + S = C, S 0,
(3)
X S = 0.
The central path consists of points (X μ , y μ , S μ ) satisfying the perturbed system Ai • X = bi , i = 1, 2, . . . , m, m i=1
X 0,
yi Ai + S = C, S 0,
(4)
X S = μI,
where μ ∈ R, μ > 0. Nesterov and Nemirovsky (1994) proved that there is a unique solution (X μ , y μ , S μ ) to the central path Eqs. (4) for any barrier parameter μ > 0, under the assumption that F 0 is nonempty and Ai (i = 1, 2, . . . , m) are linearly independent. Moreover, the limit of (X μ , y μ , S μ ) as μ goes to 0 is a primal-dual optimal solution of the corresponding SDP problem. Since for X, S ∈ S n , the product X S is generally not in S n , so, the left-hand side of (4) is a map from S n × Rm × S n to Rn×n × Rm × S n . Thus, the system (4) is not a square system when X and S are restricted to S n , which it is needed for applying Newton-like methods. A remedy for this is to make the system (4) square by modifying the left-hand side to a map from S n × Rm × S n to itself. To this end, we use the so-called similar symmetrization operator H P : Rn×n → S n introduced by Zhang (1998) defined as H P (M) =
1 P M P −1 + (P M P −1 )T , ∀M ∈ Rn×n , 2
where P ∈ Rn×n is some nonsingular matrix. Zhang (1998) also observed that if P is invertible and M is similar to a (symmetric) positive definite matrix, then H P (M) = μI ⇔ M = μI. Thus, for any given nonsingular matrix P, system (4) is equivalent to Ai • X = bi , i = 1, 2, . . . , m, m i=1
X 0,
yi Ai + S = C, S 0,
(5)
H P (X S) = μI.
Then a Newton-like method applied to system (5) leads to the following linear system:
123
A second-order corrector algorithm for SDP
169
Ai • X = 0, i = 1, 2, . . . , m, m Ai yi + S = 0,
(6)
i=1
H P (X S + X S) = τ μI − H P (X S), where (X, y, S) ∈ S n × Rm × S n is the search direction, τ ∈ (0, 1) is the centering parameter and μ = X • S/n is the normalized duality gap corresponding to (X, y, S). The search direction obtained by the above system is called MonteiroZhang (MZ) family. Todd et al. (1998) proved that system (6) has a unique solution n × Rm × S n and for the scaling matrix P which satisfies for any (X, y, S) ∈ S++ ++ −1 n P X S P ∈ S . As it was shown in Nesterov and Todd (1997, 1998) and Todd et al. (1998) the choice of P = W 1/2 in (6), where W = S 1/2 (S 1/2 X S 1/2 )−1/2 S 1/2 = X −1/2 (X 1/2 S X 1/2 )1/2 X −1/2 ,
(7)
leads to the Nesterov-Todd (NT) direction. For NT scaling P = W 1/2 we have P X P = P −1 S P −1 and hence P X S P −1 ∈ S n . In this article, we derive the NT direction for the system (6) and based on it state our second-order corrector algorithm. We note that the third equation of system (6) can also be written as ˆ = τ μI − H ( Xˆ S), ˆ H ( Xˆ Sˆ + Xˆ S)
(8)
where H := H I is the plain symmetrization operator and Xˆ = P X P, Xˆ = PX P, Sˆ = P −1 S P −1 , Sˆ = P −1 S P −1 .
(9)
Moreover, in terms of the Kronecker product, Eq. 8 can be expressed as ˆ ˆ ˆ Evec Xˆ + Fvec Sˆ = vec(τ μI − H ( Xˆ S)),
(10)
1 ˆ Fˆ = 1 ( Xˆ ⊗ I + I ⊗ Xˆ ). Eˆ = ( Sˆ ⊗ I + I ⊗ S), 2 2
(11)
where
We observed that for NT scaling, we have ˆ Xˆ = S,
ˆ Eˆ = F.
(12)
In what follows we describe a second-order corrector algorithm. For convenience, we first introduce some notations. Let M ∈ S n with the eigenvalue decomposition n T λi qi qiT , where D = diag(λ1 , λ2 , . . . , λn ) and λ1 , λ2 , . . . , λn M = Q D Q = i=1 are the eigenvalues of M, and Q is an orthonormal matrix. Then, the positive part and the negative part are defined as M + = λi >0 λi qi qiT and M − = λi 0, neighborhood parameters 0 < τ2 < τ1 < 1, a centering parameter 0 ≤ τ ≤ 1 and an initial point (X 0 , y 0 , S 0 ) ∈ N (τ1 , τ2 ). Set k := 0. Step 1 If X k • S k ≤ ε, then stop. Step 2 Compute the NT scaling P k = (W k )1/2 by (7). k , y k , S k ) by (13) and (X k , y k , S k ) Step 3 Compute the directions (X − − − + + + by (14). c,k c,k c,k , y− , S− ) by system (15). Find a step Step 4 Compute the directions (X − k k k length vector α = (α1 , α2 ) giving a sufficient reduction of the duality gap and assuring (X (α k ), y(α k ), S(α k )) ∈ N (τ1 , τ2 ). Step 5 Let (X k+1 , y k+1 , S k+1 ) := (X (α k ), y(α k ), S(α k )). Set k := k + 1 and go to Step1. 3 Complexity analysis of the algorithm In √ this section, we prove that Algorithm 1 has an iteration-complexity bound of O( n log(X 0 • S 0 /ε)) for the NT direction. Throughout this section we choose to set τ = τ1 . First, we give some technical lemmas that will be used frequently during the analysis. From now on, we use Λ to denote the diagonal matrix Λ = diag(λ1 , λ2 , . . . , λn ), where λi for i = 1, 2, . . . , n are the eigenvalues of Xˆ Sˆ with increasing order λ1 ≤ λ2 ≤ . . . ≤ λn . We also use U to denote the ˆ = Λ. We should emphasize that the matriorthogonal matrix such that U T Xˆ SU 1/2 1/2 ˆ ˆ ˆ ˆ and S 1/2 X S 1/2 have the same eigenvalues, since ces X S, S X , X S, S X, X S X they are all similar to each other. The third equations of system (13), (14) and (15) can be rewritten as ˆ = [τ μI − Xˆ S] ˆ −, H ( Xˆ Sˆ− + Xˆ − S) ˆ = [τ μI − Xˆ S] ˆ +, H ( Xˆ Sˆ+ + Xˆ + S)
(17) (18)
123
172
C. Liu, H. Liu
and c c ˆ H ( Xˆ Sˆ− + Xˆ − S) = −H ( Xˆ − Sˆ− ),
(19)
respectively, where Xˆ − = PX − P, yˆ− = y− , Sˆ− = P −1 S− P −1 , Xˆ + = PX + P, yˆ+ = y+ , Sˆ+ = P −1 S+ P −1 , c c c c c c −1 Xˆ − = PX − P, yˆ− = y− , Sˆ− = P −1 S− P . Denote ˆ ( Xˆ (α), yˆ (α), S(α)) := α1 ( Xˆ − , yˆ− , Sˆ− ) + α2 ( Xˆ + , yˆ+ , Sˆ+ ) c c c + α12 ( Xˆ − , yˆ− , Sˆ− ). (20) It is easy to see that if the current iterate is feasible, then Xˆ − • Sˆ− = 0, Xˆ + • c • Sˆ c = 0 and X ˆ (α) • S(α) ˆ = 0. Sˆ+ = 0, Xˆ − − In terms of the Kronecker product, the Eqs. (17), (18) and (19) become ˆ ˆ −, ˆ Sˆ− = vec[τ μI − Xˆ S] Evec Xˆ − + Fvec ˆ ˆ ˆ +, Evec Xˆ + + Fvec Sˆ+ = vec[τ μI − Xˆ S]
(22)
c c ˆ ˆ Evec Xˆ − + Fvec Sˆ− = −vecH ( Xˆ − Sˆ− ).
(23)
(21)
and
Direct calculation yields X (α) • S(α) = Tr(X (α)S(α)) = Tr(H P (X (α)S(α))) = Tr(H P (X S)) + α1 Tr([τ1 μI − H P (X S)]− ) + α2 Tr([τ1 μI − H P (X S)]+ ) ˆ + α1 Tr([τ1 μI − Xˆ S] ˆ − ) + α2 Tr([τ1 μI − Xˆ S] ˆ + ). = Tr( Xˆ S) Hence μ(α) :=X (α) • S(α)/n ˆ − )/n + α2 Tr([τ1 μI − Xˆ S] ˆ + )/n. =μ + α1 Tr([τ1 μI − Xˆ S]
(24)
We note the following simple but useful results (see Feng and Fang 2010, Lemma 3.1). Lemma 3 Let (X, y, S) ∈ F 0 . Then we have (i) Tr([τ1 μI − X 1/2 S X 1/2 ]− ) ≤ −(1 √ − τ1 )X • S; (ii) Tr([τ1 μI − X 1/2 S X 1/2 ]+ ) ≤ n[τ1 μI − X 1/2 S X 1/2 ]+ F .
123
A second-order corrector algorithm for SDP
173
The difficulty of analyzing second-order algorithms arises from the second order term in the corrector step of the algorithm, which is one of the main differences of this algorithm and other interior-point algorithms. To overcome this difficulty the following lemma is an important technical result that gives the relationship between this term and the matrix Λ. Lemma 4 Let (X, y, S) ∈ F 0 . Then we have 1 H ( Xˆ − Sˆ− ) = (U ([τ1 μI − Λ]− )2 Λ−1 U T − ( Xˆ − − Sˆ− )2 ). 4 Proof By Xˆ = Sˆ = V , the Eq. 17 becomes 1 (( Xˆ − + Sˆ− )V + V ( Xˆ − + Sˆ− )) = [τ μI − V 2 ]− . 2
(25)
Since V 0, Eq. 25 (called a Lyapunov equation) has a unique symmetric solution (see de Klerk 2002, Theorem E.2). Further, by V 2 = Xˆ Sˆ = U ΛU T and U T U = I one has [τ μI − V 2 ]− = U [τ μI − Λ]− U T , V −1 = U Λ−1/2 U T , and therefore Xˆ − + Sˆ− = [τ μI − V 2 ]− V −1 = U [τ μI − Λ]− Λ−1/2 U T .
(26)
gives the unique symmetric solution to Eq. 25. On the other hand, it is trivial to verify that H ( Xˆ − Sˆ− ) =
1 (( Xˆ − + Sˆ− )2 − ( Xˆ − − Sˆ− )2 ). 4
(27)
Hence H ( Xˆ − Sˆ− ) =
1 (U ([τ1 μI − Λ]− )2 Λ−1 U T − ( Xˆ − − Sˆ− )2 ), 4
which gives the required result. The following lemma gives an upper bound for the second order term. Lemma 5 Let (X, y, S) ∈ F 0 . Then we have H ( Xˆ − Sˆ− ) F ≤ 2−3/2 nμ.
Proof Since Xˆ − • Sˆ− = 0, the matrices D := Xˆ − + Sˆ− and Q := Xˆ − − Sˆ− have the same norm. Consequently, by (27) we have H ( Xˆ − Sˆ− )2F = D 2 − Q 2 2F /16 = Tr(D 4 + Q 4 − D 2 Q 2 − Q 2 D 2 )/16 ≤ (D 2 2F + Q 2 2F )/16 ≤ (D4F + Q4F )/16 = D4F /8.
123
174
C. Liu, H. Liu
By (26), we have H ( Xˆ − Sˆ− ) F ≤ 2−3/2 [τ μI − Λ]− Λ−1/2 2F ≤ 2−3/2 Λ1/2 2F = 2−3/2 nμ,
which completes the proof.
From now on, we introduce the notation β := (τ1 − τ2 )/τ1 , then we have β ∈ (0, 1), τ1 − τ2 = βτ1 and τ2 = (1 − β)τ1 . Further let us denote
N (τ1 , β) := (X, y, S) ∈ F 0 : [τ1 μI − X 1/2 S X 1/2 ]+ ≤ βτ1 μ . F
When the parameters τ1 and β are chosen appropriately and all the iterates reside in the neighborhood N (τ1 , β), we claim that the duality gap is decreasing at the rate of √ O(1/ n). Lemma 6 Suppose √ that the current iterate (X, y, S) ∈ N (τ1 , β). If τ1 ≤ 1/5, β ≤ 1/2 and α1 = 0.5α2 βτ1 /n, then we have μ(α) ≤ 1 − α2
(4 −
√ 10) βτ1 μ. √ 10 n
√
Proof Because (X, y, S) ∈ N (τ1 , β), by (24) and Lemma 3, it follows that μ(α) = μ + α1 Tr([τ1 μI − X 1/2 S X 1/2 ]− )/n + α2 Tr([τ1 μI − X 1/2 S X 1/2 ]+ )/n √ ≤ μ − α1 (1 − τ1 )μ + α2 βτ1 μ/ n
= μ − α2 βτ1 /n 0.5(1 − τ1 ) − βτ1 μ √ √ (4 − 10) βτ1 ≤ 1 − α2 μ. √ 10 n √ Here the second equality holds due to α1 = 0.5α2 βτ1 /n and the last inequality follows from τ1 ≤ 1/5 and β ≤ 1/2. Subsequently, we show how to ensure all the iterates in the neighborhood N (τ1 , β). Lemma 7 Let (X, y, S) ∈ N (τ1 , β). Eˆ and Fˆ are defined by (11). Then we have ˆ − 2 ≤ nμ Eˆ −1 vec[τ1 μI − Xˆ S] and ˆ + 2 ≤ β 2 τ1 μ/(1 − β). Eˆ −1 vec[τ1 μI − Xˆ S]
123
A second-order corrector algorithm for SDP
175
Proof By the proof of Lemma 4.2 in Monteiro and Zhang (1998), we have ˆ −1 ) = 1/λ1 ρ(( Fˆ E) and ˆ −1/2 vec[τ1 μI − Xˆ S] ˆ − 2 = ( Fˆ E)
n 2 [τ1 μ − λi ]− /λi . i=1
Hence, by Eˆ = Fˆ it follows that
ρ( Eˆ −1 ) = 1/ λ1
(28)
and ˆ − 2 = Eˆ −1 vec[τ1 μI − Xˆ S]
n
n
+ 2 λi − τ1 μ/ λi ≤ λi = nμ.
i=1
i=1
Further, by (X, y, S) ∈ N (τ1 , β), then λi ≥ (1 − β)τ1 μ. Therefore ˆ + 2 ≤ Eˆ −1 2 vec[τ1 μI − Xˆ S] ˆ + 2 Eˆ −1 vec[τ1 μI − Xˆ S] 2 ˆ −1 + ˆ 2F = ρ ( E )[τ1 μI − Xˆ S]
= [τ1 μI − X 1/2 S X 1/2 ]+ 2F /λ1 ≤ β 2 τ1 μ/(1 − β).
This completes the proof.
The following identity introduced by Zhang (1998) is useful for deriving the bounds on the direction (X (α), y(α), S(α)). n , Lemma 8 Let u, v, r ∈ Rn and E, F ∈ Rn×n satisfy Eu + Fv = r . If F E T ∈ S++ then
(F E T )−1/2 Eu2 + (F E T )−1/2 Fv2 + 2u T v = (F E T )−1/2 r 2 . ˆ to conclude the We now apply Lemma 7 and Lemma 8, together with Eˆ = F, following result. √ Lemma 9 If (X, y, S) ∈ N (τ1 , β), β ≤ 1/2 and α1 = 0.5α2 βτ1 /n, then 2 ˆ vec Xˆ (α)2 + vec S(α) ≤ 25α22 βτ1 μ/16.
Proof From (20), (21), (22) and (23), we have ˆ ˆ ˆ ˆ − + α2 vec[τ1 μI − Xˆ S] ˆ + Evec Xˆ (α) + Fvec S(α) = α1 vec[τ1 μI − Xˆ S] 2 − α1 vecH ( Xˆ − Sˆ− ).
123
176
C. Liu, H. Liu
Applying Lemma 8 to this equality, we obtain 2 ˆ ˆ vec Xˆ (α)2 + vec S(α) + 2 Xˆ (α) • S(α) −1 − −1 ˆ ˆ ˆ + ˆ ˆ = α1 E vec[τ1 μI − X S] + α2 E vec[τ1 μI − Xˆ S] − α12 Eˆ −1 vecH ( Xˆ − Sˆ− )2 ˆ − + α2 Eˆ −1 vec[τ1 μI − Xˆ S] ˆ + ≤ α1 Eˆ −1 vec[τ1 μI − Xˆ S]
2 + α12 Eˆ −1 vecH ( Xˆ − Sˆ− ) . Using (28) and Lemma 5, one has Eˆ −1 vecH ( Xˆ − Sˆ− ) ≤ Eˆ −1 vecH ( Xˆ − Sˆ− )
= H ( Xˆ − Sˆ− ) F / λ1
≤ 2−3/2 nμ/ (1 − β)τ1 μ. On the other hand, by Lemma 7 we have ˆ − + α2 Eˆ −1 vec[τ1 μI − Xˆ S] ˆ + 2 α1 Eˆ −1 vec[τ1 μI − Xˆ S] ˆ − 2 + α22 Eˆ −1 vec[τ1 μI − Xˆ S] ˆ + 2 = α12 Eˆ −1 vec[τ1 μI − Xˆ S] ≤ α12 nμ + α22 β 2 τ1 μ/(1 − β). √ Therefore, by β ≤ 1/2 and α1 = 0.5α2 βτ1 /n we have 2 ˆ vec Xˆ (α)2 +vec S(α) ≤
α12 nμ + α22 βτ1 μ+2−3/2 α12 nμ/
2 (1 − β)τ1 μ
√ √ ≤ ( 5/2 + 2/16)2 α22 βτ1 μ ≤ 25α22 βτ1 μ/16, which gives the required result.
√ Lemma 10 Let (X, y, S) ∈ N (τ1 , β), τ1 ≤ 1/5 and β ≤ 1/2. If α1 = 0.5α2 βτ1 /n, then we have H P (X (α)S(α)) F ≤ α22 βτ1 μ(α). Proof Using Lemma 9, we have ˆ ˆ (α) S(α) ˆ H P (X (α)S(α)) F = H ( Xˆ (α) S(α)) F ≤ X F ˆ ˆ ˆ ˆ ≤ X (α) F S(α) F = vec X (α)vec S(α) 1 2 ˆ ≤ (vec Xˆ (α)2 + vec S(α) ) 2 ≤ 25α22 βτ1 μ/32.
123
A second-order corrector algorithm for SDP
177
On the other hand, since τ1 ≤ 1/5, and β ≤ 1/2, from (24), we have ˆ = (1 − α1 )μ ≥ (1 − μ(α) ≥ μ + α1 Tr(− Xˆ S)/n
√ 10/20)μ ≥ 25μ/32.
The required result follows. For convenience, let us denote ˆ − + α2 [τ1 μI − Xˆ S] ˆ + − α12 H ( Xˆ − Sˆ− ). X (α) := Xˆ Sˆ + α1 [τ1 μI − Xˆ S]
(29)
We have H P (X (α)S(α)) = H P (X S) + H P (X (α)S + X S(α)) + H P (X (α)S(α)) = H P (X S) + α1 [τ1 μI − H P (X S)]− + α2 [τ1 μI − H P (X S)]+ − α12 H P (X − S− ) + H P (X (α)S(α)) ˆ − + α2 [τ1 μI − Xˆ S] ˆ + − α12 H ( Xˆ − Sˆ− ) = Xˆ Sˆ + α1 [τ1 μI − Xˆ S] +H P (X (α)S(α)) = X (α) + H P (X (α)S(α)).
(30)
Using Lemma 4, we have U T X (α)U = Λ + α1 [τ1 μI − Λ]− + α2 [τ1 μI − Λ]+ − α12 U T H ( Xˆ − Sˆ− )U = Λ + α1 [τ1 μI − Λ]− + α2 [τ1 μI − Λ]+ − +
α12 ([τ1 μI − Λ]− )2 Λ−1 4
α12 T U ( Xˆ − − Sˆ− )2 U. 4
Hence, by Weyl theorem (see Horn and Johnson 1990, Theorem 4.3.1), we have α2 λi (U T X (α)U ) ≥ λi + α1 [τ1 μ − λi ]− + α2 [τ1 μ − λi ]+ − 1 ([τ1 μ − λi ]− )2 4λi if λi ≤ τ1 μ, (1 − α2 )λi + α2 τ1 μ, = (1 − α1 − α12 /4)λi + (α1 + α12 /4)τ1 μ, if λi > τ1 μ. (31) √ Lemma 11 Let (X, y, S) ∈ N (τ1 , β), τ1 ≤ 1/5 and β ≤ 1/2. If α1 = 0.5α2 βτ1 /n, then we have [τ1 μ(α)I − X (α)]+ F ≤ (1 − α2 )βτ1 μ(α). √ Proof Since τ1 ≤ 1/5, β ≤ 1/2 and α1 = 0.5α2 βτ1 /n, we have 1 − α1 − α12 /4 > 0.
123
178
C. Liu, H. Liu
From (31), if λi > τ1 μ we have τ1 μ(α) − λi (U T X (α)U ) ≤ τ1 μ(α) − (1 − α1 − α12 /4)τ1 μ − (α1 + α12 /4)τ1 μ = τ1 (μ(α) − μ) ≤ 0, and if λi ≤ τ1 μ we have τ1 μ(α) − λi (U T X (α)U ) ≤ τ1 μ(α) − =
μ(α) ((1 − α2 )λi + α2 τ1 μ) μ
μ(α) (1 − α2 )(τ1 μ − λi ). μ
Therefore, [τ1 μ(α)I − X (α)]+ F = [τ1 μ(α)I − U T X (α)U ]+ F μ(α) (1 − α2 )[τ1 μ(α)I − Λ]+ F ≤ μ μ(α) (1 − α2 )[τ1 μ(α)I − X 1/2 S X 1/2 ]+ F = μ ≤ (1 − α2 )βτ1 μ(α), which is the required result.
The next lemma gives out a sufficient condition which guarantees all the iterates in the neighborhood N (τ1 , β). √ Lemma 12 Let (X, y, S) ∈ N (τ1 , β), τ1 ≤ 1/5 and β ≤ 1/2. If α1 = 0.5α2 βτ1 /n, then (X (α), y(α), S(α)) ∈ N (τ1 , β). ¯ Proof For all δ ∈ [0, 1], let ( X¯ (δ), S(δ)) := (X + δX (α), S + δS(α)). From (29), we have ¯ H P ( X¯ (δ) S(δ)) = H P (X S) + δ H P (X (α)S + X S(α)) + δ 2 H P (X (α)S(α)) = H P (X S) + δ(α1 [τ1 μI − H P (X S)]− + α2 [τ1 μI − H P (X S)]+ − α12 H P (X − S− )) + δ 2 H P (X (α)S(α)) ˆ − + α2 [τ1 μI − Xˆ S] ˆ + = (1 − δ) Xˆ Sˆ + δ( Xˆ Sˆ + α1 [τ1 μI − Xˆ S]
− α12 H ( Xˆ − Sˆ− )) + δ 2 H P (X (α)S(α)) = (1 − δ) Xˆ Sˆ + δX (α) + δ 2 H P (X (α)S(α)).
Using (31) and the fact that λmin (·) is a homogeneous concave function on the space of symmetric matrices, one has ¯ ˆ + δλmin (X (α)) λmin (H P ( X¯ (δ) S(δ))) ≥ (1 − δ)λmin ( Xˆ S) 2 + δ λmin (H P (X (α)S(α)))
123
A second-order corrector algorithm for SDP
179
≥ (1 − δ)(1 − β)τ1 μ + δλmin (X (α)) − δ 2 H P (X (α)S(α)) F . Since 1 − α1 − α12 /4 > 0, from (31), we have λmin (X (α)) ≥ min((1 − α2 )(1 − β)τ1 μ + α2 τ1 μ, τ1 μ) = min((1 − β)τ1 μ + α2 βτ1 μ, τ1 μ) = (1 − β)τ1 μ + α2 βτ1 μ. Hence, ¯ ≥ (1 − δ)(1 − β)τ1 μ + δ(1 − β)τ1 μ + δα2 βτ1 μ λmin (H P ( X¯ (δ) S(δ))) − δ 2 α22 βτ1 μ(α) ≥ (1 − β)τ1 μ > 0. ¯ This reveals that X¯ (δ) S(δ) is nonsingular for all δ ∈ [0, 1], and further implies that ¯ is nonsingular as well. By using continuity, it follows each of the factors X¯ (δ) and S(δ) ¯ that X (α) = X¯ (1) 0 and S(α) = S(1) 0 since X, S 0. Using Lemma 1, Lemma 2 and Eq. 30, we have [τ1 μ(α)I − X 1/2 (α)S(α)X 1/2 (α)]+ F ≤ [τ1 μ(α)I − H P (X (α)S(α))]+ F = [τ1 μ(α)I − X (α) −H P (X (α)S(α))]+ F ≤ [τ1 μ(α)I − X (α)]+ F +[−H P (X (α)S(α))]+ F . Further, from Lemma 10 and Lemma 11, one has [τ1 μ(α)I − X 1/2 (α)S(α)X 1/2 (α)]+ F ≤ (1 − α2 )βτ1 μ(α) + α22 βτ1 μ(α) ≤ βτ1 μ(α), which implies (X (α), y(α), S(α)) ∈ N (τ1 , β).
Now we are in a position to present our main complexity result. Theorem 1 Suppose that τ = τ1 ≤ 1/5 √ and β ≤ 1/2 are fixed for all iterations. Then Algorithm 1 will terminate in O( n log(X 0 • S 0 /ε)) iterations with a solution X • S ≤ ε. √ Proof By Lemma 12, at each iteration, if we let α¯ = (0.5 βτ1 /n, 1), then we have (X (α), ¯ y(α), ¯ S(α)) ¯ ∈ N (τ1 , β).
123
180
C. Liu, H. Liu
Furthermore, from Lemma 6, we conclude μ(α) ¯ ≤ 1−
(4 −
√ 10) βτ1 μ. √ 10 n
√
This completes the proof by Lemma I. 36 of Roos et al. (2006).
We have to point out that the duality gap is monotone with respect to α1 for any fixed α2 ∈ [0, 1] from the expression (24). So, the step length can be computed approximately in the following way. First, set α2 ∈ (0, 1]. Second, find the greatest α1 ∈ [0, 1]√such that (X (α), y(α), √ S(α)) ∈ N (τ1 , β). Lemma 12 guarantees that α1 ≥ 0.5α2 βτ1 /n and the O( n L) iteration bound still holds. 4 Numerical results In this section we report on the results of some numerical experiments. Numerical results were obtained by using MATLAB R2009a on an Intel Core 2 PC (2.66 GHz) with 2 GB RAM. We compare the proposed second-order corrector algorithm with the path-following algorithm in Li and Terlaky (2010) and the Mehrotra-type predictorcorrector algorithm in Todd et al. (1998) on the following classes of SDP problems. These three algorithms will be referred to as Algorithm SOC, Algorithm PF and Algorithm MPC, respectively. For all the problems below, the starting points (X 0 , y 0 , S 0 ) are feasible interior points and so we can let the parameter 0 < τ1 ≤ 0.05 be small enough to keep − (1 − τ ), which implicates (X 0 , y 0 , S 0 ) ∈ N (τ , β). For Algo(X 0 , y 0 , S 0 ) ∈ N∞ 1 1 rithm SOC and Algorithm PF, we set τ = τ1 , β = 0.5, and we compute the step length √ in the following way. First, set α2 = 1. Second, use bisection in closed interval [0.5 βτ1 /n, 1] to find the greatest α1 such that (X (α), y(α), S(α)) ∈ N (τ1 , β). For Algorithm MPC, as Todd et al. (1998), we use a fixed step-length parameter of 0.98 in the predictor and corrector step. We use the strategy proposed by Todd et al. (1998) for computing the NT scaling matrix and the NT direction. In this case, it is capable to simultaneously scale X and S to diagonal matrices. Then, it is straightforward to decide the positive and negative parts of τ μI − H P (X S). (1) Random SDP: The test problem is generated as follows: After one inputs two positive integer m, n, MATLAB generates m matrixes Ri ∈ Rn×n , i= 1, . . . , m m Ai + I randomly. Then we take Ai = (RiT + Ri )/2, bi = Tr(Ai ) and C = i=1 to obtain an SDP and its initial feasible point (I, e, I ). (2) Norm minimization problem: min A0 +
x∈Rm
m
xk Ak ,
k=1
where the Ak ∈ R N ×N , k = 0, . . . , m. It is well known that this problem can be expressed as an SDP involving m + 2 symmetric matrices of dimension n × n,
123
A second-order corrector algorithm for SDP
181
where n = 2N (see Vandenberghe and Boyd 1996, for details). For this problem, we choose the following feasible starting point: X 0 = I /n,
y 0 = t0 e1 , S 0 = t0 I + C,
where e1 is column of identity matrix I of dimension n × n, t0 = 1.1C1 , the first 0 A0 and C = AT 0 where C1 denote the matrix 1-norm of C. 0
(3) Max-Cut problem: min L • X, s.t. diag(X ) = e/4, X 0, where L = A − Diag(Ae), e is the vector of all components equal to 1 and A is the weighted adjacency matrix of a graph (see Helmberg et al. 1996). In our experiments, we only consider unweighted graphs where each edge is present independently with probability one half. We choose the following feasible starting point: X 0 = I /4,
y 0 = −1.1abs(L)e, S 0 = L − Diag(y 0 ).
(4) ETP (Educational testing problem): max e T d, s.t. A − Diag(d) 0, d ≥ 0, where A ∈ S N++ . This problem can readily be expressed as a dual form of SDP, involving symmetric matrices of dimension n × n, where n = 2N . We choose the following feasible starting point:
3I 0 X = , 0 2I 0
y = 0.4λmin (A)e, S = 0
0
0 A − Diag(y 0 ) . 0 Diag(y 0 )
In our experiments, for each class of the problems mentioned above, we solved ten instances with random data. That is, the given matrices are random, with entries chosen from the normal distribution with zero mean and unit variance. For the ETP problem, A is the matrix product of such a random matrix and its transpose so that the resulting matrix is symmetric positive definite. For each set of ten instances, we give the average number of iterations and CPU time for each algorithm to reduce the duality gap by a factor of 1010 . We also stop the iteration if the number of iterations exceeds 50. The results are shown in Table 1. From the results in Table 1 we find that, on the average, the number of iterations and CPU time required by Algorithm SOC are about 55.33 and 55.86%, respectively, less than those required by Algorithm PF. Although our implementations are very coarse, the proposed algorithm is comparable to the Mehrotra-type predictor-corrector algorithm in Todd et al. (1998) as a whole.
123
182
C. Liu, H. Liu
Table 1 Computational performance of the algorithms Problem
m
n
Algorithm SOC
Algorithm PF
Algorithm MPC
Iter.
Iter.
Iter.
Time
Time
Time
Random SDP
50
50
11.4
6.9675
26.2
16.272
11.0
5.9604
Norm minimization
25
50
11.2
6.7529
23.6
14.388
11.1
5.9258
Max-Cut
50
50
11.1
6.7852
26.0
16.237
11.7
6.3496
41.5a
25.244a
19.0
ETP
25
50
18.7
11.337
10.112
a Algorithm PF fails on eight of the instances due to the number of iterations exceeds 50. The number
reported here does not include these unsuccessful instances
5 Conclusions This paper provides a new second-order corrector interior-point algorithm for SDP √ problems. Based on the NT direction, this algorithm has O( n L) iteration-complexity bound, which is the best complexity results for IPMs so far. Our preliminary numerical experiments show the new algorithm may also perform well in practice. However, there are still some unsettled issues for implementation. For example, an adaptive update of the centering parameter and efficient strategies to choose step lengths deserve more work for real applications of the algorithm. Acknowledgments We would like to thank the anonymous referees for their valuable comments and suggestions, which helped to improve this paper. This work is supported partly by National Natural Science Foundation of China under Grant No. 61072144 and No. 61179040.
References Ai W (2004) Neighborhood-following algorithms for linear programming. Sci China Ser A 47:812–820 √ Ai W, Zhang S (2005) An O( n L) iteration primal-dual path-following method, based on wide neighborhoods and large updates, for monotone LCP. SIAM J Optim 16:400–417 Alizadeh F (1991) Combinatorial optimization with interior-point methods and semi-definite matrices. PhD Thesis, Computer Sience Department, University of Minnesota, Minneapolis, MN Alizadeh F (1995) Interior point methods in semidefinite programming with applications to combinatorial optimization. SIAM J Optim 5:13–51 Alizadeh F, Haeberly JA, Overton ML (1998) Primal-dual interior-point methods for semidefinite programming: convergence rates, stability and numerical results. SIAM J Optim 8:746–768 Boyd S, Ghaoui LE, Feron E, Balakrishnan V (1994) Linear matrix inequalities in system and control theory. SIAM, Philadelphia, PA Cartis C (2009) Some disadvantages of a Mehrotra-tpye primal-dual corrector interior point algorithm for linear programming. Appl Numer Math 55:1110–1119 de Klerk E (2002) Aspects of semidefinite programming: interior point algorithms and selected applications. Kluwer, Dordrecht √ Feng Z, Fang L (2010) A wide neighbourhood interior-point method with O( n L) iteration-complexity bound for semidefinite programming. Optimization 59:1235–1246 Helmberg C, Rendl F, Vanderbei RJ, Wolkowicz H (1996) An interior-point method for semidefinite programming. SIAM J Optim 6:342–361 Horn RA, Johnson CR (1990) Matrix analysis. Cambridge University Press, New York, NY Horn RA, Johnson CR (1991) Topics in matrix analysis. Cambridge University Press, New York, NY
123
A second-order corrector algorithm for SDP
183
Kojima M, Shindoh S, Hara S (1997) Interior poit methods for the monotone semidefinite linear complementarity problem in symmetric matrices. SIAM J Optim 7:86–125 Li Y, Terlaky T (2010) A new class of large neighborhood path-following interior point algorithms for 0 0 √ semidefinite optimization with O( n log Tr(Xε S ) ) iteration complexity. SIAM J Optim 20:2853– 2875 √ Liu C, Liu H, Cong W (2010) An O( n L) iteration primal-dual second-order corrector algorithm for linear programming. Optim Lett. doi:101007/s11590-010-0242-6 Mehrotra S (1992) On the implementation of a primal-dual interior point method. SIAM J Optim 2:575–601 Monteiro RDC (1997) Primal-dual path-following algorithms for semidefenite programming. SIAM J Optim 7:663–678 Monteiro RDC (1998) Polynomial convergence of primal-dual algorithms for semidefinite programming based on Monteiro and Zhang family of directions. SIAM J Optim 8:797–812 Monteiro RDC (2003) First- and second-order methods for semidefinite programming. Math Program Ser B 97:209–244 Monteiro RDC, Zhang Y (1998) A unified analysis for a class of long-step primal-dual path-following interior-point algorithms for semidefinie programming. Math Program 81:281–299 Nesterov YE, Nemirovsky AS (1994) Interior point methods in convex programming: theory and applications. SIAM, Philadelphia, PA Nesterov YE, Todd MJ (1997) Self-scaled barriers and interior-point methods for convex programming. Math Oper Res 22:1–42 Nesterov YE, Todd MJ (1998) Primal-dual interior-point methods for self-scaled cones. SIAM J Optim 8:324–364 Nocedal J, Wright SJ (1999) Numerical optimization. Springer, New York, NY Peng JM, Roos C, Terlaky T (2002) Self-regular functions and new search directions for linear and semidefinite optimization. Math Program Ser A 93:129–171 Roos C, Terlaky T, Vial JP (2006) Interior point methods for linear optimization, 2nd edn. Springer, Boston, MA Salahi M, Mahdavi-Amiri N (2006) Polynomial time second order Mehrotra-type predictor-corrector algorithms. Appl Math Comput 183:646–658 Salahi M, Peng J, Terlaky T (2007) On Mehrotra-type predictor-corrector algorithms. SIAM J Optim 18:1377–1397 Todd MJ, Toh KC, Tütüncü RH (1998) On the Nesterov-Todd direction in semidefinite programming. SIAM J Optim 8:769–796 Vandenberghe L, Boyd S (1996) Semidefinite programming. SIAM Rev 38:49–95 Ye Y (1990) A class of projective transformations for linear programming. SIAM J Comput 19:457–466 Zhang Y (1998) On extending some primal-dual interior-point algorithms from linear programming to semidefinite programming. SIAM J Optim 8:365–386
123