On the Low Rank Solutions for Linear Matrix Inequalities Wenbao Ai
∗
Yongwei Huang
†
Shuzhong Zhang
‡
September 2006
Abstract
In this paper we present a polynomial-time procedure to find a low rank solution for a system of Linear Matrix Inequalities (LMI). The existence of such a low rank solution was shown in AuYeung and Poon [1] and Barvinok [3]. In Au-Yeung and Poon’s approach, an earlier unpublished manuscript of Bohnenblust [6] played an essential role. Both proofs in [1] and [3] are nonconstructive in nature. The aim of this paper is to offer a constructive and polynomial-time procedure to find such a low rank solution approximatively. Extensions of our new results and their relations to some of the known results in the literature are discussed.
Keywords: Rank reduction, linear matrix inequality, joint numerical range. Mathematics Subject Classification: 90C22, 15A03, 15A60.
∗
School of Science, Beijing University of Posts and Telecommunications, Beijing 100876, People’s Republic of China.
Email:
[email protected] † Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, Hong Kong. Email:
[email protected] ‡ Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, Hong Kong. Email:
[email protected]. Research supported by Hong Kong RGC Earmarked Grants CUHK4185/05E and CUHK418406.
1
1
Introduction
Finding a low rank matrix solution for a system of Linear Matrix Inequalities (LMI), or a Semidefinite Program (SDP), is of great importance in theory as well as in practice. As one example, SDP is often used as a relaxation for quadratic optimization, where a rank-1 SDP solution is automatically optimal for the quadratic model. Thus, finding a rank-1 matrix solution for an SDP problem is equivalent to solving a quadratic program. Ye and Zhang [19] showed a number of examples where such rank1 solutions can be found. In Sensor Network Localization problems (see Biswas and Ye [5]), one is required to find rank 2 (planar) or rank 3 (spatial) solutions for an SDP relaxation problem. Similarly, in graph realization (see Biswas et al [4]), the problem is to find a solution to an LMI with the rank no more than a given value. Unexpectedly, the famous kissing number problem of identical spheres (originally due to Newton and Gregory) is related as well. More information on the ball kissing number problem can be found, e.g., in Pfender and Ziegler [15]. Barvinok [2] presented an upper bound on the lowest rank among all matrix solutions of a feasible LMI system. The same bound was independently rediscovered by Pataki in [14]. Later it was shown by Barvinok in [3] that the bound is essentially tight. Along a different line, Sturm and Zhang [18] proposed a matrix rank-1 decomposition scheme, leading to an algorithmic approach to finding rank-1 solutions for SDP, provided that the number of constraints in the SDP problem is small. The matrix decomposition scheme of Sturm and Zhang was extended to the complex matrix case by Huang and Zhang [11]. Moreover, Huang and Zhang [11] also proposed a polynomial-time procedure, in the complex case, for finding a low rank solution with a bound on the rank similar to that in Barvinok [2] and Pataki [14]. Although the bound in Barvinok [2] cannot be improved in general, it does permit an unexpected strengthening under an additional mild condition. A first reference in which such an enhanced bound had appeared was Au-Yeung and Poon [1] (1979). The corner stone in Au-Yeung and Poon’s proof is a result established by Bohnenblust [6] in an unpublished note. Barvinok [3] (2001) gave an alternative proof. Unfortunately, both proofs are not constructive. As a matter of fact, Barvinok posed as an open problem in [3] to find such a low rank solution constructively and efficiently. The main goal of the current paper is indeed to solve this open problem by presenting a polynomial-time method to
2
actually get hold of a low rank solution as promised in [1] and [3], albeit in an approximative sense. That we speak only of approximation in this context is inevitable, since finding any feasible solution (even without rank constraints) for an LMI system itself is basically a Semidefinite Program, which can only be solved approximately. The aim of this paper is to present a polynomial-time algorithm to find the low rank solutions, in terms of the problem dimension and log 1² where ² is the error in satisfying the equality constraints. As a consequence, the existence of the exact low rank solution follows by taking limit. The organization of the paper is as follows. In Section 2 we present an algorithmic approach and the new results, and in Section 3 we discuss extensions of our results and their connections with other known results in the literature.
2
A polynomial-time rank reduction procedure
Let F be either < (real) or C (complex), and SF n be either S n (space of real n by n symmetric matrices) or Hn (space of complex n by n Hermitian conjugate symmetric matrices). Furthermore, let
n(n + 1)/2, if F = 0 | Ir + t∆ º 0}. (If ∆ º 0 then let ∆ := −∆). This rank reduction procedure terminates only when (1) does not have any solution that is linearly independent of Ir , implying that dF (r) ≤ m + 1. Therefore, if m ≤ dF (p + 1) − 2 (under the condition of Theorem 2.2), then dF (r) ≤ m + 1 ≤ dF (p + 1) − 1, and consequently rank (X) = r < p + 1 due to the strict monotonicity of dF (·). Thus, rank (X) ≤ p. This proves Theorem 2.2. Under the conditions of Theorem 2.4 (m ≤ dF (p + 1) − 1 and 1 ≤ p ≤ n − 2), to complete the proof we shall continue to discuss what to do when the above described rank reduction procedure terminates. In fact, the only remaining case to discuss is: dF (r) = m + 1 = dF (p + 1), since the case m + 1 < dF (p + 1) was already considered in Theorem 2.2. Then we have rank (X) = p + 1 ≤ n − 1. In that case, let X = U ΛU ∗ where U = [u1 , u2 , ..., up , up+1 ] ∈ F n×(p+1) and uj ’s are normalized orthogonal vectors, j = 1, 2, ..., p + 1, and Λ ∈ SF p+1 ++ is a diagonal positive definite matrix. Since the above described rank reduction procedure has terminated, it follows that (U ∗ Ai U ) • ∆ = 0, i = 1, ..., m,
(2)
has solutions that form a one-dimensional subspace; in particular, they are all multiples of Λ. Due to p+1 ≤ n−1, there must exist up+2 ∈ F n with kup+2 k = 1 and u∗j up+2 = 0, j = 1, 2, ..., p, p+1. ¯ ∈ SF p+1 such that the following The range space of (2) is of dimension m, therefore there exists ∆ equations are satisfied ¯ i = 1, ..., m. −u∗p+2 Ai up+2 = (U ∗ Ai U ) • ∆, Letting
˜ := ∆
¯ ∆
0p+1
0T p+1
1
˜ 0 := and ∆
Λ
0p+1
0T p+1
0
,
˜ := [U, up+2 ] ∈ F n×(p+2) , it follows that Ai • (U ˜∆ ˜U ˜ ∗ ) = 0 and Ai • (U ˜∆ ˜ 0U ˜ ∗ ) = 0, for all and U ˜ (∆ ˜ + τ∆ ˜ 0 )U ˜ ∗ = 0, for all i = 1, ..., m, where τ ∈ < is any given constant. i = 1, ..., m, and so Ai • U 1 ¯ − 12 ). Then, ∆ ¯ + τ Λ ¹ 0 and rank (∆ ¯ + τ Λ) ≤ p. Let Choose τ := −λmax (Λ− 2 ∆Λ
¯ ∆ + τΛ = Q
β1 ..
∗ Q
. βp 0
5
where βj ≤ 0, j = 1, ..., p, and Q is an order p + 1 unitary matrix. If βj ’s are all zero, then ˜ (∆ ˜ +τ∆ ˜ 0 )U ˜ ∗ is a rank-one solution in L ∩ SF n+ , and the theorem is proven. Otherwise, to simplify, U let us recycle the notations and define ˜ := [U, up+2 ]. U := U Q, and [u1 , ..., up+1 ] := U, and U Clearly, all uj ’s, j = 1, ..., p + 1, p + 2, remain to be unit vectors and orthogonal to each other. Let Ut := [u1 , u2 , ..., up , (1 − t)up+1 + tup+2 ] ∈ F n×(p+1) . Clearly, rank (Ut ) = p + 1 for any t ∈ [0, 1], with U0 = U . For a given t ∈ [0, 1], consider the following system of linear equations (Ut∗ Ai Ut ) • ∆ = 0, i = 1, ..., m.
(3)
Since m = dF (p + 1) − 1, for any fixed t ∈ [0, 1] the above linear equations must have at least one nontrivial solution, which can be found in polynomial time by Gaussian elimination. Take such a matrix solution, normalize it and denote it to be ∆t with k∆t kF = 1. In particular, ∆0 = Λ/kΛkF , and
1 ∆1 = qP p 2+1 β j=1 j
β1 ..
.
. βp 1
We now apply bisection on the interval [0, 1], according to the positive (semi)definteness of the normalized solution ∆t for (3). By resetting ∆t = −∆t if ∆t is negative semidefinite, we know that ∆t is either positive semidefinite or indefinite. Let pk and ik be two sequences defined as follows, where ‘p’ stands for the ‘positive definiteness’ nature of ∆pk , and ‘i’ is for the ‘indefiniteness’ nature of ∆ik , and k is the iteration counter. Initially, p0 = 0 and i0 = 1. Consider now k = 1. Take t = (pk−1 + ik−1 )/2. If ∆t is positive semidefinite with rank (∆t ) ≤ p, stop, and we have found a nontrivial solution with rank less than p + 1 in the set L ∩ SF n+ ; if ∆t is positive definite, then let pk := t and ik := ik−1 ; else, if ∆t is indefinite, then let pk := pk−1 and ik := t. After this iteration, increase the iteration counter to k := k + 1, and continue the procedure.
6
Clearly, ik − pk = 2−k , k = 0, 1, 2, .... At each iteration k, we shall also keep track of a solution, to be denoted by Xk , in the following way. First, let Dk := (∆pk + τk ∆ik )/k∆pk + τk ∆ik kF , where
argmax {τ > 0 | ∆p + τ ∆i º 0} , k k τk = argmax {τ > 0 | ∆ − τ ∆ º 0} , pk ik
if ∆pk • ∆ik ≥ 0; if ∆pk • ∆ik < 0,
and finally let Xk := Upk Dk Up∗k .
(4)
Since ∆pk is positive definite and ∆ik is indefinite in our algorithmic procedure, τk is always finite, and ∆pk + τk ∆ik can never be zero either. Moreover, by this construction, we have rank (Xk ) ≤ p. By renaming −∆ik to ∆ik if necessary, we assume from now on that ∆pk • ∆ik ≥ 0 always holds. Since k(1 − t)up+1 + tup+2 k2 = (1 − t)2 + t2 ∈ [0.5, 1] for all t ∈ [0, 1], and
Up∗k Upk =
Ip
0p
0T k(1 − pk )up+1 + pk up+2 k2 p
,
therefore, 0.5Ip+1 ¹ Up∗k Upk ¹ Ip+1 , and p ≤ kUp∗k Upk kF ≤ p + 1. Thus, 1 1 kXk k2F = tr (Upk Dk Up∗k Upk Dk Up∗k ) ≥ tr (Dk2 Up∗k Upk ) ≥ kDk k2F = 0.25, 2 4 and also kXk kF ≤ 1, for all iteration counter k. The matrix solution Xk satisfies 0.25 ≤ kXk kF ≤ 1, and rank (Xk ) ≤ p for all k. However, Xk may not exactly be on the subspace L anymore. In the remaining part of the analysis, we shall bound this error. Before proceeding, we first note the following estimations: k∆pk + τk ∆ik k2F ≥ k∆pk k2F + τk2 k∆ik k2F + 2τk ∆pk • ∆ik ≥ 1 + τk2 . ˆ := [0n , ..., 0n , up+1 − up+2 ] ∈ F n×(p+1) . We have kU ˆ kF = Let U | {z }
(5)
√ 2, and
p
ˆ = Ui + U ˆ /2k . Upk = Uik + (ik − pk )U k 7
(6)
For any 1 ≤ j ≤ m and k ≥ 1, we have |Aj • Xk | = |Aj • (Upk Dk Up∗k )| 1 |Aj • (Upk (∆pk + τk ∆ik )Up∗k )| = k∆pk + τk ∆ik kF τk = |Aj • (Upk ∆ik Up∗k )| k∆pk + τk ∆ik kF τk ≤ q |Aj • (Upk ∆ik Up∗k )| [here we use (5)] 1 + τk2 ≤ |Aj • (Upk ∆ik Up∗k )| ¯ ´¯ ³ ¯ ˆ /2k )∗ ¯¯ ˆ /2k )∆i (Ui + U [here we use (6)] = ¯Aj • (Uik + U k k ¯ ´ ´¯ ³³ ¯ ˆ∗ + U ˆ ∆i U ∗ /2k + U ˆ ∆i U ˆ ∗ /4k ¯¯ = ¯Aj • Uik ∆ik U k k ik ³ ´ ˆ kF /2k + kU ˆ k2F · k∆i kF /4k ≤ kAj kF 2kUik ∆ik kF · kU k ≤
4kAj kF . 2k ³
Therefore, for any given precision ² > 0, if k ≥ log
4 max1≤j≤m kAj kF ²
´ , then |Aj • Xk | ≤ ² for all
1 ≤ j ≤ m. Using the famous error bound result of Hoffman for the polyhedral systems (thus includes the linear subspace as a special case; see [9] or [13]), it follows that dist (Xk , L) ≤ O(²). Using a general error bound result for the LMI systems (see Theorem 7.4.2 in Luo and Sturm [12]), it follows that limk→∞ dist (Xk , L ∩ SF n+ ) = 0.
¤
The following example is essentially due to Bohnenblust [6], which can be used to show that the bound in Theorem 2.4 is tight without any additional conditions.
Example 2.5 Let 1 ≤ p ≤ n − 2 and consider ¯ ¯ ¯ λIp+1 X12 ∈ SF n ¯ λ ∈ 0. Consider the subspace as follows L = {X ∈ SF n | (Ai −
bi Am ) • X = 0, i = 1, 2, ..., m − 1}. bm
Of course, codim(L) = m − 1 ≤ dF (p + 1) − 1. Using Theorem 2.4, we can get X(²) º 0 in polynomial-time, such that: (a) rank (X(²)) ≤ p; (b) 0.25 ≤ kX(²)kF ≤ 1; (c) dist (X(²), L) ≤ ². We shall see that in this case, lim inf ²→0 Am • X(²) > 0. We prove this by contradiction. If there ˆ then Ai • X ˆ = 0, is a subsequence ²k , such that lim²k →0 Am • X(²k ) = 0 and lim²k →0 X(²k ) = X, ˆ º 0, contradicting to the fact that A ∩ SF n+ is bounded. Similarly, if there is a i = 1, ..., m, and X ˆ then X ˆ := − subsequence ²k with lim²k →0 Am • X(²k ) < 0 and lim²k →0 X(²k ) = X,
bm ˆ ˆX Am •X
satisfies
ˆ = −bi , i = 1, ..., m. Taking any X ∈ A ∩ SF n+ , and letting ∆ := X + X ˆ º 0, we have Ai • X Ai • ∆ = 0, i = 1, ..., m. This is again in contradiction with the fact that A ∩ SF n+ is bounded. 9
Now that lim inf ²→0 Am • X(²) > 0, let us define X(²) :=
bm Am •X(²) X(²),
which is the desired matrix
solution, satisfying: 1. X(²) ∈ SF n+ ; 2. lim sup²→0 kX(²)kF < ∞; 3. rank (X(²)) ≤ p; 4. dist (X(²), A) = O(²). Using the error bound result for the LMI systems (Theorem 7.4.2 in Luo and Sturm [12]), we have lim²→0 dist (X(²), A ∩ SF n+ ) = 0. If, additionally, A ∩ SF n++ 6= ∅, then by the error bound result in Zhang [20] (Theorem 2.5), we have dist (X(²), A ∩ SF n+ ) = O(²).
¤
Similar to Example 2.5, the bound in Theorem 2.6 cannot be improved in general. Example 2.7 Let 1 ≤ p ≤ n − 2 and consider ¯ ¯ ¯ Ip+1 X12 ∈ SF n ¯ tr X22 = 1 . A= X= ¯ ¯ X X 21
22
One can compute that codim(A) = dF (p + 1) + 1 in this case. It is clear that A does not contain any nontrivial positive semidefinite matrices with rank less than p+1, while it indeed contains a nontrivial positive semidefinite matrix with rank equal to p + 1: √ 1 1p+1 Ip+1 0(p+1)×(n−p−2) p+1 √ 1 1T 1 0 p+1 p+1 0(n−p−2)×(p+1) 0 0(n−p−2)×(n−p−2) Similarly, consider
.
Ip+1 X12 n A= X= ∈ SF . X21 X22
Then, codim(A) = dF (p + 1). However, it does not contain any positive semideinite matrix with rank less than p + 1. This shows that the boundedness assumption on A ∩ SF n+ in Theorem 2.6 cannot be removed in general. 10
Using the same technique, the following analogous results in the context of Linear Matrix Inequalities (replacing linear subspace by a polyhedral cone) can be similarly shown. Corollary 2.8 Let A = {X ∈ SF n | Ai • X Ei 0, i = 1, ..., m}, where Ei ∈ {≤, ≥, =}, i = 1, ..., m. Suppose that dim(A ∩ SF n+ ) ≥ 1, m ≤ dF (p + 1) − 1, and 1 ≤ p ≤ n − 2. Then, one can find in polynomial time (of n and log 1² ) a matrix X such that kXkF = 1, X ∈ SF n+ , rank (X) ≤ p, and dist (X, A) ≤ ². In particular, this implies that there is a nontrivial X ∈ A∩SF n+ with rank (X) ≤ p.
Proof. Follow exactly the same steps as in the proof for Theorem 2.4, except that the directionfinding equations are now expanded sequentially, (U ∗ Ai U ) • ∆ = 0, for i with Ai • X = 0,
(7)
where X = U U ∗ . By letting X(t) := U (I + t∆)U ∗ and increasing t, we either arrive at a reduction of the rank of X, or adding a new index to the direction-finding equations (7). In the worst case, all the equations are added, and then the situation is the same as in Theorem 2.4.
¤
Corollary 2.9 Let A = {X ∈ SF n | Ai • X E bi , i = 1, ..., m}, where Ei ∈ {≤, ≥, =}, i = 1, ..., m. Suppose that A ∩ SF n+ is nonempty and bounded. Moreover, suppose that m ≤ dF (p + 1) and 1 ≤ p ≤ n − 2. Then, one can find in polynomial time (of n and log 1² ) a matrix X such that kXkF = 1, X ∈ SF n+ , rank (X) ≤ p, and dist (X, A) ≤ ². In particular, this implies that there is a nontrivial X ∈ A ∩ SF n+ with rank (X) ≤ p.
3
Connections and extensions
Consider a given Euclidean space V. For a convex cone K ⊆ V, its dual cone is denoted as K∗ := {y ∈ V | xT y ≥ 0, for all x ∈ K}. We first note the following useful fact regarding the convex cones and their dual objects. Lemma 3.1 Let U, K ⊆ V be two nonempty closed convex cones. Suppose that K is pointed, i.e., K ∩ (−K) = {0}. Then, dim(U ∩ K) = 0 if and only if (−U ∗ ) ∩ int (K∗ ) 6= ∅. 11
Proof. ‘If’ part. Let 0 6= y ∈ (−U ∗ ) ∩ int (K∗ ). Then, for any 0 6= x ∈ K it follows that xT y > 0. Thus, if there is any 0 6= x ∈ U ∩ K, then since y ∈ −U ∗ it follows that xT y ≤ 0, yielding a contradiction. Thus, we must have U ∩ K = {0}. ‘Only if’ part. If U ∩ K = {0}, then by the duality relation we have V = {0}∗ = (U ∩ K)∗ = cl (U ∗ + K∗ ).
(8)
Suppose by contradiction that (−U ∗ ) ∩ int (K∗ ) = ∅. Since K is pointed, int (K∗ ) is nonempty (−U ∗ is obviously nonempty), and so we can apply the separation theorem to conclude that there exists 0 6= d ∈ V such that dT (−y) ≤ 0 for all −y ∈ −U ∗ (i.e., dT y ≥ 0 for all y ∈ U ∗ ), and dT z ≥ 0 for all z ∈ int (K∗ ). Now, by (8), since −d ∈ V, there should exist a sequence yi ∈ U ∗ and zi ∈ K∗ , i = 1, 2, ..., such that −d = lim (yi + zi ). i→∞
This leads to the following contradiction: 0 > −kdk2 = lim dT (yi + zi ) ≥ 0. i→∞
The lemma is thus proven.
¤
Now let us take V = SF n , K = SF n+ , U = L = {X ∈ SF n | Ai • X = 0, i = 1, ..., m}. It follows P that K∗ = K = SF n+ , and −U ∗ = U ∗ = L⊥ = {Z ∈ SF n | Z = ki=1 yi Ai , y ∈