MATHEMATICS OF OPERATIONS RESEARCH Vol. 34, No. 4, November 2009, pp. 1008–1022 issn 0364-765X eissn 1526-5471 09 3404 1008
informs
®
doi 10.1287/moor.1090.0419 © 2009 INFORMS
A Low-Dimensional Semidefinite Relaxation for the Quadratic Assignment Problem Yichuan Ding
Department of Management Science and Engineering, School of Engineering, Stanford University, Stanford, California 94305,
[email protected]
Henry Wolkowicz
Department of Combinatorics and Optimization, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada,
[email protected], http://orion.uwaterloo.ca/~hwolkowi The quadratic assignment problem (QAP) is arguably one of the hardest NP-hard discrete optimization problems. Problems of dimension greater than 25 are still considered to be large scale. Current successful solution techniques use branch-and-bound methods, which rely on obtaining strong and inexpensive bounds. In this paper, we introduce a new semidefinite programming (SDP) relaxation for generating bounds for the QAP in the trace formulation. We apply majorization to obtain a relaxation of the orthogonal similarity set of the quadratic part of the objective function. This exploits the matrix structure of QAP and results in a relaxation with much smaller dimension than other current SDP relaxations. We compare the resulting bounds with several other computationally inexpensive bounds such as the convex quadratic programming relaxation (QPB). We find that our method provides stronger bounds on average and is adaptable for branch-and-bound methods. Key words: quadratic assignment problem; semidefinite programming relaxations; interior point methods; large-scale problems MSC2000 subject classification: Primary: 90C26, 90C22; secondary: 90C09, 65K10 OR/MS subject classification: Primary: Programming; secondary: quadratic History: Received October 27, 2006; revised April 22, 2008, June 2, 2009, and August 10, 2009. Published online in Articles in Advance October 20, 2009.
1. Introduction. In this paper, we introduce a new efficient bound for the quadratic assignment problem (QAP). We use the Koopmans-Beckmann trace formulation QAP
∗QAP = min trace AXBX T + CX T X∈
where A, B, and C are n by n real matrices and denotes the set of n by n permutation matrices. Throughout this paper, we assume the symmetric case, i.e., that both A and B are symmetric matrices. The QAP is considered to be one of the hardest NP-hard problems to solve in practice. Many important combinatorial optimization problems can be formulated as a QAP. Examples include the traveling salesman problem, VLSI design, keyboard design, and the graph-partitioning problem. The QAP is well-described by the problem of allocating a set of n facilities to a set of n locations while minimizing the quadratic objective arising from the distance between the locations in combination with the flow between the facilities. Recent surveys include Pardalos and Wolkowicz [29], Wolkowicz [34], Zhao et al. [36], Pardalos et al. [30], Karisch et al. [20], Hadley et al. [13, 14, 15], and Rendl and Wolkowicz [33]. Solving QAP to optimality usually requires a branch-and-bound (B&B) method. Essential for these methods are strong, inexpensive bounds at each node of the branching tree. In this paper, we study a new bound obtained from a semidefinite programming (SDP) relaxation. This relaxation uses only On2 variables and On2 constraints but yields a bound provably better than the so-called projected eigenvalue bound (PB) (Hadley et al. [14]), and is competitive with the recently introduced quadratic programming bound (QPB) (Anstreicher and Brixius [2]). 1.1. Outline. In §1.2, we continue with preliminary results and notation. In §1.3, we review some of the known bounds in the literature. Our main results appear in §2 where we compare relaxations that use a vector lifting of the matrix X into the space of n2 × n2 matrices with a matrix lifting that remains in n , the space of n × n symmetric matrices. We then parameterize and characterize the orthogonal similarity set of B, B, using majorization results on the eigenvalues of B (see Theorem 2.1). This results in three SDP relaxations, MSDR1 to MSDR3 (see §2.2). We conclude with numerical tests in §3. 1008
Ding and Wolkowicz: Low-Dimensional SDP Relaxation for QAP
1009
Mathematics of Operations Research 34(4), pp. 1008–1022, © 2009 INFORMS
1.2. Notation and preliminaries. For two real m × n matrices A B ∈ mn , A B = trace AT B is the trace inner product. nn = n denotes the set of n by n square real matrices and n denotes the space of n × n n symmetric matrices. +n (resp. ++ ) denotes the cone of positive semidefinite (resp. positive definite) matrices n n in . We let A B (resp. A B) denote the Löwner partial order A − B ∈ +n (resp. A − B ∈ ++ ). The linear transformation diag M denotes the vector formed from the diagonal of the matrix M and the adjoint linear transformation is diag∗ v = Diag v, i.e., the diagonal matrix formed from the vector v. We use A ⊗ B to 2 denote the Kronecker product of A and B and use x = vecX to denote the vector in n obtained from the columns of X. Then (see, e.g., Horn and Johnson [19]), trace AXBX T = AXB X = vecAXB x = xT B ⊗ Ax
(1)
We let denote the cone of nonnegative (elementwise) matrices = X ∈ n X ≥ 0. denotes the set of matrices with row and column sums 1, = X ∈ n Xe = X T e = e, where e is the vector of ones. E = eeT is the matrix of ones and denotes the set of doubly stochastic matrices = ∩ . The minimal product of two vectors is n x y− = min xi yi
i=1
where the minimum is over all permutations, , , of the indices 1 2 n. Similarly, we define the maximal product of x, y, x y+ = max ni=1 xi yi . We denote the vector of eigenvalues of a matrix A by A. Definition 1.1. Let x y ∈ n . By abuse of notation, we denote x majorizes y or y is majorized by x with x y or y x. Let the components of both vectors be sorted in nonincreasing order, i.e., x1 ≥ x2 ≥ · · · ≥ xn , y1 ≥ y2 ≥ · · · ≥ yn . Following, e.g., Marshall and Olkin [23], x y if and only if p i=1
xi ≥
p i=1
yi
n i=1
p = 1 2 n − 1
xi =
n i=1
yi
In Marshall and Olkin [23], it is shown that x y if and only if there exists S ∈ with Sx = y. Note that for fixed y, the constraint x y is not a convex constraint; however, x y is a convex constraint and it has an equivalent LP formulation (e.g., Hardy et al. [18]). 1.3. Known relaxations for QAP. One of the earliest and least expensive relaxations for QAP is the Gilmore-Lawler bound (GLB), which is based on a linear programming (LP) formulation (see, e.g., Gilmore [12], Drezner [9]). Related dual-based LP bounds such as the Karisch-Çela-Clausen-Espersen bound (KCCEB), are discussed in Karisch et al. [21], Pardalos et al. [31], Drezner [9], and Hahn and Grant [16]. These formulations are currently able to handle problems with moderate size n (approximately 20) (Gilmore [12], Lawler [22]). Formulations based on nonlinear optimization include eigenvalue and parametric eigenvalue bounds (EB) (Finke et al. [11], Rendl and Wolkowicz [33]), projected eigenvalue bounds (PB) (Hadley et al. [14], Falkner et al. [10]), convex quadratic programming (QPB) bounds (Anstreicher and Brixius [2]), and SDP bounds (Rendl and Sotirov [32], Zhao et al. [36]). For recent numerical results that use these bounds, see, e.g., Anstreicher and Brixius [2], Rendl and Sotirov [32]. A summary and comparison of many of these bounds is given in Anstreicher [1]. Note that = ∩ ∩ , i.e., the addition of the orthogonal constraints changes the doubly stochastic matrices to permutation matrices. This illustrates the power of nonlinear quadratic constraints for QAP. Using the quadratic constraints, we can see that SDP arises naturally from Lagrangian relaxation (see, e.g., 1 Nesterov T 1 et al. [27]). Alternatively, one can lift the problem using the positive semidefinite matrix vecX into vecX 2 the symmetric matrix space n +1 . One then obtains deep cuts for the convex hull of the lifted permutation matrices. However, this vector-lifting SDP relaxation requires On4 variables and, hence, is expensive to use. Problems with n > 25 become impractical for branch-and-bound methods. It has been proved in Anstreicher and Wolkowicz [3] that strong (Lagrangian) duality holds for the following quadratic program with orthogonal constraints: ∗EB =
min
XX T =X T X=I
traceAXBX T
Ding and Wolkowicz: Low-Dimensional SDP Relaxation for QAP
1010
Mathematics of Operations Research 34(4), pp. 1008–1022, © 2009 INFORMS
The optimal value ∗EB yields the so-called eigenvalue bound, denoted EB. The Lagrangian dual is ∗EB = max n min traceS + traceT + xT B ⊗ A − I ⊗ S − T ⊗ Ix S T ∈ x∈n2
(2)
The inner minimization problem results in the hidden semidefinite constraint B ⊗ A − I ⊗ S − T ⊗ I 0 Under this constraint, the inner minimization program is attained at x = 0. As a result of strong duality, the equivalent dual program ∗EB = max n traceS + traceT B ⊗ A − I ⊗ S − T ⊗ I 0 S T ∈
(3)
has the same value as the primal program, i.e., both yield the eigenvalue bound EB. One can then add the constant row and column sum linear constraints Xe = X T e = e to obtain the projected eigenvalue bound PB in Hadley et al. [14]. In Anstreicher and Brixius [2], the authors strengthen PB to get a (parametric) convex quadratic programming bound (QPB). This new bound QPB is inexpensive to compute and, under some mild assumptions, is strictly stronger than PB. QPB is a highly competitive bound if we take into account the trade-off between the quality of the bound and the expense in the computation. The use of QPB along with the Condor high-throughput computing system has resulted in the solution for the first time of several large QAP problems from the QAPLIB library (Burkard et al. [7], Anstreicher and Brixius [2], Anstreicher et al. [4]). In this paper, we propose a new relaxation for QAP, which has comparable complexity to QPB. Moreover, our numerical tests show that this new bound usually obtains better bounds than QPB when applied to problem instances from the QAPLIB library. 2. SDP relaxation and quadratic matrix programming. 2.1. Vector-lifting SDP relaxation (VSDR). program: QCQP
Consider the following quadratic constrained quadratic
∗QCQP = min xT Q0 x + c0T x + #0 s.t. xT Qj x + cjT x + #j ≤ 0
j = 1 m
n
x∈ where for all j, we have Qj ∈ n , cj ∈ n , and #j ∈ . To find approximate solutions to QCQP, one can 2 homogenize the quadratic functions to get the equivalent quadratic forms qj x x0 = xT Qj x +cjT xx0 +# x jx0 along 2 0 with the additional constraint x0 = 1. The homogenized forms can be linearized using the vector x ∈ n+1 , i.e., T #j 21 cjT x0 x0 qj x x0 = 1 x x c Qj 2 j #j 21 cjT 1 xT = trace (4) 1 x Y c Qj 2 j where Y represents xxT and the constraint Y = xxT is relaxed to xxT Y . Equivalently, we can use the Schur complement and get the lifted linear constraint 1 xT 0 (5) Z= x Y i.e., we can identify y = x. The objective function is now linear: #0 21 c0T Z trace 1 c Q0 2 0
Ding and Wolkowicz: Low-Dimensional SDP Relaxation for QAP
1011
Mathematics of Operations Research 34(4), pp. 1008–1022, © 2009 INFORMS
and the constraints in QCQP are relaxed to linear inequality constraints: trace
#j
1 T c 2 j
1 c 2 j
Qj
Z ≤ 0
j = 1 m
In this paper, we call this a vector-lifting semidefinite relaxation (VSDR) and we note that the unknown variable Z ∈ n+1 . 2.2. Matrix-lifting SDP relaxation (MSDR). MQCQP
Consider QCQP with matrix variables:
∗MQCQP = min traceX T Q0 X + C0 X T + #0 s.t. traceX T Qj X + Cj X T + #j ≤ 0
j = 1 m
X ∈ nr Let x = vecX, c = vecC, )ij denote the Kronecker delta, and Eij = ei ejT ∈ n be the zero matrix except with one at the i j position. Note that if r = n, then the orthogonality constraint XX T = I is equivalent to xT I ⊗ Eij x = )ij , ∀ i j. X T X = I is equivalent to xT Eij ⊗ Ix = )ij , ∀ i j. Using both of the redundant constraints XX T = I and X T X = I strengthens the SDP relaxation (see Anstreicher and Wolkowicz [3]). We can now rewrite QAP using the Kronecker product and see that it is a special case of MQCQP with linear and quadratic equality constraints and with nonnegativity constraints (recall that = ∩ ∩ ). ∗QAP = min xT B ⊗ Ax + c T x s.t. xT I ⊗ Eij x = )ij
∀ i j
xT Eij ⊗ Ix = )ij
∀ i j
(6)
Xe = X T e = e x ≥ 0 2
Note that in the case of QAP, we have r = n and x = vecX from (6) is in n . Relaxing the quadratic objective function and the quadratic orthogonality constraints results in a linearized/lifted constraint (5) and we T 2 end up with Z = 1x xY ∈ n +1 , a prohibitively large matrix. However, we can use a different approach and exploit the structure of the problem. We can replace the constraint Y = xxT with the constraint Y = XX T and T I X then relax it to Y XX T . This is equivalent to the linear semidefinite constraint X Y 0. The size of this constraint is significantly smaller. We call this a matrix-lifting semidefinite relaxation and denote it MSDR. The relaxation for MQCQP with X ∈ nr is MSDR
∗MSDR = min traceQ0 Y + C0 X T + #0 s.t. traceQj Y + Cj X T + #j ≤ 0
I
XT
X
Y
X ∈ nr
j = 1 m
0 Y ∈ n
If r ≤ n and the Slater constraint qualification holds, then MSDR solves MQCQP, ∗MQCQP = ∗MSDR (see Beck [5], Beck and Teboulle [6]). However, the bound from MSDR is not tight in general.
Ding and Wolkowicz: Low-Dimensional SDP Relaxation for QAP
1012
Mathematics of Operations Research 34(4), pp. 1008–1022, © 2009 INFORMS
To apply this to the QAP formulation in (6), we first reformulate it as MQCQP by removing B from the objective using the constraint R = XB: ∗QAP
= min trace
X
T
R s.t. R = XB
0
1 A 2
1 A 2
0
X
+ trace CX T
R
(7)
XX T − I = X T X − I = 0 Xe = X T e = e X ∈ n
X ≥ 0 To linearize the objective function, we use trace and the lifting
X
T
R
X R
0
1 A 2
1 A 2
0
X R
X
= trace
R
T
=
XX T
XRT
RX T
RRT
0
1 A 2
1 A 2
0
=
X
R
I
Y
Y
Z
X
T
R
(8)
This defines the symmetric matrices Y Z ∈ n , where we see Y = RX T = XX T RX T = XBX T ∈ n . We can then relax this to get the convex quadratic constraint XX T XRT I Y GX R Y Z = − 0 (9) RX T RRT Y Z A Schur complement argument shows that the convex quadratic constraint (9) is equivalent to the linear conic constraint1 I X T RT X I Y (10) 0 R Y Z The above discussion yields the MSDR relaxation for QAP: MSDR0
∗QAP ≥ min trace AY + trace CX T s.t. R = XB Xe = X T e = e I X T RT X I Y 0 R Y Z X R ∈ n
(11) X ≥ 0
Y Z ∈ n
where Y represents or approximates RX T = XBX T and Z represents or approximates RRT = XB 2 X T . Because X is a permutation matrix, we conclude that the diagonal of Y is the X permutation of the diagonal of B (and, similarly, for the diagonals of Z and B 2 ): diagY = X diagB
diagZ = X diagB 2
(12)
1 Note that the linearized conic constraint is not onto, which suggests that it is more ill-conditioned than the convex quadratic constraint. Empirical tests in Ding et al. [8] confirm this.
Ding and Wolkowicz: Low-Dimensional SDP Relaxation for QAP
1013
Mathematics of Operations Research 34(4), pp. 1008–1022, © 2009 INFORMS
Also, given that Xe = X T e = e and Y = XBX T , Z = XB 2 X T for all X, Y , Z feasible for the original QAP, we conclude that Ye = XBe Ze = XB 2 e We may add these additional constraints to the above MSDR. These constraints essentially replace the orthogonality constraints. We get the first version of our SDP relaxation: MSDR1
∗MSDR1 = min trace AY + trace CX T s.t. Xe = X T e = e diagY = X diagB diagZ = X diagB 2 Ye = XBe Ze = XB 2 e I X T XBT X I Y XB Y Z X ∈ n
0
X ≥ 0
Y Z ∈ n
Proposition 2.1. Let B be nonsingular. In addition, suppose that X Y Z solves MSDR1 and satisfies Z = XB 2 X T . Then, X is optimal for QAP. Proof.
Via the Schur complement, we know that the semidefinite constraint in MSDR1 is equivalent to I − XX T Y − XBX T 0 (13) Y − XBX T Z − XB 2 X T
Therefore, XX T I, X T X I. Moreover, X satisfies Xe = X T e = e, X ≥ 0. Now, multiplying both sides of diagZ = X diagB 2 from the left by eT yields trace Z = trace B 2 . Because Z = XB 2 X T , we conclude that trace Z = trace XB 2 X T = trace B 2 , i.e., trace B 2 I − X T X = 0. Because B is nonsingular, we conclude that B 2 0. Therefore, I − X T X 0 implies that I = X T X. Thus, the optimizer X is orthogonal and doubly stochastic (X ∈ ∩ ). Hence, X is a permutation matrix. Moreover, (13) and Z − XB 2 X T = 0 imply the off-diagonal block Y − XBX T = 0. Thus, we conclude that the bound ∗MSDR1 from (MSDR1 ) is tight. Remark 2.1. The assumption that B is nonsingular is made without loss of generality because we could shift B by a small positive multiple of the identity matrix, say ,I, while simultaneously subtracting ,trace A, i.e., traceAXBX T + CX T = traceAXB + ,IX T − ,AXX T + CX T = traceAXB + ,IX T + CX T − , trace A 2.2.1. The orthogonal similarity set of B. In this section, we include additional constraints in order to strengthen MSDR1 . Using majorization given in Definition 1.1, we now characterize the convex hull of the orthogonal similarity set of B, denoted conv B. Theorem 2.1.
Let
S1 = conv B = convY ∈ n Y = XBX T X ∈ ¯ ≥ A ¯ B− ∀ A¯ ∈ n S2 = Y ∈ n trace AY S3 = Y ∈ n diagX T Y X B ∀ X ∈ S4 = Y ∈ n Y B
Then, S1 is the convex hull of the orthogonal similarity set of B, and S1 = S2 = S3 = S4 .
(14)
Ding and Wolkowicz: Low-Dimensional SDP Relaxation for QAP
1014
Mathematics of Operations Research 34(4), pp. 1008–1022, © 2009 INFORMS
Proof. (i) S1 ⊆ S2 : Let Y ∈ S1 A¯ ∈ n . Then, ¯ ≥ trace AY
min
Y ∈conv B
T ¯ = min trace AXBX ¯ ¯ B− trace AY = A X∈
by the well-known minimal inner-product result (e.g., Rendl and Wolkowicz [33]), Finke et al. [11]. (ii) S2 ⊆ S3 : Let U ∈ , p ∈ 1 2 n − 1, and let .p denote the index set corresponding to the p smallest entries of diagU T Y U . Define the support vector /p ∈ n of .P by 1 if i ∈ .p p / i = 0 otherwise. Then, for Ap = U Diag/p U T , we get /p diagU T Y U = Diag/p U T Y U = U Diag/p U T Y = Ap Y ≥ /p B− by definition of S2 . Because choosing A¯ = ±I implies trace Y = trace B, the inclusion follows. (iii) S3 ⊆ S4 : Let Y ∈ S3 and let Y = V DiagY V T , V ∈ , be its spectral decomposition. Because U ∈ implies that diagU T Y U B, we may take U = V and deduce Y = diagV T Y V B (iv) S4 ⊆ S1 : To obtain a contradiction, suppose Y B but Yˆ conv B. Because is a compact set, we conclude that the continuous image B = Y Y = XBX T X ∈ is compact. Hence, its convex hull conv B is compact as well. Therefore, a standard hyperplane separation argument implies that there exists A¯ ∈ n such that ¯ Yˆ < min A ¯ Y = min A ¯ Y = A ¯ B− A Y ∈convB
As a result,
Y ∈B
¯ Yˆ < A ¯ B− ¯ Yˆ − ≤ A A
Without loss of generality, suppose that the eigenvalues · are in nondecreasing order. Then, the above minimum product inequality could be written as n i=1
¯ n−i+1 Yˆ < i A
which implies 0>
n i=1
n i=1
¯ n−i+1 B i A
¯ n−i+1 Yˆ − n−i+1 B i A
¯ = i−1 ¯ ¯ ¯ Because i A j=1 j+1 A − j A + 1 A, we can rewrite the above inequality as 0>
n i−1 i=1
=
n−1 j=1
j=1
¯ ¯ ¯ j+1 A − j A + i A n−i+1 Yˆ − n−i+1 B
¯ − j A ¯ j+1 A
¯ + 1 A
n i=1
n i=j+1
n−i+1 Yˆ − n−i+1 B
i Yˆ − i B
¯ ni=1 i Yˆ − i B = 0. Thus, we have the Notice that Yˆ B implies eT Yˆ = eT B, so 1 A following inequality: n−1 n ¯ − j A ¯ 0> j+1 A n−i+1 Yˆ − n−i+1 B (15) j=1
i=j+1
Ding and Wolkowicz: Low-Dimensional SDP Relaxation for QAP
1015
Mathematics of Operations Research 34(4), pp. 1008–1022, © 2009 INFORMS
¯ ≥ j A ¯ and by the definition of Yˆ majorized by B, However, by assumption j+1 A n i=j+1
n−i+1 Yˆ =
n−j
t=1
t Yˆ ≥
n−j
t=1
t B =
n i=j+1
n−i+1 B
which contradicts (15). Remark 2.2. Based on our Theorem 2.1,2 Xia [35] recognized that the equivalent sets S1 to S4 in (14) admit a semidefinite formulation, i.e., n n S1 = S5 = Y ∈ S n Y = i BYi Yi = In trace Yi = 1 Yi 0 i = 1 n i=1
i=1
Xia [35] then proposed an orthogonal bound, denoted OB2, from the optimal value of the SDP ∗OB2 =
min
X≥0 Xe=X T e=e Y ∈S5
traceAY + CX T
Note that this orthogonal bound OB2 can be applied to the projected version PQAP (given in §2.2.3), and then it is provably stronger than the convex quadratic programming bound QPB. We failed to recognize this point in our initial work. Instead, motivated by Theorem 2.1, we now propose an inexpensive bound that is stronger than QPB for most of the problem instances we tested. 2.2.2. Strengthened MSDR bound. Suppose that A = UA DiagAUAT denotes the orthogonal diagonalization of A with the vector of eigenvalues A in nonincreasing order. We assume that the vector of eigenvalues B is in nondecreasing order. Let p
) = 1 1 1 0 0 0 p
p = 1 2 n − 1
We add the following cuts to MSDR1 : )p diagUAT Y UA ≥ )p B
p = 1 2 n − 1
(16)
These are valid cuts because )p diagUAT Y UA ≥ )p diagUAT Y UA − ≥ )p B− , for Y ∈ S1 by part (ii) of the proof of Theorem 2.1. Hence, we get the following relaxation: MSDR2
∗MSDR2 = min A Y + C X s.t. Xe = X T e = e diagY = X diagB diagZ = X diagB 2 Ye = XBe Ze = XB 2 e )p diagUAT Y UA ≥ )p B p = 1 2 n − 1 I X T BT X T X I Y 0 X ≥ 0 XB Y Z n X ∈ Y Z ∈ n
The cuts (16) approximate the majorization constraint diagUAT Y UA B and yield a comparison between the bounds MSDR2 and EB. 2
Xia [35] references our Theorem 2.1 from an earlier version of our paper.
(17)
Ding and Wolkowicz: Low-Dimensional SDP Relaxation for QAP
1016
Mathematics of Operations Research 34(4), pp. 1008–1022, © 2009 INFORMS
Lemma 2.1.
The bound from MSDR2 is stronger than the eigenvalue bound EB, i.e., ∗MSDR2 ≥ A B− +
Proof.
min
Xe=X T e=e X≥0
C X
It is enough to show that the first terms on both sides of the inequality satisfy A Y ≥ A B−
for any Y feasible in MSDR2 . Note that A Y = UA DiagAUAT Y = A diagUAT Y UA Because A is a nonincreasing vector and B is nondecreasing, we have B A = B A− . Also, n−1 A = p A − p+1 A)p + n Ae p=1
Therefore, because diagY = X diagB and eT X = eT , we have A Y =
n−1 p=1
p A − p+1 A)p diagUAT Y UA + n Ae B
Because )p diagUAT Y UA ≥ )p B holds for any feasible Y , we have A Y ≥
n−1 p=1
=
n−1
p=1
=
n i=1
=
p A − p+1 A )p B + n Ae B
n i=1
p
p A − p+1 A n−1
i B
p=i
i=1
n i B + n A i B
i=1
p A − p+1 A + n A
i Bi A
= B A−
2.2.3. Projected bound. The row and column sum equality constraints of QAP, = X ∈ n Xe = X T e = e, can be eliminated using a nullspace method. (In the following proposition, refers to the orthogonal matrices of appropriate dimension.) Proposition 2.2 (Hadley et al. [14]). Let V ∈ n n−1 be full column rank and satisfy V T e = 0. Then, X ∈ ∩ if and only if 1 T for some X ∈ X = E + V XV n After substituting for X and using A = V T AV , B = V T BV , the QAP can now be reformulated as the projected version 1 1 1 BX T + AX BE + AE T + AE PQAP min trace AX BX BE n n n2 = I X T = X T X s.t. X 1 = E + V XV T ≥ 0 XX n BX T and Z = YY = X BBX T and we replace X with 1/nE + V XV T . Then, the two We now define Y = X T T terms XBX and XBV V BX admit the representations 1 1 1 BX T V T + EBV X T V T + V XV T BE + E BE XBX T = V X n n n2
Ding and Wolkowicz: Low-Dimensional SDP Relaxation for QAP
1017
Mathematics of Operations Research 34(4), pp. 1008–1022, © 2009 INFORMS
and
1 1 1 T + EBV V T BVX T V T + VXV T BV V T BE + EBV V T BE XBV V T BX T = V ZV n n n2 T respectively. In MSDR2 , we use Y to represent/approximate XBX and use Z to represent/approximate XBBX T . and Y. Therefore, in the projected version we have to let Z However, XBBX T cannot be represented with X T T T represent XBV V BX instead of XBBX , and we replace the corresponding diagonal constraint with diagZ = X diagBV V T B. Based on these definitions, PQAP has the following quadratic matrix programming formulation: min traceAY + CX T s.t. diag Y = X diagB diag Z = X diagBV V T B 1 = V XV T + E XX n 1 1 1 T Y = V YV + EBV X T V T + V XV T BE + E BE Y X n n n2 1 1 1 Z = V ZV T + EBV V T BVX T V T + VXV T BV V T BE + EBV V T BE ZX n n n2 B R = X X T X RT I Y X = T RRT Y Z RX ≥ 0 XX R ∈ n−1 X
Y Z ∈ S n−1
We can now relax the quadratic constraint I
Y
Y
Z
with the convex constraint
(18)
I
X R
= T X I Y
X T X
RT X
T RX
RRT
RT
Y 0 Z
As in MSDR2 , we now add the following cuts for Y ∈ conv X: )p diagUAT YUA ≥ )p B
p = 1 2 n − 2
T ≤ 2 A ≤ · · · ≤ n−1 A. )p follows is the spectral decomposition of A and 1 A where A = UA DiagAU A the definition in §2.2.1, i.e., )p ∈ Rn−1 )p = 0 0 0 1 1. Our final projected relaxation MSDR3 is
MSDR3
Y + C XX ∗MSDR3 = min A Y X Y = XX diagB s.t. diagY X Z = XX diagBV V T B diagZX )p diagUAT YUA ≥ )p B ≥ 0 XX T I X X I B Y X
T BT X Y
0
Z
∈ n−1 Y Z ∈ S n−1 X
p = 1 2 n − 2
Ding and Wolkowicz: Low-Dimensional SDP Relaxation for QAP
1018
Mathematics of Operations Research 34(4), pp. 1008–1022, © 2009 INFORMS
where 1 = E + V XV T XX n 1 1 1 Y = V YV T + EBV X T V T + V XV T BE + E BE Y X n n n2 1 1 1 Z = V ZV T + EBV V T BVX T V T + VXV T BV V T BE + EBV V T BE ZX n n n2 Note that the constraints Ye = XBe, Ze = XB 2 e are no longer needed in MSDR3 . In MSDR3 , all the constraints act on the lower dimensional space obtained after the projection. The strategy of adding cuts after the projection has been successfully used in the projected eigenvalue bound PB and the quadratic programming bound QPB. For this reason, we propose MSDR3 instead of MSDR2 . Lemma 2.2.
Let ∗PB denote the projected eigenvalue bound. Then, ∗MSDR3 ≥ ∗PB
Proof. Because MSDR3 has constraints )p diagUAT YUA ≥ )p B
p = 1 2 n − 2
B − . This proof is the same as the proof for trace AY ≥ we need only prove that trace AY ≥ A A B− in Lemma 2.1. Remark 2.3. Every feasible solution to the original QAP satisfies Y = XBX T X ∈ . This implies that Y could be obtained from a permutation of the entries of B. Moreover, the diagonal entries of B remain on the diagonal after a permutation. Denote the off-diagonal entries of B by 0ffDiagB. We see that, for each i j = 1 2 n i = j, the following cuts are valid for any feasible Y : min50ffDiagB6 ≤ Yij ≤ max50ffDiagB6
(19)
It is easy to verify that if the elements of 0ffDiagB are all equal, then QAP can be solved by MSDR1 , MSDR2 , or MSDR3 using the constraints in (19). If B is diagonally dominant, than for any permutation X, we have that Y = XBX T is diagonally dominant. This property generates another series of cuts. These results could be used to add cuts for Z = XB 2 X T as well. 3. Numerical results. 3.1. QAPLIB problems. In Table 1, we present a comparison of MSDR3 with several other bounds applied to instances from QAPLIB (Burkard et al. [7]). The first column (OPT) denotes the exact optimal value. The following columns contain the Gilmore-Lawler bound (GLB) (Gilmore [12]); dual linear programming bound (KCCEB) (Karisch et al. [21], Hahn and Grant [16], Hahn and Grant [17]); projected eigenvalue bound (PB) (Hadley et al. [14]); convex quadratic programming bound (QPB) (Anstreicher and Brixius [2]); and the vectorlifting semidefinite relaxation bounds (SDR1, SDR2, and SDR3) (Zhao et al. [36]) computed by the bundle method (Rendl and Sotirov [32]). The last column is our MSDR3 bound. All output values are rounded up to the nearest integer. To solve QAP, the minimization of trace AXBX T and trace BXAX T are equivalent. However, in terms of the relaxation MSDR3 , exchanging the roles of A and B results in two different formulations and bounds. In our tests, we use the maximum of the two formulations for MSDR3 . When considering branching, we stay with the better formulation throughout to avoid doubling the computational work. From Table 1, we see that the relative performance of the various bounds can vary on different instances. The average performance of the bounds can be ranked as follows: PB < QPB < MSDR3 ≈ SDR1 < SDR2 < SDR3 In Table 2, we present the number of variables and constraints used in each of the relaxations. Our bound MSDR3 uses only On2 variables and only On2 constraints. If we solve MSDR3 with an interior point method,
Ding and Wolkowicz: Low-Dimensional SDP Relaxation for QAP
1019
Mathematics of Operations Research 34(4), pp. 1008–1022, © 2009 INFORMS
Table 1. Problem
OPT
GLB
Comparison of bounds for QAPLIB instances.
KCCEB
PB
QPB
SDR1
SDR2
SDR3
MSDR3
Esc16a Esc16b Esc16c Esc16d Esc16e Esc16g Esc16h Esc16i Esc16j
68 292 160 16 28 26 996 14 8
38 220 83 3 12 12 625 0 1
41 274 91 4 12 12 704 0 2
47 250 95 −19 6 9 708 −25 −6
55 250 95 −19 6 9 708 −25 −6
47 250 95 −19 6 9 708 −25 −6
49 275 111 −13 11 10 905 −22 −5
59 288 142 8 23 20 970 9 7
50 276 123 1 14 13 906 0 0
Had12 Had14 Had16 Had18 Had20
1652 2724 3720 5358 692
1536 2492 3358 4776 6166
1619 2661 3553 5078 6567
1573 2609 3560 5104 6625
1592 2630 3594 5141 6674
1604 2651 3612 5174 6713
1639 2707 3675 5282 6843
1643 2715 3699 5317 6885
1595 2634 3587 5153 6681
Kra30a Kra30b
88900 91420
68360 69065
75566 76235
63717 63818
68257 68400
69736 70324
68526 71429
77647 81156
72480 73155
Nug12 Nug14 Nug15 Nug16a Nug16b Nug17 Nug18 Nug20 Nug21 Nug22 Nug24 Nug25 Nug27 Nug30
578 1014 1150 1610 1240 1732 1930 2570 2438 3596 3488 3744 5234 6124
493 852 963 1314 1022 1388 1554 2057 1833 2483 2676 2869 3701 4539
521 N/a 1033 1419 1082 1498 1656 2173 2008 2834 2857 3064 N/a 4785
472 871 973 1403 1046 1487 1663 2196 1979 2966 2960 3190 4493 5266
482 891 994 1441 1070 1523 1700 2252 2046 3049 3025 3268 N/a 5362
486 903 1009 1461 1082 1548 1723 2281 2090 3140 3068 3305 N/a 5413
528 958 1069 1526 1136 1619 1798 2380 2244 3372 3217 3438 4887 5651
557 992 1122 1570 1188 1669 1852 2451 2323 3440 3310 3535 4965 5803
502 918 1016 1460 1082 1549 1726 2291 2099 3137 3061 3300 4621 5446
Rou12 Rou15 Rou20
235528 354210 725522
202272 298548 599948
223543 323589 641425
200024 296705 597045
205461 303487 607362
208685 306833 615549
219018 320567 641577
223680 333287 663833
207445 303456 609102
Scr12 Scr15 Scr20
31410 51140 110030
27858 44737 86766
29538 48547 94489
4727 10355 16113
8223 12401 23480
11117 17046 28535
23844 41881 82106
29321 48836 94998
18803 39399 50548
Tai12a Tai15a Tai17a Tai20a Tai25a Tai30a
224416 388214 491812 703482 1167256 1818146
195918 327501 412722 580674 962417 1504688
220804 351938 441501 616644 1005978 1565313
193124 325019 408910 575831 956657 1500407
199378 330205 415576 584938 981870 1517829
203595 333437 419619 591994 974004 1529135
215241 349179 440333 617630 908248 1573580
222784 364761 451317 637300 1041337 1652186
202134 331956 418356 587266 970788 1521368
Tho30
149936
90578
99855
119254
124286
125972
134368
136059
122778
the complexity of computing the Newton direction in each iteration is On6 and the number of iterations of an interior point method is bounded by On ln1/, (Monteiro and Todd [26]). Therefore, the complexity of computing MSDR3 with an interior point method is On7 ln1/,. Note that the computational complexity for the most expensive SDP formulation, SDR3, is On14 ln1/, where , is the desired accuracy. Thus, MSDR3 is significantly less expensive than SDR3. Though QPB is less expensive than MSDR3 in practice the complexity as a function of n is the same. Table 2.
Complexity of relaxations.
7 Methods
GLB
KCCEB
PB
QPB
SDR1
SDR2
SDR3
MSDR3
Variables Constraints
On4 On2
On2 On2
On2 On2
On2 On2
On4 On2
On4 On3
On4 On4
On2 On2
Ding and Wolkowicz: Low-Dimensional SDP Relaxation for QAP
1020
Mathematics of Operations Research 34(4), pp. 1008–1022, © 2009 INFORMS
Table 3.
CPU time and iterations for computing MSDR3 on the Nugent problems.
Instances
Nug12 Nug15 Nug18 Nug20 Nug25
CPU time(s) Number of iterations
151 18
576 19
2039 22
5349 26
Nug27
Nug30
32364 52113 122060 27 25 29
Table 3 lists the CPU time (in seconds) for MSDR3 for several of the Nugent instances (Nugent et al. [28]). (We used a SUN SPARC 10 and the SeDuMi3 SDP package. For a rough comparison, note that the results in Anstreicher et al. [4] were done on a C3000 computer and took 3.2 CPU seconds for the Nug20 instance and 9 CPU seconds for the Nug25 instance for the QPB bound.) 3.2. MSDR3 in a branch-and-bound framework. When solving general discrete optimization problems using B&B methods, one rarely has advance knowledge that helps in branching decisions. We now see that MSDR3 helps in choosing a row and/or column for branching in our B&B approach for solving QAP. If X is a permutation matrix, then the diagonal entries diagZ = X diagBV V T B are a permutation of the diagonal entries of BV V T B. In fact, the converse is true under a mild assumption. Proposition 3.1. Assume the n entries of diagBV V T B are all distinct. If X ∗ Y ∗ Z ∗ is an optimal solution to MSDR3 that satisfies diagZ ∗ = P diagBV V T B for some P ∈ , then X ∗ Y ∗ Z ∗ solves QAP exactly. Proof. Without loss of generality, assume the entries of b = diagBV V T B are strictly increasing, i.e., b1 < ∗ b2 < · · · < bn . By the feasibility of X ∗ , Z ∗ , we have diagZ ∗ = X ∗ b. Also, n we ∗know diagZ = Pbn for ∗some ∗ P ∈ . Therefore, X b = P b holds as well. Now, assume Pi1 = 1. Then, j=1 Xij bj = b1 . Because j=1 Xij = 1 and Xij∗ ≥ 0 j = 1 2 n, we conclude that b1 is a convex combination of b1 b2 bn . However, b1 is the strict minimum in b1 b2 bn . This implies that Xi1∗ = 1. The conclusion follows for P = X ∗ by finite induction after we delete column one and row i of X. As a consequence of Proposition 3.1, we may consider the original QAP problem in order to determine an optimal assignment of entries of diagBV V T B to diagZ, where each entry of diagBV V T B requires a branch-and-bound process to determine its assigned position. For entries with a large difference from the mean of diagBV V T B, the assignments are particularly important because a change of their assigned positions usually leads to significant differences in the corresponding objective value. Therefore, in order to fathom more nodes early, our B&B strategy first processes those entries with large differences from the mean of diagBV V T B. Branch-and-Bound Strategy 3.1. Let b = diagBV V T B. Branch on the ith column of X where i corresponds to the element bi that has the largest deviation from the mean of the elements of b. (If this strategy results in several elements close in value, then we randomly pick one of them.) For example, Nug12 yields diagBV V T BT = 23
14
14
23
1767
867
867
1767
23
14
14
23
Therefore, the sixth or seventh entry has value 867; this has the largest difference from the mean value 1672. Table 4 presents the MSDR3 bounds in the first level of the branching tree for Nug12. The first and second columns of Table 4 present the results for branching on elements from the sixth column of X first. The other columns provide a comparison with branching from other columns first. On average, branching with the sixth column of X first generates tighter bounds and should lead to descendant nodes in the branch-and-bound tree that was fathomed earlier. 4. Conclusion. We have presented new bounds for QAP that are based on a matrix-lifting (rather than a vector-lifting) semidefinite relaxation. By exploiting the special doubly stochastic and orthogonality structure of the constraints, we obtained a series of cuts to further strengthen the relaxation. The resulting relaxation MSDR3 is provably stronger than the projected eigenvalue bound PB and is comparable with the SDR1 bound and the quadratic programming bound QPB in our empirical tests. Moreover, due to the matrix-lifting property of the bound, it only uses On2 variables and On2 constraints. Hence, the complexity is comparable with that of QPB. 3
Information is available at http://sedumi.ie.lehigh.edu.
Ding and Wolkowicz: Low-Dimensional SDP Relaxation for QAP
1021
Mathematics of Operations Research 34(4), pp. 1008–1022, © 2009 INFORMS
Table 4. Nodes X1 6 = 1 X2 6 = 1 X3 6 = 1 X4 6 = 1 X5 6 = 1 X6 6 = 1 X7 6 = 1 X8 6 = 1 X9 6 = 1 X10 6 = 1 X11 6 = 1 X12 6 = 1 Mean
Results for the first level branching for Nug12.
Bounds
Nodes
Bounds
Nodes
Bounds
523 528 520 517 537 529 507 519 522 527 506 504 5199
X1 1 = 1 X2 1 = 1 X3 1 = 1 X4 1 = 1 X5 1 = 1 X6 1 = 1 X7 1 = 1 X8 1 = 1 X9 1 = 1 X10 1 = 1 X11 1 = 1 X12 1 = 1 Mean
508 509 507 515 512 517 516 524 524 514 527 510 5153
X1 2 = 1 X2 2 = 1 X3 2 = 1 X4 2 = 1 X5 2 = 1 X6 2 = 1 X7 2 = 1 X8 2 = 1 X9 2 = 1 X10 2 = 1 X11 2 = 1 X12 2 = 1 Mean
512 513 508 510 519 513 507 513 514 513 510 516 5123
Subsequent work has shown that our MSDR3 relaxation and bound are particularly efficient for matrices with special structures, for example, if B is a Hamming distance matrix of a hypercube or a Manhattan distance matrix from rectangular grids (see, e.g., Mittelmann and Peng [24]). Additional new relaxations based on our work have been proposed (see, e.g., the bound OB2 in Xia [35]). Another recent application is decoding in multiple antenna systems (see Mobasher and Khandani [25]). Acknowledgments. The authors are indebted to the associate editor and two anonymous referees for helping make numerous significant improvements to the paper. References [1] Anstreicher, K. M. 2003. Recent advances in the solution of quadratic assignment problems. Math. Programming Ser. B 97(1-2) 27–42. [2] Anstreicher, K. M., N. W. Brixius. 2001. A new bound for the quadratic assignment problem based on convex quadratic programming. Math. Programming Ser. A 89(3) 341–357. [3] Anstreicher, K. M., H. Wolkowicz. 2000. On Lagrangian relaxation of quadratic matrix constraints. SIAM J. Matrix Anal. Appl. 22(1) 41–55. [4] Anstreicher, K. M., N. W. Brixius, J.-P. Goux, J. Linderoth. 2002. Solving large quadratic assignment problems on computational grids. Math. Programming Ser. A 91(3) 563–588. [5] Beck, A. 2006. Quadratic matrix programming. SIAM J. Optim. 17(4) 1224–1238. [6] Beck, A., M. Teboulle. 2000. Global optimality conditions for quadratic optimization problems with binary constraints. SIAM J. Optim. 11(1) 179–188. [7] Burkard, R. E., S. Karisch, F. Rendl. 1991. QAPLIB—A quadratic assignment problem library. Eur. J. Oper. Res. 55 115–119. (http://www.opt.math.tu-graz.ac.at/qaplib/.) [8] Ding, Y., N. Krislock, J. Qian, H. Wolkowicz. 2008. Sensor network localization, Euclidean distance matrix completions, and graph realization. Internat. Conf. Mobile Comput. Networking. ACM, New York, 129–134. [9] Drezner, Z. 1995. Lower bounds based on linear programming for the quadratic assignment problem. Comput. Optim. Appl. 4(2) 159–165. [10] Falkner, J., F. Rendl, H. Wolkowicz. 1994. A computational study of graph partitioning. Math. Programming Ser. A 66(2) 211–239. [11] Finke, G., R. E. Burkard, F. Rendl. 1987. Quadratic assignment problems. Ann. Discrete Math. 31 61–82. [12] Gilmore, P. C. 1962. Optimal and suboptimal algorithms for the quadratic assignment problem. SIAM J. Appl. Math. 10 305–313. [13] Hadley, S. W., F. Rendl, H. Wolkowicz. 1990. Bounds for the quadratic assignment problems using continuous optimization. Integer Programming and Combinatorial Optimization. University of Waterloo Press, Waterloo, ON, Canada, 237–248. [14] Hadley, S. W., F. Rendl, H. Wolkowicz. 1992. A new lower bound via projection for the quadratic assignment problem. Math. Oper. Res. 17(3) 727–739. [15] Hadley, S. W., F. Rendl, H. Wolkowicz. 1992. Symmetrization of nonsymmetric quadratic assignment problems and the HoffmanWielandt inequality. Linear Algebra Appl. 167 53–64. [16] Hahn, P., T. Grant. 1998. Lower bounds for the quadratic assignment problem based upon a dual formulation. Oper. Res. 46(6) 912–922. [17] Hahn, P., T. Grant, N. Hall. 1998. A branch-and-bound algorithm for the quadratic assignment problem based on the Hungarian method. Eur. J. Oper. Res. 108(3) 629–640. [18] Hardy, G. H., J. E. Littlewood, G. Polya. 1934. Inequalities. Cambridge University Press, London and New York. [19] Horn, R. A., C. R. Johnson. 1994. Topics in Matrix Analysis. Cambridge University Press, Cambridge, MA. (Corrected reprint of the 1991 original.) [20] Karisch, S. E., F. Rendl, H. Wolkowicz. 1994. Trust regions and relaxations for the quadratic assignment problem. P. Pardalos, H. Wolkowicz, eds. Quadratic Assignment and Related Problems. American Mathematical Society, Providence, RI, 199–219. [21] Karisch, S. E., E. Çela, J. Clausen, T. Espersen. 1999. A dual framework for lower bounds of the quadratic assignment problem based on linearization. Computing 63(4) 351–403.
1022
Ding and Wolkowicz: Low-Dimensional SDP Relaxation for QAP
Mathematics of Operations Research 34(4), pp. 1008–1022, © 2009 INFORMS
[22] Lawler, E. 1963. The quadratic assignment problem. Management Sci. 9 586–599. [23] Marshall, A. W., I. Olkin. 1979. Inequalities: Theory of Majorization and Its Applications. Academic Press, New York. [24] Mittelmann, H., J. Peng. 2007. Estimating bounds for quadratic assignment problems associated with Hamming and Manhattan distance matrices based on semidefinite programming. Technical report, University of Illinois at Urbana-Champaign, Urbana, IL. [25] Mobasher, A., A. K. Khandani. 2007. Matrix-lifting semi-definite programming for decoding in multiple antenna systems. Proc. 10th Canadian Workshop Inform. Theory CWIT’07, Edmonton, Alberta, Canada, 136–139. [26] Monteiro, R. D. C., M. J. Todd. 2000. Path-following methods. Handbook of Semidefinite Programming. Kluwer Academic Publishers, Boston, 267–306. [27] Nesterov, Y. E., H. Wolkowicz, Y. Ye. 2000. Semidefinite programming relaxations of nonconvex quadratic optimization. Handbook of Semidefinite Programming, Vol. 27. International Series Operation Research Management Science. Kluwer Academic Publishers, Boston, 361–419. [28] Nugent, C. E., T. E. Vollman, J. Ruml. 1968. An experimental comparison of techniques for the assignment of facilities to locations. Oper. Res. 16 150–173. [29] Pardalos, P. M., H. Wolkowicz, eds. 1994. Quadratic Assignment and Related Problems. American Mathematical Society, Providence, RI. [30] Pardalos, P., F. Rendl, H. Wolkowicz. 1994. The quadratic assignment problem: A survey and recent developments. P. Pardalos, H. Wolkowicz, eds. Quadratic Assignment and Related Problems. American Mathematical Society, Providence, RI, 1–42. [31] Pardalos, P. M., K. G. Ramakrishnan, M. G. C. Resende, Y. Li. 1997. Implementation of a variance reduction-based lower bound in a branch-and-bound algorithm for the quadratic assignment problem. SIAM J. Optim. 7(1) 280–294. [32] Rendl, F., R. Sotirov. 2007. Bounds for the quadratic assignment problem using the bundle method. Math. Programming Ser. B 109(2–3) 505–524. [33] Rendl, F., H. Wolkowicz. 1992. Applications of parametric programming and eigenvalue maximization to the quadratic assignment problem. Math. Programming Ser. A 53(1) 63–78. [34] Wolkowicz, H. 2000. Semidefinite programming approaches to the quadratic assignment problem. Nonlinear Assignment Problems, Vol. 7. Combinatorial Optimization. Kluwer Academic Publishers, Dordrecht, The Netherlands, 143–174. [35] Xia, Y. 2007. New semidefinite relaxations for the quadratic assignment problem. Personal communication. [36] Zhao, Q., S. E. Karisch, F. Rendl, H. Wolkowicz. 1998. Semidefinite programming relaxations for the quadratic assignment problem. J. Combin. Optim. 2(1) 71–109.