Semidefinite programming relaxations for graph coloring and maximal clique problems † Igor Dukanovic ‡,
Franz Rendl
§
October 27, 2005
Abstract The semidefinite programming formulation of the Lov´ asz theta number does not only give one of the best polynomial simultaneous bounds on the chromatic number χ(G) or the clique number ω(G) of a graph, but also leads to heuristics for graph coloring and extracting large cliques. This semidefinite programming formulation can be tightened toward either χ(G) or ω(G) by adding several types of cutting planes. We explore several such strengthenings, and show that some of them can be computed with the same effort as the theta number. We also investigate computational simplifications for graphs with rich automorphism groups.
Keywords: Lov´asz theta number, chromatic number, clique number, cutting planes.
1
Introduction
The Lov´asz theta number Θ(G), introduced in [21], is sandwiched between the clique number and the chromatic number of a graph ω(G) ≤ Θ(G) ≤ χ(G). It can be computed to arbitrary fixed precision in polynomial time by interior point methods as it can be formulated as a semidefinite program (SDP) [21]. The positive semidefinite matrix variable in this SDP can be used to either extract large cliques [14] or to approximately color the given graph [18]. There are graphs with arbitrarily large gaps Θ(G) − ω(G) and/or χ(G) − Θ(G). Well-known examples are triangle free (ω(G) = 2) Mycielski graphs with arbitrarily large chromatic numbers. There has been considerable scientific effort to strengthen the original bound of Lov´asz toward the clique number (or equivalently the stability number in the complement) [25, 32, 22]. To better approximate the chromatic number, Szegedy strengthened the Lov´asz theta number by adding nonnegativity constraints [33]. Meurdesoif further tightened this bound by the inclusion of triangle inequalities [27]. None of these tighter models can be solved routinely, see for instance [27]. We refer to [20] for a recent survey on relaxations of max-clique and coloring. It is the purpose of the present paper to explore efficient algorithms to compute Θ and its various strengthenings toward ω and χ. The starting point are two independent models for Θ. The standard model (15) allows to compute Θ very efficiently on dense graphs. The second model (16) is tailored for sparse graphs. It turns out that the strengthenings of Θ toward χ can be computed with an effort comparable to compute Θ for the dense model. Likewise the strengthenings of Θ toward ω can be efficiently computed in the sparse model. Finally, we will explore symmetries of a given graph to †
Partial support by the EU project Algorithmic Discrete Optimization (ADONET), MRTN-CT-2003-504438, is gratefully acknowledged. ‡ Univerza v Mariboru, Ekonomsko-poslovna fakulteta, Razlagova 14, 2000 Maribor, Slovenia,
[email protected] § Universit¨ at Klagenfurt, Institut f¨ ur Mathematik, Universit¨ atsstraße 65-67, 9020 Klagenfurt, Austria,
[email protected]
1
simplify the computations. The key observation is that the variables corresponding to vertices in the same orbit can be assumed to have equal value. The same is true for the orbits of edges and non-edges. Similar ideas have been investigated by Gatermann and Parrilo [13], and more recently by De Klerk, Pasechnik and Schrijver [8]. We provide computational results on graphs with nontrivial automorphism groups showing that Θ and its variations can be computed efficiently for graphs with up to 1000 vertices and several millions of edges.
2
Notation
The vector of all ones is denoted by e. The matrix J n = eeT (or simply J) is the n × n matrix of all ones. We denote by ei the ith column of the identity matrix I of appropriate dimension, and we define Eij = ei eTj + ej eTi for i 6= j. To avoid misleading notation E ii = 2ei eTi we additionally define Ei = ei eTi . The pointwise product of vectors x and y is the vector x ◦ y. The trace inner product hX, Y i = tr XY is a well-known inner product in the space of the real symmetric matrices S n . It is easy to check that the adjoint of operator diag , which takes a diagonal of a matrix, is operator Diag, which maps a vector into a diagonal matrix. A B (A B) denotes that A − B is a positive semidefinite (respectively positive definite) matrix. A graph with vertex set V = {1, ..., n} and edge set E will be denoted by G = (V, E). We assume G to be simple (loopless, without multiple edges) and undirected. The complement graph G = (V, E) is the graph on the same vertex set with edges ij ∈ E exactly if ij = ji ∈ / E, i.e. E = {ij ∈ / E}. We denote the number of edges |E| by m and the number of non-edges |E| by m. The chromatic number χ(G) is the smallest number of colors needed to (properly) color a graph. The clique number ω(G) is the size of the largest clique (set of pairwise adjacent vertices) in the graph.
3
Graph Coloring
A s-coloring of a graph G = (V, E) is traditionally defined as a mapping c : V → {1, ..., s} such that ij ∈ E ⇒ c(i) 6= c(j). This s-coloring c defines a s-coloring relation R by iRj ⇐⇒ c(i) = c(j). The coloring relation R partitions the vertex set V into the color classes c −1 (1), ..., c−1 (s), so it is an equivalence relation. In matrix terms, we can express a s-coloring of a graph by a matrix M ∈ {0, 1} n×n such that mij = 1 ⇐⇒ c(i) = c(j). Clearly, for any such matrix M there exists a permutation matrix P such that J n1 0 0 Jn 2 (1) P T MP = . .. . J ns
P In other words, P T M P is the direct sum of s blocks of all ones, with block sizes n i ≥ 1 and i ni = n. We call M the matrix representation of the coloring c. Notice also that diag P T M P = diag M = e, and mij = 0 whenever ij is an edge of the underlying graph. Remark 1 Let c be a coloring and M be its matrix representation. The color classes c −1 (i) are given by Ci = {v : c(v) = i}. Its characteristic vectors χ Ci ∈ {0, 1}n , defined by 1 v ∈ Ci (χCi )v = 0 v∈ / Ci 2
can be used to express M as M=
s X
χCi χTCi .
(2)
i=1
Note in particular that the rank of M is s. We first recall the following well known characterizations of matrices of the form given in (1). Lemma 2 Let M be a symmetric 0-1 matrix. Then the following statements are equivalent. a) There exists a permutation matrix P such that P T M P is the direct sum of s blocks of all ones. b) diag (M ) = e, rank(M ) = s, M 0. c) diag (M ) = e, rank(M ) = s and M satisfies the triangle inequalities: mij + mjk − mik ≤ 1 ∀(i, j, k).
(3)
Proof. Since J 0, we obviously have a) ⇒ b). To see that b) ⇒ c), we note that a violated triangle inequality would have mij = mjk = 1, mik = 0. The corresponding submatrix 1 1 0 1 1 1 0 1 1 is indefinite, contradicting M 0. Finally we note that the triangle inequalities imply that M represents a transitive relation, so the rows and columns of M can be permuted to be a direct sum of all ones matrices. Since the rank of M is s, there must be s such blocks. The previous characterizations of matrix representations of colorings involve a rank constraint on M . The following result shows that we can trade the rank constraint for a semidefiniteness constraint. Theorem 3 Let M be a symmetric 0-1 matrix.Then M is of the form (1) if and only if d)
diag (M ) = e and (tM − J 0 ⇐⇒ t ≥ s).
Proof: Suppose M is of the form (1). Then any two rows in tM − J corresponding to two vertices of the same color are equal. Therefore the rows of any nonzero minor of tM − J correspond to (at most) s vertices of different colors, and thus the corresponding submatrix is tI s − Js , or a submatrix of it. But tIs − Js 0 ⇐⇒ t ≥ s. So a) above implies d). We observe in particular that sM − J 0 in this case, and that tM − J 6 0 for t < s. To see the other direction we observe that diagonal elements in sM − J 0 are non-negative. So s ≥ 1, and M 1s J 0. Denote r := rank(M ). Since statement b) in Lemma 2 implies a), there exists a permutation matrix P such that P T M P is a direct sum of r blocks of all ones. Now the previously established implication, a)⇒ d), shows r = min{t : tM −J 0}. But the latter equals s, by assumption. As a consequence we can write the coloring number χ(G) of a graph G as the optimal value of the following SDP in bivalent variables. Corollary 4 [27, 18] χ(G) = min{t : Y − J 0, diag Y = te, Y = Y T , yij = 0 ∀ij ∈ E, yij ∈ {0, t}}
(4)
Proof. Minimizing the rank over matrices representing proper colorings gives in view of the previous theorem χ(G) = min{t : tM − J 0, diag (M ) = e, mij = 0 ∀ij ∈ E, mij ∈ {0, 1}}. Substituting Y = tM gives the bivalent linear SDP (4). 3
A hierarchy of relaxations toward χ(G)
4
Since solving (4) is well known to be NP-hard, several relaxations were introduced. The initial relaxation of Lovasz [21] simply drops the condition that the elements of Y are 0 or t. Θ(G) = min{t : Y − J 0, diag Y = te, yij = 0 ∀ij ∈ E}
(5)
(Regarding the notation for Lov´asz number we also refer to Remark 8.) Szegedy [33] maintains Y ≥ 0 and gets the stronger relaxation Θ+ (G). Θ+ (G) = min{t : Y − J 0, diag Y = te, yij = 0 ∀ij ∈ E, yij ≥ 0 ∀ij ∈ E}
(6)
Finally Meurdesoif [27] includes the triangle inequalities (3) as well. We denote the resulting bound by Θ+4 . Θ+4 (G) = min{t : Y − J 0, diag Y = te, yij = 0 ∀ij ∈ E, (7) yij ≥ 0 ∀ij ∈ E, yij + yjk − yik ≤ t ∀ij, jk ∈ E} We clearly have a chain Θ(G) ≤ Θ+ (G) ≤ Θ+4 (G) ≤ χ(G).
(8)
Remark 5 It is known that the gaps χ(G) − Θ(G) [18] and χ(G) − Θ + (G) [7] can be arbitrarily large. We are not aware of any theoretical investigations of the gap χ(G) − Θ +4 (G). The representation of M , given in (2) shows that the relaxations above could P be Tfurther tightened by constraining Y to be in the cone C of completely positive matrices, with C = { yi yi : yi ≥ 0}. However just checking whether a matrix is completely positive is co-NP-complete [28]. Polynomial relaxations based on copositive programming are investigated in [29, 3], but practical implementations still remain a challenge. Some preliminary results on vertex transitive graphs can be found in [12]. All relaxations of χ(G), given in (8), can be computed in polynomial time to some prescribed precision. Several methods have been proposed in the literature to compute Θ(G), ranging from interior-point methods, to methods based on eigenvalue optimization. We refer to the website of Hans Mittelmann 1 for a summary of state-of-the-art software to compute Θ(G). Computing the other relaxations Θ + and Θ+4 is significantly more involved. We are not aware of any computational study based on these relaxations, aside from the preliminary work reported in [27]. We will show in the following that these bounds can be computed quite efficiently with an effort comparable to computing Θ for reasonably dense graphs.
5
The clique number and semidefinite relaxations
In this section we briefly recall how (5) can alternatively be obtained as a semidefinite relaxation of the Max-Clique problem [21]. Suppose S is a clique in G. Let χ S be its characteristic vector. Then clearly (χS )i (χS )j = 0 ∀ij ∈ E, because the vertices i and j are not in the same clique, if ij ∈ E. Using χ S we form the following matrix 1 (9) X = T χS χTS . χS χS We get |S| = hX, Ji. Moreover, X satisfies the following conditions: X 0, tr (X) = 1, x ij = 0 ∀ij ∈ E, rank(X) = 1. We leave it to the reader to verify that in fact ω(G) = max{hX, Ji : X 0, tr (X) = 1, xij = 0 ∀ij ∈ E, rank(X) = 1}. Dropping the rank constraint on X, we obtain the problem (5) in dual form: 1
http://plato.la.asu.edu
4
Θ(G) = max{hX, Ji : X 0, tr X = 1, xij = 0 ∀ij ∈ E}.
(10)
Therefore ω(G) ≤ Θ(G). By adding additional nonnegativity for X, (10) can be strengthened to the Schrijver’s number, see [25, 32]. Θ− (G) = max{hX, Ji : tr X = 1, X 0, xij = 0 ∀ij ∈ E, xij ≥ 0 ∀ij ∈ E}.
(11)
Further strengthenings, see [22] lead to Θ−4 (G) = max{hX, Ji : tr X = 1, X 0, xij = 0 ∀ij ∈ E, xij ≥ 0 ∀ij ∈ E, xij ≤ xii , xik + xjk ≤ xij + xkk ∀i, j, k ∈ V }.
(12)
Now notice that ω(G) ≤ Θ−4 (G) since matrices of the form (9) are also feasible for (12). Since the feasible sets of (12), (11) and (10) form a chain, we get Θ −4 (G) ≤ Θ− (G) ≤ Θ(G). Thus we finally have the following inequalities ω(G) ≤ Θ−4 (G) ≤ Θ− (G) ≤ Θ(G) ≤ Θ+ (G) ≤ Θ+4 (G) ≤ χ(G). Remark 6 It should be observed that model (12) is slightly weaker than N + (F RAC(G)) introduced in [22]. Burer and Vandenbussche [4] have recently proposed a computational scheme for (approximately) optimizing over the latter. They provide computational results on graphs with up to 200 nodes. Remark 7 It is NP-hard to decide whether χ(G) − ω(G) > 0 even when χ(G) is known [5]. As a consequence, unless P=NP, there is no polynomial-time computable upper bound p(G) on ω(G) that provably improves Θ(G), i.e. such that for any graph G with ω(G) 6= Θ(G) we have ω(G) ≤ p(G) < Θ(G), see [5]. Practical experience on random graphs reported in Table 3 also supports the intuition that it is hard to considerably strengthen the Lovasz theta bound. Remark 8 The notation for the Lov´ asz number Θ(G) comes in various flavours. In the context of the stability number, (5) is sometimes denoted by θ(G) or θ(G). The Schrijver’s number is sometimes denoted by θ 0 or θ2 , while Θ+ was originally denoted as θ1/2 , see [33]. In the next proposition we use the technique of transforming optimal solution of SDP (5) into a feasible solution of SDP (10) (and respectively (6) to (11), and (7) to (12)) to generalize the well-known inequalities Θ(G)Θ(G) ≥ 0 [21] and Θ + (G)Θ− (G) ≥ 0 [33] to the pair Θ+4 (G) and Θ−4 (G). Proposition 9 Let G be a simple graph with n vertices. Then Θ(G)Θ(G) ≥ 0
Θ+ (G)Θ− (G) ≥ 0
Θ+4 (G)Θ−4 (G) ≥ 0.
Proof. Let the pair t∗ = Θ(G) and Y ∗ be optimal for (5) for graph G. Then X :=
1 t∗ n
Y∗
(13)
is feasible for (10) for the complement graph G. So Θ(G) ≥ hX, Ji = eT Xe =
1 T ∗ 1 1 n e Y e = ∗ (eT (Y ∗ − J)e + eT Je) ≥ ∗ (0 + n2 ) = ∗ t∗ n t n t n t
since Y ∗ − J 0. Now t∗ = Θ(G) ≥ 1 establishes Θ(G)Θ(G) ≥ n. If Y ∗ is optimal for (6), then (13) transforms additional constraints y ij ≥ 0∀ij ∈ E into xij ≥ 0∀ij ∈ E thus confirming Θ+ (G)Θ− (G) ≥ n. If Y ∗ is optimal for (7) yij + yjk − yik ≤ t = ykk which holds for any i, j, k ∈ V transforms into xij + xjk ≤ xik + xkk while the inequalities xij ≤ xii = t∗1n yii = n1 are satisfied since X 0. 5
density sparse model dense model
0.10 1 223
0.25 7 118
0.50 42 34
0.75 130 5
0.90 238 1
Table 1: Comparison of the computation times (in seconds) for Θ(G) on five random graphs with 100 vertices and different densities in the dense (15) and the sparse model (16).
6
Sparse and dense graphs
In this section we concentrate on computationally efficient formulations of our SDP relaxations, by exploiting the density (or sparsity) of the given graph G. We say that G is dense, if m > 21 n2 and sparse otherwise. Note that this definition differs from the usual notion, where a graph is dense if m = Ω(n2 ) and sparse if m = o(n2 ). The following linear operator AG : S n → RE will be useful. AG (X)ij = hX, Eij i = xij + xji = 2xij ∀ij ∈ E (14) P Its adjoint operator ATG : RE → S n is given by ATG (y) = ij∈E(G) yij Eij . Using this map we can write any symmetric matrix Z as yi i = j X T T Z= yi Ei + AG (u) + AG (v), i.e. zij = u ij ∈ E. ij i vij ij ∈ /E
T (y), where y ∈ RE . Thus Note that any feasible matrix Y J for (5) is also of the form Y = tI + A G we get the following primal-dual pair of SDP. T (y) − J 0} Θ(G) = min{ t : tI + AG = max{hJ, Xi : tr (X) = 1, AG (X) = 0, X 0}
(15)
There are m + 1 equations (in the dual). We call it the dense model for Θ(G). If we write Z = Y − J, then we get the following formulation. Θ(G) = min{ t : te − diag (Z) = e, − AG (Z) = 2e, Z 0} = max{eT x + 2eT ξ : eT x = 1, Diag (x) + ATG (ξ) 0}
(16)
This model has n + m equations (in the primal), and we call it the sparse model. The computational effort to solve SDP by interior point methods is determined by two factors: the size of the matrix variables, in our case n, and the number of parameters necessary to determine a feasible solution. Notice that x ∈ Rn , ξ ∈ Rm determine any feasible solution of (16), while m + 1 values of t ∈ R and y ∈ Rm determine solutions of (15). Therefore the sparse model is computationally more efficient for sparse graphs (m small), while the dense model is better for dense graphs (m small). To give some impression of the practical impact, we provide in Table 1 computation times to compute Θ(G) with the two models. The codes have been run on a 3000 MHz PC with 1G RAM under Linux. The primal-dual predictor-corrector interior-point method was used. The algorithm was stopped once the duality gap was below 10−4 . The same machine setting was used in all the computational results reported in this paper. By using the appropriate model, we can compute Θ(G) for graphs with density above 0.75 or below 0.25 in a few seconds. For graphs with density of about 0.5, the dense model is slightly faster. In Table 2 we provide a more detailed picture about the effect of using the approriate model, depending on the density. We consider random graphs of densities p ∈ {0.1, 0.25, 0.5, 0.75, 0.9}. The timings refer to one graph per instance. The blank entries in Tables 2-3 correspond to problems which are out of reach by standard interior point methods. 6
0.1
0.25
0.5
0.75
0.9
n |E| time Θ(G) time Θ+4 (G, N ) |E| time Θ(G) time Θ+4 (G, N ) |E| time Θ(G) time Θ+4 (G) |E| time Θ(G) time Θ+4 (G) |E| time Θ(G) time Θ+4 (G)
100 487 1 3 1240 7 19 2531 34 53 3734 5 7 4460 1 1
150 1137 7 17 2802 66 136 5665 418 631 8451 53 76 10079 5 6
200 2047 30 88 5099 371 767 10026 2735 4421 15049 277 435 17942 21 29
250 3149 101 260 7926 1476 2888
23541 1164 1759 28127 72 102
300 4531 309 701
350 6098 671 1686
400 7949 1583 3683
40436 221 316
55099 540 723
71863 1357 2046
Table 2: Times in seconds for computing Θ(G), Θ +4 (G) and Θ+4 (G, N ) on random graphs. Graphs with density 0. Now Y :=
n X eeT = J λ
since X 0 and eT ( nλ X − eeT )e = 0. So Y is feasible for (19) for the graph G with t = y ii = 1 n n λ = Θ(G) . Therefore Θ(G) ≤ t = Θ(G) .
Likewise Θ+ (G) ≤ And finally
n Θ− (G)
Θ+4 (G)
≤
(25) n1 λn
=
since (25) additionally transforms x [ij] ≥ 0 of (23) into y[ij] ≥ 0 of (22).
n Θ−4 (G)
since (25) transforms triangle inequalities x [ij] + x[jk] ≤ x[ik] +
y[ij] + y[jk] − y[ik] ≤ λ1 = t. Now Proposition 9 establishes equalities.
1 n
into
Remark 15 The Theorem 14 does not only generalize well-known equality Θ(G)Θ(G) = n, see [19]. Its proof also implies that solving the SDP relaxation on a vertex transitive graph G does not produce only the bound on G but a corresponding matrix variable as well. The fact that these pairs of problems are equivalent will be exploited in Sections 8.4 and 8.5.
8
Computational results
We have discussed the sparse (16) and the dense (15) model to compute Θ(G). In Tables 1 and 2 we focused on the computational efficiency of these methods. Now we take a closer look at the quality of the various relaxations to approximate both χ(G) and ω(G).
8.1
Comparing Θ(G) and Θ+∆ (G) on random and DIMACS graphs
We have seen in Table 2 that the effort to compute Θ(G) and Θ +4 (G) is comparable for dense graphs 2 (m > n4 ). In Table 3 we compare these bounds on randomly generated graphs of given densities. It turns out that the inclusion of the additional sign constraints (6) and the triangle inequalities (7) does not change the value of the relaxation in a significant way. In fact, in all these instances we have Θ+4 (G) < 1.01. Θ(G) This is consistent with the very preliminary experience reported by Meurdesoif [27]. The situation is not much different if we look at some of the dense instances from the DIMACS collection, see Tables 4 and 5. To get some impression also for sparse graphs, we consider also including a limited number of the additional constraints in the columns labeled Θ +4 (G, N ), as described in section 6.2. Even though the difference between the relaxations is a little bit bigger than for completely random graphs, the increment of the lower bound toward χ(G) is still only marginal. 10
density \ 0.1
0.25
0.5
0.75
0.9
n Θ−4 (G) Θ(G) Θ+4 (G, N ) Θ+4 (G) Θ−4 (G) Θ(G) +4 Θ (G, N ) Θ+4 (G) Θ−4 (G) Θ(G) Θ+4 (G) Θ−4 (G) Θ(G) Θ+4 (G) Θ−4 (G) Θ(G) Θ+4 (G)
100 4.0000 4.0000 4.0000 4.0000 5.8056 5.8228 5.8571 5.8685 10.7476 10.8246 10.8976 19.2699 19.4945 19.5539 32.3485 33.1647 33.2152
150 5.0000 5.0000 5.0000
200 4.4428 4.4471 4.4682
250 4.8014 4.8047 4.8256
6.8409 6.8636 6.9056 6.9194 12.8249 12.9034 12.9998
7.7852 7.8104 7.8568
8.5052 8.5328 8.5792
14.6807 14.7830
24.3363 24.4414
28.6875 28.7988
31.8306 31.9376
41.9934 42.0462
50.1470 50.2200
58.0653 58.1541
300 5.1217 5.1275 5.1505
350 5.3839 5.3893 5.4139
400 5.7118 5.7166 5.7403
63.0662 63.1388
69.4530 69.5523
73.9517 74.0238
Table 3: The bounds on ω(G) and χ(G) on random graphs
8.2
Comparing Θ and Θ−4
A similar situation occurs, if one looks at the improvements of Θ(G) toward the clique number. We collect some representative results in Table 6. The improvement on the random graphs labeled dsjc is small. The situation is more encouraging, once we look at highly structured generalized Hamming graphs labeled H. The vertex set of a Hamming graph H(a, b, c) consists of numbers 0, ..., b a − 1 written in base b. The number of different digits is the Hamming distance between two numbers. Two vertices u, v are connected, if the Hamming distance d(u, v) = c, respectively d(u, v) ≤ c in H − (a, b, c), and d(u, v) ≥ c in H + (a, b, c). These graphs are important in coding theory.
8.3
Graphs with a medium amount of symmetry
The graphs in Table 7 also have application in coding theory. 2 They possess a medium amount of symmetry inabling us to use ideas developed in Section 7 to reduce computational complexity. While the number of edges can be very large, only the number of orbits in columns labeled orb. effects the computation times. The orbits of vertices and edges were found by NAUTY. Applying this approach we have observed a decrease in stability (due to larger condition number of the reduced system matrix) which was paid off by significantly smaller computational effort. So in Table 7 we report only three decimals. Again, the relaxation Θ− (G) improves upon Θ(G). This improvement is typically somewhat stronger than on completely random graphs, but relative to ω(G) it is still marginal.
8.4
Vertex transitive graphs
We have just seen that on unstructured random graphs, the variations of Θ(G) do not lead to significantly improved bounds. The situation is quite different, once we consider highly structured graphs. In Table 8, we give results on powers of cycle which are vertex transitive graphs. The number of orbits (in Tables 8 and 9 denoted by or.) in vertex transitive graphs can not be larger than the number of vertices denoted by n. Therefore additional reduction techniques described in [13] and [8] can be applied to 2
“Challenge Problems: Independent Sets in Graphs” (http://www.research.att.com/∼njas/doc/graphs.html)
11
name myciel5 myciel6 1-insertions 4 4-insertions 3 1-FullIns 4 2-FullIns 3 3-FullIns 3 dsjc125.5 dsjc125.9 dsjc250.9
n 47 95 67 79 93 52 80 125 125 250
m 236 755 232 156 593 201 346 3891 6961 27897
|E| 845 3710 1979 2925 3685 1125 2814 3859 789 3228
dens. 0.22 0.17 0.11 0.05 0.14 0.15 0.11 0.50 0.90 0.90
Θ(G) 2.6387 2.7342 2.2333 2.0480 3.1244 4.0282 5.0158 11.7844 37.7678 55.1527
Θ+ (G) 2.6387 2.7342 2.2333 2.0480 3.1244 4.0282 5.0158 11.8674 37.8028 55.2155
Θ+4 (G) 3.0933 3.2538 2.5230 2.1818 3.4869 4.2408 5.1935 11.8674 37.8031 55.2156
Θ+4 (G, N )
2.7583 2.8340 2.2634 2.0701 3.2657 4.2408 5.1198 11.8555
χ(G) 6 7 5 4 5 5 6 ≤ 18 44 ≤ 73
Table 4: Computational results for some small DIMACS Graph Coloring Problems.
further reduce the computational effort. We recall that the vertex set of the Johnson graph J(n, m, t) is n subsets of cardinality m of a universal set of size n. Two vertices are connected, it the given by the m corresponding sets have exactly t elements in common. In Table 9 we report the results on Hamming and Johnson graphs. Since these vertex transitive graphs are equivalent to corresponding association schemes any resulting SDP is a linear program [32] with the number of variables equal to the number of orbits. So all instances in Table 9 can be computed easily. Without considering the symmetries of H − (12, 2, 7) just storing the system matrix and its Cholesky decomposition would require more than 105 Gb of RAM. The clique sizes in Tables 8 and 9 in column labeled cliq. are obtained by an iterated greedy heuristic. Note that this gives only a rough estimate as it is known [34] that ω((C 7 )4 ) ≥ 108 while the heuristic finds only clique with 88 vertices. The number of colors in column labeled col. is obtained by heuristic TABU [17]. This heuristic could not handle the two largest graphs in Table 9 with more than 3000 vertices, so we have no good estimate of χ for these two graphs. It is remarkable that the inclusion of the sign constraints nearly doubles the bound on some of the H − graphs. Meanwhile the additional inclusion of the triangle inequalities does not improve Θ + on any of the considered association schemes in Table 9. Since association schemes are vertex transitive graphs by Remark 15 these results could also be interpreted as upper bounds Θ − and Θ−4 on the clique numbers of the complement graphs. name myciel7 1-insertions 2-insertions 3-insertions 4-insertions 1-FullIns 5 2-FullIns 4 4-FullIns 3 5-FullIns 3 dsjc125.1 dsjc250.1
5 4 4 4
n 191 202 149 281 475 282 212 114 154 125 250
m 2360 1227 541 1046 1795 3247 1621 541 792 736 3218
dens. 0.13 0.06 0.05 0.03 0.02 0.08 0.07 0.08 0.07 0.10 0.10
Θ(G) 2.8146 2.2765 2.1334 2.0868 2.0612 3.1811 4.0559 6.0100 7.0068 4.1061 4.9063
Θ+4 (G, N )
2.9206 2.3084 2.1802 2.1229 2.0893 3.2935 4.3021 6.1612 7.0811 4.2084 4.9317
χ(G) 8 6 5 5 5 6 6 7 8 5 ≤9
Table 5: Computational results for some medium-sized DIMACS Graph Coloring Problems.
12
name H + (5, 2, 3) H + (6, 2, 4) H + (7, 2, 5) H + (8, 2, 6)
n 32 64 128 256
m 240 1312 1856 4736
≤ ω(G) 4 4 2 2
Θ−4 (G) 4.0000 4.0000 3.0000 3.0000
Θ− (G) 4.0000 4.0000 3.0000 3.0000
Θ(G) 5.3333 5.3333 3.5556 3.5556
dsjc125.5 dsjc125.5 dsjc125.1
125 125 125
3859 3891 736
10 10 4
11.4019 11.7105 4.0671
11.4021 11.7106 4.1057
11.4730 11.7844 4.1062
Table 6: Computational results for the Clique Number on some DIMACS Stable Set Problems name DC128 ET 256 ET 512 ET 1024 ZC128 ZC256 ZC512 ZC1024
n 128 256 512 1024 128 256 512 1024
m 15040 62208 254080 1029376 14144 59904 248320 1015296
orb. 98 358 909 3238 33 47 61 81
ω(G) 16 50 100 171 18 36 62 112≤
Θ− (G) 16.678 54.465 103.549 182.071 20.000 37.333 68.000 128.000
Θ(G) 16.841 55.114 104.424 184.226 20.667 38.000 68.750 128.667
Table 7: Computational results for graphs with medium amount of symmetry.
8.5
Extracting colorings and cliques
Besides computing a bound on χ(G) or ω(G) the SDP approach also produces matrix variables (Z and X) which can be utilized for finding suboptimal coloring [18] or extracting large clique, see [2]. The former idea can be generalized to approximately solving max-k-cut problem, see [9]. The clique of size ω(J(12, 7, 3)) = 7 reported in Table 9 was found by an exact branch and bound algorithm applying the strengthened bound Θ −4 on the non-isomorphic subgraphs. But the search for stronger bounds is also motivated by the hope that these matrix variables will better approximate optimal coloring, clique or max-k-cut, respectively. In fact, we have found 8-colorings of H(6,2,2) and H(6,2,4) by utilizing the strengthened bound Θ +4 in a recursive variant of Karger-Motwani-Sudan graph coloring heuristic while applying Lov´asz theta number instead produced a 10-coloring both times. This heuristics [11] recursively applies the idea suggested by (2),(4) and (5) (or (7)) and formalized in [18] that a large of-diagonal entry y ij implies high probability that there is an optimal coloring in which c(i) = c(j). Likewise equations (9) and (10) imply that a large diagonal element x ii suggests a high probability that vertex i belongs to a large clique. name C97 (C5 )4 (C9 )3 (C7 )4
n 97 625 729 2401
m 97 170000 255879 2785160
orb. 47 4 3 4
cliq. 2 21 67 88
Θ−4 (G) 2.0001 25.0000 82.8870 121.1521
Θ(G) 2.0005 25.0000 82.8870 121.1521
Θ+ (G) 2.0005 25.0000 82.8870 121.1521
Θ+4 (G) 2.0208 27.9508 85.5466 127.9508
Table 8: Computational results on vertex transitive graphs.
13
col. 3 62 119 233
name H(6,2,4) H(6,2,2) H(9,2,6) H(10,2,8) H(11,2,8) H − (9,2,6) H − (9,2,5) H − (11,2,6) H − (12,2,7) H(6,3,3) H(7,3,3) J(10,5,2) J(12,5,3) J(12,7,3) J(14,7,3)
n 64 64 512 1024 2048 512 512 2048 4096 729 2187 252 792 792 3432
m 480 480 21504 23040 168960 119040 97536 1520640 6760448 58320 306180 12600 83160 69300 2102100
orb. 1 1 8 9 10 3 4 5 5 5 6 4 4 4 6
cliq. 4 6 4 2 2 130 74 232 462 9 9 6 7 3 8
Θ− (G) 4.0000 6.0000 4.0000 2.6667 3.2000 160.0000 80.0000 265.8461 531.6923 9.0000 9.0000 6.0000 15.0000 3.6923 8.0000
Θ(G) 4.0000 6.0000 4.0000 2.6667 3.2000 160.0000 80.0000 265,8461 531.6923 9.0000 9.0000 6.0000 15.0000 3.6923 8.0000
Θ+ (G) 5.3333 8.0000 6.4000 3.2000 4.9382 192.0000 128.0000 512.0000 1024.0000 11.5714 11.5714 8.2500 22.0000 6.6000 11.8182
col. 8 8 11 7 12 256 128 512 22 32 12 39 12
Table 9: Computational results on association schemes.
9
Conclusions
Considering all these computational results, we draw the following conclusions. • The Lov´asz number can be computed efficiently on very sparse and very dense graphs. • On random graphs the strengthened bounds do not improve the original Lov´asz number considerably. • Exploiting symmetry can simplify computations significantly. • On some vertex transitive graphs the non-negativity constraints yield a substantial improvement over the original relaxation. • The additional inclusion of the triangle constraints often does not improve upon the simpler relaxation Θ+ . Our numerical results have been obtained by computer programs downloadable from www.math.uniklu.ac.at/or. Further research is necessary to investigate the computation of these bounds by faster first order methods like [6, 16, 23, 35]. Acknowledgement: We thank the anonymous referees for many helpful suggestions to improve an earlier version of this paper.
References [1] S. BENSON, Y. YE, and X. ZHANG. Solving large-scale sparse semidefinite programs for combinatorial optimization. SIAM Journal on Optimization, 10:443–461, 2000. [2] S. BENSON and Y. YE. Approximating maximum stable set and minimum graph coloring problems with the positive semidefinite relaxation. In M. Ferris and J. Pang eds, Applications and Algorithms of Complementarity, pages 1–18. Kluwer Academic Publishers, 2000.
14
[3] I. M. BOMZE and E. de KLERK. Solving standard quadratic optimization problems via linear, semidefinite and copositive programming. Journal of Global Optimization, 24:163–185, 2002. [4] S. BURER, D. VANDENBUSSCHE. Solving lift-and-project relaxations of binary integer programs. To appear in SIAM Journal on Optimization. [5] S. BUSYGIN and D. V. PASECHNIK. On χ(g)−α(g) > 0 gap recognition and α(g)-upper bounds. Electronic Colloquium on Computational Complexity, Report No. 52, pages 1–5, 2003. [6] S. BURER and R. D. C. MONTEIRO. A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization. Mathematical Programming, 95:329–357, 2003. [7] M. CHARIKAR. On semidefinite programming relaxations for graph coloring and vertex cover. In Proceedings of the 41th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 616–620, 2002. [8] E. de KLERK, D. V. PASECHNIK, and A. SCHRIJVER. Reduction of symmetric semidefinite programs using the regular ∗-representation. To appear in Mathematical Programming. [9] E. de KLERK, D. V. PASECHNIK, and J. P. WARNERS. On approximate graph colouring and max-k-cut algorithms based on the θ-function. Journal of Combinatorial Optimization, 8:267 – 294, 2004. [10] I. DUKANOVIC. Semidefinite Programming applied to Graph Coloring Problem. PhD thesis, University of Klagenfurt, Austria, 2005 forthcoming. [11] I. DUKANOVIC and F. RENDL. A semidefinite programming based heuristic for graph coloring. Submitted to Annals of Operations Research. [12] I. DUKANOVIC and F. RENDL. Copositive programming motivated bounds on the clique and the chromatic number of a graph. Working paper, University of Klagenfurt, 2005. [13] K. GATERMANN and P. A. PARRILO. Symmetry groups, semidefinite programs, and sums of squares. Journal of Pure and Applied Algebra, 192:95–128, 2004. ¨ ´ [14] M. GROTSCHEL, L. LOVASZ, and A. SCHRIJVER. Geometric Algorithms and Combinatorial Optimization. Springer New York, 1988. [15] G. GRUBER and F. RENDL. Computational experience with stable set relaxations. SIAM Journal on Optimization, 13:1014–1028, 2003. [16] C. HELMBERG and F. RENDL. A spectral bundle method for semidefinite programming. SIAM Journal on Optimization, 10:673–696, 2000. [17] A. HERTZ and DE WERRA D. Using tabu search for graph coloring. Computing, 39:345–351, 1987. [18] D. KARGER, R. MOTWANI, and M. SUDAN. Approximate graph coloring by semidefinite programming. Journal of the Association for Computing Machinery, 45:246–265, 1998. [19] D. E. KNUTH. The sandwich theorem. Electronic Journal of Combinatorics, 1:1–48, 1994. [20] M. LAURENT and F. RENDL. Semidefinite programming and integer programming. In K. Aardal et al., editor, Handbook in OR & MS, pages 393–514. Elsevier B. V., forthcoming. ´ [21] L. LOVASZ. On the Shannon capacity of a graph. IEEE Transactions on Information Theory, 25:1–7, 1979. 15
´ [22] L. LOVASZ and A. SCHRIJVER. Cones of matrices and set-functions and 0-1 optimization. SIAM Journal on Optimization, 1:166–190, 1991. [23] J. MALICK, J. POVH, F. RENDL, A. WIEGELE. Boundary point method. Working paper, 2005. [24] F. MARGOT. Pruning by isomorphism in branch-and-cut. Lecture Notes in Computer Science, 2081:304–317, 2001. [25] Jr. R.J. McELIECE, E.R. RODEMICH, and H.C. RUMSEY. The Lov´asz bound and some generalizations. Journal of Combinatorics and System Sciences, 3:134–152, 1978. [26] B. D. McKAY. Practical graph isomorphism. Congressus Numerantium, 30:45–87, 1981. [27] P. MEURDESOIF. Strengthening the Lov´asz θ(G) bound for graph coloring. Mathematical Programming, 102:577–588, 2005. [28] K. G. MURTY. Some NP-complete problems in quadratic and nonlinear programming. Mathematical Programming, 39:117–129, 1987. [29] P. A. PARRILO. Structured Semidefinite Programs and Semialgebraic Geometry Methods in Robustness and Optimization. PhD thesis, California Institute of Technology, 2000. [30] P. A. PARRILO and B. STURMFELS. Minimizing polynomial functions. In Algorithmic and quatitative real algebraic geometry. DIMACS Series in Discrete Mathematics and Theoretical Computer Science, 60: 83–99, AMS, 2003. [31] A. SCHRIJVER. New code upper bounds from the Terwilliger algebra. IEEE Transactions on Information Theory, 51:2859–2866, 2004. [32] A. SCHRIJVER. A comparison of the Delsarte and Lov´asz bounds. IEEE Transactions on Information Theory, IT-25:425–429, 1979. [33] M. SZEGEDY. A note on the theta number of Lov´asz and the generalized Delsarte bound. 35th Annual Symposium on Foundations of Computer Science, pages 36–39, 1994. ˇ [34] A. VESEL and J. ZEROVNIK. Improved lower bound on the Shannon capacity of C7. Information Processing Letters, 81 (5):277-282, 2002. [35] K.C. TOH and M. KOJIMA. Solving some large scale semidefinite programs via the conjugate residual method. SIAM Journal on Optimization, 12:669–691, 2002.
16