Breaking Symmetries in Graph Coloring Problems with Degree ...

3 downloads 466 Views 407KB Size Report
Nov 25, 2014 - 1 Department of Computer Science, Ben-Gurion University of the ... wakowski [24] proved in 1997 that R(4, 3, 3) ≤ 32, and one year later ...
Breaking Symmetries in Graph Coloring Problems with Degree Matrices: the Ramsey Number R(4,3,3)=30? Michael Codish1 , Michael Frank1 , Avraham Itzhakov1 , and Alice Miller2

arXiv:1409.5189v2 [cs.AI] 25 Nov 2014

1

Department of Computer Science, Ben-Gurion University of the Negev, Israel 2 School of Computing Science, University of Glasgow, Scotland

Abstract. This paper introduces a general methodology that applies to solve graph edge-coloring problems and demonstrates its use to compute the Ramsey number R(4, 3, 3). The number R(4, 3, 3) is often presented as the unknown Ramsey number with the best chances of being found “soon”. Yet, its precise value has remained unknown for more than 50 years. The proposed technique is based on two well-studied concepts, abstraction and symmetry. First, we introduce an abstraction on graph colorings, degree matrices, that specify the degree of each vertex in each color. We compute, using a SAT solver, an overapproximation of the set of degree matrices of all solutions of the graph coloring problem. Then, for each degree matrix in the over-approximation, we compute, again using a SAT solver, the set of all solutions with matching degrees. Breaking symmetries, on degree matrices in the first step and with respect to graph isomorphism in the second, is cardinal to the success of the approach. We illustrate the approach via two applications: proving that R(4, 3, 3) = 30 and computing the previously unknown number of (3, 3, 3; 13) Ramsey colorings (78,892).

1

Introduction

This paper introduces a general methodology that applies to solve graph edge-coloring problems and demonstrates its use to prove that the Ramsey number R(4, 3, 3) is equal to 30. Ramsey numbers arise in a problem of graph theory that involves assigning k colors to the edges of a complete graph. In particular, R(4, 3, 3) is the smallest number n such that any coloring of the edges of the complete graph Kn in three colors will either contain a K4 sub-graph in the first color, a K3 sub-graph in the second color, or a K3 sub-graph in the third color. The precise value of this number has been sought for more that 50 years. Kalbfleisch [19] proved in 1966 that R(4, 3, 3) ≥ 30, Piwakowski [24] proved in 1997 that R(4, 3, 3) ≤ 32, and one year later Piwakowski and Radziszowski [25] proved that R(4, 3, 3) ≤ 31. We also compute the number of Ramsey (3, 3, 3; 13) colorings. A (3, 3, 3; n) coloring is an assignment of one of three colors to each of the edges of the complete graph Kn such that there is no monochromatic K3 sub-graph. It is known that R(3, 3, 3) = 17, so there are no (3, 3, 3; 17) colorings. The number of (3, 3, 3; n) colorings for 14 ≤ n ≤ 16 are known. The number of (3, 3, 3; 13) colorings was previously unknown [27]. Solving hard search problems on graphs, and graph coloring problems in particular, relies heavily on breaking symmetries in the search space. When searching for a graph, ?

Supported by the Israel Science Foundation, grant 182/13. Computational resources provided by an IBM Shared University Award (Israel).

the actual names of the vertices do not matter, and restricting the search modulo graph isomorphism is highly beneficial. When searching for a graph coloring, on top of graph isomorphism, solutions are typically closed under permutations of the colors: the actual names of the colors do not matter and the term often used is “weak isomorphism” [25] (the equivalence relation is weaker because both node names and edge colors do not matter). When the problem is to compute the set of all solutions modulo (weak) isomorphism, or to count them, the task is even more challenging. Often one first attempts to compute all of the solutions, and to then apply one of the available graph isomorphism tools, such as nauty [21] to select representatives of their equivalence classes modulo (weak) isomorphism. However, typically the number of solutions is so large that this approach is doomed to fail even though the number of equivalence classes itself is much smaller. The problem is that tools such as nauty can only be applied after, and not during, search. A solution is to restrict the search space so that it still includes at least one representative of each equivalence class, and as few as possible other graphs. Or in other words, to apply symmetry breaking (modulo equivalence) during the search for a solution to the graph coloring problem. We present a general, two step, methodology towards solving hard graph coloring problems. We apply it to compute the previously unknown set of all Ramsey colorings (3, 3, 3; 13), modulo weak isomorphism, of the complete graph K13 . More significantly, we apply it to prove that the Ramsey number R(4, 3, 3) is equal to 30. In the first step, we describe the solutions of a graph coloring problem in terms of an abstraction, which we call degree matrices, that focuses on the degrees of vertices in each color. We compute, using a SAT solver, an over-approximation, i.e. a super-set M, of the set of degree matrices of all solutions of the graph coloring problem. Already in this step we obtain new results for Ramsey coloring problems which are presented as Lemmas 2 and 3. In the second step, for each degree matrix M in the over-approximation M, we show how to compute the set of all solutions of the graph coloring problem which have a degree matrix matching M , or alternatively that there are none such. Solving a graph coloring problem, using an over-approximation, M, of the degree matrices in its solutions, has three key advantages: (1) It breaks the coloring problem into a set of independent sub-problems, one sub-problem for each M ∈ M. These subproblems can then be solved in parallel. (2) It directs the search: solving the (graph coloring) problem given some properties (the degree matrix) of its solutions helps guide the solver through the search space. (3) It facilitates symmetry breaking: degree matrices are isomorphic under renamings of colors and of graph vertices. We take advantage of this during the first step to break many of the symmetries in the coloring problem. Also, in the second step, seeking solutions given a degree matrix boosts symmetry breaking. Throughout the paper we express graph coloring problems in terms of constraints via a “mathematical language”. Our implementation uses the BEE, finite-domain constraint compiler [23], which solves constraints by encoding them to CNF and applying an underlying SAT solver. The solver can be applied to find a single (first) solution to a constraint, or to find all solutions for a constraint modulo a specified set of (integer and/or Boolean) variables. Of course, correctness of our results assumes the lack of bugs in the tools we have used including the constraint solver and the underlying SAT solver. To this end we have performed our computations using four different un-

derlying SAT solvers: MiniSAT [13,14], CryptoMiniSAT [29], Glucose [3,4], and Lingeling [5,6]. BEE configures directly with MiniSAT, CryptoMiniSAT, and Glucose. For the experiments with Lingeling we first apply BEE to generate a CNF (dimacs) file and subsequently invoke the SAT solver. Lingeling, together with Druplig [7], provides a proof certificate for unsat instances (and we have taken advantage of this option). All computations were performed on a cluster with a total of 168 Intel E8400 cores clocked at 2 GHz each, able to run 336 parallel threads. Each SAT instance is run on a single thread. The notion of a “degree matrix” arises in the literature with several different meanings. Degree matrices with the same meaning as we use in in this paper are considered also in [8]. Gent and Smith [16], building on the work of Puget [26], study symmetries in graph coloring problems and recognize the importance of breaking symmetries during search. Meseguer and Torras [22] present a framework for exploiting symmetries to heuristically guide a depth first search, and show promising results for (3, 3, 3; n) Ramsey colorings with 14 ≤ n ≤ 17. Al-Jaam [1] proposes a hybrid meta-heuristic algorithm for Ramsey coloring problems, combining tabu search and simulated annealing. While all of these approaches report promising results, to the best of our knowledge, none of them have been successfully applied to solve open instances or improve the known bounds on classical Ramsey numbers. Our approach focuses on symmetries due to weak-isomorphism for graph coloring and models symmetry breaking in terms of constraints introduced as part of the problem formulation. This idea, advocated by Crawford et al. [12], has previously been explored in [10] (for graph isomorphism), and in [26] (for graph coloring). Graph coloring has many applications in computer science and mathematics, such as scheduling, register allocation and synchronization, path coloring and sensor networks. Specifically, many finite domain CSP problems have a natural representation as graph coloring problems. Our main contribution is a general methodology that applies to solve graph edge coloring problems. The application to compute an unknown Ramsey number is impressive, but the importance here is in that it shows the utility of the methodology.

2

Preliminaries

An (r1 , . . . , rk ; n) Ramsey coloring is an assignment of one of k colors to each edge in the complete graph Kn such that it does not contain a monochromatic complete sub-graph Kri in color i for 1 ≤ i ≤ k. The set of all such colorings is denoted R(r1 , . . . , rk ; n). The Ramsey number R(r1 , . . . , rk ) is the least n > 0 such that no (r1 , . . . , rk ; n) coloring exists. In the multicolor case (k > 2), prior to this paper, the only known value of a nontrivial Ramsey number is R(3, 3, 3) = 17. The value of R(4, 3, 3) was previously known to be equal either to 30 or to 31. The numbers of (3, 3, 3; n) colorings are known for 14 ≤ n ≤ 16 but prior to this paper the number of colorings for n = 13 was unknown. More information on recent results concerning Ramsey numbers can be found in the electronic dynamic survey by Radziszowski [27]. In this paper, graphs are always simple, i.e. undirected and with no self loops. Colors are associated with graph edges. For a natural number n denote [n] = {1, 2, . . . , n}. A graph coloring, in k colors, is a pair (G, κ) consisting of a simple graph G = ([n], E)

and a mapping κ : E → [k]. When κ is clear from the context we refer to G as the graph coloring. of G induced by the color c ∈ [k] is the graph Gc = The sub-graph ([n], e ∈ E κ(e) = c ). We typically represent G as an n × n adjacency matrix, A, defined such that ( κ(i, j) if (i, j) ∈ E Ai,j = 0 otherwise If A is the adjacency matrix representing the graph G, then we denote the Boolean adjacency matrix corresponding to Gc as A[c]. We denote the ith row of a matrix A by Ai . The color-c degree of a node x in G is denoted degGc (x) and is equal to the degree of x in the induced sub-graph Gc . When clear from the context  we write deg c (x). Let G = ([n], E) and π be a permutation on [n]. Then π(G) = (V, (π(x), π(y)) (x, y) ∈ E ). Permutations act on adjacency matrices in the natural way: If A is the adjacency matrix of a graph G, then π(A) is the adjacency matrix of π(G) obtained by simultaneously permuting with π both rows and columns of A.

ϕn,k adj (A) =

^

1 ≤ Aq,r ≤ k ∧ Aq,r = Ar,q ∧ Aq,q = 0



(1)

1≤q 48 hr. hardest:55.4 sec (94,248 instances, 0 solutions)

Table 3. Computing R(4, 3, 3; 30) step by step.

step 1: Prove, using a SAT solver, a bound on the degrees in the first color: Equation (8) together with a constraint that states that, for the first color, the minimal degree is less than 2 or the maximal degree is greater than 5, is found unsatisfiable indicating that degrees in the first color are between 2 and 5. step 2: Compute, using Prolog, all graphical sequences [15,9] on 13 vertices with degrees between 2 and 5. There are 280 such sequences. step 3: State Lemma 2: For each sequence S from the 280, consider the constraints of Equation (8), this time fixing the first column of M to S. This results in 280 instances, 95 are found satisfiable. The slowest instance required 51 hours of computation using CryptoMiniSAT as the underlying SAT solver. For MiniSAT and Glucose we indicate the number of timeouts (after 60 hours). The 95 satisfiable instances were finished within the timeout in both MiniSAT and Glucose. step 4: Extend, using Prolog, the 95 degree sequences from Lemma 2 to all possible corresponding lex sorted degree matrices. The first column in each degree matrix is one of the sequences in the lemma, each row sums to 12, and the elements of each column form a graphical sequence (if sorted). This results in a set M of 84,184 degree matrices. step 5: Compute, using a SAT solver, for each degree matrix M ∈ M, all corresponding solutions of Equation (8). This involves 84,184 instances and generates in total 827,964 (3, 3, 3; 13) colorings. In Table 2 we detail the total solving time for these instances and the solving times for the hardest instance for each SAT solver. The largest number of graphs generated by a single instance is 10,311. step 6: The 827,964 graph colorings are reduced modulo weak isomorphism using nauty3 [21]. This process results in a set with 78,892 graphs. 3

Note that nauty does not handle edge colored graphs and weak isomorphism directly. We applied an approach called k-layering described at https: //computationalcombinatorics.wordpress.com/2012/09/20/ canonical-labelings-with-nauty.

Experiment 2: computing R(3, 3, 4; 30) = 30 Theorem 1 states that if there is no (4, 3, 3; 30) Ramsey coloring then R(4, 3, 3) = 30 and otherwise R(4, 3, 3) = 31. We show that there is no (4, 3, 3; 30) Ramsey coloring hence proving that R(4, 3, 3) = 30. We describe the step by step computation summarizing the process in Table 3 where we report on the use of four different SAT solvers: MiniSAT [13,14], CryptoMiniSAT [29], Glucose [3,4], and Lingeling [5,6]. There are three main differences in this process in comparison to the steps described for the computation of R(3, 3, 3; 13): (1) We cannot enumerate degree sequences in the first step because there are 30 vertices; (2) BEE is not configured to iterate for all solutions using Lingeling, so we introduce the “verify” steps, 3 and 4 below; and (3) All instances were found to be unsatisfiable implying that there are no colorings. So, we do not need to apply nauty in the last step. We comment that the CNF instances in all of the instances are not large, consisting of 95,000 – 100,000 clauses each. step 1: State Lemma 3: Compute, using SAT, all graphical sequences [15,9] for the first color on 30 vertices with degrees satisfying the constraints of Theorem 1 and a relaxation of Constraint (8) as described in the “second scenario” of Section 4. There are 28 such sequences. step 2: Verify Lemma 3: add to the constraint from the previous step a constraint which specifies that the degree sequence in the first color differs from the 28 in the statement of the lemma. The fact that this constraint is unsatisfiable constitutes a proof of the lemma. The purpose of this step is to facilitate the use of Lingeling. step 3: Extend, using SAT, the 28 degree sequences from Lemma 3 to all possible corresponding lex sorted degree matrices. Here we ignore the coloring constraints, assume the constraints of Theorem 1, assume the first column is one of the sequences in the lemma, require that the elements of each column form a graphical sequence (if sorted) and that each row sums to 29. This results in a set M of 94,248 degree matrices. step 4: Verify, using SAT, that there are no other degree matrices besides the 94,248. This is performed in 28 instances which are all unsat. The purpose of this step is to facilitate the use of Lingeling. step 5: Compute, using a SAT solver, for each degree matrix M ∈ M, all corresponding solutions of Equation (8) using the full (not relaxed) constraint. This involves 94,248 instances, all of which are unsatisfiable. In Table 2 we detail the total solving time for these instances and the solving times for the hardest instance for each SAT solver. For MiniSAT and Glucose we indicate the number of instances that were not resolved after 48 hours.

7

Discussion

Proving extremal bounds and existence cases for combinatorial problems by computer is a common approach, but the question of whether a proof by computer constitutes a proper proof is a controversial one. Most famously the issue caused much heated

debate after publication of the computer proof of the Four Color Theorem [2]. It is straightforward to justify an existence proof (i.e. a sat result), as it is easy to verify that the witness produced satisfies the desired properties. Justifying an unsat result requires a more philosophical argument, which is beyond the scope of this paper. If nothing else, as in all computer proofs of the kind presented in this paper, we are certainly required to add the proviso that our unsat results are based on the assumption of a lack of bugs in the entire tool chain (constraint solver, sat solver, c-compiler etc.) used to obtain them. Some sat solvers have the ability to reduce a problem down to a minimal set of clauses which lead to the empty clause (a minimum unsatisfiable core, [20]). A formal proof of unsatisfiabilty can then potentially be derived from this core. This is itself a hard problem (possibly unsolvable, due to the size of the core), and not one that we address in this paper. Initial proofs by computer can lead to more elegant proofs that rely less (if at all) on computer search. For example the initial skepticism in the case of the Four Color Theorem was eventually allayed after the initial models used were refined in order to reduce the computer search to fewer subcases [28], finally leading to formalization of the proof using the Coq proof assistant [17]. Without the initial computer proofs, the final (universally accepted) formal proof may have never existed. In this paper, we have made the effort to test our results using several different SAT solvers. We have also provided detailed explanations of how we have proven two intermediate lemmas (Lemmas 2 and 3) using computer search. These lemmas are essential – they have allowed us to both parallelize and direct the search when solving our two major Ramsey number problems. They may one day be formally proven – perhaps eventually leading to a non-computer proof of our main results.

8

Conclusion

We have proposed a methodology for solving hard graph coloring problems by combining ideas related to abstraction and to breaking symmetry during search. Our main vehicle is the use of degree matrices, both for abstraction and for symmetry breaking. It is worth noting that degree matrices are to graph edge-colorings as degree sequences are to graphs. Our approach is general and applies directly to any graph edge coloring problem. We report on the successful application of this approach to solve two open problems in the area of finite Ramsey theory. The fact that our technique leads to a solution for these specific problems is a significant indication of the power of the method. These problems have hitherto proven to be untractable, despite a concerted effort over the past fifty years to solve them, using a variety of techniques — including symmetry breaking. Our proposed combination of abstraction and symmetry is likely to influence the search for a solution to many other hard combinatorial search problems in the future.

References 1. J. M. Al-Jaam. Experiments of intelligent algorithms on Ramsey graphs. Int. Arab J. Inf. Technol., 4(2):161–167, 2007. 2. K. Appel and W. Haken. Every map is four colourable. Bulletin of the American Mathematical Society, 82:711–712, 1976. 3. G. Audemard and L. Simon. Glucose SAT Solver. http://www.labri.fr/perso/ lsimon/glucose/. 4. G. Audemard and L. Simon. Predicting learnt clauses quality in modern SAT solvers. In C. Boutilier, editor, IJCAI 2009, Proceedings of the 21st International Joint Conference on Artificial Intelligence, Pasadena, California, USA, July 11-17, 2009, pages 399–404, 2009. 5. A. Biere. Lingeling SAT Solver. http://fmv.jku.at/lingeling/. 6. A. Biere. Lingeling essentials, A tutorial on design and implementation aspects of the the SAT solver lingeling. In D. L. Berre, editor, POS-14. Fifth Pragmatics of SAT workshop, a workshop of the SAT 2014 conference, part of FLoC 2014 during the Vienna Summer of Logic, July 13, 2014, Vienna, Austria, volume 27 of EPiC Series, page 88. EasyChair, 2014. 7. A. Biere. Yet another local search solver and lingeling and friends entering the sat competition 2014. Technical report, Department of Computer Science Series of Publications B, pages 39-40, University of Helsinki, 2014. Proceedings of SAT Competition 2014, A. Balint, A. Belov, M. Heule, M. J¨arvisalo (editors). 8. J. R. Carroll and G. Isaak. Degree matrices realized by edge-colored forests. Available from www.lehigh.edu/˜gi02/ecforest.pdf (unpublished), 2009. 9. S. A. Choudum. A simple proof of the Erd˝os-Gallai theorem on graph sequences. Bull. Austral. Math. Soc., 33(1):67–70, 1986. 10. M. Codish, A. Miller, P. Prosser, and P. J. Stuckey. Breaking symmetries in graph representation. In F. Rossi, editor, IJCAI 2013, Proceedings of the 23rd International Joint Conference on Artificial Intelligence, Beijing, China, August 3-9, 2013. IJCAI/AAAI, 2013. 11. M. Codish, A. Miller, P. Prosser, and P. J. Stuckey. Constraints for symmetry breaking in graph representation. Full version of [10] (in preparation)., 2014. 12. J. M. Crawford, M. L. Ginsberg, E. M. Luks, and A. Roy. Symmetry-breaking predicates for search problems. In L. C. Aiello, J. Doyle, and S. C. Shapiro, editors, KR, pages 148–159. Morgan Kaufmann, 1996. 13. N. E´en and N. S¨orensson. MiniSAT SAT Solver. http://minisat.se/Main.html. 14. N. E´en and N. S¨orensson. An extensible sat-solver. In E. Giunchiglia and A. Tacchella, editors, Theory and Applications of Satisfiability Testing, 6th International Conference, SAT 2003. Santa Margherita Ligure, Italy, May 5-8, 2003 Selected Revised Papers, volume 2919 of Lecture Notes in Computer Science, pages 502–518. Springer, 2003. 15. P. Erd¨os and T. Gallai. Graphs with prescribed degrees of vertices (in Hungarian). Mat. Lapok, pages 264–274, 1960. Available from http://www.renyi.hu/˜p_erdos/ 1961-05.pdf. 16. I. P. Gent and B. M. Smith. Symmetry breaking in constraint programming. In W. Horn, editor, ECAI 2000, Proceedings of the 14th European Conference on Artificial Intelligence, Berlin, Germany, August 20-25, 2000, pages 599–603. IOS Press, 2000. 17. G. Gonthier. Formal proofthe four-color theorem. Noticies of the AMS, 55(11):1382–1293, 2008. 18. J. Kalbfleisch and R. Stanton. On the maximal triangle-free edge-chromatic graphs in three colors. Journal of Combinatorial Theory, 5:9–20, 1968. 19. J. G. Kalbfleisch. Chromatic Graphs and Ramsey’s Theorem. PhD thesis, University of Waterloo, January 1966.

20. I. Lynce and J. Marques-Silva. On computing minimum unsatisfiable cores. In SAT 2004, volume 3542 of Lecture Notes in Computer Science, pages 305–310. Springer, 2005. 21. B. McKay. nauty user’s guide (version 1.5). Technical Report TR-CS-90-02, Australian National University, Computer Science Department, 1990. 22. P. Meseguer and C. Torras. Exploiting symmetries within constraint satisfaction search. Artif. Intell., 129(1-2):133–163, 2001. 23. A. Metodi, M. Codish, and P. J. Stuckey. Boolean equi-propagation for concise and efficient SAT encodings of combinatorial problems. J. Artif. Intell. Res. (JAIR), 46:303–341, 2013. 24. K. Piwakowski. On Ramsey number r(4, 3, 3) and triangle-free edge-chromatic graphs in three colors. Discrete Mathematics, 164(1-3):243–249, 1997. 25. K. Piwakowski and S. P. Radziszowski. 30 ≤ R(3, 3, 4) ≤ 31. Journal of Combinatorial Mathematics and Combinatorial Computing, 27:135–141, 1998. 26. J. Puget. On the satisfiability of symmetrical constrained satisfaction problems. In H. J. Komorowski and Z. W. Ras, editors, Methodologies for Intelligent Systems, 7th International Symposium, ISMIS ’93, Trondheim, Norway, June 15-18, 1993, Proceedings, volume 689 of Lecture Notes in Computer Science, pages 350–361. Springer, 1993. 27. S. P. Radziszowski. Small Ramsey numbers. Electronic Journal of Combinatorics, 1994. Revision #14: January, 2014. 28. N. Robertson, Sanders, P. Seymour, and R. Thomas. The four-colour theorem. Journal of Combinatorial Theory, Series B, 70:2–44, 1997. 29. M. Soos. CryptoMiniSAT, v2.5.1. http://www.msoos.org/cryptominisat2, 2010.