A New Exact Algorithm for the Two-Sided Crossing ... - CiteSeerX

5 downloads 549 Views 188KB Size Report
eling it directly to a binary quadratic programming problem. We show ... algorithm for instances where the maximum degree of vertices on the free side is.
A New Exact Algorithm for the Two-Sided Crossing Minimization Problem? Lanbo Zheng1,2 and Christoph Buchheim3 1

School of Information Technologies, University of Sydney, Australia. 2 IMAGEN program, National ICT Australia. [email protected] 3 Computer Science Department, University of Cologne, Germany. [email protected]

Abstract. The Two-Sided Crossing Minimization (TSCM) problem calls for minimizing the number of edge crossings of a bipartite graph where the two sets of vertices are drawn on two parallel layers and edges are drawn as straight lines. This well-known problem has important applications in VLSI design and automatic graph drawing. In this paper, we present a new branch-and-cut algorithm for the TSCM problem by modeling it directly to a binary quadratic programming problem. We show that a large number of effective cutting planes can be derived based on a reformulation of the TSCM problem. We compare our algorithm with a previous exact algorithm by testing both implementations with the same set of instances. Experimental evaluation demonstrates the effectiveness of our approach.

1

Introduction

Real world information is often modeled by abstract mathematical structures so that relationships between objects are easily visualized and detected. Directed graphs are widely used to display information with hierarchical structures which frequently appear in computer science, economics and social sciences. Sugiyama, Tagawa, and Toda [14] presented a comprehensive approach to draw directed graphs. First, vertices are partitioned and constrained to a set of equally spaced horizontal lines, called layers, and edges are straight lines connecting vertices from adjacent layers. They then select a permutation of the vertices in each layer to reduce the number of crossings between the edges. The second step is very important as it is generally accepted that drawings with less crossings are easier to read and understand. This problem attracted a lot of studies in graph drawing and is usually solved by considering two neighboring layers at a time. The resulting problem is generally called the two-layer crossing minimization (TLCM) problem. Another motivation comes from a layout problem in VLSI design [12]. A recent study shows that solutions of the two-layer crossing ?

This work was partially supported by the Marie Curie Research Training Network 504438 (ADONET) funded by the European Commission.

minimization problem can be used to solve the rank aggregation problem that has applications in meta-search and spam reduction on the Web [2]. Given a bipartite graph G = (V1 ∪ V2 , E), a two-layer drawing consists of placing the vertices from V1 on distinct positions on a straight line L1 and placing the vertices from V2 on distinct positions on a parallel line L2 . Each edge is drawn using a straight line segment connecting the positions of the end vertices of the edge. Clearly, the number of edge crossings in a drawing only depends on the permutations of the vertices on L1 and L2 . The two-layer crossing minimization problem asks to find a permutation π1 of vertices on L1 and a permutation π2 of vertices on L2 so that the number of edge crossings is minimized. This problem was first introduced by Harary and Schwenk [7] and has two different versions. The first one is called two-sided crossing minimization (TSCM), where vertices of the two vertex sets can be permuted freely. For multi-graphs, Garey and Johnson proved the NP-hardness of this problem by transforming the Optimal Linear Arrangement problem to it [5]. The one-sided crossing minimization (OSCM) problem is more restricted; here the permutation of one vertex set is given. However, this problem is also NP-hard [4], even for forests of 4-stars [10]. It is obvious from the literature that the one-sided crossing minimization problem has been intensively studied. Several heuristic algorithms deliver good solutions, theoretically or experimentally. The barycenter heuristic [14] is an √ O( n)-approximation algorithm, while the median heuristic [4] guarantees 3approximative solutions. Yamaguchi and Sugimoto [15] gave a 2-approximation algorithm for instances where the maximum degree of vertices on the free side is not larger than 4. A new approximation algorithm presented by Nagamochi [11] has an approximation ratio of 1.4664. J¨ unger and Mutzel [8] used integer and linear programming methods to solve the TLCM problem exactly for the first time. For the one-sided version, they reduced it to a linear arrangement problem and used the branch-and-cut algorithm published in [6] to solve it. For the two-sided version, an optimal solution was found by enumerating all permutations of one part of the vertices for a given graph. A good starting solution and a good theoretical lower bound were used to make the enumeration tree small. They did extensive experiments to compare the exact algorithm with various existing heuristic algorithms. They found that if one layer is fixed, then the branch-and-cut algorithm is very effective and there is no need to use heuristics in practice. But for the TSCM problem, in the worst case, the algorithm enumerates an exponential number of solutions. For some instances whose optimal solutions could not be computed, we found that the gaps between the optima and the results approached by iterated heuristic algorithms are not negligible. See Fig. 1 for an example. In this paper, we directly model the TSCM problem as a binary quadratic programming (BQP) problem. Because all variables are binary, this model can be easily transformed into an integer linear programming (ILP) model so that general optimization methods can be applied. In particular, branch-and-cut is one of the most successful methods in solving ILP problems. The performance of a branch-and-cut algorithm often depends on the number and quality of cutting

(a) A drawing with 19 crossings

(b) A drawing with 50 crossings

Fig. 1. A 20+20 graph with 40 edges. Drawing (a) has a minimum number of crossings, drawing (b) is the best drawing found by the iterated barycenter heuristic.

planes generated within the algorithm. Unfortunately, we do not know many classes of cutting planes for the TSCM problem from the literature. Our approach is based on a reformulation of the TSCM problem such that all valid inequalities for a maximum cut problem become valid. The maximum cut problem has been well-studied and many classes of cutting planes are known. We conjecture that these cutting planes could be helpful to solve our problem. We compared our approach with a previous exact algorithm by testing it with the same instances. Experimental evaluation positively proves our conjecture. This paper is organized as follows. In Sect. 2, the problem under consideration is formalized and necessary notation is introduced. In Sect. 3, we describe how to reformulate the TSCM problem and present a corresponding branch-andcut algorithm. Experimental results are analyzed in Sect. 4, and in Sect. 5, we summarize and conclude our work.

2

Preliminaries

For a bipartite graph G = (V1 ∪ V2 , E), let V = V1 ∪ V2 , n1 = |V1 |, n2 = |V2 |, m = |E| and let N (i) = {j ∈ V | {i, j} ∈ E} denote the set of neighbors of i ∈ V in G. For k ∈ {1, 2}, a vertex ordering (or vertex permutation) πk of Vk is a bijection πk : Vk → {1, 2, . . . , nk }. For a pair of vertices (i, j) ∈ Vk , we write i < j instead of πk (i) < πk (j). Any solution of TLCM is obviously completely specified by a permutation π1 of V1 and a permutation π2 of V2 . The k formulation system given in [8] can be applied directly to our problem: let δij =1 if πk (i) < πk (j) and 0 otherwise. Then πk is characterized by the binary vector nk δ k ∈ {0, 1}( 2 ) .

Given π1 and π2 , the induced number of crossings is:

C(π1 , π2 ) = C(δ 1 , δ 2 ) =

nX 2 −1

n2 X

X

X

1 2 1 2 (δst · δji + δts · δij )

(1)

1 2 1 2 (δst · δji + δts · δij )

(2)

i=1 j=i+1 s∈N (i) t∈N (j)

=

nX 1 −1

n1 X

X

X

s=1 t=s+1 i∈N (s) j∈N (t)

In the one-sided crossing minimization problem, the permutation π1 of V1 is fixed, thus δ 1 is a constant vector. Hence the objective functions (1) and (2) are linear in δ 2 in this case. On contrary, in the two-sided crossing minimization problem, both δ 1 and δ 2 are vectors of binary variables, so that (1) and (2) become quadratic functions. P P P In the following, for simplicity, we write st . Using δji = 1 − δij and δii = 0, we can reformulate our objective function as: C(π1 , π2 ) = C(δ 1 , δ 2 ) =

nX 2 −1

n2 X

i=1 j=i+1

! X

1 (δst

+

2 δij



1 2 2δst δij )

+

st

Note that there are different ways to formulate the objective function. We use (3) because it has an advantage when solving dense graphs, see Sect. 4 for details. 1 2 As the next step, all quadratic terms δst · δij in the objective function (3) can be linearized by introducing new variables

1 2 βstij = δst · δij

(4)

that are zero-one valued and satisfy the following sets of constraints 1 2 δst + δij − βstij ≤ 1

(5)

1 −δst + βstij 2 −δij + βstij 1 2 δst , δij , βstij

≤0 ≤0

(6) (7)

∈ {0, 1}

(8)

This standard linearization technique is well-known from the literature. Combining the above results with the linear inequalities for the linear ordering problem in [8], we obtain the following integer linear programming model for the TSCM problem:

Minimize

nX 2 −1

n2 X

i=1 j=i+1

Subject to

! X

1 (δst

+

2 δij

− 2βstij ) +

st

2 2 2 + δjk − δik ≤1 0 ≤ δij

for 1 ≤ i < j < k ≤ n2

1 1 1 ≤1 0 ≤ δst + δtl − δsl

for 1 ≤ s < t < l ≤ n1

1 2 δst , δij , βstij satisfy (6), (7), (8)

for coef(βstij ) < 0 1 ≤ i < j ≤ n2 s ∈ N (i), t ∈ N (j), s < t

1 2 δst , δij , βstij satisfy (5), (8)

for coef(βstij ) > 0 1 ≤ i < j ≤ n2 s ∈ N (i), t ∈ N (j), s < t

Here coef(βstij ) is the coefficient of variable βstij in the objective function. If the variable βstij has a negative coefficient, Constraint (5) is not necessary because it tightens the linear relaxation on a side that is not relevant for our optimization objective. Similarly, Constraints (6) and (7) are unnecessary for variables with positive coefficients. The 3-dicycle inequalities in the first two lines of the above formulation are necessary to ensure that the vectors δ 1 and δ 2 indeed correspond to permutations of V1 and V2 , i.e., that integer solutions of our ILP correspond to solutions of the TSCM problem.

3 3.1

Our Algorithm Branch-and-cut

The exact algorithm used by J¨ unger and Mutzel [8] to solve the TSCM problem becomes unpractical as graphs are growing larger and the theoretical lower bound is no longer effective to bound the enumeration tree. However, the basic idea of branch-and-bound is the pruning of branches in this tree: at some node of the tree, the ordering of a certain set of vertices is already fixed. According to this information, one can derive a lower bound on the number of crossings subject to these fixed vertices. If the lower bound is at most as good as a feasible solution that has already been found, e.g., by some heuristics, it is clear that the considered subtree cannot contain a better solution, so it does not have to be explored. Through the integer programming model we formulated in the previous section, we can solve the TSCM problem directly with an LP-based branch-and-cut algorithm. The basic structure of this approach is an enumeration tree rooted at an LP relaxation of the original problem, i.e., the integrality constraints are relaxed. LPs are solved very quickly in practice. If the LP-solution is integer, we can stop. Otherwise, we try to add cutting planes that are valid for all integer solutions of the ILP but not necessary for (fractional) solutions of the LP. If such

cutting planes are found, they are added to the LP and the process is reiterated. We resort to the branching part only if no more cutting planes are found. High quality cutting planes that cause big changes to the objective function can be crucial to make the enumeration tree small. However, finding them is usually a sophisticated problem. In the following, we describe our approach to resolve the above issue. We show that the TSCM problem is, in fact, a cut problem with additional constraints. We then describe how to generate a set of cutting planes which may help to improve the performance of our algorithm. 3.2

Generating Cutting Planes

If the 3-dicycle inequalities are relaxed, the TSCM problem is an unconstrained binary quadratic programming (UBQP) problem. We denote it as TSCM*. Every polytope corresponding to an UBQP problem is isomorphic to a cut polytope by [13], thus TSCM* can be reduced to a maximum cut problem. The corresponding graph H = (A∪B ∪r, E1 ∪E2 ) is defined as follows. Every 2 1 (δij ) for 1 ≤ s < t ≤ n1 vertex ast ∈ A (bij ∈ B) corresponds to a variable δst (1 ≤ i < j ≤ n2 ) and every edge e = (ast , bij ) ∈ E1 to a variable βstij . Edges in E2 join vertex r with all vertices in A and B. Now for some cut in H defined by a (possibly empty) set S ⊆ A ∪ B ∪ r, let γv,w be 1 if precisely one of the vertices v and w belongs to S, and 0 otherwise. Then the connection between the original UBQP problem and the maximum cut problem on H is given by the equations 1 = γr,ast δst

for all vertices ast ∈ A

2 δij

for all vertices bij ∈ B

= γr,bij

βstij = 21 (γr,ast + γr,bij − γast ,bij ) for all edges (ast , bij ) in E1 Thus, the TSCM problem can be considered as a maximum cut problem with the following additional constraints: 0 ≤ γv,ast + γv,atl − γv,asl ≤ 1

for 1 ≤ s < t < l ≤ n1

0 ≤ γv,bij + γv,bjk − γv,bik ≤ 1

for 1 ≤ i < j < k ≤ n2

which are transformed from the 3-dicycle inequalities. Then it is not hard to see that cutting planes for maximum cut problems are all valid for the transformed TSCM problem. The cut polytope has been investigated intensively in the literature and many classes of cutting planes are known, see [3] for a survey. In our algorithm, we concentrate on odd-cycle inequalities because in general a large number of such inequalities can be found and separated in polynomial time [1]. The validity of odd-cycle inequalities is based on the observation that the intersection of a cut and a cycle in a graph always contains an even number of edges. This leads to the following formulation of the odd-cycle inequalities: γ(F ) − γ(C \ F ) ≤ |F | − 1

for F ⊆ C, |F | odd, and C a cycle in H

However, due to the 3-dicycle inequalities, it is hard to determine which oddcycle inequalities are useful for improving the performance of our branch-andcut algorithm. Moreover, the exact separation routine in [1] requires an O(n3 ) running time, where n is the number of vertices of H. For practical purposes this is rather slow. In our algorithm, we augment H to H 0 = (A ∪ B ∪ r, E1 ∪ E2 ∪ E3 ) by adding new edges e1 = (ast , atl ), e2 = (atl , asl ) and e3 = (ast , asl ) for 1 ≤ s < t < l ≤ n1 and assigning them weights of zero. A similar process is also applied to vertices in B. We scan through all triangle sets (r, ast , bij ), (ast , atl , bij ) and (bij , bjk , ast ) and check each of the four associated cycle inequalities for violation. This can be done in O(n2 ) time. The main procedure of our branch-and-cut algorithm is performed as follows: 1. Solve an initial LP without the 3-dicycle inequalities, 2. If the solution is infeasible, try to find violated cutting planes. The separation is performed in the order: 3-dicycle inequalities, odd-cycle inequalities associated with triangle sets (v, ast , bij ) and odd-cycle inequalities associated with triangle sets (ast , atl , bij ) and (bij , bjk , ast ), 3. Revise the LP and resolve it, 4. Repeat step 2 and 3. If no cutting planes are generated, branch. Besides the triangle inequalities described above, we also tried to generate other odd-cycle inequalities by using the algorithm in [1]. However, this approach is time-consuming and does not remarkably improve the performance of our algorithm; so we do not include it in our experimental evaluation. Moreover, in our experiments we found that our cutting plane algorithm performs very well with the set of cutting planes described above. For some instances, we do not have to go to the branching step at all, see Sect. 4.

4

Experimental Results

In order to evaluate the practical performance of our approach presented in the previous section, we performed a computational evaluation. In this section, we report the results and compare them to the results obtained with the branchand-bound algorithm in [8]. We tested the performance of – B&C 1: Our branch-and-cut algorithm using odd-cycle inequalities, – B&C 2: The CPLEX MIP-solver with default options, – JM: The exact algorithm used by J¨ unger and Mutzel [8]. These algorithms have been implemented in C++ using the Mixed Integer Optimizer of CPLEX 9.0. For a better comparison, our experiments for all three algorithms were carried out on the same machine, a standard desktop computer with a 2.99 GHz processor and 1.00 GB of RAM running Windows XP. We reimplemented the algorithm of [8] using the ILP solver of CPLEX to solve a subproblem on the enumeration tree. In the remainder of this section, all running times are given in seconds. The test suite of our experiments is the same

as used in [8]. It is generated by the program random bigraph of the Stanford GraphBase by Knuth [9]. The main results from our experiments are reported in Table 1. We give the results for “10+10-graphs”, i.e., bipartite graphs with 10 nodes on each layer, with increasing edge densities up to 90%. We compare our results with those of the exact algorithm presented in [8]. The notation used in Table 1 is as follows: – – – – – – – –

ni : number of nodes on layer i for i = 1, 2 m: number of edges time: the running time of each algorithm value: the minimum number of crossings computed by each algorithm nodes: the number of generated nodes in the branch-and-cut tree cuts: the number of user cuts generated by our branch-and-cut algorithm Gom.: the number of Gomory cuts generated by CPLEX with default options cliq.: the number of clique cuts generated by CPLEX with default options

Table 1. Results for “10+10-graphs” with increasing density ni 10 10 10 10 10 10 10 10 10

m 10 20 30 40 50 60 70 80 90

nodes 2 2 0 0 0 0 9 0 0

B&C 1 cuts time 120 0.50 670 2.12 1787 2.86 9516 8.20 14526 12.52 19765 21.23 37857 245.88 25553 24.05 5468 9.67

value 1 11 52 142 276 459 717 1037 1387

B&C 2 nodes Gom. cliq. time 0 0 0 0.03 2 4 3 0.30 159 4 19 1.45 3610 2 48 26.16 8938 1 112 101.03 14154 2 236 220.46 [38044] [1] [427] [1000] 19448 3 813 642.87 3354 3 1334 158.96

value 1 11 52 142 276 459 [717] 1037 1387

JM [8] time value 0.20 1 0.53 11 3.92 52 8.52 142 19.41 276 39.65 459 103.11 717 216.26 1037 234.62 1387

Notice that in our approach we did not use any initial heuristics, in order to give clearer and independent runtime figures. Nevertheless, as obvious from Table 1, our approach is much faster than the previous algorithm. This is particular true for dense graphs, e.g., graphs with more than 80 edges (bold figures indicate that optimal solutions have been found earlier than by the previous algorithm). Compared to B&C 2, there is a much smaller number of subproblems to be solved in our branch-and-cut algorithm. It is exciting to see that many instances have been solved to optimality in the cutting plane phase. For sparse graphs, the branch-and-cut algorithm with default options of CPLEX performs better. It has Gomory cuts and clique cuts that could be helpful in reducing the effort required to complete the enumeration. However for dense graphs, i.e., graphs with 70 edges, the instances are too large for the default branch-and-cut algorithm, we set a general time limit of 1000 seconds. Whenever this limit was reached, we report the best result and the numbers of nodes and cuts generated so far; the figures are then put into brackets. Notice

that both branch-and-cut algorithms perform worse than the previous algorithm when the testing graph has 70 edges. In Sect. 2 we have mentioned that there is an advantage of the objective function used in our ILP model. Now it is clearly visible in Table 1: some very dense graphs are solved even faster than graphs with less edges. The reason is that, in our formulation, variables generated from a subgraph K2,2 are implicitly substituted by 1, as the two corresponding terms in (3) sum up to 1 then. This is possible since every K2,2 induces exactly one crossing in any solution. This helps to reduce the LP size and allows to solve very dense instances quickly.

5

Conclusion and Future Work

We have studied the two-sided crossing minimization problem by modeling it directly to a binary quadratic programming problem. We have described a strategy to generate effective cutting planes for the TSCM problem. This is based on reformulating it to a cut problem with additional constraints. We have shown that these cutting planes can remarkably improve the performance of our branchand-cut algorithm. Our computational results show that our algorithm runs significantly faster than earlier exact algorithms even without using any heuristic algorithm for computing starting solutions, in particular for dense graphs. Nevertheless, our approach for solving the TSCM problem has transcended its original application. In graph drawing, many combinatorial optimization problems can be modeled as binary quadratic programming problems in a natural way. This is particularly true for various types of crossing minimization problems. Our encouraging computational results are obtained for small-scale graphs. In the future, we plan to test the performance of our algorithm with larger graphs. Beyond the sets of inequalities used in this paper, it would be interesting to identify classes of facet-defining inequalities for the TSCM problem. We are now making investigations in this direction.

References 1. F. Barahona and A. R. Mahjoub. On the cut polytope. Mathematical Programming, 36:157–173, 1986. 2. T. Biedl, F. J. Brandenburg, and X. Deng. Crossings and permutations. In P. Healy and N. S. Nikolov, editors, Proceedings of Graph Drawing 2005, pages 1–12, Limerick, Ireland, 2005. 3. E. Boros and P. L. Hammer. The max-cut problems and quadratic 0-1 optimization; polyhedral aspects, relaxations and bounds. Annals of Operations Research, 33:151–180, 1991. 4. P. Eades and N. C. Wormald. Edge crossings in drawing bipartite graphs. Algorithmica, 11:379–403, 1994. 5. M. R. Garey and D. S. Johnson. Crossing number is NP-complete. SIAM Journal on Algebraic and Discrete Methods, 4:312–316, 1983. 6. M. Gr¨ otschel, M. J¨ unger, and G. Reinelt. A cutting plane algorithm for the linear ordering problem. Operations Research, 32:1195–1220, 1984.

7. F. Harary and A. J. Schwenk. Trees with hamiltonian square. In Mathematika, volume 18, pages 138–140. 1971. 8. M. J¨ unger and P. Mutzel. 2-layer straight line crossing minimization: performance of exact and heuristic algorithms. Journal of Graph Algorithms and Applications, pages 1–25, 1997. 9. D. Knuth. The Stanford GraphBase: A platform for combinatorial computing. 1993. 10. X. Mu˜ noz, W. Unger, and I. Vrt’o. One sided crossing minimization is NP-hard for sparse graphs. In P. Mutzel, M. J¨ unger, and S. Leipert, editors, Proceedings of Graph Drawing 2001, pages 115–123, Vienna, Austria, 2001. 11. H. Nagamochi. An improved approximation to the one-sided bilayer drawing. In G. Liotta, editor, Proceedings of Graph Drawing 2003, pages 406–418, Perugia, Italy, 2003. 12. C. Sechen. VLSI Placement and Global Routing using Simulated Annealing. Kluwer Academic Publishers, Boston, 1988. 13. C. De Simone. The cut polytope and the boolean quadric polytope. Discrete Mathematics, 79:71–75, 1990. 14. K. Sugiyama, S. Tagawa, and M Toda. Methods for visual understanding of hierarchical systems. IEEE Transactions on Systems, Man, and Cybernetics, SMC11(2):109–125, 1981. 15. A. Yamaguchi and A. Sugimoto. An approximation algorithm for the two-layered graph drawing problem. In 5th Annual International Conference on Computing and Combinatorics, pages 81–91, 1999.

Suggest Documents