May 12, 2011 - the requirements for dual degree (Bachelor and Master of Technology) in. Computer Science and ... Department of Computer Science and Engineering ... Sanjeeva Prasad - for his impeccable style in teaching Programming.
Towards a 4/3 Approximation for the Metric Traveling Salesman Problem
SWATI GUPTA
Department of Computer Science And Engineering Indian Institute of Technology, Delhi
12th May 2011
ii
Towards a 4/3 Approximation for the Metric Traveling Salesman Problem A thesis submitted in partial fulfilment of the requirements for dual degree (Bachelor and Master of Technology) in Computer Science and Engineering by SWATI GUPTA 2006CS50451 under the guidance of Prof. NAVEEN GARG
Department of Computer Science And Engineering Indian Institute of Technology, Delhi 12th May 2011
Certificate This is to certify that Swati Gupta has worked under my supervision for her Master of Technology thesis entitled Towards a 4/3 Approximation for the Metric Traveling Salesman Problem during the course of her dual degree (Bachelor and Master of Technology) at Indian Institute of Technology, Delhi. This work is a bonafide record of research work carried out by her under my supervision. The contents of this thesis, in full or in parts, have not been submitted to any other Institute or University for the award of any degree or diploma. All sources and references have been acknowledged. I hereby recommend this submission to the institute in partial fulfilment of the requirements for dual degree (Bachelor and Master of Technology) in Computer Science and Engineering.
Professor Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi
iv
Acknowledgments ”‘Unsolved problems give us an occasion to fight the ignorances of human mind”’ - Prof. Ram Murty of Queen’s University. Introducing me to the wonderful Traveling Salesman Problem has been one of the best things Prof. Naveen Garg, my advisor, has done for me and now it has become nothing short of a passion for me. I remember telling Prof. Naveen Garg that this problem was very hard and that I was scared of pursuing it as my thesis project. But it is only with his constant encouragement, belief and support that I have been able to write this thesis today. I will always value our discussions on this project, for some of his statements have been defining ones for me. His clarity in thought and ideas has helped me not only progress through the project but also be a less confused person.
Secondly, I would like to thank Prof. Amitabha Tripathi for instilling in me a fondness for Graph Theory. I took this class in my seventh semester - scared almost by his reputation of failing half the students. But after taking it for a semester and rigorous problem solving I was sure that I wanted to work on a combination of Graph Theory and Algorithms in the coming years. He guided
vi my proof writing and taught me how to write beautiful documents using latex. He also showed us a documentary of Fermat’s Last Theorem - that made me realise that it might just be awesome to work on a problem for seven years.
I would also like to thank Nishita Aggarwal, who enthusiastically worked with me during the later part of the thesis. She listened to me, asked innocent doubts that were sometimes loopholes in my proofs and helped in removing them. Discussions with her helped me clear my head, structure my thinking, and focus on one idea at a time. She also helped me in drawing some pictures used in this thesis.
It was not only working on this problem that kept me busy but the interesting course work in Approximation Algorithms for which I’m grateful to Prof. Naveen Garg and Prof. Amit Kumar and Advanced Algorithms for I learnt the love of teaching from Prof. S.N.Maheshwari. I would like to thank my department - Prof. S. Arun Kumar for his invaluable anecdotes and helping me build strong foundations in Theoretical Computer Science, Prof. Sanjeeva Prasad - for his impeccable style in teaching Programming Languages and Logic to Computer Science, Prof. Shyam Kumar Gupta for his trust and confidence in me, honest criticism, and a source of mysteries of Mathematics.
I would also like to thank IMPECS and Microsoft Research for conducting Workshops on Algorithms at IIT Delhi and Bangalore. To my Algorithm Reading Group friends - Syamantak, Manoj, Ankur, Divya, Nishita, Deepak,
vii Ankit, Sahil, Arpit, Arindam, Prof. Ragesh Jaiswal, Prof. Amit Kumar, Prof. Naveen Garg, Prof. Sandeep Sen, for giving talks on exciting topics and expanding the pool of ideas I was exposed to.
Besides, I would like to express my love and many thanks to my friends Vishal Narula - for patiently listening to me blabber about TSP, Divya - for letting me know when Prof. Naveen Garg could be found in his office and when their project meetings were scheduled, Pragun - for supporting me in my theoretical work even though he could not digest a lot of Mathematics, my mother - for making my project famous as the problems of salesmen in my house, my father my family - who constantly thought my project was related to travel agencies, my grandfather who being an engineer himself asked me a million times about the practical uses of TSP, my sister - who had to be content with the small graphs I drew for her and above all, the infinite fun I had in explaining all of it to them.
Lastly, I would like to thank Aman Gupta, for letting me use the layout of his thesis and structure of his certificates, and to Prof. David Williamson, for setting standards of how a master’s thesis should be written.
Swati Gupta Computer Science and Engineering Indian Institute of Technology, Delhi
viii
Abstract In this project, we consider a restriction of the Traveling Salesman Problem which is formally stated as - ‘Given the costs associated with traveling between any pair of n cities, find the tour of the minimum cost which visits each city once and exactly once ’. Held and Karp [4] formulated a lower bound for TSP using 1-trees in 1970. The value of this lower is equal to the value of SubTour LP and is conjectured to have an integrality gap of 3/4. Motivated by obtaining a 4/3 approximation for the traveling salesman problem using the Held-Karp bound, we consider the special case when distances satisfy the graph metric on an underlying unweighted graph G. When G is 2-vertex-connected and has a Hamiltonian path, we show how to obtain a spanning Eulerian trail of length atmost (4/3)n. When G is 3-regular 3-edge-connected, the Held-Karp bound is n and we show a novel approach of finding a 4/3 approximation for TSP on G. During the course of the project, we also looked at properties of graphs that are LP-oblivious and at the structure of half-integer vertices. We give comparisons of our work to recent unpublished results, and detail further directions for research.
x
Contents List of Figures
xiii
1 Introduction
1
2 Literature Review
5
3 Tours using Hamiltonian paths
13
3.1
Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13
3.2
Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
4 Cubic 3-Edge-Connected Graphs
21
4.1
Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . .
22
4.2
Important Lemmas . . . . . . . . . . . . . . . . . . . . . . . .
22
4.3
Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
5 Future Work
31
xii
CONTENTS
List of Figures 2.1
An SEP Feasible Graph, that is not 1-edge-tough . . . . . . .
9
2.2
Counter-Example . . . . . . . . . . . . . . . . . . . . . . . . .
10
3.1
Deepest-edges . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
3.2
Deepest edges will never cross each other . . . . . . . . . . .
15
3.3
Modified Graph . . . . . . . . . . . . . . . . . . . . . . . . . .
15
3.4
Intervals defined on the modified graph . . . . . . . . . . . .
16
3.5
Patterns formed for the modified graph . . . . . . . . . . . .
17
3.6
Connecting vertices in the intervals in 2 ways . . . . . . . . .
18
3.7
Intervals inside d-adjacent deepest-edges . . . . . . . . . . . .
19
3.8
Removing an edge from connecting intervals disconnects the graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
Doubling an edge in the connecting interval . . . . . . . . . .
19
3.10 Final walk formed using P2 . . . . . . . . . . . . . . . . . . .
20
4.1
On Expanding a super-vertex with degree 2 . . . . . . . . . .
26
4.2
On Expanding a super-vertex with degree 4 . . . . . . . . . .
27
4.3
Expanding a super-vertex with degree 4 . . . . . . . . . . . .
27
3.9
xiv 4.4
LIST OF FIGURES When v1,v4 are in the same component . . . . . . . . . . . .
28
Chapter 1
Introduction The Traveling Salesman Problem is one of the oldest and the most extensively worked on problems in the field of algorithms. It is formally stated as : Given the costs associated with traveling between any pair of n cities, find the tour of the minimum cost which visits each city once and exactly once. This is equivalent to finding the minimum cost hamiltonian cycle in a complete weighted graph. Mathematically, suppose we have costs cij for 1 ≤ i, j ≤ n, associated with going between each pair of cities, then we want to find a cyclic permutation σn such that n X i=1
ciσn (i) = minτ cyclic
n X
ciτn (i)
i=1
. TSP is an NP Hard problem that is NP-Hard to approximate. The proof of the same can be found in the book Approximation Algorithms [28]. Some restrictions that make the problem approximable are - metric TSP, assymetric
2
Introduction
TSP with triangle inequality and symmetric (1,2) TSP. For symmetric TSP, the distances cij = cji for all i,j ∈ [1,n] and for assymetric there is no such condition. Edges cost in 1,2-TSP are restricted to 1 and 2 only. The best known tour constructing algorithms for these three cases give an approximation factor of - 3/2 (for metric TSP, refer to [3]), log(n)/loglog(n) (for asymmetric with triangle inequality, refer to [21]) and 8/7 (for 1,2 symmetric TSP, refer to [22]). Another interesting restriction to TSP is Graphical TSP where the metric used is graph metric. Given an undirected, unweighted graph G = (V,E), the cost between two vertices i and j is given by the length of the shortest path between i and j ∈ G = (V,E). The solution on this metric, corresponds to a walk in the underlying graph G. Thus, the traveling salesman problem becomes the problem of finding the shortest closed walk on the graph G which visits all vertices of G. The best known algorithm for graphical TSP was 3/2 approximation algorithm given by Christofides in 1978. Two recent unpublished results give a 1.5 - approximation (Gharan et al [25]) and 1.461 approximation (Svensson et al [23]) for graphical TSP. The goal for Metric TSP is to reach a 4/3 approximation, towards proving the long standing conjecture that there must exist a 4/3 approximation. Working with Graphical TSP that is a subset of Metric TSP simplifies the problem and gives the added advantage of exploiting the structure of the graphs. The basis of the 4/3 conjecture comes from the Held Karp heuristic where Held and Karp used 1-trees as a relaxation of optimal tours. One way of obtaining a lower bound to an optimization problem is to solve the relaxed problem optimally. In this case, it translates to constructing a tour which
3 satisfies a subset of the properties of any TSP tour. The relaxed Subtour Linear Program for TSP is given as -
minimize
X
cij xij
1≤i≤j≤n
subject to : x(δ(v)) = 2 ∀ v ∈ V x(δ(S)) ≥ 2 ∀ φ ⊂ S ⊂ V xe ≥ 0 xe ≤ 1 It was proven in [4] that the value of Held-Karp hueristic is equal to the optimum value of the Sub-Tour LP. The Held-Karp heuristic typically generates solutions of cost above 99% of the optimal. But, there is a known example involving a subcubic graph where the solution obtained is 3/4OPT . Since no example worse than this is known, it is conjectured that the integrality gap of the SubTour LP is 3/4 and there exists a 4/3 approximation algorithm for the TSP. The objective of this project is to study instances of TSP for which the HeldKarp heuristic gives the optimal solution, and develop algorithms on these instances in support of the conjecture. We have restricted ourselves to the graph metric. We consider a simple question of whether a Hamiltonian path in a 2-vertex-connected graph can be converted into a spanning Eulerian trail. For this, we give an algorithm (referred to as the path algorithm)to convert a given Hamiltonian path into a spanning Eulerian closed trail using atmost
4
Introduction
(4/3)n edges. Next, cubic 3-edge-connected graphs are LP oblivious, that is the linear programming formulation gives the trivial bound for n in such graphs. An assignment which achieves the minimum value of n is obtained by simply assigning xe = 2/3 to each of the edges. Gamarnik et al [18] gave an algorithm for these graphs with an approximation factor of 3/2 − 5/389. Recently Boyd et al [2] also gave 4/3 approximation for cubic graphs and 7/5 for subcubic graphs. We also, in this thesis, give a 4/3 approximation algorithm for cubic 3-edge-connected graphs, referred to as the cubic algorithm.
In the subsequent chapters, we have discussed some related work, the path algorithm, the cubic algorithm, and future directions of research.
Chapter 2
Literature Review In the book Combinatorial Optimisation [27], William Cook et al have very beautifully explained the history and developments in the travelling salesman problem. People have studied the TSP in many different ways. They tried studying the convex polytope of tours and found inequalities which were facet inducing for these polytopes, like in [19, 20]. There were classes of polytopes with different properties and hence, different inclusion relations between them. These inequalities were formed as a generalisation of those which hold for any tour (giving rise to relaxed tours)- for example the subtour elimination constraints and comb inequalities (which are disjoint and both facet inducing) and then came clique tree inequalities which generalised subtour and comb inequalities. It is quite interesting how a graph which satisifies subtour elimination constraints, does not satisfy a mixture of subtour and degree constraints(which give the comb inequality). Examples to understand the same can be found in the book or through following papers [19], [20]. There were also studies quanitifying the advantage of different
6
Literature Review
classes of facet inducing inequalities, refer to [24]. Apart from this, Alexander Shrijver’s book on History of Combinatorial Optimization [29] gives an interesting historical account of TSP that dates back to Kirkman and Hamilton.
The main idea behing N. Christofides’ 3/2 approximation algorithm (1978) for the metric TSP (refer to [3]) is that after ensuring connectivity, one can add a minimum number of edges to get a closed walk. For this, a minimum spanning tree was first found out, then the vertices with odd degree were matched with a matching of cost no more than 1/2OPT . This ensured that the degree of all vertices was even, thus giving an Eulerian graph. Hence, there exists a closed walk of cost no more than 3/2OPT . A generalisation of this technique is T-joins. A T-join of G = (V,E) is a set of edges J such that | J ∩ δ(v) |≡| T ∩ {v} | (mod2), ∀v ∈ V . T-joins can always be reduced to paths which are edge-disjoint. It is still not clear though how T-joins can be used to improve the best known approximation factor.
An important approach of approximating a problem is by using lower bounds. What is the minimum size of a subgraph of G such that every vertex has a degree greater than or equal to 2 (D2 bound)? Or what is the smallest 2-edge-connected graph of a subgraph? These questions give a lower bound for the length of any closed tour that visits all the vertices of the graph. In their Bachelor’s thesis, Kushal et al [9] explain these lower bounds in detail. They give relations between the toughness of graph, the D2 bound, the ear decomposition bound and introduce a new bound - Durability bound. Another important lower bound is that developed by Held and Karp in
7 [4]. The Held-Karp hueristic uses 1-trees that are minimum spanning trees with an extra edge incident on vertex 1 that makes a loop. Using lagrangian multipliers, they make sure that the degree of all nodes was as close to 2 as possible. More analysis on the Held-Karp hueristic can be found in David Williamson’s master thesis (refer to [18]). It was in this thesis that the 4/3 conjecture was made.
We tried to characterize graphs which have the Held-Karp bound equal to n. SEP feasible graphs are those where an assignment of positive real values to the edges of the graph satisfies the constraints of the Subtour LP, refer to [17]. Such graphs are necessarily 2 vertex-connected, 1-tough and 1-block-tough. Hamiltonian graphs are also SEP feasible and hence these become necessary conditions for hamiltonianicity. As an aside, a graph that is t-block-tough is also t-tough. Hence, block-toughness becomes a stronger condition than toughness, but not a sufficient condition.
Further we explored edge-toughness introduced by Katona et al in [7] and [12]. The necessary condition for Hamiltonian graphs is that the size of a cut set S of vertices has to be greater than the number of components of G\S. For edge-toughness, Katona et al generalize the cut set to a set of vertices and edges. They present it as a tool to prove non-hamiltonicity. Another known concept at that time was of non-path-toughness, for which there was no easy way to prove it for a graph. Non-path-toughness takes a set of vertices (X) and counts the minimum number of disjoint paths required to connect the vertices of X. They prove that t-edge-tough is also t-tough; a hamiltonian
8
Literature Review
graph is always 1-edge-tough and 2t-toughness implies t-edge-toughness. Katona et al also prove that there exist (2t − )-tough graphs which are not t-edge-tough. They prove that every 1-edge-tough graph has a 2-factor.
Motivated by some constructions for graphs with held-karp bound equal to n, we looked at Half-Integral solutions to the Subtour LP. Robert Carr et al in [ 6] give a 4/3 tour constructing algorithm for half-integer triangle-vertices. Their algorithm considers a very small portion of actual solutions for which the Held-Karp bound is n. Although in this paper, they form an interesting notion of using patterns, that we use later in our path algorithm. The following section presents some small observations during the course of the background reading.
Observations 1. Necessary condition for kT − joink ≤ n/2 is that the graph should be 1-factorable. 2. Held-Karp bound for k-regular and k-edge connected graphs is n. This can be seen by simply assigning xe = 2/k for each edge. 3. Though SEP feasible graphs are 2-vertex connected, 1-tough and 1block-tough, they are not 1-edge-tough and path-tough. A simple example for an SEP feasible graph which is not 1-edge-tough and pathtough is shown in figure [??]. Consider the set of vertices A = v1,v2,v3 and the set of edges Y = a,b,c,d,e,f. Let X = {}. Then (X,Y) acts as an A-separator. Also, considering the same (X,Y), we get the inequality
9
Figure 2.1: An SEP Feasible Graph, that is not 1-edge-tough
for edge-toughness. Thus, proving that this graph is not 1-edge-tough and path-tough. It is SEP-feasible as we can simply assign 1/2 to the edges a,b,c,d,e,f and 1 to others. This hints at the idea that the Held-Karp formulation does not really penalise for disjoint paths, but preserves vertex-related properties(toughness) and connectivity.
4. The construction of graphs in [8] used for disproving 2-tough conjecture has Held-Karp bound = n. Consider any G(L, u, v, l, 2l + 3), refer to [8] for definitions. Charge every instance of L with two 1/2-triangles at vertex u and v and 1-edges otherwise, as shown in the figure [2.2]. Next, since each occurence of u and v is connected, join these through 1
10
Literature Review
Figure 2.2: Counter-Example
edges to form a path of the L graphs. Since Kl again has a Hamiltonian path between any two vertices, raise the edges of the path to 1 to get a satisfying xe assignment. An example with G(L,u,v,2,5) is shown in the figure [2.2].
We started with a simple question - Given a Hamiltonian Path in a 2-vertex connected graph, can we find a walk of length at most 4/3(n) covering all the vertices? We found an interesting partition of edges which could be used to solve this problem. This algorithm is referred to as the path algorithm in this thesis, and detailed in Chapter 3.
In the paper by Gamarnik et al [ 5] give a 3/2-5/389 approximation algorithm for cubic 3-edge-connected graphs, as a step towards supporting the 4/3 conjecture. Bill Jackson et al in [13] prove that for graphs with 3-edge-connectivity, there exists a spanning even sub-graph such that the size of each component is atleast 5. To prove this result, they first prove a stronger statement that for a 3-edge-connected graph G such that there exists a vertex u with degree = 3 and two edges u1 and u2 incident on u, there
11 exists a spanning even subgraph X with {e1 , e2 } ⊂ E(X) and σ(X) ≥ 5. We convert their proof into an algorithm for finding an even-spanning subgraph for a given graph G that satisfies these properties. Using this, we give a 4/3 approximation for cubic 3-edge-connected graphs. This algorithm, called the cubic algorithm in this thesis, is detailed in Chapter 4.
12
Literature Review
Chapter 3
Tours using Hamiltonian paths Problem Statement: To find a tour of length (4/3)n, given a hamiltonian path in an undirected hamiltonian graph.
3.1
Definitions
Let P be the hamiltonian path and let the vertices on the path be labelled as {v1 , . . . , vn }, where n is the number of vertices in the graph. We will follow the convention that for every edge (vs , vt ), s < t. Any edge (vs , vt ) is said to be l-incident at vs and h-incident at vt . From the set of edges {(vs , vk ): k ∈ I} incident on a vertex vs , its deepest-edge is (vs , vt ) such that t >= k ∀ k ∈ I. Deepest-edge in an interval of vertices is the one which is h-incident on the highest indexed vertex and l-incident on a vertex in the interval. Two
14
Tours using Hamiltonian paths
deepest-edges (vs , vt ) and (vk , vl ) are said to be d-adjacent if s < k < t < l(or k < s < l < t).
3.2
Algorithm
1. Building a set of deep edges (a) Include the deepest-edge from the vertex v1 in S. (b) If the last added deepest-edge is (vk , vl ), include in S the deepestedge in the interval [v1 , vl−1 ]. (c) Repeat Step 1b till an edge h-incident on vn is included. (d) Label the edges in the order of addition as {e1 , . . . , eK }.
Figure 3.1: Deepest-edges
Claim 1: Step 1 will terminate. Suppose there is a vertex vi , such that the deepest-edge in the interval [v1 , vi−1 ] does not cross the vertex vi then vi becomes a cut-vertex and violates the hamiltonianicity of the given graph. Claim 2: For i > 2, the deepest-edge ei is l-incident on [vt , vl−1 ] where ei−2 =(vs , vt ) and ei−1 =(vk , vl ). If it was not so, then either the fact that it is deepest in the interval [v1 , vl−1 ] will be violated or
3.2 Algorithm
15
Figure 3.2: Deepest edges will never cross each other
the selection of an earlier edge would be proven to be false. That is, a situation like Figure 2 can never arise. Only the ‘deeper’ of these two edges(which is e3 in this case) would have been picked in Step 1, instead of e2 . 2. Splitting vertices: Split every vertex vk on which two deepest-edges are incident into 2 vertices vk1 and vk2 such that the deepest-edge which is h-incident on vk becomes h-incident on vk1 and the other on vk2 . The example in Figure 1 thus becomes as shown in Figure 3. Note that edges which were d-adjacent in the previous graph remain d-adjacent in the new graph as well. Also, two deepest-edges which are incident on the same vertex are never d-adjacent.
Figure 3.3: Modified Graph
3. Defining Intervals: Now, define the intervals formed by the endpoints of deepest-edges in the set S as {xi }. For K deepest edges, the
16
Tours using Hamiltonian paths number of intervals formed will be exactly 2K-1. The intervals for the modified graph in Figure 3 thus become as shown in Figure 4.
Figure 3.4: Intervals defined on the modified graph
4. Forming Patterns: Define three patterns Pi for i={1,2,3} using intervals {xj : j = i mod 3} as follows P1 : x1 e2 x4 e3 x7 e5 P2 : e1 x2 e2 x5 e4 x8 e5 P3 : e1 x3 e3 x6 e4 x9 These three patterns for the modified graph are shown in the Figure 5. Note that every pattern takes pairs of d-adjacent deepest-edges separated by some intervals xi , which we will henceforth refer to as the connecting intervals. 5. Selecting a pattern: Evaluate the cost of patterns Pi by adding 1 for each deepest-edge used in the pattern and adding the number of vertices in the each interval xj used in it. The patterns formed above have a cost of 7, 4 and 5 respectively. In general, the total cost of the patterns, C is C=
X
xi + 2K
since each interval is used exactly once and each deepest-edge is used
3.2 Algorithm
17
Figure 3.5: Patterns formed for the modified graph
twice. But X
xi = n − 2K
as vertices in the intervals and the end points of the deepest-edges add up to the total number of vertices. Hence, the total cost of the patterns = n. Hence, atleast one of the patterns must have a cost