Multicommodity Flows and Approximation Algorithms Indian Institute of ...

23 downloads 846 Views 560KB Size Report
A thesis submitted in partial ful lment of the requirements for the degree of. Doctor of Philosophy to the. Indian Institute of Technology, Delhi. April 1994 ...
Multicommodity Flows and Approximation Algorithms

by

Naveen Garg Department of Computer Science and Engineering A thesis submitted in partial ful lment of the requirements for the degree of Doctor of Philosophy

to the

Indian Institute of Technology, Delhi April 1994

Certi cate

This is to certify that the thesis titled \Multicommodity Flows and Approximation Algorithms" being submitted by Naveen Garg to the Indian Institute of Technology, Delhi, for the award of the Degree of Doctor of Philosophy, is a record of bona de research carried out by him under my supervision. The results obtained in this thesis have not been submitted to any other University or Institute for the award of any other degree or diploma.

Vijay V. Vazirani Professor, Department of Computer Science and Engineering Indian Institute of Technology New Delhi 110016

Abstract This thesis is about multicommodity ows and their use in designing approximation algorithms for problems involving cuts in graphs. In a ground-breaking work Leighton and Rao [34] showed an approximate max- ow min-cut theorem for uniform multicommodity

ow and used this to obtain an approximation algorithm for the ux of a graph. We consider the multicommodity ow problem in which the object is to maximize the sum of the

ows routed and prove the following approximate max- ow min-multicut theorem min-multicut  max- ow  min-multicut O(log k) where k is the number of commodities. Our proof is based on a rounding technique from [34]. Further, we show that this theorem is tight. For a multicommodity ow instance with speci ed demands, the ratio of the maximum concurrent ow to the sparsest cut was shown to be bounded by O(log2 k) [30, 57, 17, 47]. We use ideas from our proof of the approximate max- ow min-multicut theorem and a geometric scaling technique from [1] to provide an alternate proof of this bound. For the special case when the graph is a tree, we can, using a Primal-Dual approach, show a 2-approximate maximum integral ow minimum multicut theorem. We also relate integral

ow and multicut in trees to matching and vertex cover problems in graphs. Another special case for which we can obtain better approximate max- ow min-cut theorems is that of multiway cuts. Here we show that the dual linear program always has a half-integral optimum. This yields a 2-approximate max- ow min-multiway cut theorem for edge and node weighted graphs. The above approximate min-max relations also yield approximation algorithms for the corresponding cut problems. These results for multicommodity ows were obtained in joint work with Vazirani and Yannakakis [17, 18, 19]. We also provide a combinatorial approach to characterizing the vertices of the s-t cut polyhedron and in the process show that not all s-t cuts are vertices of this polyhedron. However, there is a small set of inequalities such that the corresponding poyhedron has exactly s-t cuts as its vertices. This result was obtained in joint work with Vazirani [16].

Acknowledgements I am truly indebted to Vijay Vazirani, my thesis advisor, without whom this thesis would never even have started. Vijay initiated me into research while I was still an undergraduate, taught me how to go about it and encouraged me throughout these 3 years. His approach to algorithms, with the emphasis on identifying the combinatorial structure in the problem, the technique used and the search for simple and elegant solutions, have all left a deep impact on me. The courses that he taught on approximation algorithms have been a valuable part of my learning. I am thankful to him for being so generous with his time and for being so immensely patient with me. My trips to Berkeley during the summer months of 92 and 93 provided me with an opportunity for meeting and working with other researchers in the eld. I wish to thank my host, Umesh Vazirani for arranging these. Umesh has a highly intuitive approach to things and in the many discussions that I have had with him, he provided me with some of this intuition and with di erent ways of looking at some of the results in this thesis. I am grateful to him for that and for the GO board that he brought for me. I also wish to thank Dorit Hochbaum for supporting my stay at Berkeley, for collaborating in my research and for sharing with me, her immense knowledge of approximation algorithms. The days spent at Berkeley were great fun and I thank Mor and Sandeep for being such wonderful hosts. I am very fortunate to have been here with a great set of people who were always ready to share their ideas with me. Thanks to Suren Reddy for all those milk-co ee discussions, Santy for the numerous kala jamuns that he has treated me to and Manica for being a good sounding board for ideas. I also wish to thank Huzur Saran for numerous fruitful discussions and for carefully checking all my ideas. Many are those who do not have anything directly to do with this thesis, or my research, but who have made my stay here very lively with numerous interesting discussions. These include Satish, Sushil, Sharat, Vijay, PP, Rajneesh, Nidhi, Sushma and others at the hostel and Naseer, Moorthy, Neelima, Neena and Rekha at the department. Things would have been very di erent without them and I thank all of them for the great times. The one quarter that I have always received constant encouragement and support from is my family: my parents and my sisters, Nidhi and Nalini. This thesis is dedicated to them.

Credits

The motivation for most of the work in this thesis has been the results of Leighton and Rao [34] on uniform multicommodity ow and Klein, Agrawal, Ravi and Rao [30] for general multicommodity ows. The results in Chapter 2 on the s-t cut polyhedron were obtained jointly with Vijay Vazirani and appear in [16]. The geometric scaling technique was developed jointly with Manica Aggarwal and was rst applied to a problem in network design [1]. All the other results in this thesis were obtained jointly with Vijay Vazirani and Mihalis Yannakakis. These include our results on multicut and integral multicommodity ows on trees [18] (Chapter 4 ), node multiway cuts [19] (Chapter 3 ), maximum multicommodity ow in general graphs [17] (Chapter 5 ) and the O(log k log D) bound on the min-cut max- ow ratio for general multicommodity ow [17]. I am grateful to all my co-authors for allowing me to include results, obtained in joint work with them, into this thesis.

Contents 1 Introduction

1

1.1 Multicommodity Flows : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

1

1.2 Edge Disjoint Paths and Integral Flow : : : : : : : : : : : : : : : : : : : : :

3

1.3 Approximation Algorithms : : : : : : : : : : : : : : : : : : : : : : : : : : :

4

1.4 Applications : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

6

1.5 Techniques Used : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

6

1.6 Organization of the Thesis : : : : : : : : : : : : : : : : : : : : : : : : : : : :

8

2 The s{t Cut Polyhedron

10

2.1 Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

10

2.2 Preliminaries : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

11

2.3 Characterizing Vertices : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

12

2.4 Other Formulations of the Dual : : : : : : : : : : : : : : : : : : : : : : : : :

14

2.5 The s-t cut Polyhedron : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

16

2.6 Discussion : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

17

3 Node Multiway Cuts

18

3.1 Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

18

3.2 Preliminaries : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

20

i

ii 3.3 Half-integrality of the Optimum : : : : : : : : : : : : : : : : : : : : : : : : :

22

3.4 Approximate Max- ow Min-multiway Cut Theorem : : : : : : : : : : : : :

25

3.5 Approximating the Multiway Cut : : : : : : : : : : : : : : : : : : : : : : : :

26

4 Multicut in Trees

29

4.1 Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

29

4.2 Preliminaries : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

31

4.3 Finding the Maximum Integral Flow : : : : : : : : : : : : : : : : : : : : : :

34

4.3.1 Unit Height Trees : : : : : : : : : : : : : : : : : : : : : : : : : : : :

34

4.3.2 Trees with Unit Capacity Edges : : : : : : : : : : : : : : : : : : : :

35

4.3.3 Trees with Edge Capacities : : : : : : : : : : : : : : : : : : : : : : :

36

4.4 Approximating Integral Flow and Multicut : : : : : : : : : : : : : : : : : :

38

4.5 Integrality Gap for Grid Graphs : : : : : : : : : : : : : : : : : : : : : : : :

45

4.6 The Tree-representable Set Cover Problem : : : : : : : : : : : : : : : : : : :

46

5 Multicut in General Graphs

47

5.1 Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

47

5.2 Preliminaries : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

48

5.3 Two Crucial Lemmas : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

50

5.3.1 Growing a Region : : : : : : : : : : : : : : : : : : : : : : : : : : : :

50

5.3.2 Growing Disjoint Regions : : : : : : : : : : : : : : : : : : : : : : : :

54

5.4 Approximate Max- ow Min-multicut Theorem : : : : : : : : : : : : : : : :

55

5.5 Approximating the Minimum Multicut : : : : : : : : : : : : : : : : : : : : :

57

5.6 Applications : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

59

6 The Min-Cut Max-Flow Ratio

62

iii 6.1 Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

62

6.2 Preliminaries : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

64

6.3 Min-Cut Max-Flow ratio : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

67

6.3.1 Uniform Path Lengths : : : : : : : : : : : : : : : : : : : : : : : : : :

68

6.3.2 Well-spaced Functions : : : : : : : : : : : : : : : : : : : : : : : : : :

69

6.3.3 Ill-spaced Functions : : : : : : : : : : : : : : : : : : : : : : : : : : :

72

6.4 Splitting into Well-spaced Functions : : : : : : : : : : : : : : : : : : : : : :

73

6.4.1 Making a Function Monotone : : : : : : : : : : : : : : : : : : : : : :

73

6.4.2 Finding Well-spaced Functions : : : : : : : : : : : : : : : : : : : : :

74

6.5 Approximating the Sparsest Cut : : : : : : : : : : : : : : : : : : : : : : : :

76

7 Discussion and Open Problems

80

Bibliography

82

CHAPTER 1

Introduction 1.1 Multicommodity Flows The classical maximum ow problem plays a central role in combinatorial optimization. The origins of ow research can be traced to the 1956 paper of Ford and Fulkerson in which they published a `direct labelling' method nding the maximum ow. The method implies the famous max- ow min-cut theorem: the maximum amount of ow is equal to the minimum capacity of any cut separating the source from the sink. The power of this theorem lies in that it relates two fundamental graph-theoretic quantities via the potent mechanism of a min-max relation. It was also Ford and Fulkerson who considered the problem of nding multicommodity ows, which involves simultaneously shipping several di erent commodities from their respective sources to their sinks in a single network so that the total amount of ow through any edge is bounded by a given capacity. The importance of the max- ow min-cut theorem to ow theory and the theory of cuts in graphs has led researchers to seek its generalization to the case of multicommodity ow. The objective now is to maximize the sum of the ows of the commodities subject to capacity and ow conservation requirements. The notion of a multicut generalizes that 1

2

1. Introduction

of a cut, and is de ned as a set of edges whose removal disconnects each source from its corresponding sink. Clearly, maximum multicommodity ow is bounded by minimum multicut; the question is whether equality holds. In 1963, T.C. Hu [24] extended Ford and Fulkerson's labelling algorithm to 2-commodity ow problems and obtained a max-bi ow min-cut theorem. However, for more than two commodities one can construct very simple examples to show that equality does not hold in general. Consider a tree of height one with three leaves. Each pair of leaf nodes form the source-sink pair for a commodity. If all edges have unit capacities the maximum ow is 23 , whereas minimum multicut is 2. In this situation the best one can hope for is an approximate max- ow min-multicut theorem. We show one such by proving that the maximum ow is at least as large as the minimum  1 multicut times log k and that there is an instance where this lower bound is achieved. Cunningham [10] showed that this `gap'  between the maximum ow and the minimum 2 multicut can be tightened to 2 ? k for the special case when each pair of nodes, from amongst a set of k speci ed nodes, is the source-sink pair of a commodity. The multicut now separates these k nodes and is called a multiway cut. The example mentioned above   tree when extended to a tree with k leaves shows that the 2 ? k2 gap between the max- ow and the min-multiway cut is tight. We extend this result to the case of node multiway cuts (weights  nodes), and show that the minimum node multiway cut has weight no more  on2 the than 2 ? k times the maximum multicommodity ow. This version of multicommodity

ow has throughput constraints on the nodes instead  of2 capacity constraints on the edges. Once again there are examples to show that this 2 ? k ratio between the minimum node multiway cut and the maximum ow is tight. Given a multicommodity ow problem, with demands associated with the commodities (the amount of the commodity that we wish to ship), one often needs to know if there is a feasible

ow, i.e. a ow that satis es the demands and obeys the capacity constraints. For a feasible

ow to exist it is necessary that the capacity of any cut exceed the sum of the demands that are separated by the cut. The max- ow min-cut theorem for single commodity ow implies that this cut condition is also sucient. In contrast, a multicommodity ow problem can be infeasible even if the cut condition is satis ed. The simplest such example is where the network is K2;3 with unit-capacity edges and unit demand between every pair of nodes not connected by an edge. Instances for which the cut condition is also sucient have been extensively studied. Seymour [54] considered the supply-demand graph, the graph obtained

1.2. Edge Disjoint Paths and Integral Flow

3

from the original network by adding a source-sink edge for each commodity; he showed that if this graph excludes K5 as a minor then the cut condition is sucient for ow routability. Okamura and Seymour [42] showed that in a planar graph, if all sources and sinks lie on the boundary of a single face then the cut condition is sucient. A natural question to ask is how large a `safety margin' is needed, i.e. by what factor should the capacity of a cut exceed the total demand separated by the cut, in order to ensure existence of a feasible ow. In a pioneering work Leighton and Rao [34] showed that for uniform multicommodity ow (a special case of multicommodity ow where there is a demand of 1 unit between every pair of nodes) it is sucient that the capacity of every cut exceed the demand across it by a factor O(log n) and that this factor is the best possible. For the general undirected multicommodity ow problem Klein, Agrawal, Ravi and Rao [30] showed that it is sucient to have a `safety margin' of O(log C log D), where C; D is the sum of the capacities and the demands respectively. Tragoudas [57] improved this margin to O(log n log D). In [17] we show that this ratio is O(log k log D), where k is the number of commodities. Subsequently, Plotkin and Tardos [47] gave a method for scaling the demands that allows them to replace the log D in the above bound by log k. In this thesis we use a geometric scaling technique from [1] and provide another proof of this O(log2 k) ratio.

1.2 Edge Disjoint Paths and Integral Flow Another classical problem in graph theory is nding edge disjoint paths. A fundamental characterization was found in 1927 by K. Menger: the maximum number of pairwise disjoint

paths between a given source and sink in a graph is equal to the minimum size of a cut separating source and sink. Menger's result forms the basis for research on paths in graphs,

continuing until the present day. Later it was observed that Menger's theorem and Ford and Fulkerson's theorem can be derived from each other by a simple construction, providing a link between paths and ows.

If all capacities are integral the maximum ow for the case of a single commodity is integral. However, integer and half-integer multicommodity ow problems are NP-complete and so is the edge-disjoint paths problem. These problems are tractable for the special case of graphs where the cut condition is sucient for feasible ow. For all such graphs, if capacities and demands are integral then ow can be routed half-integrally and if in addition the nodes

4

1. Introduction

satisfy a certain `evenness' property, one can obtain an integral ow in polynomial time. The maximization version of the edge-disjoint paths problem, i.e. nding the maximum number of edge-disjoint paths connecting the corresponding sources and sinks, has a nice characterization (a min-max theorem) when the demand graph formed by introducing an edge for each source-sink pair is a complete graph [38, 7, 41]. For the case when the original graph is a tree we show that the maximum number of edge-disjoint paths can be found in polynomial time if the capacities are unity. For arbitrary capacities the problem is NP-hard and we give an approximate min-max relation: the maximum integral ow is at least as large as half the minimum multicut. We also show that the ratio between the minimum multicut and the maximum integral ow can be as large as k2 for general graphs.

1.3 Approximation Algorithms We now get to the main idea of this thesis: designing good approximation algorithms. Since most real world combinatorial optimization problems are NP-hard and it is unlikely that they could be solvable in polynomial time, researchers in the past have attempted to nd polynomial time algorithms that guarantee a solution with value that is within a multiplicative factor of of the value of the optimal solution. The factor is variously referred to as the performance guarantee or the approximation bound of the algorithm and an algorithm with performance guarantee is called an -approximation algorithm. The notion of approximation algoithms can be extended to include all such algorithms which trade the quality of the solution obtained with the amount of resources used (space, time, number of processors etc.). Inherent to any approximation algorithm for a combinatorial optimization problem is the notion of a lower bound. To guarantee an approximation factor of our analysis of the algorithm needs to compare the optimum with the quality of the solution found by the algorithm. Since nding the value of the optimum is NP-complete, we resort to comparing the value of the solution obtained with a lower bound on the value of the optimum. A lower bound is said to be -good if for all instances the value of the lower bound is at least 1 1 times the value of the optimum. If for some instance the lower bound is times the optimum then any approximation algorithm using this lower bound alone cannot guarantee an approximation factor better than . Thus, nding good lower bounding techniques is

1.3. Approximation Algorithms

5

central to the task of designing good approximation algorithms. Finding the minimum multicut is NP-hard. However, the maximum multicommodity ow is a lower bound on the weight of the minimum multicut and from our approximate max- ow min-multicut theorem it follows that this lower bound is O(log k)-good. Our proof of the approximate max- ow min-multicut theorem also leads to an approximation algorithm for the minimummulticut that has a performance guarantee of O(log k). The problem of nding the minimum multicut in trees is as hard as the vertex cover problem in general graphs. Once again, using maximumintegral ow as a lower bound we obtain a 2-approximation algorithm for this problem. Dahlhaus, Johnson, Papadimitriou, Seymour and Yannakakis [11] showed  that the minimum multiway cut problem is NP-hard for any k  3 and they give a 2 ? k2 approximation algorithm for this problem. Our techniques also yield an algorithm with the same performance guarantee. However, one very useful feature of our approach  is2 that it can be extended to nd node-multiway cuts that have weight no more than 2 ? k times the minimum (this is not true for the algorithm in [11]). The optimization version of the feasibility problem for multicommodity ow is to nd the maximum f such that f fraction of each demand can be routed concurrently. The fraction f is called the maximum concurrent ow; the problem is feasible if f  1. De ne the sparsest cut to be the cut that minimizes the ratio of the total capacity of the cut to the sum of the demands separated by the cut. It is easy to see that f cannot exceed the sparsest cut ratio and the sequence of papers [30, 57, 17, 47] prove that f is at least as large as the sparsest cut ratio times ( log1 k ). This approximate min-max relation is the basis of the claim made earlier that an O(log2 k) safety margin is sucient for ow routability. The proof of this theorem also yields an O(log2 k)-approximation algorithm for the sparsest cut in the graph (computing the sparsest cut is NP-hard). We provide an alternate proof of this fact. 2

It was in this context that the rst approximate min-max theorem for multicommodity ow was given by Leighton and Rao [34]. As mentioned before, they considered the special case of uniform multicommodity ows and showed that the maximum concurrent ow is at least

( log1 n ) times the sparsest cut ratio. They also obtain a O(log n)-approximation algorithm for the sparsest cut in this setting. The sparsest cut ratio here closely approximates another interesting graph-theoretic entity, the edge expansion or ux of the graph1. The ux of the graph is the minimum over all cuts of the ratio of the capacity of the cut to the number of nodes in the smaller of the two sides of the cut. 1

6

1. Introduction

1.4 Applications Discrete mathematics and Combinatorial optimization have found in VLSI a eld that is an example par excellence for applications. An important tool in designing large layouts are the graph separator2 theorems which decompose large graphs into smaller about equal-sized graphs. Placement and routing problems typically take a divide-and-conquer approach; they split the graph, recursively lay out the two pieces and insert the edges in between these two pieces. It has been observed in practice that both the time taken by the program and the quality of the layout depend greatly on the number of wires that are reinserted in the nal step and the ratio of the sizes of the two pieces. Bhatt and Leighton [5] show formally that a provably good algorithm for bisection (the minimum edge cut that splits the graph into two halves) can be used to devise provably good algorithms for a variety of VLSI related problems. Unfortunately, the bisection problem is NP-complete. [34] shows how an -appoximation algorithm for the sparsest cut can be used to obtain a separator of weight at most O( log n) times the weight of the best bisection. These ideas combined with the algorithms in [5] produce polylogarithmic approximations for a host of VLSI related problems. Given a layout as a set of edge-disjoint paths, the problem of wiring a layout is to assign segments of paths to di erent layers so that all segments in a layer are node disjoint. The objective now is to minimize the number of transitions from one layer of the chip to another (via minimization ). Klein et.al. [30] show that the via minimization problem for a 2-layer layout can be solved by nding the minimum multicut in an appropriate graph.

1.5 Techniques Used In this thesis the basic paradigm for solving problems is to use the bounds obtained from linear programming and duality considerations to design approximation algorithms with provably good guarantees. For many NP-hard optimization problems involving cuts in graphs, there is a natural way of de ning a multicommodity ow which provides good lower bounds on the value of the minimum cut. Approximate max- ow min-cut theorems for A separator is the smallest edge (node) cut that splits the graph into two pieces such that the size of one is no more than twice the other. Separators and their applications have been extensively studied [36, 37]. 2

1.5. Techniques Used

7

this setting then testify to the goodness of these lower bounds. In some cases, for example multiway cuts and multicuts, the multicommodity ow lower bound turns out to be the same as the lower bound obtained by considering the LP relaxation. How do we use these lower bounds to design approximation algorithms and prove guarantees on the quality of the solution obtained? One approach is to formulate the problem as an integer program and nd an optimal solution to its linear programming relaxation (the linear program obtained by ignoring the integrality constraints). This optimal solution is, in general, a fractional solution and needs to be rounded to obtain a good integral solution. For the node multiway cut problem we exploit the structure of the problem to argue that there is an optimal solution to this linear program that is half-integral. We then round the   1 's to 1's to obtain a multiway cut of weight at most 2 ? 2 times the minimum fractional 2 k multiway cut. For the case of multicuts we employ a more sophisticated rounding technique the origins of which can be traced to a graph clustering technique rst proposed by Awerbuch [4]. This was later applied by Leighton and Rao, and Klein et.al. in the context of multicommodity

ow. This thesis re nes this technique further. It dispenses with the discretization of edgelengths and instead uses the idea of a packing of cuts to obtain simpler proofs and better bounds on the constants. We also use an initialization technique that leads to smaller clusters and a better bound on the capacity of the multicut obtained. A novel way of using the linear programming lower bound to obtain an approximation algorithm with good performance guarantee is the Primal-Dual approach of [2, 21, 59]. The primal-dual method has been used extensively in the past for designing polynomial time algorithms for some of the most basic combinatorial optimization problems including matching, ows and shortest paths in graphs. The basic idea behind the primal-dual method is to use the complementary slackness conditions to guide the search for the optimal primal and dual solutions. Thus the method begins with a feasible solution to the dual linear program and an infeasible solution to the primal such that the complementary slackness conditions hold. The goal at each step then is to `increase' the feasibility of the primal solution and the optimality of the dual solution while maintaining complementary slackness. The combinatorial structure of the problem dictates the improvements that are to be made at each step. The procedure ends when

8

1. Introduction

the primal is feasible. Since all the complementary slackness conditions hold for a pair of primal and dual feasible solutions, these solutions are the optimal ones. In designing approximation algorithms for NP-hard combinatorial optimization problems using this approach we relax the complementary slackness conditions suciently so as to ensure that we can nd improved integral solutions to the primal program at each step. Once a feasible solution to the primal program is found, the procedure stops. The extent to which the complementary slackness conditions were relaxed, determines the ratio of the values of the dual and primal solutions and also the approximation factor achieved. Many approximation algorithms, most notably that of Chvatal [9] and Bar-Yehuda and Even for the set cover problem t in to this framework. Another problem area where this technique has led to signi cant advances is that of network design [2, 21, 59, 14, 20]. In this thesis we use the Primal-Dual approach for our algorithm for approximating the minimum multicut and the maximum integral ow in trees. Another technique that we use in this thesis is a geometric scaling technique from [1]. In [1] we use this technique to scale the connectivity requirements in a network design problem; this gives us an approximation guarantee that is independent of the requirements. In this thesis we use this technique in our algorithm for approximating the sparsest cut.

1.6 Organization of the Thesis The rest of this thesis is organised as follows. Chapter 2 deals with the s-t cut polyhedron and provides a combinatorial approach to characterizing the vertices of this poyhedron. It also shows a small set of inequalities such that the corresponding polyhedron has exactly all s-t cuts as vertices. Chapter 3 looks at node multiway cuts in graphs. We rst prove that the linear programming relaxation of an integer program for the node multiway cut problem has an optimal solution that is half-integral and then  how to round this solution  we2show to obtain a node multiway cut of weight at most 2 ? k times the optimum. Chapter 4 contains the results for multicut and integral ow on trees. We relate the maximum integral ow problem on trees to the problem of nding a maximum matching in the demand graph and to other variants of matching. This chapter also describes a polynomial time algorithm for nding the maximum integral ow in uncapacitated trees and the primal-dual

1.6. Organization of the Thesis

9

algorithm for approximating maximum integral ow and minimum multicut in trees with edge capacities. Chapter 5 talks about multicut in general graphs; in particular we prove the O(log k)-approximate max- ow min-multicut theorem here and show the instance for which this bound on the min-multicut is achieved. We also give an O(log k)-approximation algorithm for the minimum multicut and discuss its applications. Chapter 6 concerns sparsest cuts and the feasibility of multicommodity ow. It borrows results from Chapter 5 to prove the O(log2 k) ratio between the maximum concurrent ow and the sparsest cut ratio and gives an O(log2 k)-approximation algorithm for computing the sparsest cut in a graph. Chapter 7 concludes the thesis with some open problems.

CHAPTER 2

The s{t Cut Polyhedron 2.1 Introduction One of the corner-stone problems in the theory of cuts in graphs is that of nding the minimum s-t cut. An s-t cut is a set of edges whose removal destroys all paths from node s to node t. The capacity of the minimum s-t cut is equal to the maximum ow from s to t. It is this famous max- ow min-cut theorem that is the basis for all algorithms computing the minimum s-t cut. The maximum s-t ow problem can be formulated as a linear program. It is well-known that the integral solutions to the dual LP correspond to s-t cuts [44]. Further, the constraint matrix of the dual LP is totally unimodular so that every vertex of the dual polyhedron is integral and hence corresponds to s-t cuts. Since the optimum value of the dual LP is attained at a vertex (an s-t cut), by the Duality theorem of linear programming we have that the maximum s-t ow is equal to the capacity of an s-t cut. Hence this must be the minimum s-t cut and so max- ow equals min-cut. Does every s-t cut appear as a vertex of dual polyhedron? In this chapter we show that, surprisingly enough, the answer is `No'. There seems to be no way of formulating a LP for maximum s-t ow such that the polyhedron corresponding to the dual set of inequalities 10

2.2. Preliminaries

11

has as its vertices all s-t cuts. This raises another question: can one write a polynomial number of inequalities (polynomial in the number of nodes in the graph), such that the corresponding polyhedron has all s-t cuts as its vertices, and moreover, every vertex is of this type? We answer this question in the armative; the fact has possible algorithmic applications that we discuss in Section 2.6. Polyhedra arising from basic combinatorial objects, such as matchings, matroids and cuts have been extensively studied in the past (see [39, 53]). Despite much investigation, we have not come across the above stated results on s-t cuts. This is somewhat surprising, considering that the characterization of the vertices of this polyhedron as s-t cuts is a textbook example of the use of total unimodularity.

2.2 Preliminaries Consider the maximum s-t ow problem over a directed graph G = (V; E ). Let c : E ! R+ be the capacity function on the arcs, and let s; t 2 V be the source and sink respectively. By introducing an arc (t; s) of in nite capacity we can view ow from s to t as a circulation and the objective now is to maximize the ow in the arc (t; s). Let fij denote the ow in arc (i; j ). The problem can be formulated as a linear program as follows maximize subject to

fts

P

(j;i)2E fji ?

P

(i;j )2E fij

fij fij

 0 i2V  cij (i; j ) 2 E ? (t; s)  0 (i; j ) 2 E

The rst set of inequalities say that the total ow into node i is at most the total ow out of it. Note that if these inequalities hold for each node i 2 V , then in fact they must all hold with equality, thereby implying ow conservation at each node. This is so because a de cit in the ow balance at one node implies a surplus at some other. The second set of inequalities are capacity constraints on the arcs.

12

2. The s{t Cut Polyhedron

The dual of this LP is

minimize P(i;j )2E dij cij subject to dij ps ? pt pi dij

   

pi ? pj (i; j ) 2 E ? (t; s) 1 0 i2V 0 (i; j ) 2 E ? (t; s)

(2:1)

The variables dij and pi can be viewed as distance labels on the arcs and potentials on the nodes respectively. The rst inequality then says that the potential drop across any arc cannot exceed the label on that arc. Also, the total potential drop between the source and the sink should exceed 1. The polyhedron corresponding to the set of inequalities (2.1) will be referred to as the dual polyhedron. In the next two sections we characterize the vertices and edges of this polyhedron. Before we do that we need the following de nitions.

De nition 2.2.1 A set of nodes, S , in a graph G is connected if the undirected subgraph induced over S is connected.

De nition 2.2.2 An s-t cut in the graph G is speci ed by a partition of the node set, V , into sets U and W such that s 2 U and t 2 W and consists of all arcs (u; w) such that u 2 U and w 2 W . This cut is represented as (U; W ).

2.3 Characterizing Vertices Theorem 2.3.1 The vertices of the dual polyhedron correspond exactly to s-t cuts in which the s-side is connected.

Proof: A vertex of the dual polyhedron is de ned by the intersection of m + n linearly independent inequalities. Any choice of m + n linearly independent inequalities should contain one of dij = 0, dij = pi ? pj for each arc (i; j ) 2 E ? (t; s). This is so because each variable dij occurs in just the two inequalities, dij  0 and dij  pi ? pj . If neither of these

2.3. Characterizing Vertices

13

is chosen then we would be picking m + n inequalities over m + n ? 1 variables and clearly, the inequalities picked would not be linearly independent. Let G0 be the subgraph of the arcs (i; j ) for which both dij = 0 and dij = pi ? pj are chosen.

Claim 2.3.1 G0 is acyclic1. The equations dij = 0 and dij = pi ? pj imply pi = pj . Let G0 contain the cycle (i1; i2; : : :ik ; i1). Now pi = pi = : : : = pik and therefore di ik = pi ? pik implies di ik = 0. So the set of inequalities are not independent. Hence G0 is acyclic. 1

2

1

1

1

Let r be the number of connected components in G0 . We have so far chosen m + n ? r inequalities and these are of the kind dij = 0; dij = pi ? pj . The remaining r inequalities need to be picked from amongst pi = 0; ps ? pt = 1. Our proof of Claim 2.3.1 shows that all nodes in a connected component of G0 are at the same potential. Picking pi = 0 for two di erent nodes in the same component of G0 would therefore lead to dependence and hence any choice of linearly independent inequalities would have pi = 0 for at most one node from each component of G0. The fact that the nodes in a component of G0 are equipotential also implies that s and t do not belong to the same component of G0 since then ps = pt and this violates ps ? pt  1. For a similar reason we cannot choose pi = 0 for any node in the component containing s, since then ps = 0 and ps ? pt  1 is violated. Thus, in order to choose r inequalities from pi = 0; ps ? pt  1 we need to pick pi = 0 for a node in each component not containing s and we also need to choose ps ? pt = 1. Since s and t do not belong to the same component of G0 we would have picked pi = 0 for some node in the component containing t and hence pt = 0. This in turn implies that ps = 1 (since ps ? pt = 1). Hence all nodes in the component of G0 containing s are at potential 1 while the remaining nodes are at potential zero. The potential of the nodes de nes a cut, with equipotential nodes lying on the same side of the cut. Clearly the nodes with potential 1 (the s-side of the cut) are connected in G. We remark that the arcs (i; j ) that are directed from the s-side to the t-side (forward arcs) have pi = 1, pj = 0. Corresponding to these arcs we should have chosen dij = pi ? pj since 1

Throughout this chapter `acyclic' refers to the absence of cycles in the undirected graph

14

2. The s{t Cut Polyhedron

choosing dij = 0 violates dij  pi ? pj . Arcs that are directed from the t-side to the s-side (back arcs) have pi = 0 and pj = 1. For these arcs we should have picked dij = 0 as choosing dij = pi ? pj violates dij  0. For all other arcs not in G0, any one of dij = 0; dij = pi ? pj could have been chosen. Thus all choices of m + n linearly independent inequalities yield s-t cuts which have their s-side connected. Hence all vertices of the dual polyhedron are of this kind. Moreover every s-t cut which has its s-side connected is a vertex of the polyhedron. Given such a cut we need to show m + n linearly independent inequalities that de ne the cut. We choose the inequalities in the following manner 1. Pick a spanning tree, T , on the s-side of the cut. For each arc (i; j ) 2 T choose both dij = 0 and dij = pi ? pj . 2. For all nodes on the t-side pick pi = 0. Also pick ps ? pt = 1. 3. For all forward arcs choose dij = pi ? pj . For the remaining arcs choose dij = 0. It is easy to check that these inequalities are linearly independent and de ne the cut.

2.4 Other Formulations of the Dual It is surprising that the vertices of the dual polyhedron do not capture all s-t cuts and that the cuts are asymmetric in their s and t sides. It turns out that the asymmetry of the cuts is due to the way the linear program for maximum s-t ow is formulated. In particular if the primal and the dual programs were maximize subject to

fts

P

(i;j )2E fij

? P(j;i)2E fji  0 i 2 V fij  cij (i; j ) 2 E ? (t; s) fij  0 (i; j ) 2 E

2.4. Other Formulations of the Dual

15

minimize P(i;j )2E dij cij subject to dij pt ? ps pi dij

   

pi ? pj (i; j ) 2 E ? (t; s) 1 0 i2V 0 (i; j ) 2 E ? (t; s)

(2:2)

then the vertices of the dual polyhedron correspond to s-t cuts which have their t-side connected. If instead of formulating the ow as a circulation and maximizing the ow in the arc (t; s) we maximize the net ow going out of the source, then the primal and the dual are maximize P(s;i)2E fsi ? P(i;s)2E fis subject to P P i 2 V ? fs; tg (j;i)2E fji ? (i;j )2E fij = 0 fij  cij (i; j ) 2 E fij  0 (i; j ) 2 E

maximize subject to

P

(i;j )2E dij cij

dij dsi dis dti dit dij dst dts

       

pi ? pj 1 ? pi pi ? 1 ?pi pi 0 1 ?1

(i; j ) 2 E; i 6= s; j 6= t (s; i) 2 E; i 6= t (i; s) 2 E; i 6= t (t; i) 2 E; i 6= s (i; t) 2 E; i 6= s (i; j ) 2 E

16

2. The s{t Cut Polyhedron

The dual program is equivalent to

minimize P(i;j )2E dij cij subject to dij dij ps pt

 pi ? pj (i; j ) 2 E  0 (i; j ) 2 E

(2:3)

= 1 = 0

In this formulation, the dual polyhedron has vertices that correspond to s-t cuts with both the s-side and the t-side connected. If we maximize the net ow going into the sink then the dual polyhedron has vertices that again correspond to s-t cuts with both sides connected.

2.5 The s-t cut Polyhedron There seems to be no way of formulating the maximum s-t ow LP so that the dual polyhedron has vertices corresponding to all s-t cuts. However, a slight modi cation to the dual program yields such a polyhedron.

Theorem 2.5.1 The following set of inequalities yields a polyhedron whose vertices correspond exactly to s-t cuts.

dij ps ? p t pi pi dij

    

pi ? p j 1 1 0 0

(i; j ) 2 E

i2V i2V (i; j ) 2 E

(2:4)

Proof: We proceed in the same manner as we did for characterizing the vertices earlier. The nodes in a component of G0 are equipotential. Hence we can choose either pi = 0 or pi = 1 for only one node in each component. The inequality ps ? pt  1 is satis ed only when ps = 1 and pt = 0. Hence, all nodes in the component containing t are at potential 0 while nodes in the component containing s are

2.6. Discussion

17

at potential 1. For all other components there is no restriction on choosing pi = 0=pi = 1 for a node in that component. Thus, the cut formed by putting equipotential nodes on the same side of the cut is an s-t cut that does not necessarily have any of its sides connected. Hence, all vertices of the polyhedron are s-t cuts. Moreover all s-t cuts are vertices of the polyhedron. To prove this, given any s-t cut we should be able to show m + n linearly independent inequalities that de ne the cut. These inequalities are obtained as follows 1. Pick pi = 1 for all nodes on the s-side of the cut and pi = 0 for nodes on the t-side. 2. For all forward arcs pick dij = pi ? pj . 3. Pick dij = 0 for the remaining arcs. Clearly these inequalities are linearly independent and de ne the cut.

2.6 Discussion We have thus obtained a characterization, as a set of linear inequalities, of the polyhedron corresponding to the four di erent kinds of s-t cuts that one obtains by specifying connectivity requirements on the two sides. This allows us to nd an s-t cut (of a specifed kind) minimizing a given linear function of the variables corresponding to the arcs and the nodes. Note that the polyhedra considered are unbounded and hence we cannot optimize such linear functions that attain their optimum at in nity. For example, we cannot minimize P (i;j )2E dij over the polyhedron which has as its vertices all s-t cuts. However, this is to be expected, as the optimum in this case corresponds to the maximum s-t cut, computing which is NP-hard [15]. One naturally wonders if the polyhedron could be `closed' by throwing in the remaining set of inequalities that, along with the inequalities in Theorem 2.5.1, would de ne the convex hull of all s-t cuts. However, any computationally tractable description of these inequalities seems unlikely as this would imply that NP= co-NP [28].

CHAPTER 3

Node Multiway Cuts 3.1 Introduction Given an undirected graph G = (V; E ) with weights on the edges and a set of k terminals, s1 ; s2 : : :sk , a multiway cut is a set of edges whose removal disconnects every pair of terminals. This generalizes the fundamental notion of an s-t cut. Whereas the problem of computing a minimum s-t cut can be solved in polynomial time using a maximum ow algorithm, the problem of computing the minimum weight multiway cut was shown to be NP-hard and max SNP-hard even for xed k  3 by Dahlhaus, Johnson, Papadimitriou, Seymour and Yannakakis [11]. For trees, the minimum multiway cut can be found in polynomial time using dynamic programming [61]. The problem is also polynomial time solvable in planar graphs for xed k; for general k it is NP-hard [11].





Dahlhaus et.al. [11] also give a 2 ? k2 -approximation algorithm for the minimum multiway cut problem using the idea of isolating cuts. Cunningham [10] addresses the linear programming aspects of the multiway cut problem and describes lower bounds based on these. One such lower bound is the maximum multicommodity ow obtained by allowing

ow between every pair of terminals; the objective is to maximize the sum of the ows. Cunningham shows that the minimum multiway cut is at most 2 ? k2 times the maximum 18

3.1. Introduction

19

ow. In this chapter we consider node multiway cuts; the problem of computing a minimum weight node multiway cut is known to be NP-hard and max SNP-hard [10]. It turns out that the approximation algorithm in [11] for edge multiway cuts does not extend to the node multiway cut problem. Let us give a reason for this. De ne an isolating cut for terminal si to be a cut that separates si from the rest of the terminals. A minimum isolating cut for si can be computed in polynomial time by identifying the remaining terminals, and nding a minimum cut separating them from si . The algorithm in [11] nds such cuts for each terminal, discards the heaviest cut, and picks the union of the remaining. The approximation factor is proven by observing that on doubling each edge in the optimum multiway cut, we can partition these edges into k isolating cuts, one for each terminal. Each of these cuts must be at least as heavy as the corresponding isolating cut found by the algorithm. Isolating cuts can be de ned appropriately and found eciently for the node setting as well. However, the second part of the argument crucially rests on the fact that each edge has two end points, and hence contributes to two isolating cuts. This does not carry over to the node case. Interestingly enough, this problem does have an approximation algorithm with the same factor of 2 ? k2 ; this is the main result in this chapter. We show that the LP-relaxation of an integer program for the minimum node multiway cut problem always has a halfintegral optimal solution; this property is the basis of our algorithm. The dual of this linear programming relaxation is a multicommodity ow problem in which we have a commodity for every pair of terminals and the objective is to maximize the sum of the ows routed subject to ow conservation and throughput constraints on the non-terminal nodes. The maximum multicommodity ow is a lower bound on the weight of the minimum node multiway cut; however equality of maximum ow and minimum multiway cut does not hold in general. The best that one can do in such a situation is to establish an approximate max- ow min-multiway cut theorem. We prove such a theorem (Theorem 3.4.1) and show by means of an example that the bounds claimed by the theorem are tight. Closely related to the notion of a multiway cut is that of a k-cut. A k-cut is a set of edges whose deletion disconnects the graph into k components. Goldschmidt and Hochbaum [22] show a polynomial time algorithm for computing the minimum k-cut, when k is xed.

20

3. Node Multiway Cuts





For arbitrary k the problem is NP-hard and Saran and Vazirani [52] give a 2 ? k2 approximation algorithm for it. However, approximating the node version of the k-cut problem is NP-complete since identifying if the graph has a node k-cut involves checking if the graph has an independent set of cardinality k.

3.2 Preliminaries Let G = (V; E ) be an undirected graph with node weights, w : (V ? T ) ! R+ , where T is the set of k terminals, s1 ; s2 : : :sk . A node multiway cut is a set of non-terminal nodes whose deletion disconnects every pair of terminals. The graph has a node multiway cut only if the terminals form an independent set. With every pair of terminals si ; sj we associate a commodity and designate one of si ; sj as the source and the other as the sink for this commodity. Thus the total number of commodities is k(k2?1) . The multicommodity ow problem is to maximize the sum of the

ows of the commodities subject to ow conservation (a commodity is conserved at each node except the source and sink for that commodity) and throughput constraints (the sum of the ows of all commodities through a node cannot exceed the weight of the node). The throughput and conservation constraints can be formulated as linear inequalities and the objective function, which is the sum of ows of all commodities is also linear. Thus, it is straight-forward to formulate this multicommodity ow problem as a linear program and an optimal solution to it can be found in polynomial time. Another way of writing a linear program for this problem, which although has exponentially many variables is interesting from the viewpoint of multiway cuts is as follows: Let pi be a path between two distinct terminals and let fi be a variable denoting the ow along this path. We need to ensure that the sum of the ows along paths that go through a node does not exceed the weight of the node, i.e.

X i:v2pi

fi  w v ; v 2 V ? T

3.2. Preliminaries

21

Thus the linear program is maximize subject to

Pf

i i

P

i:v2pi fi fi

 wv v 2 V ? T  0

and its dual is

minimize Pv2V ?T dv wv subject to P d  1 v2pi v dv  0 v 2 V ? T The dual can be viewed as an assignment of non-negative distance labels, dv , to the nonterminal nodes such that the distance between any two terminals is at least 1; the objective is to minimize PV ?T dv wv . Note that we never need to assign any node a distance label more than 1. Hence, any integral solution to this LP is a 0,1 assignment of distance labels to the nodes such that each path, pi , contains at least one node with dv = 1. Thus, any integral solution to the dual LP corresponds to a node multiway cut in G which is of weight equal to the value of the solution. The converse is also true, i.e. every node multiway cut in G yields an integral solution to the dual LP of value equal to the weight of the multiway cut. This correspondence between the node multiway cuts in G and integral solutions to the dual linear program implies that the optimal solution to this linear program, which by the Duality theorem of linear programming is equal to the maximum multicommodity ow, is a lower bound on the value of the minimum node multiway cut.1 Is maximum multicommodity ow equal to the minimum multiway cut? The example in Figure 3.1 shows that this is not the case. The graph in this example has, besides the k terminals, k nodes of unit weight and a center node of weight k. The maximum multicommodity ow routes a total ow of k2 while at least k ? 1 non-terminal nodes are needed to form a multiway cut and hence the weight of the minimum node multiway cut is k ? 1. The optimal solution to the dual LP for this example has distance labels of 21 on each nonterminal node; each pair of terminals is a unit distance apart and the objective function This also follows directly from the fact that the multiway cut acts as a bottleneck to ow and hence the value of the ow cannot exceed the capacity of any multiway cut. 1

22

3. Node Multiway Cuts s1

s1

1

sk

s2

1

1/2

sk

1

k

1

1/2

s2

1 k

1

1/2 1

1 1

si

s3

si

1

s3

Figure 3.1: An example to show that maximum ow is not equal to the minimum node multiway cut.

has value k2 which is equal to the maximum multicommodity ow. In the case of s-t cuts the dual of the maximum s-t ow LP has an optimal solution that is integral and hence the max- ow min-cut theorem follows as a simple consequence of the Duality theorem of linear programming. Since, in the case of multiway cuts the dual LP might not have an optimal solution that is integral, all that the Duality theorem can say is that the maximum multicommodity ow is equal to the minimum fractional multiway cut. For multiway cuts, the example of Figure 3.1 is really the worst that things get, in the sense that the dual linear program always has an optimal solution that is half-integral.

3.3 Half-integrality of the Optimum Consider the optimal solutions to the primal and dual linear programs. Let d : (V ? T ) ! R+ be the assignment of distance labels corresponding to this. By the Duality theorem, P v2V ?T dv wv is equal to the maximum multicommodity ow. We de ne the length of a path to be the sum of the distance labels of the nodes along the path. The distance between two nodes is the length of the shortest path between them. Note that the terminals have no distance labels but for the purpose of computing distances we assign them a label 0. A ow path is a path between two distinct terminals such that there is a non-zero ow along this path. The two complementary slackness conditions are 1. A node that has a non-zero distance label must be saturated.

3.3. Half-integrality of the Optimum

23

2. Any ow path must have length exactly one. Any two terminals are at least a unit distance apart under the distance assignment d. We use this fact and the two complementary slackness conditions to show that there is an optimal solution to the dual LP that is half integral. With each terminal si we associate a region, Si , which is the set of nodes at a zero distance from si . The boundary of this region is the set of nodes adjacent to this region, and is denoted by ?(Si ). Since any two terminals are at least a unit distance apart, the regions S1 ; S2 : : :Sk are disjoint. Further, the boundary of a region does not have a node in common with any region. The boundaries of the regions are not necessarily disjoint. If v is a node common to the boundary of two regions, Si ; Sj , then there is a path between terminals si and sj on which v is the only node with a non-zero distance label; hence dv = 1. Let M = [ki=1 ?(Si ) be the set of nodes in the boundaries and let M 1 be the nodes in M that belong to the boundaries of at least two regions. Then by the previous argument, 8v 2 M 1 : dv = 1; the superscript in M 1 denotes the fact that all nodes in this set have a distance label 1. We next show that there is an optimal solution to the dual LP in which the remaining nodes of M , i.e. M ? M 1, have a distance label 12 (Theorem 3.3.2). We denote this set of nodes by M 1=2 = M ? M 1; every node in this set belongs to the boundary of exactly one region.

Lemma 3.3.1 Any ow path from si to sj either uses exactly one node from M 1 or it uses exactly two nodes from M 1=2.

Proof: Let vi ; vj be the rst and last nodes from M on this ow path. Clearly, vi 2 ?(Si ) and vj 2 ?(Sj ). We have two cases 1. vi = vj . Then vi = vj 2 M 1 . Further, this is the only node of M on this path, or else the length of this ow path would be strictly greater than 1, contradicting the second complementary slackness condition. 2. vi 6= vj . We rst show that vi ; vj 2 M 1=2. Suppose vi 2 M 1 . Then dvi = 1 and since dvj > 0 the length of this ow path is strictly greater than 1, contradicting the second complementary slackness condition. For the same reason vj 62 M 1 and so vi; vj 2 M ? M 1 = M 1=2.

24

3. Node Multiway Cuts We now show that vi ; vj are the only nodes from M on this ow path. Let vk 2 ?(Sk ) be another node from M on this path. Since vk 2 ?(Sk ), there is a path from terminal sk to node vk , all nodes on which - with the exception of vk - have a distance label 0. This coupled with the facts that the length of the ow path si ? vi ? vk ? vj ? sj is exactly one and dvi ; dvj > 0 implies that the paths si ? vi ? vk ? sk and sk ? vk ? vj ? sj have length strictly less than 1. We now have a pair of distinct terminals (si ; sk or sj ; sk ) the distance between which is less than one; a contradiction.

Theorem 3.3.2 For any assignment of non-negative weights to the nodes of G there exists an optimal solution to the dual LP that is half integral. Moreover, given any optimal solution to the dual LP we can obtain, from it, a half-integral optimal solution in linear time. Proof: We start with an arbitrary optimal solution to the dual LP, and construct a feasible solution that is half-integral and has value equal to the maximum ow. Hence by the Duality theorem, this half-integral solution is an optimal solution to the dual LP. Corresponding to the optimal solution to the dual LP, compute the sets Si; ?(Si ) for 1  i  k, and the sets M; M 1; M 1=2 as de ned above. Since all nodes of M have a nonzero distance label, they are saturated (the rst complementary slackness condition). This together with Lemma 3.3.1 implies that (3:1) maximum ow = WM + 12 WM = where WS denotes the sum of the weights of the nodes in set S , i.e. WS = Pv2S wv . 1

1 2

Now consider the following assignment of distance labels to the nodes

8 > 1 if v 2 M 1 > < dv = > 12 if v 2 M 1=2 > : 0 otherwise

Claim 3.3.1 d is a feasible solution to the dual LP. Proof: Any path from si to sj must use nodes from both ?(Si ) and ?(Sj ). Let vi 2 ?(Si ) and vj 2 ?(Sj ) be two nodes on this path. If vi = vj then vi 2 M 1 and dvi = 1. Else if

3.4. Approximate Max- ow Min-multiway Cut Theorem

25

vi 6= vj then dvi ; dvj  21 . In either case the length of this path is at least one. Hence d is dual feasible. For this distance label assignment the value of the objective function is X X  X wv wv + 12 dv wv = v2V v2M v2M = 1

1 2

But this is equal to the maximum ow (equation 3.1). Hence, d is an optimal solution to the dual LP.

3.4 Approximate Max- ow Min-multiway Cut Theorem Theorem 3.4.1 (Approximate max- ow min-multiway cut theorem) 



maximum ow  minimum node multiway cut  2 ? k2  maximum ow:

Proof: Consider a half-integral optimal solution to the dual LP. Since any path between two terminals contains at least one node from M , the set of nodes, M , forms a multiway cut. Further, since each node of M has a distance label that is at least 21 , the sum of the weights of these nodes is at most 2 Pv2M dv wv = 2 Pv2V ?T dv wv , which is twice the maximum ow. We can prove a better bound on the weight of the multiway cut by discarding a part of the boundary of one of the regions. Recall that every node in M 1=2 belongs to the boundary of only one region. Let ?1=2(Si ) = M 1=2 \ ?(Si ) be the set of nodes that are unique to the boundary of the region Si . The sets ?1=2(S1); ?1=2(S2); : : : ?1=2 (Sk ) form a partition of the set M 1=2 and if ?1=2 (Sj ) is the heaviest set in the partition then W? = (Sj )  k1 WM = 1 2

1 2

Claim 3.4.1 M ? ?1=2(Sj ) is a node multiway cut in G. Proof: Clearly, any path between two terminals di erent from sj would use at least one node from M ? ?1=2 (Sj ). Any path between terminals si and sj uses a node of ?(Si ). Since ?(Si ) is disjoint from ?1=2(Sj ), such a path uses a node of M ? ?1=2(Sj ). Hence M ? ?1=2(Sj ) is a node multiway cut in G.

26

3. Node Multiway Cuts

The weight of this node multiway cut can be bounded as follows.

WM ? W? = (Sj ) = WM + WM = ? W? = (Sj )  1  WM + 1 ? k WM =   = WM + 21 2 ? k2 WM =   2  1  2 ? k WM + 2 WM = Since d is an optimal solution to the dual LP, maximum ow equals X dv wv = WM + 12 WM = v2V ?T 1 2

1

1 2

1 2

1

1 2

1

1 2

1

1

1 2

1 2

and hence the weight of the multiway cut is at most 2 ? k2 times the maximum ow. The graph in Figure 3.1 also shows that this approximate max- ow min-multiway cut theorem is tight. The maximum multicommodity ow for this instance is k2 while the minimum node multiway cut has weight k ? 1. This gives a ratio of 2 ? k2 between the minimum node multiway cut and the maximum ow.

3.5 Approximating the Multiway Cut 



The ideas in Theorem 3.4.1 can be used to obtain a 2 ? k2 -approximation algorithm for the minimum node multiway cut problem. We do not know of any other algorithm for this problem that achieves an approximation guarantee better than the trivial k.

Theorem 3.5.1 One can, in polynomial time, nd a node multiway cut of weight at most

2 ? k2 times the weight of the minimum node multiway cut.

Proof: We begin by nding an optimal solution to the dual LP. We now nd the regions corresponding to the terminals and the boundaries of these regions. By choosing an appropriate set of nodes from these boundaries, as in Theorem 3.4.1, we get a multiway cut of weight at most 2 ? k2 times the maximum ow. Since the weight of the minimum node multiway cut is only larger than the value of the maximum ow, the node multiway cut found is of weight within 2 ? k2 times that of the minimum node multiway cut.

3.5. Approximating the Multiway Cut

27

A slight modi cation to the example in Figure 3.1 shows that the analysis of our algorithm is tight. The nodes with unit weight are now assigned a weight 2, while the center node has weight k + . The center node is also the minimum node multiway cut. The optimal solution to the dual LP assigns a distance label of 1/2 to each node of weight 2 and a zero label to the center node. Hence, the multiway cut found by our algorithm has weight 2k ? 2 which is about 2 ? k2 times the weight of the optimum.

28

3. Node Multiway Cuts

Algorithm Node multiway cut; 1. Compute an optimal solution to the dual LP 2. For each terminal si, compute Si, ?(Si) and ?1=2(Si) 3. Let j be such that ?1=2(Sj ) has maximum weight 4. Output [ki=1 ?(Si) ? ?1=2(Sj ) end. Figure 3.2: Algorithm for approximating the minimum node multiway cut

CHAPTER 4

Multicut in Trees 4.1 Introduction In 1969 T.C. Hu [25] posed the following problem: Given a list of vertex pairs, (si ; ti ); 1  i  k, nd a minimum weight set of edges separating each pair of vertices in the list. We call such a set of edges a multicut. For k = 1, the problem coincides with the ordinary s-t minimum cut problem. The problem is also polynomial time solvable when k = 2, by using two applications of a minimum s-t cut algorithm [61]. Further, the multiway cut problem can be encoded as a multicut problem by including in the list of vertex pairs all distinct pairs of terminals. Hence computing the minimum multicut is NP-hard and max SNP-hard for any xed k  3 [11]. In this chapter we address the special case of nding a minimum multicut when the input graph is a tree (the problem is dealt with in its full generality in a later chapter). The minimum multiway cut in trees can be found in polynomial time using a straight-forward dynamic programming approach [8]. However, for multicuts, NP-hardness sets in much earlier; we show that computing the minimum multicut is NP-hard and max SNP-hard even on uncapacitated trees of height one. We approach this intractability of the minimum multicut problem by considering a multi29

30

4. Multicut in Trees

commodity ow problem; the formulation we deal with associates a commodity with each vertex pair and requires maximizing the sum of the ows routed subject to capacity and

ow conservation requirements. Clearly, the maximum multicommodity ow is bounded by the minimum multicut; the question is whether equality holds. Consider a tree of height one with three leaves. Each pair of leaf vertices form the source-sink pair of a commodity. All edges have unit capacity. The maximum ow in the tree is 23 whereas the minimum multicut has weight 2. In this situation the best that one can hope for is an approximate max- ow min-multicut theorem. The relative simplicity of this setting (input graph a tree) allows us to consider a stronger version of ow, namely integral ow; a commodity can now be routed only in integral units. Our main result is an approximate maximum-integral- ow minimummulticut theorem.

Theorem 4.1.1 (Approximate max-integral- ow min-multicut theorem) For trees, maximum integral ow  minimum multicut  2  maximum integral ow

Our proof of this theorem is an ecient algorithm for computing a multicut and an integral

ow such that the weight of the multicut is at most twice the value of the ow. This also gives us a 2-approximation algorithm for the minimum multicut problem in trees and a 1 2 -approximation algorithm for the maximum integral ow problem in trees (we show that this problem is NP-hard and max SNP-hard). A specially interesting aspect of the algorithm is the methodology used in designing it - it is based on a primal-dual approach. The primal-dual method has been used extensively in the past for solving problems in P. The recent work of Goemans and Williamson [21] and Williamson, Goemans, Mihail and Vazirani [59] demonstrates, conclusively, the e ectiveness of this method in the context of approximation algorithms for NP-hard optimization problems. The problem of nding a maximum integral ow is the same as that of nding a maximum cardinality set of edge-disjoint paths between the speci ed source-sink pairs, and has been extensively studied [13, 53]. We do not know of approximation algorithms for any other NP-hard cases besides ours. We show by means of an example that even for grid graphs, the ratio of the maximum (fractional) ow to the maximum integral ow (and also of the

4.2. Preliminaries

31

minimum multicut to the maximum integral ow) is O(k), thus indicating that these upper bounds cannot be of any help in obtaining good approximation algorithms for this problem. The seemingly simplistic setting of integral ow in trees captures a rich collection of problems. The demand graph H corresponding to a multicommodity ow instance is the graph obtained by putting an edge for each source-sink pair. When restricted to trees of height one and unit edge capacities, integral ow is essentially a matching in H . If the trees are of height one and edge capacities are integral, then an integral ow corresponds to a b-matching in H . On the other hand, if the edge capacities are unity and the trees of arbitrary height, integral ow corresponds to a generalization of matching, which we call the cross-free-cut matching. This problem inherits many nice combinatorial properties of matching. We also give a polynomial time algorithm for nding a maximum cross-free-cut matching, and hence also for the maximum integral ow in trees when edge capacities are unity. Finally, we also show that the multicut problem in trees is equivalent to the set cover problem for a special class of set systems, which we call tree-representable set systems. Interestingly enough, the problem of recognizing this class in polynomial time has been extensively studied in a di erent context [58] (it is the same as testing if a given binary matroid is graphic), and ecient algorithms have been discovered [6]. Hence, we also get a 2-approximation algorithm for the tree-representable set cover problem.

4.2 Preliminaries Given a tree T = (V; E ), a capacity function c : E ! Z+ , and k pairs of vertices (si ; ti ), 1  i  k, we associate a commodity i with the pair (si ; ti ) and designate si as the source and ti as the sink for this commodity. A multicommodity ow is a way of simultaneously routing commodities from their sources to the respective sinks while ensuring that the ow of each commodity is conserved at each vertex (except the source and sink vertex for that commodity) and that the sum of the

ows of all commodities through an edge does not exceed the capacity of the edge. A multicommodity ow in which the sum of the ows over all the commodities is maximized is called a maximum (multicommodity) ow. A ow is integral if each commodity has an

32

4. Multicut in Trees

integral ow through each edge. The maximum integral ow problem is to nd an integral multicommodity ow that is maximum. A multicut is de ned as a set of edges whose removal disconnects each source-sink pair. The capacity (weight) of a multicut is the sum of the capacities of the edges in it. The minimum multicut problem is to nd a multicut of minimumweight. The minimummulticut problem for trees can be solved in polynomial time for xed k [61]. This is because the multicut contains at most k edges; one can in time O(nk ) enumerate all subsets of edges of cardinality at most k and pick one that is a multicut and has the minimum weight. However, for arbitrary k the problem is NP-hard.

Theorem 4.2.1 The minimum multicut problem is NP-hard and max SNP-hard even for trees of height one.

Proof: Let T be a unit height tree with root v and leaves v1; v2; : : :vd . Let the edge ei = (v; vi) have capacity ci . If the edges ei ; ei ; : : :eip form a multicut in T , then the vertices vi ; vi ; : : :vip form a vertex cover in the demand graph, H . Thus, nding the minimum multicut in T is equivalent to nding a minimum weight vertex cover in H , where the weight of the vertex vi is given by ci. 1

1

2

2

Let pi denote the unique path from si to ti in the tree, and let fi be a variable for the ow along this path. Since the ow along any path is non-negative, fi  0. Further, the total

ow through an edge cannot exceed the capacity of the edge, i.e.

X

i:e2pi

fi  ce ; e 2 E

The ow would be maximum when Pki=1 fi is maximized. Hence the linear program for maximum multicommodity ow is maximize subject to

The additional constraint, fi 2 problem.

Pk f i=1 i

P

i:e2pi fi fi

 ce e 2 E  0 1ik

Z+ , yields a program for the maximum integral ow

4.2. Preliminaries S S 5

33 t1

1

S 3

2/3

S 5

t1

S 1

2/3

1/3

1/3

t4

S 3

t4

1/3

1/3

t3

t3 1/3 1/3

2/3

2/3

t5 S2

S 4 t2

t5

S2

S 4

t2

Figure 4.1: Example to show that the minimum (fractional) multicut is not half-integral The dual of this linear program is minimize Pe2E de ce subject to P d  1 1ik e2pi e de  0 e 2 E and can be viewed as an assignment of non-negative distance labels, de , to edges e 2 E , so as to minimize Pe2E de ce , subject to the constraint that each pair, (si ; ti ), be at least a unit distance apart. The optimal integral solution to the dual program is a 0/1 assignment of distance labels to the edges such that for every commodity, the path in the tree corresponding to the commodity contains an edge with distance label 1. Thus the edges with de = 1 form a multicut of weight equal to Pe2E de ce . Conversely, the minimum multicut corresponds to an integral solution to the dual program of value equal to the weight of the multicut. Thus the optimal integral solution to the dual LP is the minimum multicut and hence the value of the optimal (fractional) solution, which by the Duality theorem is equal to the maximum multicommodity ow, is a lower bound on the weight of the minimum multicut. Unlike the case of multiway cuts the optimal (fractional) solution is not half-integral. All edges of the tree in Figure 4.1 have unit capacities. The gure shows a multicommodity

ow of value 2 31 and a fractional multicut of the same weight. Therefore this is the pair of optimal solutions to the primal and dual linear programs.

34

4. Multicut in Trees

4.3 Finding the Maximum Integral Flow The problem of nding the maximum integral ow is the same as that of nding a maximum cardinality set of edge-disjoint paths between the speci ed source-sink pairs. For the case when the demand graph is a complete graph, there exists a min-max theorem relating the maximum number of edge-disjoint paths to the weight of the minimum multiway cut [38, 7, 41]. In this section we relate the problem of nding the maximum integral ow in a tree to other combinatorial optimization problems and establish its complexity.

4.3.1 Unit Height Trees Let T be a unit height tree with root v and leaves v1 ; v2; : : :vd . Let the edge ei = (v; vi) have capacity ci .

Proposition 4.3.1 For trees of height one and unit edge capacities, nding the maximum integral ow is equivalent to nding a maximum matching in the demand graph H .

Routing a unit ow from vi to vj corresponds to picking edge (vi; vj ) in H . Since all edges of T have unit capacity, an integral ow in T is a matching in H of the same size. The converse is also true; a matching in H of size f corresponds to an integral ow of f units in T . Thus computing the maximum integral ow in T is equivalent to nding a maximum matching in H . A well-studied generalization of matching is the b-matching. Given a graph G = (V; E ) and a function b : V ! Z+ , a b-matching is a set (with multiplicities) of edges, E 0  E , such that each vertex, v 2 V , has at most b(v ) edges incident at it.

Proposition 4.3.2 When the edge capacities are positive integers, nding the maximum integral ow is equivalent to nding a maximum b-matching in the demand graph H .

If edge ei = (v; vi) has capacity ci then corresponding to an integral ow in T we would be picking at most ci edges incident to vertex vi in H . Thus, computing a maximum integral

ow in T now corresponds to nding a maximum b-matching in H , where b(vi) = ci .

4.3. Trees with Unit Capacity Edges

35

4.3.2 Trees with Unit Capacity Edges Following standard terminology, two cuts (S; S ) and (T; T ) are said to be crossing i S \ T , S \ T , S \ T and S \ T are all non-empty. A family of cuts is non-crossing if no two cuts in the family are crossing. Given a graph G = (V; E ) and a family, F , of non-crossing cuts; de ne a cross-free-cut matching as a set of edges, E 0  E , such that E 0 contains at most one edge from each cut in F . If F is the set of all singleton cuts, (v; V ? v ), v 2 V , then a cross-free-cut matching is simply a matching in G (note that this family of cuts is non-crossing). Thus a cross-free-cut matching generalizes the notion of a matching. The maximum cross-free-cut matching problem is to nd a cross-free-cut matching of maximum cardinality. We now extend Proposition 4.3.1 to trees of arbitrary height.

Proposition 4.3.3 The maximum integral ow problem on trees with unit capacity edges is equivalent to a maximum cross-free-cut matching problem in the demand graph H .

Theorem 4.3.1 There is a polynomial time algorithm for nding a maximum integral ow on trees with unit capacity edges, and hence for the maximum cross-free-cut matching problem.

Proof: Since all edge capacities are unity, and we want an integral ow, at most one commodity can ow through an edge. As shown in Proposition 4.3.1, for a tree of height one, this is simply a maximum matching problem. Our algorithm starts by rooting the tree at an arbitrary vertex. It then does two passes over the tree, level by level, - an upward pass followed by a downward pass. Consider a tree of height 2. At a vertex v at level 1, we could solve a maximum matching problem to route ow in the subtree rooted at v . However, it may also be advantageous to send a commodity along the edge from v to the root, r. This will be strictly advantageous only if we can still route the previous amount of ow in the subtree rooted at v . We determine the commodities for which we get a strict advantage, by solving a maximum matching problem for each commodity (on the downward pass, we will pick one of these commodities for the (v; r) edge; once this is done, the rest of the routing in the subtree rooted at v can be accomplished). Vertex v can now be considered the source/sink of these commodities. This

36

4. Multicut in Trees

is done for all vertices at level 1. Then a height one problem is solved at the root. In solving this, we pick the commodity that is routed on the (v; r) edge. As remarked earlier, the rest of the routing in the subtree rooted at v can now be xed. This is the essential idea of the algorithm for arbitrary height trees as well. In the upward pass, we consider vertices level by level. At a vertex v we solve, for each choice of commodity being routed on the (v; parent(v )) edge, a height one problem. The commodities giving strict advantage are thought of as originating at v itself. In the downward pass, we start at the root, xing commodities. The vertex parent(v ) decides which commodity gets routed on (v; parent(v )). Once this is done vertex v xes the commodities routed on the edges to its children.

4.3.3 Trees with Edge Capacities We generalize the notion of b-matchings by allowing constraints for any family of noncrossing cuts (not just singleton cuts) and call this a cross-free-cut b-matching. Thus, given a graph G = (V; E ), a family, F , of non-crossing cuts and a function b : F ! Z+ , a cross-free-cut b-matching is a set of edges that contains at most b((S; S)) edges from the cut (S; S ) 2 F . The maximum cross-free-cut b-matching problem is to nd a cross-free-cut b-matching of maximum cardinality. It is easy to see that nding a maximum integral ow in a tree with edge capacities is equivalent to nding a maximum cross-free-cut b-matching in the demand graph. The maximum integral ow problem for trees with arbitrary edge capacities is NP-hard, and so is the maximum cross-free-cut b-matching problem. It is intriguing that generalizing the maximum matching problem to a family of non-crossing cuts results in a polynomial time solvable problem (the maximum cross-free-cut matching problem), whereas the same generalization of the maximum b-matching problem results in an NP-hard problem.

Theorem 4.3.2 The maximum integral ow problem is NP-hard and max SNP-hard for trees with edge capacities 1 and 2.

We reduce the NP-complete 3D-matching problem to a decision version of the maximum integral ow problem. Given three disjoint sets X , Y , Z , jX j = jY j = jZ j = n and a set

4.3. Trees with Edge Capacities

37

2

1

1

xi

yj

zk

2

xi, l 1

xi, l , a

1

xi, l , b

Figure 4.2: The tree for the NP-hardness proof of maximum integral ow of triples S = f(xi; yj ; zk )jxi 2 X; yj 2 Y; zk 2 Z g, the 3D-matching problem is to check if there exists a set of n triples that cover each element of X [ Y [ Z . Given an instance of the 3D-matching problem, we construct a tree, T , of height 3. The vertices at level 1 correspond to the elements of X [ Y [ Z . A vertex corresponding to the element xi 2 X has pi children, where pi is the number of occurences of xi in S . We label these vertices xi ; l, 1  l  pi. Each of the vertices xi ; l, has 2 children labelled xi ; l; a and xi; l; b. Thus there are jS j vertices at the 2nd level and 2 jS j vertices at the 3rd level of the tree. Edges (r; xi); 1  i  n, and (xi ; xi; l); 1  l  pi ; 1  i  n, have a capacity 2. All other edges have unit capacity. Figure 4.2 shows the construction of the tree. The occurences of xi in S are numbered arbitrarily from 1 to pi , and the lth occurence corresponds to the vertex xi ; l. If (xi ; yj ; zk ) 2 S is the lth occurence of xi , we add three source-sink pairs, (xi ; l; a; xi; l; b), (xi ; l; a; yj ) and (xi ; l; b; zk ). Thus this instance of the multicommodity ow problem has 3 jS j commodities in all. Theorem 4.3.2 now follows from:

Lemma 4.3.3 I is a true instance of the 3D-matching problem i T

of n + jS j units.

has an integral ow

Proof: Let I be a true instance of the 3D-matching problem. Then there exists a set, S 0, of n triples covering all elements in X [ Y [ Z . If (xi; yj ; zk ) 2 S 0 corresponds to the lth occurence of xi , then we route one unit of the commodities corresponding to the sourcesink pairs (xi ; l; a; yj ) and (xi ; l; b; zk ). Also, for all m such that 1  m  pi ; m 6= l we

38

4. Multicut in Trees maximum integral flow

maximum multicommodity flow

F

minimum multicut

M

Figure 4.3: Relation between integral ow and multicut route a unit ow for the source-sink pair (xi ; m; a; xi; m; b). Thus the total ow routed is 2n + Pni=1 (pi ? 1) = n + jS j. For the other direction, note that the maximum integral ow over the commodities whose source/sink are contained in the subtree rooted at xi is pi + 1. Moreover, this ow can be achieved only, by routing one unit for the source-sink pairs (xi ; l; a; yj ) and (xi ; l; b; zk ), for some l, and one unit for each of the remaining pi ? 1 pairs (xi; m; a; xi; m; b), 1  m  pi; m 6= l. Therefore an integral ow of n + jS j units has to route pi + 1 units of ow over commodities whose source/sink are contained in the subtree rooted at xi . If ow is routed for the sourcesink pairs (xi ; l; a; yj ) and (xi ; l; b; zk ) then (xi ; yj ; zk ) is a triple in S . Because the capacities of edges (r; yj ) and (r; zk) are unity, each yj ; zk is included in at most one of these triples. Since there are n such triples (one for each xi ) this set of triples forms a 3D-matching.

4.4 Approximating Integral Flow and Multicut In this section we present an algorithm that nds a multicut, M , and an integral ow, F , such that the multicut is of weight at most twice the integral ow, i.e. M  2F . Since maximum multicommodity ow (and hence maximum integral ow) is a lower bound on the weight of the minimum multicut we have

M  2F  2  maximum integral ow  2  weight of minimum multicut and

F  21 M  21  weight of minimum multicut  12  maximum integral ow

This situation is illustrated in Figure 4.3.

4.4. Approximating Integral Flow and Multicut

39

Our algorithm follows a primal-dual approach the elements of which have been enunciated in [59]. This approach when applied to approximation algorithms consists of starting with arbitrary solutions to the primal and dual linear programs, and making alternate improvements to each, until `good' integral solutions to both are found. The improvements are guided by the complementary slackness conditions. The two complementary slackness conditions for our setting are 1. fi > 0 ) Pe2pi de = 1, i.e. if the commodity i has a non-zero ow then the sum of the distance labels along path pi is exactly 1. 2. de > 0 ) Pi:e2pi fi = ce , i.e. an edge with a positive distance label is saturated.

Enforcing both these complementary slackness conditions would give us optimal solutions to the primal and dual linear programs. Since we are looking for good integral solutions to these programs and the optimal solutions are in general not integral, we cannot be enforcing all these complementary slackness conditions. We enforce the second complementary slackness condition and relax the rst to

fi > 0 ) 1 

X

e2pi

de  2

(4:1)

This implies that we pick only saturated edges in the multicut (de  0 ) Pi:e2pi fi = ce ) and that for any commodity that is routed, the ow path contains at most two edges of the multicut. It is easy to see that ensuring these two conditions would imply that the capacity of the multicut is at most twice the value of the ow. We now describe an algorithm for nding a multicut and an integral ow that meet these two requirements. We begin by rooting the tree at an arbitrary vertex, say r. The level of a vertex is its distance from the root. A commodity is contained in the subtree rooted at v if the path corresponding to it lies completely within this subtree. A commodity is contained in level i if it is contained in a subtree rooted at some vertex in level i. An edge e1 is an ancestor of an edge e2 if e1 lies on the path from e2 to the root. The algorithm makes two passes over the tree.

Pass 1. In this pass we move up the tree, one level at a time, routing ow as we go along

and picking some edges (a subset of these edges will be retained as the multicut). If v is a

40

4. Multicut in Trees

vertex in the current level, check if there exists a commodity contained in the subtree rooted at v . If yes, send as much ow of this commodity as is possible. Repeat this procedure till no more ow can be routed for commodities contained in this subtree. We also need to pick a set of edges to include into the multicut. Let Q be the set of edges saturated in this step and let I be the set of commodities such that for i 2 I , the path pi did not contain a saturated edge before this step but contains one now. Note that if a commodity in the set I is contained in the subtree rooted at v then the path corresponding to it must use the vertex v . This is because we are moving up the tree and so would have considered all paths in this subtree that did not contain v at an earlier step in the procedure and saturated some edge along each of these paths. For this same reason, all paths along which ow is routed in this step use the vertex v . Thus if there are two edges in Q such that one is an ancestor of the other, then one of these edges is redundant as far as disconnecting the source-sink pair of commodities in I is concerned. We retain the edge that is the ancestor and denote this subset of Q as frontier (v ) (the frontier of vertex v ).

Claim 4.4.1 The union of all frontiers is a multicut. Proof: As a rst step in proving this claim observe that the set of saturated edges is a multicut. If such is not the case then there exists a commodity i such that no edge along pi is saturated. We could hence, have routed additional ow of commodity i - a contradiction. From our de nition of frontiers it follows that the union of all frontiers disconnects exactly the source-sink pairs that are disconnected by the set of saturated edges. Hence the union of all frontiers is a multicut. We augment the ow in this manner, level by level. A path along which some ow is sent will be called a ow path. Since all capacities are integral the ow along each ow path is also integral. In the second pass we move down the tree dropping redundant edges from the multicut; this is done to ensure that no more than two edges of a ow path are included in the multicut (condition 4.1).

Pass 2. We move down the tree one level at a time and build the multicut. When

considering vertex v we include an edge e 2 frontier (v ) in the multicut only if no edge along

4.4. Approximating Integral Flow and Multicut

41

the path from e to v is already included in the multicut. Let M be the set of edges picked.

Claim 4.4.2 M is a multicut. Proof: By our previous claim, the frontiers of all vertices put together form a multicut. Note that the edges of frontier (v ) are required to disconnect a source-sink pair only if the path corresponding to this pair uses the vertex v (the set of commodities, I , in the preceeding discussion). If e is an edge of frontier (v ) not included in M then there is another edge e0 2 M , that lies on the path from e to v and hence disconnects all source-sink pairs that e was required to disconnect.

Algorithm multicut integral- ow(r); 1. fPass 1 g for current level = max level downto 0 do for all v 2 current level do 1.1. for all commodities contained in subtree rooted at v do 1.2.

2. fPass 2 g 2.1. M 

Route as much ow of commodity as is possible. update F Compute frontier (v )

fInitializing multicut g

2.2. for current level = 0 to max level do for all v 2 current level do for all e 2 frontier (v) do if 6 9e0 2 M such that e0 is on the path from e to v then end.

3. return (M; F )

M

M [ feg

Figure 4.4: Finding a multicut, M , and an integral ow, F , such that M  2F Figures 4.5 through 4.10 show the algorithm at work on the tree in Figure 4.5. In the rst pass over the tree (Figures 4.5 to 4.7) no ow is routed when considering vertices at level 2.

42

4. Multicut in Trees

At the rst level, one unit of commodity 1 and 2 units of commodity 2 are routed while at the root level one unit of commodity 3 is routed. The broken edges are the ones included in the frontiers of di erent vertices (Figure 4.8). In the second pass over this tree (Figures 4.9 and 4.10) the broken edges of the frontiers are made solid as they are included in the multicut. Note that the edge incident to s1 is not included in the multicut although it was included in the frontier set of one of the vertices. We now show that the multicut picked, M , contains at most two edges of any ow path. We begin by making the following two claims.

Claim 4.4.3 Let si-ti

be a ow path, and v the least common ancestor of si ; ti . If e 2 frontier(u) is an edge on this path then u is an ancestor of v .

Proof: For contradiction, assume that v is an ancestor of u. Hence, u is considered before v in the rst pass. Since e 2 frontier (u), e is saturated while considering vertex u and ow is sent along the path si -ti later (while processing vertex v ). However, this is not possible as edge e lies on the path si -ti .

Claim 4.4.4 If u is an ancestor of v then no edge of frontier(v) is an ancestor of an edge

of frontier(u).

Proof: Suppose, e1 2 frontier (v ) is an ancestor of e2 2 frontier (u). Since u is an ancestor of v , vertex v is considered before u in the rst pass. As e1 2 frontier (v ), e1 is saturated while considering vertex v and e2 2 frontier (u) is saturated later (while processing vertex u). However, this is not possible as edge e1 is an ancestor of e2 and lies on the path from e2 to u.

Lemma 4.4.1 Let si -ti be a ow path and v the least common ancestor of si; ti. Then M contains at most one edge from the path si -v (ti -v ).

Proof: Let M contain two edges e1 ; e2 from the path si -v . Further, let e1 2 frontier (v1 ) be an ancestor of e2 2 frontier (v2 ). By the above two claims it follows that v1 is an ancestor of v2 which is an ancestor of v . Thus, e1 occurs on the path from e2 to v2 and so while picking edges from frontier (v2) (Pass 2) we would not have included e2 in M .

4.4. Approximating Integral Flow and Multicut

1

43

2

2

s4 1

2

1

2

t2

t4 s2

t1

s3

2

1

1

t3

s1

Figure 4.5: Pass 1, 2nd level: no ow routed

1

2

2

s4 1

2

1

2

t2

t4 s2

t1

s3

2

1

1

t3

s1

Figure 4.6: Pass 1, 1st level: 1 unit of commodity 1 and 2 units of commodity 2 routed

1

2

2

s4 1

2

1

2

t1

s3 1

t3

t4 s2

2

t2

1

s1

Figure 4.7: Pass 1, 0th level: 1 unit of commodity 3 routed

44

4. Multicut in Trees

1

2

2

s4 1

2

1

2

t2

t4 s2

t1

s3

2

1

1

t3

s1

Figure 4.8: End of Pass 1: The broken edges are in the frontiers of vertices

1

2

2

s4 1

2

1

2

t2

t4 s2

t1

s3

2

1

1

t3

s1

Figure 4.9: Pass 2, 0th level: Bold edges included in multicut

1

2

2

s4 1

2

1

2

t1

s3 1

t3

t4 s2

2

t2

1

s1

Figure 4.10: Pass 2, 1st level: Bold edges included in multicut

4.5. Integrality Gap for Grid Graphs

45

Hence, the multicut, M , includes at most two edges of any ow path. Since, all the edges in M are saturated, the two conditions that we set out to enforce are met. Hence the capacity of the multicut is at most twice the value of the ow.

4.5 Integrality Gap for Grid Graphs The approximation algorithm for maximum integral ow on trees uses, implicitly, the fact that the ratio between the maximum fractional and integral ows for trees is at most 2. This, however, is not true for general graphs. In fact, even for grid graphs this gap is quite large.

Proposition 4.5.1 The gap between the maximum integral ow and the maximum frac-

tional ow for grid graphs can be as high as k2 , where k is the number of commodities.

Proof: The graph G is a union of k paths, p1; p2; : : :pk ; the end points of the path pi form the source-sink pair for commodity i. Every pair of paths, pi ; pj , i 6= j , intersect in a unique edge. The graph G can be embedded on an k  k grid. If the origin (0; 0) is the left bottom corner of the grid then si = (0; k ? i + 1) and ti = (i; 0). The path pi is then the path from (0; k ? i + 1) to (i; k ? i + 1) to (i; 0). Any two paths intersect at a unique vertex. To ensure that they intersect in a unique edge we replace each intersection v  (i; j ) by two vertices va ; vb. The edges incident at vertices va; vb are (va; (i ? 1; j )), (va; (i; j + 1)), (vb; (i + 1; j )), (vb ; (i; j ? 1)) and (va ; vb). Figure 4.11 shows this embedding. The maximum integral ow for this instance is one unit as by routing any one commodity we block the paths of all other commodities. However, the maximum fractional ow is k2 units; half unit of each commodity can be routed simultaneously. This yields a gap of k2 between the maximum fractional ow and the maximum integral ow. In the above example at least k edges are needed to disconnect every source-sink pair. Thus the gap between the maximum integral ow and the minimum multicut is k and so we cannot be using integral ow as a lower bound on the weight of the multicut when approximating the minimum multicut in general graphs. However, in this example the maximum ow has value k2 ; we use this as a lower bound while approximating the multicut in the next chapter.

46

4. Multicut in Trees tk

v

ti

va vb

t1

sk

si

s1

Figure 4.11: The grid graph with the k2 gap between maximum (fractional) ow and maximum integral ow

4.6 The Tree-representable Set Cover Problem The minimum multicut problem for trees can be viewed as a weighted set cover problem. The elements, P , of the set system, (P; E ), are the si -ti paths in the tree, 1  i  k. The sets, E , correspond to edges of the tree. A set includes all si -ti paths (elements) that use this edge; the set has weight equal to the capacity of the edge. Finding the minimum weight multicut for the tree is the same as nding the minimum weight set cover in this set system. Note however, that not all set cover problems can be viewed as minimum multicut problems on trees. A set system (P; E ) is called a tree-representable set system if there exists a tree T , whose set of edges is E such that every path p 2 P is a path in T . The problem of deciding whether a given set system is tree-representable is well-studied and ecient algorithms known [6]. Thus, given a set cover problem, one can, in almost linear time, check if it corresponds to the minimum multicut problem on some tree, T , and if so, nd T and the fsi ; tig pairs.

Theorem 4.6.1 There exists a polynomial time 2-approximation algorithm for the minimum weight set cover problem for set systems that are tree-representable.

CHAPTER 5

Multicut in General Graphs 5.1 Introduction Much of ow theory, and the theory of cuts in graphs, is built around a single theorem the celebrated max- ow min-cut theorem of Ford and Fulkerson [40], and Elias, Feinstein and Shannon [12]. The power of this theorem lies in that it relates two fundamental graphtheoretic entities via the potent mechanism of a min-max relation. The importance of this theorem has led researchers to seek its generalization to the case of multicommodity ow. In this setting, each commodity has its own source and sink, and the object is to maximize the sum of the ows subject to capacity and ow conservation requirements. The notion of a multicut generalizes that of a cut, and is de ned as a set of edges whose removal disconnects each source from its corresponding sink. Clearly, maximum multicommodity ow is bounded by minimum multicut; equality can be established for some special cases, for example when there are only two commodities [24]. As shown in the previous chapter, maximum multicommodity ow is not equal to the minimum multicut in general. Why does the theorem hold for a single commodity, and why does the generalization fail? To seek an explanation, let us consider the LP formulation of the maximum multicom47

48

5. Multicut in General Graphs

modity ow problem. As shown in Section 5.2 the dual of this is the linear programming relaxation of the minimum multicut problem, i.e. the optimal integral solution to the dual is the minimum multicut. In general, the vertices of the dual polyhedron are not integral. However, for the case of a single commodity, they are integral and the max- ow min-cut theorem is simply a consequence of the Duality theorem of linear programming. For the multicommodity case, the Duality theorem only shows that maximum ow is equal to the minimum fractional (i.e. relaxed) multicut. In this chapter, we address the maximum multicommodity ow problem for a general graph and prove the following approximate max- ow min-multicut theorem: minimum multicut  maximum ow  minimum multicut

O(log k)

where k is the number of commodities. We also show that our theorem is tight modulo constant factors, and we give a polynomial time algorithm for nding a multicut within O(log k) of the optimum fractional, and therefore also integral multicut. Dalhaus et.al. [11] observed that a 2-approximation algorithm for the multiway cut problem can be used to approximate the general multicut problem to within a factor 2 with a running time that is exponential in k; thus the running time is polynomial only for xed k. Klein et.al. [30] used their approximation algorithm for the sparsest cut to give an O(log3 n) approximation algorithm for the multicut. Our improvement for multicut gives us an O(log n) approximation algorithm for the problem of deleting the minimum number of clauses to make a 2CNF formula satis able. The previous bound, due to Klein et.al. [30], for this problem was O(log3 n). This in turn yields an O(log n) approximation for the minimum edge deletion graph bipartization problem deleting the minimum number of edges to make a graph bipartite and for the 2-layer via minimization problem - minimizing the number of vias required in routing a 2-layered layout.

5.2 Preliminaries Let G = (V; E ) be an undirected graph, c : E ! R+ a capacity function on the edges and let (s1 ; t1); (s2; t2); : : : (sk ; tk ) be the k speci ed pairs of vertices. We associate a commodity,

5.2. Preliminaries

49

i, with the pair (si; ti) and designate si as the source and ti as the sink for that commodity.1 Let H be the demand graph corresponding to this multicommodity ow instance. The assumption that each commodity has a single source and a single sink can be made without loss of generality. The more general case where a commodity i may have a set Si of sources and a set Ti of sinks can be easily reduced to this one by adding a new source si with edges to the vertices in Si and a new sink ti with edges to Ti . Let E 0 be any set of edges. We denote by CE the total capacity of the edges in E 0, thus CE = Pe2E ce. The cut associated with a set of vertices, S , is the set of edges with exactly one end point in S ; this cut is denoted by 5(S ). 0

0

0

Let pi be a path between some source-sink pair and let fi be a variable for the ow along this path.2 We require that the ow along any path be non-negative, i.e. fi  0. The capacities of the edges impose constraints on the ow; the total ow through an edge cannot exceed the capacity of the edge. X fi  ce ; e 2 E i:e2pi

The ow would be maximum when Pi fi is maximized. Hence the linear program for maximum multicommodity ow is maximize subject to

Pf

i i

P

i:e2pi fi fi

 ce e 2 E  0

and its dual is

minimize Pe2E de ce subject to P d  1 e2pi e de  0 e 2 E The dual program can be viewed as an assignment of non-negative distance labels, de , to edges e 2 E , such that each source-sink pair is at least a unit distance apart. The distance label assignment is such, as to minimize Pe2E de ce . Since the graph is undirected, it is immaterial which vertex in the pair is designated as the source (sink). The only ow along this path would be of the commodity corresponding to the source-sink pair connected by pi . The variable fi then denotes the ow of this commodity along pi . 1

2

50

5. Multicut in General Graphs

The optimal integral solution to the dual program is a 0/1 assignment of distance labels to the edges such that any path connecting a source-sink pair contains an edge with distance P label 1. Thus the edges with de = 1 form a multicut of weight equal to e2E de ce . Conversely, the minimum multicut corresponds to an integral solution to the dual program of value equal to the weight of the multicut. Thus the optimal integral solution to the dual LP is the minimum multicut and hence the value of the optimal (fractional) solution, which by the Duality theorem is equal to the maximum multicommodity ow, is a lower bound on the weight of the minimum multicut.

5.3 Two Crucial Lemmas In this section we shall prove two lemmas that will be central to our multicut algorithm. We shall prove these in sucient generality so that they can be applied to the other versions of the multicommodity ow problem as well. Let d : E ! R+ be an assignment of distance labels to the edges such that for each commodity, i, the length of the shortest path between the source and sink corresponding to the commodity is at least 1. The distance between two vertices u; v is the length of the shortest path between the vertices under the distance assignment d; we denote this by distd(u; v). Thus distd (si ; ti)  1, for all i, 1  i  k.

5.3.1 Growing a Region A region is just a set of vertices. Initially the region is just a single vertex, r; we call this the root vertex. The region is `grown' by including more vertices. The vertices are included in the order of their distances from the root with the nearest vertex coming in rst. This procedure for growing regions is similar to a graph clustering technique rst proposed by Awerbuch [4] (for graphs without capacities or lengths on the edges), and to that used by Leighton and Rao [34] and Klein et.al. [30] (for graphs with capacities and lengths) in the context of multicommodity ows. One can also view each region as growing out radially, with respect to the edge lengths, de . To be able to view this growth as a continuous process we associate a variable yS with each set S  V . At any point in the algorithm we identify a set, A, as the active set and raise

5.3. Growing a Region

51

dW

W R dR

Figure 5.1: Growing a region its variable yA . Initially, the active set is frg. De ne the weight of the set A as

wt(A) =

X

S A

yS C5(S) + Wr

where Wr is a constant. Since wt(frg) = Wr , it can be viewed as a weight assigned to the root vertex, r. The reason behind de ning the weight of a set in this manner shall be clear later. De ne the radius of A as X rad(A) = yS S A

If while raising yA we nd that PS :e25(S ) yS = de for some edge e = (u; v ) 2 5(A), u 2 A, we include the vertex v in the region. It is easy to see that a vertex is included in the region when its distance from the root is equal to the radius of the region. We now make the set A [ fvg active, and start increasing the variable corresponding to it. We keep growing the region in this manner, one vertex at a time, until the cut associated with the active set, 5(A), has capacity no more than   wt(A), where  is a constant to be chosen later. Thus the termination condition for the region growing process is

C5(A)    wt(A)

(5:1)

Let R denote the active set for which condition 5.1 is satis ed.



Lemma 5.3.1 rad(R)  1 ln 1 +

P

e E de ce Wr 2



.

Proof: Note that a region is grown only when the capacity of the cut associated with the region, C5(A) , is at least   wt(A) (condition 5.1). From our de nition of the weight of

52

5. Multicut in General Graphs

a region it follows that in increasing the radius of the region by an amount 1 we shall be picking an additional weight of C  A  wt(A). Thus, a 1 (constant) increase in the radius doubles the weight. The weight of the region therefore, grows exponentially with the radius and this is the key fact about this region growing procedure. 5(

)

We now provide a formal proof of this lemma. Let A be the active set at some intermediate point and let R = rad(A) and W = wt(A) be the radius and weight of this set. Increasing the variable yA increases both the radius and weight of the set A; let these increases be R and W respectively (Figure 5.1). By our de nition of the weight of a set the increase in weight is equal to C5(A)  R. Since condition 5.1 was not met at this point, C5(A)    W , and so W = C5(A)  R    R  W Therefore, W    R W and for an in nitesimal increase we have dW    dR (5:2) W The weight of any set is at least Wr . To upper-bound the weight of a set observe that this region growing procedure ensures that for all edges e 2 E , PS :e25(S ) yS  de and so

wt(A) =

X

S A

yS C5(S) + Wr

0 X@

1 = ceA + Wr yS S A e25(S ) 0 1 X@ X = yS A + W r ce e2E X S:e25(S);SA  dece + Wr e2E Hence the weight of a set is at least W and at most W +P d c . Integrating equation 5.2 X

r

with these limits on the weight we have

r

e2E e e

Z Wr +Pe E dece dW

Z rad(R) dR   0 Wr P d cW ln 1 + e2WE e e    rad(R) r  P dece  2

and so the radius of the region is at most 1 ln 1 +

e E Wr 2

.

5.3. Growing a Region

53

The weight on the root is thus an initial condition that ensures quicker termination of the growth process. Figure 5.2 is a formal description of this region growing process. The

procedure grow region(G; d; r; ; Wr); 1. A frg, W Wr , R 0

fInitialization g 2. while C5(A)    W do fTermination condition not met g 2.1. W 1 C5(A) ? W , R CWA fW is increase in weight required to satisfy end condition, R is corresponding increase in radius g 2.2. Pick u, a vertex not in A and nearest to the root, r 2.3. if R  distd(r; u) ? R then fu included in region g 2.3.1. R distd(r; u) ? R, W C5(A)  R fIncrease in radius and weight of region g 2.3.2. A A [ fug, R R + R, W W + W fAdding vertex to region and updating radius and weight g 5(

)

else

end.

3. return (A)

fTermination condition can be met without including new vertex g 2.3.3. W W + W , R R + R fUpdate radius and weight to value at the point where termination condition would be met g

Figure 5.2: The procedure for growing a region procedure grow region takes as input a graph, G, a distance label assignment, d, a choice for the root vertex and the constants,  and Wr and returns a region. Note that although we described the growth as a continuous process, it can be implemented to run in time O(m + n log n) along the lines of the implementation of Dijkstra's shortest path algorithm using Fibonacci heaps.

54

5. Multicut in General Graphs

5.3.2 Growing Disjoint Regions Having grown a region we remove all vertices contained in that region (and the edges incident at them) and grow another region in the residual graph. Let R1 ; R2 ; : : : Ri ; : : : denote the regions formed. We denote by Gi = (Vi; Ei) the graph obtained by deleting vertices contained in [ij?=11 Rj , where G1 = G. Note that E1 ? E2; E2 ? E3; : : :Ei ? Ei+1 ; : : : is a sub-partition of the edge set, E , where an edge lies in the ith set of this sub-partition only if it is incident to a vertex in Ri . Let e 2 Ei ? Ei+1; we de ne the extent to which edge e is included in the regions R1; R2; : : : Ri; : : : as X d~e = yS S Ri ;e25(S )

An edge not in any set of the partition is not included and hence d~e = 0 for such an edge. By our region growing process it follows that for all edges e 2 E , d~e  de . Let M = 5(R1) [ 5(R2 ) [ : : : [ 5(Ri ) [ : : :

Lemma 5.3.2 CM    (Pe2E d~ece + Pi Wr ). Proof: We stop growing the region Ri only when it satis es the termination condition (equation 5.1). Therefore, C5(Ri)\Ei    wt(Ri ) and so

CM =

X

  The weight of a region can be rewritten as

C5(Ri)\Ei

iX i

wt(Ri)

X yS C5(S)\Ei + Wr S Ri 0 1 X A X @ ce + Wr yS = e25(S )\Ei S Ri X X = yS ce + Wr e 25 ( S ) \ E i S Ri

wt(Ri ) =

(5.3)

5.4. Approximate Max- ow Min-multicut Theorem

55

1 0 X X B yS C = A + Wr @ce e2Ei ?Ei S Ri ;e25(S ) X ~ +1

= and so

X i

e2Ei ?Ei+1

wt(Ri ) = =

cede + Wr

X d~ece + Wr

X X

i e2Ei ?Ei X X ~

e2E

dece +

+1

i

Wr

i

(5.4)

Substituting equation 5.4 into equation 5.3 proves the lemma. Since Ri is a region grown in the graph Gi , therefore  P dece  1 Ei rad(Ri)   ln 1 + e2W r P Since Ei  E and d; c are non-negative functions, e2Ei de ce  Pe2E de ce and so  P  rad(Ri )  1 ln 1 + e2WE de ce r Hence Lemma 5.3.1 bounding the radius of a region holds for the regions R1; R2 ; : : : Ri ; : : :.

5.4 Approximate Max- ow Min-multicut Theorem In this section we prove an approximate maximum-multicommodity- ow minimum-multicut theorem and show by means of an example that the theorem is tight modulo constant factors.

Theorem 5.4.1 (Approximate max- ow min-multicut theorem) maximum ow  minimum multicut  4 ln(q + 1)  maximum ow where, q is the cardinality of the minimum vertex cover, Q, in the demand graph, H .

Proof: We prove this theorem by exhibiting a multicut of weight at most 4 ln(q + 1) times the maximum ow. The multicut can be viewed as a partition of the vertex set such that no set in the partition contains both, the source and the sink vertex of any commodity. The sets in the partition are formed by growing regions as in the previous section; we choose

56

5. Multicut in General Graphs

the constants  and Wr such that for no commodity, both the source and the sink of that commodity are included in the same region. This, for example would be the case if the diameter of the region is less than 1. We always choose a vertex of Q as the root for growing a region and we continue growing regions (as in Section 5.3) while the residual graph Gi contains a source-sink pair. If each region grown is such that it does not contain both the source and the sink vertex of a commodity then the number of regions grown is at most jQj = q and the union of the cuts associated with the regions, M = 5(R1) [ 5(R2 ) [ : : : [ 5(Ri ) [ : : : is a multicut.

P

Choosing Wr as e Eq de ce and  as 2 ln(q + 1) ensures that the radius of a region is at most 1 2 (Lemma 5.3.1). Hence if the source and sink of a commodity are in the same region, the distance between them is strictly less than 1 - a contradiction. 2

The capacity of the multicut is given by (Lemma 5.3.2)

CM    ( Since Wr =

P

e E de ce 2

q

X~

e2E

d e ce +

i

Wr )

and the number of regions grown is at most q ,

X i

Wr  q  Wr 

Further, for all edges e 2 E , d~e  de and hence

X~

e2E

Therefore,

X

CM  2

X e2E

de ce 

X

e2E

X e2E

de ce

de ce

de ce = 4 ln(q + 1)

X e2E

dece

If the distance label assignment, d, is an optimal solution to the dual LP, then by the Duality theorem, Pe2E de ce is equal to the maximum multicommodity ow. Hence the multicut is of capacity at most 4 ln(q + 1) times the maximum ow. This theorem also shows that the ratio of the optimal integral solution to the optimal fractional solution of the dual program is at most O(log k). The following example shows that an O(log k) bound on the ratio of the minimum multicut to the maximum ow is tight.

5.5. Approximating the Minimum Multicut

57

Theorem 5.4.2 8k; n, k  n, there exists an n vertex graph, G, and (k2) pairs of vertices in G such that the ratio of the minimum multicut separating these pairs to the maximum

ow between these pairs is (log k).

Proof: Let G = (V; E ) be a k-vertex, bounded degree expander graph (each vertex has degree at most d, for an appropriate constant d). Every vertex has at most k2 vertices within distance logd ( k2 ). Thus, G has (k2) pairs of vertices that are a distance logd ( k2 ) or more apart. Let these be the pairs for the multicut instance. All edges of the graph have unit capacity, and hence the total capacity of the edges is O(k). Since each ow path is

(log k) long,  k each  unit of ow exhausts a total capacity of (log k) and hence the maximum

ow is O log k . The optimum multicut can be viewed as a partition of the vertex set of G. Since G is an expander, any set, S , in the partition has (jS j) edges running across it, provided jS j  k2 . Any subgraph with more than k2 vertices has pairs that are logd ( k2 ) apart, and hence contains a pair of vertices that are to be separated. Thus, no set in the partition induced by the multicut has more than k2 vertices, and so each set has (jS j) edges running across it. Hence the number of edges in the multicut is (k). This yields a ratio of (log k) between the weight of the minimum multicut and the maximum ow. The demand graph corresponding to this instance has k vertices and (k2) edges (since the number of source-sink pairs is

(k2)) and hence the minimum vertex cover has cardinality (k). Note that splitting an edge by adding vertices on the edge does not change the maximum ow or the minimum multicut. We modify G into an n-vertex graph by adding an appropriate number of vertices on the edges of G.

5.5 Approximating the Minimum Multicut The proof of the approximate max- ow min-multicut theorem can be modi ed to yield an algorithm for approximating the minimum multicut.

Theorem 5.5.1 Consider an instance of the multicut problem speci ed by a graph G = (V; E ), a capacity function c : E ! R+ and k pairs of vertices. One can, in polynomial time, nd a multicut separating the speci ed pairs of vertices and having weight at most

58

5. Multicut in General Graphs

4 ln(k + 1) times the weight of the minimum multicut. Proof: If the vertex cover, Q, of the demand graph is given, then our proof of the approximate max- ow min-multicut theorem also constructs a multicut of weight at most 4 ln(q + 1) times the maximum ow. Since computing the minimum vertex cover is NP-hard, we use an approximation to the minimum vertex cover; our vertex cover is the set of vertices that are sources for some commodity. Since there are k commodities in all, the size of this vertex cover is at most k.

P

We need to modify our choice of Wr and ; we set these to e kE de ce and 2 ln(k + 1) respectively. Using similar arguments as in the proof of the approximate max- ow minmulticut theorem one can show that the capacity of the multicut, M , is 2

CM  4 ln(k + 1)

X e2E

de ce

Once again, if the distance label assignment, d, is the optimal solution to the dual LP then the capacity of M is at most 4 ln(k +1) times the weight of the minimum fractional multicut and hence is at most 4 ln(k + 1) times the weight of the minimum multicut.

Algorithm multicut(d); 1. M , Q [ki=1 fsig, Wr

end.

P

e E de ce

k

2

fInitializing, multicut M . Q is the set of root vertices, Wr is the weight of a root g 2. while G contains a vertex of Q do 2.1. Pick vertex r from G such that r 2 Q 2.2. R grow region(G; d; r; 2 ln(k + 1); Wr) fGrow region with root as r and  = 2 ln(k + 1) g 2.3. M M [ 5(R), G G ? R fInclude edges of 5(R) in multicut, remove vertices of R from G g fAll source-sink pairs separated g 3. return (M ) Figure 5.3: An approximation algorithm for the multicut

5.6. Applications

59 S 5 S 4

S 2 S 1

t2

t1

S 3

t4 t 6 t5

S 6

t3

Figure 5.4: A multicut obtained using procedure multicut Figure 5.3 describes formally, the approximation algorithm for the minimum multicut. It takes as input a distance label assignment, d : E ! R+ , and returns a multicut of weight at most 4 ln(k + 1) Pe2E de ce . Figure 5.4 shows a multicut obtained by this procedure.

5.6 Applications Several graph problems can be viewed as edge deletion problems; we wish to nd a minimum weight set of edges whose removal yields a graph with a desired structure  [60]. Klein et.al. [30] propose a method for approximating such a problem when the property  can be speci ed as a 2CNF formula so that deleting edges from the graph corresponds to deleting clauses in the formula. In particular, they show how to model the minimum edge deletion graph bipartization problem and the 2-layer via minimization problem. We thus wish to address the question of approximability of the following problem. A 2CNF formula, F , is a weighted set of clauses of the form P  Q where P , Q are literals. Find a minimum weight set of clauses the deletion of which makes the formula satis able. Construct a graph G(F ) whose vertex set is the set of literals in F . For each clause of the kind P  Q include two edges (P; Q) and (P; Q) of capacity equal to the weight of the clause P  Q.

Lemma 5.6.1 A 2CNF formula, F , is satis able i no connected component of the graph

G(F ) contains both a literal and its complement.

60

5. Multicut in General Graphs

Proof: An edge (P; Q) in G(F ) implies that the literals P and Q take the same truth value. Thus, if a literal and its complement occur in the same connected component then the 2CNF formula is not satis able. For the converse, note that if two literals P; Q are in the same connected component then their complementary literals P ; Q are also in one connected component. Thus, the components can be paired, so that in each pair one component contains a set of literals and the other contains the complementary literals. We can now obtain a satisfying assignment by setting, for each pair of components, the literals of one component to true (and the other's to false). Thus we wish to nd a minimum weight set of edges, M , whose removal separates the pairs of complementary literals in G. Let W be the minimum weight set of clauses whose deletion makes F satis able.

Lemma 5.6.2 wt(W )  wt(M )  2  wt(W ). Proof: The minimum multicut, M , in G(F ) corresponds to a set of clauses (of weight at most wt(M )) whose deletion makes the formula satis able. Hence, wt(W )  wt(M ). Each clause of F corresponds to two edges in G(F ). Thus the set W corresponds to a multicut in G(F ) of weight at most 2  wt(W ). Therefore, wt(M )  2  wt(W ). Finding the set of edges, M , is exactly the multicut problem on the graph G(F ) with every pair of complementary literals forming a source-sink pair. Thus, the number of pairs, k, is equal to the number of variables in the formula, n, and hence by Theorem 5.5.1 we can approximate M to within a factor O(log n). Using Lemma 5.6.2 we get

Theorem 5.6.3 Given a 2CNF formula, one can in polynomial time nd a set of clauses

of weight at most O(log n) times the minimum weight set of clauses whose deletion makes the formula satis able.

Corollary 5.6.4 The edge-deletion graph bipartization problem and the 2-layer via minimization problem can both be approximated within a factor O(log n) in polynomial time.

5.6. Applications

61

We leave open the question whether these problems can be approximated within some constant factor. We know they are both max SNP-hard and hence do not have a polynomial time approximation scheme. Our approximation algorithm for edge multicuts can be adapted to the node case. A node multicut is a set of nodes (excluding the source/sink nodes) such that their deletion disconnects each source from the corresponding sink. The problem of nding a minimum weight node multicut is NP-hard, even for k = 3, since the node multiway cut problem for k = 3 can be viewed in this setting by letting each pair of terminals be a pair of the multicut problem. The multicommodity ow problem is similar to that for node multiway cuts: we have k commodities, with si ; ti as the source-sink pair for the ith commodity. The objective is to maximize the sum of the commodities routed subject to throughput and conservation constraints.

Theorem 5.6.5 The minimum node multicut problem can be approximated to within a

O(log k) factor in polynomial time.

Corollary 5.6.6 The (weighted) node deletion graph bipartization problem can be approximated within a factor of O(log n) in polynomial time.

CHAPTER 6

The Min-Cut Max-Flow Ratio 6.1 Introduction Given a multicommodity ow problem, with a demand associated with each commodity, which is the amount of the commodity that we wish to ship, one often needs to know if there is a feasible ow, i.e. a ow that satis es the demands and obeys the capacity constraints. For a feasible ow to exist it is necessary that the capacity of any cut exceed the sum of the demands of the commodities whose source and sink are separated by the cut. The max- ow min-cut theorem for single commodity ow implies that this cut condition is also sucient. In contrast, a multicommodity ow problem can be infeasible even if the cut condition is satis ed. The simplest such example is where the network is K2;3 with unit-capacity edges and unit demand between every pair of nodes not connected by an edge. Instances for which the cut condition is also sucient have been extensively studied. Seymour [54] considered the supply-demand graph, the graph obtained from the original network by adding a sourcesink edge for each commodity; he showed that if this graph excludes K5 as a minor then the cut condition is sucient for ow routability. Okamura and Seymour [42] showed that in a planar graph, if all sources and sinks lie on the boundary of a single face then the cut condition is sucient. 62

6.1. Introduction

63

The optimization version of the multicommodity ow problem, called the concurrent ow problem, rst formulated by Shahrokhi and Matula [55], is to maximize the throughput, which is de ned as the value, f , such that there exists a multicommodity ow shipping a fraction f of the demand of each commodity. An upper bound on the throughput can be obtained by considering cuts in the network. For any cut, the throughput times the sum of the demands of the commodities whose source and sink lie on di erent sides of the cut, cannot exceed the capacity of the cut. Thus the throughput is at most the minimum, taken over all cuts, of the ratio of the capacity of the cut to the demand across the cut ; the cut achieving this minimum ratio, , is called the sparsest cut. The maximum throughput is also referred to as the maximum concurrent ow or the maximum ow (this is di erent from the de nition of maximum ow in the previous Chapters) and the sparsest cut is commonly called the minimum cut. The max- ow min-cut theorem for single commodity ow states that the maximum ow and minimum cut as de ned above are equal. However, equality of maximum ow and minimum cut does not hold for multicommodity ow instances in general. In this situation, the best one can hope for is an approximate max- ow min-cut theorem. In ground-breaking work, Leighton and Rao [34] gave the rst such theorem. They considered the uniform multicommodity ow, a special case of multicommodity ow where there is a demand of one unit between every pair of nodes, and proved the following approximate max- ow min-cut theorem: O(log n)  f  ; where n is the number of vertices in the graph. Alternatively, this means that for the case of uniform demands, it is sucient that the capacity of every cut exceed the demand across the cut by an O(log n) factor. Further, this approximate max- ow min-cut theorem is tight; Leighton and Rao show an example where the ratio between the minimum cut and the maximum ow is O(log n). In proving this approximate max- ow min-cut theorem Leighton and Rao also obtain an O(log n)-approximation algorithm for the sparsest cut in the graph. This algorithm is a key subroutine in algorithms for approximating a variety of NP-hard graph problems; these include nding small-area VLSI layouts [5], graph chordalization and register suciency [30].

64

6. The Min-Cut Max-Flow Ratio

Klein, Agrawal, Ravi and Rao [30] were the rst to prove an approximate max- ow min-cut theorem for general multicommodity ow. They proved that

O(log C log D)  f  ; where C is the sum of the capacities of all edges and D is the sum of all demands. The min-cut max- ow ratio was later improved to O(log n log D) by Tragoudas [57]. In [17] we show that this ratio is O(log k log D), where k is the number of commodities. Subsequently, Plotkin and Tardos [47] gave a method for scaling demands that allows them to prove that up to a small constant factor, the worst min-cut max- ow ratio is obtained for instances with integer demands where D, the sum of the demands is bounded by a small degree polynomial in k. Thus they replace the log D factor in the above bounds for the min-cut max- ow ratio by log k. In this chapter we provide a di erent proof for this tighter bound on the min-cut max- ow ratio.

Theorem 6.1.1 ([30, 57, 17, 47]) f  O(log2 k) Thus, for a multicommodity ow to be feasible it is sucient to have a safety margin of O(log2 k), where k is the number of commodities.

The proof of this theorem also yields a polynomial time algorithm for approximating the sparsest cut within a factor of O(log2 k). Computation of such cuts is a basic step for a variety of approximation algorithms for NP-hard optimization problems. Kahale [26] showed that the problem of approximating the sparsest cut by the maximum throughput can be reduced to the problem of approximating the minimum multicut by the maximum multicommodity ow. Our proof of Theorem 6.1.1 has a similar avor as Kahale's reduction. We also use a scaling technique from [1] to arrive at the claimed bound.

6.2 Preliminaries An instance of the multicommodity ow problem now consists of an undirected graph G = (V; E ), a capacity function, c : E ! R+ , on the edges and k commodities, numbered 1

6.2. Preliminaries

65

through k, where for commodity i, besides the source and sink pair, si ; ti, for that commodity we are also speci ed a non-negative demand, dem(i). We follow the terminology of Chapter 5 and de ne the cut associated with a set S , denoted by 5(S ), as the set of edges with exactly one end point in S . We denote by CE , the total capacity of the edges in the set E 0; thus CE = Pe2E ce . Let DE denote the sum of the demands of the commodities whose source and sink vertices are disconnected by deleting the edges in E 0. Thus C5(S ) ; D5(S ) denote the capacity and the demand across the cut 5(S ). Using this notation the Cut condition can be formulated as 0

0

0

0

Cut condition: 8S  V , C5(S )  D5(S )

The sparsest cut is the cut that minimizes the ratio DC given by C5(S) = Smin V D

S) S)

5(

5(

and the sparsest cut ratio, , is

5(S )

Let Pi denote a collection of paths from si to ti in G and let fi (P ) denote the ow of commodity i along a path P . The amount of ow of commodity i through an edge e is given by PP :e2P fi (P ) and the sum of the ows of all commodities through e is Pki=1 PP :e2P fi (P ). The constraint on ow imposed by the capacity of edge e is given by k X X i=1 P :e2P

fi (P )  ce

The total ow of commodity i is equal to PP 2Pi fi (P ). The requirement that f fraction of the demand of commodity i be routed is re ected in the inequality

X P 2Pi

fi (P )  f  dem(i)

We also require that the ow variables fi (P ) be non-negative and the objective is to maximize the throughput f , subject to these constraints. A multicommodity ow has capacity utilization , u, if it is feasible in the network with capacities given by u  ce . The problem of maximizing the throughput is equivalent to nding a ow of minimum capacity utilization. At optimality f = u1 . The multicommodity

ow is feasible if u  1 (f  1). The linear program for the maximum concurrent ow

66

6. The Min-Cut Max-Flow Ratio

problem can be rewritten into a linear program for minimizing the capacity utilization. minimize subject to

u

Pk P e2E i=1 P :e2P fi (P )  u  ce P f (P )  dem(i) 1  i  k P 2Pi i fi (P )  0 1  i  k; P 2 Pi

The dual of this linear program calls for an assignment of non-negative distance labels, de , to edges e 2 E , subject to the constraint that Pe2E de ce  1; the objective is to maximize Pk dist (s ; t )dem(i), where dist (u; v) is the length of the shortest path between vertices d i i d i=1 u and v under this assignment of distance labels. At optimality, we have k X i=1

distd (si ; ti )  dem(i) = u = f1

An upper bound on Pki=1 distd (si ; ti )  dem(i) for legal distance label assignments d (those satisfying Pe2E de ce  1 and de  0), translates to a lower bound on the value of the maximum throughtput f . Let us try to give a meaning to this dual program. Let d be a 0/1 assignment of distance labels to the edges and let E 0 be the set of edges with de = 1. Then Pe2E de ce is the total capacity of the edges in E 0 and distd (si ; ti ) is the minimum number of edges of E 0 in any path from si to ti . Thus if we route a unit ow of commodity i we shall be exhausting distd (si; ti ) units of capacity of the edge set E 0. Hence, for the multicommodity ow to be feasible the edges in E 0 should have a total capacity of at least Pki=1 distd (si ; ti )  dem(i), i.e. k X X distd (si; ti )  dem(i)  dece i=1

e2E

Alternatively, the capacity utilization of any ow is at least Pk dist (s ; t )  dem(i) i=1 Pd i i e2E de ce We can now dispense with the assumption that the distance function, d, is 0/1. Note that we can make the distance function integral by scaling it with a suciently large number. An edge e with an integral distance label, de > 0, can be viewed as a path of length de , each edge along which has capacity ce and unit distance. A feasible multicommodity ow

6.3. Min-Cut Max-Flow ratio

67

in the original graph is also feasible in this graph obtained by replacing edges by paths and vice-versa. Hence our earlier statement about the capacity utilization of any ow continues to hold for all non-negative distance functions and hence

u

k X i=1

distd (si ; ti )  dem(i) d : de  0;

X

e2E

dece  1

which is the statement of the weak duality theorem of linear programming for this setting. The strong duality theorem says that at optimality k X 1 =u=X dist ( s ; t )  dem ( i ) ; d : d  0 ; dece  1 d i i e f i=1 e2E

In the next section we prove an O(log k) upper bound on Pki=1 distd (si ; ti )  dem(i); this implies a O(log k) lower bound on the maximum throughput f . We shall need the following lemma for proving this. 2

2

Lemma 6.2.1 The ratio of the capacity of a multicut to the demand across it is at least . Proof: Let M be a multicut and U1; U2 : : :Ul be the partition of the vertex set corresponding to M . Since each edge of M runs across two sets of the partition, the sum of the capacities of the cuts corresponding to these sets is equal to twice the capacity of the multicut, i.e. Pl C C U i=1 5(Ui ) = 2CM . Since is the sparsest cut ratio, D Uii  and hence D5(Ui )  C Ui . Therefore, Xl Xl C5(Ui) 2CM = D5(Ui)  i=1 i=1 Since each commodity whose source and sink lie in di erent sets of the partition has its demand counted twice in the sum Pli=1 D5(Ui) , we have 5(

5(

)

5(

)

)

Pl D DM = i=1 2 5(Ui)  C M

6.3 Min-Cut Max-Flow ratio Let d : E ! R+ be an assignment of distance labels to the edges such that Pe2E de ce  1. Let p be a function such that p(i) is the length of the shortest path between the source

68

6. The Min-Cut Max-Flow Ratio

and the sink of commodity i under the distance assignment d, i.e. p(i) = distd (si ; ti). Let 1 < 2 < : : : < l be the distinct values that the function p takes; thus p : f1 : : :kg ! f1; 2; : : :lg. We now need to place an upper bound on the value of Pki=1 p(i)  dem(i).

6.3.1 Uniform Path Lengths If the distance label assignment is such that the lengths of the shortest paths between the source and the corresponding sink of all commodities are equal then the following lemma P gives an upper bound on ki=1 p(i)  dem(i).

Lemma 6.3.1 If p(1) = p(2) = : : : = p(k) then Pki=1 p(i)  dem(i)  4 ln( k+1) . Proof: Let p(1) = p(2) = : : : = p(k) = . Our proof of this lemma is almost identical to the proof of Theorem 5.5.1. Here too we nd a multicut, M , that separates each source-sink pair. The procedure for nding the multicut is similar to procedure multicut. The only di erence is in the choice of  for growing the regions. Since the source and sink vertices corresponding to a commodity are a distance  apart, we can grow of diameter at most . By P dregions c e e 2 ln(k+1) Lemma 5.3.1, choosing  =  and Wr = e kE ensures that the radius of any  region is at most 2 and hence the diameter is at most . 2

The capacity of the multicut, M , is bounded by (Lemma 5.3.2)

X~

CM   

e2E

dece +

X ! i

Wr

(6:1)

Since we would only be picking the source vertices as the roots for growing regions (procedure multicut), the number of regions grown is at most k, the number of commodities. Thus, X X Wr  k  Wr = dece e2E

i

Also, for all edges e, d~e  de and so

X~

e2E

de ce 

X e2E

de ce

6.3. Well-spaced Functions

69

Substituting these bounds on Pe2E d~e ce and Pi Wr into equation 6.1 we have

CM   

X~

e2E

dece +

X ! i

Wr

X  2  ( dece) e2E  2 = 4 ln(q + 1) where the third inequality uses the fact that Pe2E de ce  1. Since this multicut separates each source-sink pair, the total demand across the multicut, DM , is equal to Pki=1 dem(i). Since the ratio of the capacity of a multicut to the demand across it is at least (Lemma 6.2.1) we have k X i=1

dem(i) = DM

 C M q + 1)  4 ln(

and so k X i=1

p(i)  dem(i) = 

k X i=1

dem(i)

 4 ln(q + 1)

6.3.2 Well-spaced Functions De nition 6.3.1 A function f : f1 : : :kg ! f1; 2; : : :lg is said to be well-spaced if i+1  ki ; 1  i  l ? 1.

Thus if the function p is well-spaced, then for any two commodities for which the length of the shortest paths between the source and sink of these commodities are di erent, the length of one is at least k times the length of the other. The usefulness of a well-spaced function is evident from the following lemma.

70

6. The Min-Cut Max-Flow Ratio

Lemma 6.3.2 If p : f1 : : :kg ! f1; 2; : : :lg is well-spaced then Pki=1 p(i)  dem(i)  8 ln(k+1)



.

Proof: We build the sum Pki=1 p(i)  dem(i) in phases. In the jth phase we consider only such commodities for which the length of the shortest path between the source and sink is j . Let Dj be the total demand of this set of commodities. As in the proof of Lemma 6.3.1 we nd a multicut that separates the source and sink vertices of all these P commodities.  However, we now grow regions of diameter at most 2j ; choosing Wr as e kE de ce and  as 4 ln(k+1) would ensure this (Lemma 5.3.1). j 2

Let M be the multicut found. From Lemma 5.3.2 we have that the capacity of this multicut, CM , is at most X~ X ! dece + Wr CM    e2E

i

where d~e is the extent to which edge e is included in the regions grown in this phase and P W is the sum of the weights of the roots of the regions grown in this phase. i r Since this multicut separates each source-sink pair under consideration, the total demand across the multicut, DM , is equal to Dj . By Lemma 6.2.1, DM  C M , and hence

Dj = DM  C M X~ X !   dece + Wr i e2E ! X X 4 ln( k + 1) d~ece + Wr = j i e2E and therefore

X~ X ! 4 ln( k + 1) j Dj  de ce + Wr i e2E

At the end of this jth phase we reduce the distance label of the edge e by d~e to obtain a new distance label assignment. Since we have reduced distances on the edges the length of the shortest path between the source and sink vertices of commodities that have not been considered in phases 1 through j might be reduced. However, since we could have grown at most k regions each of diameter no more than 2j , the maximum possible reduction in the length of any shortest path is k2j . Since, the length of the shortest path between the source

6.3. Well-spaced Functions

71

and sink vertices of commodities not considered so far is at least j +1  kj , the length of the shortest path under this new distance label assignment is at least j +1 ? k2j  j2 . Hence the source and sink vertices corresponding to these commodities are still suciently far apart (at least j2 ). Therefore while nding a multicut in the (j+1)th phase we can be working with regions of diameter j2 without including both the source and sink of some commodity in the same region. +1

+1

+1

Since we always pick the source vertex of some commodity as the root for growing a region, the total number of regions grown in all the l phases is at most the number of commodities. Thus, Xl X X Wr  k  Wr = dece e2E

j =1 i

Further, since at the end of each phase we decrease the distance label on an edge by the extent to which it is included in the regions grown in that phase we have

Xl ~

j =1

and so

1 0 X @ Xl ~ A X de ce = ce de  de ce

Xl X ~

j =1 e2E

de  d e

e2E

j =1

e2E

We can now bound the quantity of interest viz. Pki=1 p(i)  dem(i) as follows k X i=1

p(i)  dem(i) =

Xl

j =1

j Dj

l X X X d~ece + Wr  4 ln(k + 1) i j =1 e2E X  8 ln(k + 1) dece e2E  8 ln(k + 1) where the last inequality uses the fact that Pe2E de ce  1.

!

Note that this lemma continues to hold even when p(i) is not really the length of the shortest path between the source and sink of commodity i, but some number less than that, i.e. p(i)  distd (si ; ti ). We use this observation critically in dealing with functions that are not well-spaced.

72

6. The Min-Cut Max-Flow Ratio

6.3.3 Ill-spaced Functions What do we do when the function p : f1 : : :kg ! f1; 2; : : :lg is not well-spaced? We nd some well-spaced functions, p1; p2; : : :pi ; : : : such that the length of the shortest path between the source and the sink of a commodity is only larger than the value that any function pi takes for that commodity, i.e. pi (j )  p(j ). We now apply Lemma 6.3.2 to each P of these well-spaced length functions to obtain bounds on kj=1 pi (j )  dem(j ) and use these P to derive the claimed bound on kj=1 p(j )  dem(j ).

Theorem 6.3.3 Let f be a function over the natural numbers and let 1  2  : : :  l be the distinct numbers in the range of f , i.e f : N ! f1; 2; : : :lg. Given k, we can split f into k ? 1 well-spaced functions f1 ; f2; : : :fk?1 such that for all j 1. f1 (j )  f2 (j )  : : :  fk?1 (j )  0.

P ?1 f (j )   k  f (j ) 2. f (j )  ik=1 i k?1

Corollary 6.3.4 For all j , fi(j ) 



k k?1

 1  i

f (j )

We use Theorem 6.3.3 to split the function p into well-spaced functions f1 ; f2; : : :fk?1 .1 De ne the ith length function pi to be pi (j ) = i  k ?k 1  fi (j )





Since, i  k?k 1 is a constant and fi is well-spaced, pi is also well-spaced. Further, from Corollary 6.3.4 we have that for all i, pi (j )  p(j ), and hence we can apply Lemma 6.3.2 to these functions. Therefore, k X j =1

pi (j )  dem(j )  8 ln(k + 1)

which, by our de nition of pi(j ), implies

  fi (j )  dem(j )  8 k ?k 1 ln(k i+ 1) j =1 k X

(6:2)

For Theorem 6.3.3 we require that the path-lengths as given by the function p be natural numbers. This can be ensured by scaling all distance labels, de, by a suitably large number. 1

6.4. Splitting into Well-spaced Functions

73

Summing equation 6.2 for functions f1 ; f2 ; : : :fk?1 we have

  kX ?1 fi (j )  dem(j )  8 k ?k 1 ln(k + 1) 1i i=1 i=1 j =1

k kX ?1 X

(6:3)

Interchanging, the order of the summation on the left hand side we have ?1 k kX X j =1 i=1

fi (j )  dem(j ) =

 =

k X j =1 k X j =1 k X j =1

dem(j )

k?1 X i=1

fi (j )

!

dem(j )  f (j ) dem(j )  p(j )

(6.4)

Combining equations 6.3 and 6.4 we have

 k  ln(k + 1) O(log2 k) H ( k ? 1) = p(j )  dem(j )  8 k ? 1 j =1 k X

where H(k ? 1) = 1 + 21 + 13 + : : : + k?1 1 is the (k ? 1)th harmonic sum and is bounded by ln(k ? 1) + 1.

6.4 Splitting into Well-spaced Functions In this section we prove Theorem 6.3.3. The process of splitting f into f1 ; f2 : : :fk?1 is done in two stages. We rst make f `monotone' and then `split' this monotone function.

6.4.1 Making a Function Monotone De nition 6.4.1 Let  = Pi aiki be a natural number.

It is well known that the coecients, ai , are unique and 0  ai  k ? 1.  is monotone if a0  a1  a2 : : :  ai  : : :. The function f : N ! f1; 2; : : :lg is monotone if each i is monotone.

Lemma  6.4.1 Given  we can construct another number ^ which is monotone and   ^  k?k 1 .

74

6. The Min-Cut Max-Flow Ratio

Proof: Let  = Pi ai ki . Find the least i such that ai < ai+1 and increase all of a0 ; a1 : : :ai to ai+1 . Continue in this manner till there is no i such that ai < ai+1 . Let ^ai denote the P increased value of ai and let ^ = i a^i ki . Clearly, ^  . The coecients a^0 ; ^a1; ^a2 : : : a^i : : : form a staircase (Figure 6.1). If a^0 = : : : = a^j > a^j +1 = : : : = a^j > a^j +1 = : : : = a^j > : : : then it follows from the above procedure that a^j = aj , a^j = aj , a^j = aj , : : : Therefore, 1

2

2

2

2

3

3

1

1

1

3

^ = =

X

i j X1 i=0

= aj

1



Suggest Documents