Constant factor Approximation Algorithms for Uniform Hard ...

3 downloads 15400 Views 534KB Size Report
Oct 11, 2016 - The LP solution will set y1 = 1, y2 = 1/M at a cost of 1/M. The IP solution will set y1 = y2 = 1 and will .... then a node may have an unbounded in-degree. .... Let CS be the set of clients in C , for which dj < u and CD be the set of ...
arXiv:1606.08022v2 [cs.DS] 11 Oct 2016

Constant factor Approximation Algorithms for Uniform Hard Capacitated Facility Location Problems: Standard LP is not too bad October 12, 2016 Sapna Grover1 , Neelima Gupta2 , Aditya Pancholi3 1. Shivaji College, University of Delhi, India. [email protected] 2. University of Delhi, India. [email protected] 3. University of Delhi, India. [email protected] Abstract In this paper, we study the uniform hard capacitated knapsack median problem, a common generalization of the k-median problem and facility location problem. Natural LP of the problem is known to have an unbounded integrality gap. We give first constant factor approximation for the capacitated knapsack median problem violating the budget only by an additive fmax where fmax is the maximum cost of a facility opened by the optimal and capacities by 5 factor. To the best of our knowledge no constant factor approximation is known for the problem. We also give a constant factor (O(1/)) approximation for the capacitated facility location problem with uniform capacities by rounding the solution of the natural LP, violating the capacities a little (1 + ). The LP is known to have an unbounded integrality gap without violating the capacities, even when the capacities are uniform. Thus, we achieve the best possible from the natural LP for the problem. The result shows that standard LP is not too bad for capacitated facility location problem.

1

Introduction

Facility location problem (FLP) is one of the widely studied problems in computer science and operational research. The problem is well known to be NP-hard. Li [20] gave a 1.488 factor algorithm almost closing the gap between the best known and best possible of 1.463 by Guha et al. [14] for the metric uncapacitated case. Best known approximation ratio of 2.675 +  for (uncapacitated) k-median was given by Byrka et al. [7]. 1

On the other hand, for the capacitated version of the problem (CapFLP), standard LP is known to have an unbounded integrality gap even when the capacities are uniform for the most basic variant of the problem, namely, metric facility location problem. Integrality gap example for Capacitated FLP Consider two facilities such that f1 = 0, f2 = 1, each with M units of capacity and M + 1 clients, each with unit demand. The LP solution will set y1 = 1, y2 = 1/M at a cost of 1/M . The IP solution will set y1 = y2 = 1 and will incur a cost of 1. Note that the example breaks when the capacities are allowed to be violated a little. In particular, for a fixed  > 0 if we allow the capacities to be violated by a factor of (1 + ), then the gap is bounded by O(1/). Local search technique has been particularly found useful to deal with capacities. The approach provides 3 factor for uniform capacities [2] and 5 factor for the non-uniform case [4]. Few LP-based algorithms known for the problem are due to [3, 5, 18]. Levi, Shmoys and Swamy [18] gave a 5-factor algorithm for CapFLP, for a restricted version of the problem in which all facility costs are same. In a recent work, An, Singh and Svensson [3] gave a constant factor approximation by strengthening the natural LP. Recent result on capacitated k-Facility Location problem by Byrka et al. [5] give a result for CapFLP as a particular case. They give an O(1/2 ) factor approximation violating the capacities by a factor of (2 + ). Capacitated k-median is much less understood. Obtaining a constant factor approximation algorithm for the problem is open. Existing solutions giving constant-factor approximation, violate at least one of the two (cardinality and capacity) constraints. Natural LP is known to have an unbounded integrality gap when any one of the two constraints is allowed to be violated by a factor less than 2. For the case of uniform capacities, several results [5, 9, 11, 15, 19] have been obtained that violate either the capacities or the cardinality by a factor of 2 or more. In case of nonuniform capacities, a (7 + ) algorithm was given by Aardal et al. [1] violating the cardinality constraint by a factor of 2 as a special case of Capacitated k-FLP when the facility costs are all zero. Li [21] broke the barrier of 2 in cardinality and gave an exp(O(1/2 )) approximation using at most (1 + )k centers (facilities) for uniform capacities. Li gave a sophisticated algorithm using a novel linear program called rectangle LP. The result was extended to nonuniform capacities in [22] by the same author using a new LP called configuration LP. The approximation ratio was also improved from exp(O(1/2 )) to (O(1/2 log(1/))). Though the algorithm violates the cardinality only by 1 + , it introduces a softness bounded by a factor of 2. The running time of the algorithm is nO(1/) . Very recently, Byrka et al. [8] broke the barrier of 2 in capacities and gave an O(1/2 ) approximation violating capacities by a factor of (1 + ) factor for uniform capacities. The algorithm uses randomized rounding to round a fractional solution to the configuration LP. For non-uniform capacities also, a similar result has been obtained by Demirci et al. in [13]. The paper presents an O(1/5 ) approximation algorithm with capacity violation by a factor of at most (1 + ). The running time of the algorithm is nO(1/) . In this paper, we study the uniform capacitated variants of some facility location problems. In particular, we study capacitated knapsack median problem and facility location problem. Knapsack median is a common generalization of k-median and facility location problems. For knapsack median, the natural LP is known to have an unbounded integrality gap [10] even for the uncapacitated variant of the problem. Krishnaswamy et al. [16] showed that the 2

integrality gap holds even on adding the covering inequalities to strengthen the LP, and gave a 16 factor approximation that violates the budget constraint by an additive factor of fmax , the maximum opening cost of a facility in optimal. Kumar [17] gave first constant factor approximation without violating the budget constraint. Kumar strengthened the natural LP by obtaining a bound on the maximum distance a client can travel. Charikar and Li [12] reduced the large constant obtained by Kumar to 34 which was further improved to 32 by Swamy in [23]. Byrka et al. [6] extended the work of Swamy by applying sparsification as a pre-processing step to obtain a factor of 17.46. To the best of our knowledge, no constant factor algorithm is known for the capacitated variant of the problem even when capacity/budget/both are allowed to be violated by a constant factor. In particular, we present the following result: Theorem 1. There is a polynomial time algorithm that approximates hard uniform capacitated knapsack median problem within a constant factor violating the capacity by a factor of 5 and budget only by an additive factor of fmax where fmax is the maximum cost of a facility in the optimal. We also present a constant factor (O(1/)) approximation for capacitated facility location problem with uniform capacities violating the capacities a little, by (1+). Though constant factor results are known for the problem without violating the capacities [2, 3], the result is interesting as it is obtained by rounding the solution to the natural LP. In particular, we give the following result: Theorem 2. There is a polynomial time algorithm that approximates hard uniform capacitated facility location problem within a constant factor (O(1/)) violating the capacity by a factor of at most (1 + ) for a fixed 0 <  < 1/2. The result shows that natural LP is not too bad for capacitated facility location problem. High Level Idea of our approach: We combine the ideas from [5] and [16]. First, facilities and demands are partitioned into clusters so that they can be handled independently, as is done in [5]. Each cluster has a single client j called the cluster center with total demand dj of the cluster. We define the notion of sparse and dense clusters based on the total demand served by the facilities in a cluster instead of the total openings. We say that a cluster is sparse if the total demand is less than the capacity and dense otherwise. In a sparse cluster centered at j, a facility is opened (if any) within a bounded distance (twice per unit connection cost paid by j in LP optimal). If no facility is opened in the cluster of j, another cluster centered at, what we denote by λ(j), is identified within a bounded distance from j and a facility opened in the cluster of λ(j) serves the demand of j. λ(j) could be a center of a sparse cluster or a dense cluster. One potential candidate for λ(j) is the closest other cluster center (denoted by η(j)). However, this may lead to unbounded violation of capacity at the facility opened in the cluster of η(j) as, one cluster center may be nearest to many such j’s. So to say, if we construct a forest of rooted trees, with (j, η(j)) edges, then a node may have an unbounded in-degree. These trees are then converted into binary trees by a standard procedure. Let (j, σ(j)) be an edge in a tree. Thus, σ(j) becomes a potential candidate for λ(j). If we could guarantee to open a facility in σ(j), we could assign the demand of j to the opened facility violating the capacity by a small constant factor. However, we cannot guarantee that. We group the clusters in pairs and guarantee to open

3

one facility in each such group. We also form a forest of groups. If Gj (j, jc ) represents a group where (jc , j) is a tree edge (and hence j = σ(jc )) and we open a facility in the cluster of j, we can assign the demand of both j and jc to it at a loss of factor 2 in capacity with bounded cost. If the facility is opened in the cluster of jc , we cannot assign the demand of j to it as the cost of going from j to jc cannot be bounded. Since one facility is guaranteed to be opened in each group and the edge costs are monotonically decreasing as we go up the tree, demand of j can be served by a facility opened in the parent group within bounded distance. Since the tree of cluster centers is a binary tree, it leads to capacity violation of at most 5 factor. For a dense cluster, we may need to open more than one facility. Since we cannot guarantee the existence of more than one facility within this radius (twice per unit connection cost paid by j in LP optimal), we may have to look for them outside this radius in the cluster. However, we are always able to serve the demand of a dense cluster within the cluster itself. We define the notion of pseudo-integral solution. A solution is said to be pseudo-integral if at most two facilities are fractionally opened. To identify which facility/facilities to open in a group and in dense clusters, we define a new LP and obtain an optimal solution to it which is pseudo-integral, using the properties of extreme point solutions. By opening these fractional facilities suitably, we obtain an integrally open solution violating the budget by an additional fmax . These ideas are similar to that in [16]. Min-cost flow is then used to obtain integral assignments.

2

Capacitated Knapsack Median Problem

Knapsack median problem is a generalization of the k -median problem where-in facilities have opening costs and there is a budget on the total cost of open facilities. We are given a set of clients C, a set of facilities F and a real valued distance function c(i, j) on F ∪ C in metric space. Each client j has a unit demand associated with it, each facility i has an opening cost fi , and we have a budget B. The goal is to open a set of facilities and assign clients to them so as to minimize the total connection cost such that the total facility cost of the opened facilities is at most B. When B = k and fi = 1, problem reduces to the k-median problem. In capacitated knapsack median, each facility i also has an associated capacity ui which limits the maximum amount of demand it can serve. We deal with the case when ui = u ∀i and denote it by unifKnM. The Integer Program (IP) for instance (C, F, c, f, u, B) of unifKnM is given as follows: P P M inimize i∈F j∈C c(i, j)xij P

i∈F

xij = 1

xij ≤ yi P fi yi ≤ B P i∈F j∈C xij ≤ u yi yi , xij ∈ {0, 1}

∀j∈C

(1)

∀ i ∈ F, j ∈ C

(2) (3)

∀i∈F

(4) (5)

where yi = 1 if and only if facility i is open and xij = 1 if client j is assigned to facility i. First constraint makes sure that every client is served, second constraint ensures that a client 4

is served only by an open facility. Constraint (3) says that total cost of opened facilities is within the given budget, and the last constraint ensures that the total assignment on an open facility does not exceed its capacity. LP-Relaxation of the problem is obtained by allowing the variables yi and xij to be fractional. Call it LP-unifKnM. We first define some notations which will be used throughout the paper. P For an LP solution σ =< x, y >, j ∈ C and a subset T of facilities, let Costs (j, σ, T ) = i∈T xij c(i, j) denote the service/connection cost paid by client j for getting served by facilities in T , P Costs (σ, T ) = j∈C Costs (j, σ, T ) denote the P total service cost paid by all the clients for getting served by facilities in T , CostP f (y, T ) = i∈T fi yi denote the total facility opening cost of facilities in T , Size(y, T )P = i∈T yi denote the total extent upto which facilities are opened in T and Aσ (j, T ) = i∈T xij denote the total assignment of j on facilities in T , under σ. To begin with, we guess the facility with maximum opening cost, fmax , in the optimal solution. We remove any facility i with fi > fmax before applying the algorithm. The algorithm is run for all possible choices for fmax and the solution with minimum cost is selected. This can be done in polynomial time because there are only |F| choices for fmax . Our algorithm consists of three main steps. First step consists of clustering. In the second step, we obtain the pseudo-integral solution, which is rounded in the third step to obtain an integrally open solution. Min-cost flow is finally used to obtain the integral assignments.

2.1

Clustering

Let σ ∗ =< x∗ , y ∗ > denote the optimal solution of LP-unifKnM. Let Cˆj denote the average P connection cost of a client j in σ ∗ i.e. Cˆj = i∈F c(i, j)x∗ij . Let ball(j) be the set of facilities within a distance of 2Cˆj of j i.e. ball(j) = {i ∈ F : c(i, j) ≤ 2Cˆj }. Then, Size(y ∗ , ball(j)) ≥ 1 2 by the standard averaging argument. Let Rj = 2Cˆj denote the radius of ball(j). Consider the clients in non-decreasing order of their radii. Let S = C, C 0 = φ. Let j be the client with smallest radius Rj in S. Add j to C 0 . Delete j from S. For a client j 0 ∈ S with c(j, j 0 ) ≤ 4Cˆj 0 , remove j 0 from S and let ctr(j 0 ) = j. Repeat until S is empty. Lemma 1. [5] The following holds1. (Separation Property) For any j 6= j 0 ∈ C 0 , it holds that c(j, j 0 ) > 4max{Cˆj , Cˆj 0 }. 2. For any j 0 ∈ C \ C 0 , there is a j ∈ C 0 with ctr(j 0 ) = j and c(j, j 0 ) ≤ 4Cˆj 0 . Proof. 1. Wlog, assume that j was added to C 0 before j 0 , then Cˆj ≤ Cˆj 0 . Hence, c(j, j 0 ) > 4Cˆj 0 for otherwise, j 0 would have been removed from S when j was considered. 2. If j 0 ∈ / C 0 , then there must have been a j ∈ C 0 such that j 0 was removed when j was added to C 0 . Thus, ctr(j 0 ) = j and c(j, j 0 ) ≤ 4Cˆj 0 . Lemma 2. Let j 0 ∈ C \ C 0 and j ∈ C 0 such that c(j, j 0 ) < Rj , then Rj ≤ 2Rj 0 .

5

Proof. Suppose, if possible, Rj > 2Rj 0 . Let ctr(j 0 ) = k (k could be j itself). Then, c(j 0 , k) ≤ 2Rj 0 . And, c(k, j) ≤ c(k, j 0 ) + c(j 0 , j) < 2Rj 0 + Rj < 2Rj , which is a contradiction to claim 1 of Lemma (1). For each j ∈ C 0 , define cluster Nj as the set of facilities to which j is nearest amongst all the clients in C 0 ; that is Nj = {i ∈ F | ∀j 0 ∈ C 0 : j 6= j 0 ⇒ c(i, j) < c(i, j 0 )} assuming that the distances are distinct. j is called the center of the cluster. Note that ball(j) ⊆ Nj and the sets Nj partition F. Lemma 3. [5] Let j ∈ C 0 and i ∈ Nj then, 1. For any client j 0 ∈ C 0 , c(j, j 0 ) ≤ 2c(i, j 0 ) 2. For any client j 0 ∈ C \ C 0 , c(j, j 0 ) ≤ 2c(i, j 0 ) + 2Rj 0 Proof. If j 0 ∈ C 0 then using triangle’s inequality and using the fact that i belongs to the ball of j and not j 0 we get c(j, j 0 ) ≤ 2c(i, j 0 ). If j 0 ∈ / C 0 , there exists k ∈ C 0 such that ctr(j 0 ) = k and c(j 0 , k) ≤ 4Cˆj 0 . Also, c(i, j) ≤ c(i, k) because i ∈ Nj and not Nk . Bsy triangle inequality, c(i, k) ≤ c(i, j 0 ) + c(j 0 , k) ≤ c(i, j 0 ) + 4Cˆj 0 = c(i, j 0 ) + 2Rj 0 . Therefore, c(j, j 0 ) ≤ c(i, j) + c(i, j 0 ) ≤ 2c(i, j 0 ) + 2Rj 0 . total demand of clients in C serviced by facility i in σ ∗ i.e. li = P Let l∗i denote the P 0 j 0 ∈C xij 0 . Let dj = i∈Nj li . Let CS be the set of clients in C , for which dj < u and CD be the set of remaining clients in C 0 . We call the clusters centered at j ∈ CS as sparse and those centered at j ∈ CD as dense. Define NCD = ∪j∈CD Nj and NCS = ∪j∈CS Nj .

2.2

Constructing forest of cluster centers

In this section, we construct the forest of rooted trees of the cluster centers. But, before that we present some results which will motivate us to define our trees. Lemma (4) bounds the cost of distributing the demand dj of a cluster centered at j to the facilities that served j in σ ∗ , proportionately. This, together with Lemma (5) bounds the cost of transferring a part of dj to η(j). Lemma 4.

X

dj

j∈C 0

Proof.

X

dj

j∈C 0

=

XX j∈C 0

X

X

c(i, j)x∗ij ≤ 3

i∈F

c(i, j)x∗ij =

i∈F

XX

c(i, j)x∗ij

j∈C i∈F

X

dj Cˆj (by the definition of Cˆj )

j∈C 0

Aσ∗ (j 0 , Nj )Cˆj

j 0 ∈C

Let j ∈ C 0 . If Rj ≥ c(j, j 0 ) then we know j 0 ∈ / C 0 . Thus, using Lemma (2), we have Rj ≤ 2Rj 0 and, XX XX 1XX 1XX Aσ∗ (j 0 , Nj )Cˆj = Aσ∗ (j 0 , Nj )Rj ≤ Aσ∗ (j 0 , Nj )2Rj 0 = Aσ∗ (j 0 , Nj )Rj 0 2 2 0 0 0 0 0 0 0 0 j∈C j ∈C

j∈C j ∈C

j∈C j ∈C

6

j∈C j ∈C

XX

=2

Aσ∗ (j 0 , Nj )Cˆj 0

j∈C 0 j 0 ∈C

 X

=2

 X

 j 0 ∈C

X

=2

Aσ∗ (j 0 , Nj ) Cˆj 0

j∈C 0

Cˆj 0

j 0 ∈C

=2

XX

c(i, j)x∗ij

j∈C i∈F

Otherwise, (i.e. if Rj < c(j, j 0 )) using Lemma (3) we have c(j, j 0 ) ≤ 2c(i, j 0 ) + 2Rj 0 and, XX 1XX 1XX Aσ∗ (j 0 , Nj )Cˆj = Aσ∗ (j 0 , Nj )Rj < Aσ∗ (j 0 , Nj )c(j, j 0 ) 2 2 j∈C 0 j 0 ∈C j∈C 0 j 0 ∈C j∈C 0 j 0 ∈C   XX X ≤ 12 x∗ij 0 2c(i, j 0 ) + 2Rj 0 j∈C 0 j 0 ∈C i∈Nj

=

XX X j∈C 0

=

j 0 ∈C

i∈Nj

XX X

=

j 0 ∈C

=

j 0 ∈C

c(i, j 0 )x∗ij 0 + 2

X

Rj 0

X

Aσ∗ (j 0 , Nj )Rj 0 X

Aσ∗ (j 0 , Nj )

j∈C 0

Cˆj 0

j 0 ∈C

c(i, j 0 )x∗ij 0 + 2

XX j 0 ∈C

i∈F

X X

=3

j 0 ∈C

j 0 ∈C

i∈F

XX

j∈C 0

x∗ij 0 c(i, j 0 ) +

j 0 ∈C j∈C 0 i∈Nj

XX

XX

x∗ij 0 c(i, j 0 ) +

c(i, j

0

c(i, j 0 )x∗ij 0

i∈F

)x∗ij 0

j 0 ∈C i∈F

Hence, the claim follows. Next, we define η(j) formally. For each j ∈ C 0 , define η(j) as follows: if j ∈ CS , then η(j) = j 0 6= j for some j 0 ∈ C 0 , s.t c(j, j 0 ) ≤ c(j, j 00 ) ∀j 00 ∈ C 0 , and if j ∈ CD , then η(j) = j. Lemma 5. For the solution σ ∗ =< x∗ , y ∗ >;    X X X X X ∗ ∗ ∗ dj c(i, j)xij + c(j, η(j)) 1 − xij ≤2 dj c(i, j)xij j∈CS

i∈Nj

i∈Nj

j∈C 0

i∈F

Proof. Letus first consider the second term ofLHS.   X  X X X X  ∗ ∗ dj c(j, η(j)) 1− xij = dj c(j, η(j))xij ≤ dj j∈CS

i∈Nj

j∈CS

i∈N / j

(by the definition of η(j))  X  X X 2c(i, j)x∗ij (by Lemma 3-(1)) ≤ dj j∈CS

j 0 ∈C 0 :j 0 6=j i∈Nj 0

7

j∈CS

X

X

j 0 ∈C 0 :j 0 6=j i∈Nj 0

c(j, j

0

)x∗ij



X

=2

dj

X

j∈CS

c(i, j)x∗ij



i∈N / j

Thus    X    X X X X X ∗ ∗ ∗ c(i, j)xij + dj xij ≤ dj c(j, η(j)) 1− c(i, j)xij + dj i∈Nj j∈CS i∈Nj X   X X ∗ ∗ dj c(i, j)xij ≤ 2 dj c(i, j)xij

j∈CS

2

X j∈CS

j∈C 0

i∈N / j

X j∈CS

6

c(i,

i∈Nj

i∈F

Combining Lemmas (4) and (5), we get XX

j∈CS

dj

X

 X  c(i, j)x∗ij +c(j, η(j)) 1− x∗ij ≤

i∈Nj

i∈Nj

j)x∗ij

j∈C i∈F

Next, we construct a forest F of client centers in C 0 . The forest consists of a collection of rooted trees with client centers as nodes and (j, η(j)) as directed edges. Note that, for every j ∈ CD , we have an outgoing edge that points to itself. That is any j ∈ CD will be always at the root. However, a tree may have a root which does not belong to CD if for some j, j 0 ∈ CS , we have η(j) = j 0 and η(j 0 ) = j. In this case, we have a 2 length cycle at the root, we arbitrarily remove one of the cycle edges. Note that, the in-degree of nodes in the forest can be unbounded. We modify the forest to bound the in-degree of nodes. Let T be a tree in F, For every node j ∈ T , sort all the children of j from left to right by non decreasing distance from j. Remove all incoming edges of j from T except the shortest one. Add a directed edge from each child of j to its left sibling (if it exists). Let T 0 denote the new tree obtained, F0 be the resulting forest and σ(j) be the parent of j in F0 . If a node is root, then σ(j) is j itself.   X X X Lemma 6. dj c(i, j)x∗ij + c(j, σ(j)) 1 − x∗ij ≤ 12LPopt j∈CS

i∈Nj

i∈Nj

Proof. We will show that for a node j in F0 , c(j, σ(j)) ≤ 2c(j, η(j)). The claim then follows using Lemmas (4) and (5). While converting F into F0 , some edges were changed. σ(j) = η(j) for a node j whose edge did not change and hence, the result holds trivially. Consider a j such that σ(j) was the left brother of j in T . Using triangle inequality we have c(j, σ(j)) ≤ c(σ(j), η(j)) + c(j, η(j)) and by construction of binary trees c(σ(j), η(j)) ≤ c(j, η(j)), so we can write c(j, σ(j)) ≤ 2c(j, η(j)). If we could guarantee to open a facility in cluster of j/σ(j) for every j in C 0 , then we are done. Unfortunately, we cannot do that. However, we pair up the cluster centers in CS in groups and guarantee to open at least one facility in each group. This is explained in the next section.

2.3

Obtaining a pseudo-integral solution

In this section, we define a new LP which provides us with a solution having (at most) two fractionally opened facilities. In order to do so, we group nodes in F0 in sizes of 2 as follows: 8

Let T be a tree in F0 . Nodes in T are grouped by a top-down greedy procedure starting from the root node r. If r ∈ CD , we do nothing. That is, dense clusters are not grouped. Let j be the topmost node in CS not yet grouped. Let jc be the child of j with minimum c(jc , j). Form a new group consisting of {j, jc }. We’ll use the notation Gj (j, jc ) to denote the group consisting of j and jc with j as the parent of jc in T . Repeat the above procedure until groups of size 2 cannot be formed. Let L denote the set of all the leaf nodes that did ˜ We now write a new LP, not become a part of any group. Let C˜ = C 0 \ L and C˜S = CS ∩ C. ˜ called LPnew . Let T (j) = ball(j) if j ∈ CS and T (j)  = Nj if j ∈CD . X X X X X c(i, j)wi wi  + u dj  c(i, j)wi + c(j, σ(j)) 1 − Minimize j∈CS

j∈CD i∈Nj

i∈Nj

i∈Nj

X

wi ≤ 1 ∀ j ∈ CS

(6)

dj c ∀ j ∈ CD u

(7)

wi ≥ 1 ∀ Groups (Gj1 , Gj2 ) : j1 , j2 ∈ C˜S

(8)

X

X

fi wi ≤ B

(9)

j∈C˜S ∪CD

i∈T (j)

0 ≤ wi ≤ 1

(10)

i∈T (j)

X

wi ≥ b

i∈T (j)

X

X

wi +

i∈T (j1 )

i∈T (j2 )

 X

Lemma 7. A feasible solution w0 to LPnew can be obtained such that

dj 

j∈C˜S

u

X X

c(i,

j)wi0

 X

c(i, j)wi0 + c(j, σ(j)) 1 −

i∈Nj

≤ 17LPopt .

j∈CD i∈T (j)

Proof.PWe define a feasible solution to the above LP as follows. Let j ∈ CD , i ∈ T (j). Set 0

x∗

0

ij wi0 = j ∈C = lui ≤ yi∗ . For j ∈ CS , i ∈ T (j), we set wi0 = min{x∗ij , yi∗ } = x∗ij < yi∗ . We u will next show that the feasible. Xsolution isX 0 Let j ∈ CS , then wi = x∗ij ≤ 1.

i∈T (j)

Next, let j ∈ CD , then

i∈ball(j)

X

wi0

wi0

=

P

j 0 ∈C

x∗ij 0

u

i∈Nj

i∈T (j)

X

X

=

X li dj dj = ≥ b c. Note that u u u

i∈Nj

≥ 1 as dj ≥ u.

i∈T (j)

X

For j1 , j2 ∈ C˜S , we have

i∈T (j1 )

Also,

X

X

j∈C˜S ∪CD

i∈T (j)

fi wi0



X

wi0 +

i∈T (j2 )

X X j∈C 0

X

wi0 =

fi wi0

i∈ball(j1 )



X X j∈C 0

i∈Nj

9

i∈Nj

X

x∗ij + fi yi∗

i∈ball(j2 )

≤B

x∗ij ≥

1 1 + ≥1 2 2

X

Next, consider the objective function. For j ∈ CD , we have

u c(i,

j)wi0

=u

c(i, j)x∗ij 0 ≤

X X

X X X

 c(i, j 0 ) + 4Cˆj 0 x∗ij 0 . Summing over all j ∈ CD we have,

x∗ij 0 [c(i, j 0 ) + 4Cˆj 0 ] =

j∈CD i∈Nj j 0 ∈C

X X X X i∈NCD



j )+4



i∈F

x∗ij 0 c(i, j 0 ) + 4

X X X

x∗ij 0 Cˆj 0

j∈CD i∈Nj j 0 ∈C

x∗ij 0 Cˆj 0

i∈NCD j 0 ∈C

x∗ij 0 c(i, j 0 ) + 4

X

Cˆj 0

j 0 ∈C

x∗ij 0 c(i,

0

j )+4

i∈NCD j 0 ∈C

XX

X X

0

j 0 ∈C

X X

X X X j∈CD i∈Nj j 0 ∈C

x∗ij 0 c(i,

i∈NCD j 0 ∈C

=

c(i, j)(

i∈Nj j 0 ∈C

i∈Nj j 0 ∈C

=

P

i∈Nj

i∈T (j)

X X

X

X

X

x∗ij 0

i∈NCD

Cˆj 0 because

j 0 ∈C

x∗ij 0 c(i, j 0 ) + 4

j 0 ∈C

X

X

x∗ij 0 ≤ 1

i∈NCD

Cˆj 0 because |NCD |≤ |F|

j 0 ∈C

= LPopt + 4LPopt = 5LPopt  X X X  Also, dj c(i, j)wi0 + c(j, σ(j)) 1 − wi0 ≤ 12LPopt by Lemma (6). j∈CS

i∈Nj

Thus, the solution w0 is feasible and

X

i∈Nj   X X dj  c(i, j)wi0 + c(j, σ(j)) 1 − wi0 +

j∈C˜S

u

X X

c(i,

j)wi0

i∈Nj

i∈Nj

≤ 17LPopt .

j∈CD i∈T (j)

We now present an iterative algorithm that outputs a set A of facilities having at most two fractionally opened facilities.

Iterative Algorithm: Obtaining a pseudo-integral solution 1. Initialize A = ∅, F˜ = F. 2. While F˜ 6= ∅, do (a) Compute an extreme point solution w ˜ to the above LP. ˜ ˜ (b) Let F0 = {i ∈ F : w ˜i = 0}. Remove all facilities in F˜0 from F˜ i.e. set F˜ = ˜ ˜ F \ F0 . (c) Let F˜1 = {i ∈ F˜ : w ˜i = 1}. Remove all facilities in F˜1 from F˜ and add them to X ˜ F˜1 and A = A ∪ F˜1 . Also, set B = B − A i.e. set F˜ = F\ fi w ˜i . ˜1 i∈F

(d) If there does not exist any w ˜i that is 0 or 1 then break. 10

j 0 ∈C

u

x∗ij 0

)=

3. Return A. Lemma 8. The solution w ˜ given by Iterative Algorithm satisfies the following: 1. w ˜ has at most two fractional facilities. X X 2. fi w ˜i ≤ B. j∈C 0 i∈T (j)

3. Cost function has value ≤ 17LPopt . Proof. Consider the iteration when the algorithm reaches step (2d). Let the linearly independent tight constraints corresponding to (6) and (7) be denoted as X and the ones corresponding to (8), i.e. all groups, be denoted as Y. Because of the laminar nature of constraints, all sets in X ∪ Y are disjoint. Also, each set in X ∪ Y has at least two fractional variables because w ˜i = (0, 1) ∀ i and constraints (6), (7) and (8) are tight. This implies that the number of variables is at least |2X |+|2Y|, whereas the number of linearly independent tight constraints is at most |X |+|Y|+1. Since, the number of variables and the number of linearly independent tight constraints are equal at an extreme point, we get |X |+|Y|≤ 1 and that each set in X ∪ Y has at most two variables. Hence, each set in X ∪ Y, if any, has exactly two fractional variables. X X X X fi w ˜i = fi w ˜i ≤ B follows as in step (2a), we compute a feasible soluj∈C 0 i∈T (j)

j∈C˜ i∈T (j)

tion of the LPnew , reduce the budget by the cost of opened facilities for subsequent iterations and maintain feasibility in every iteration. Claim (3) follows as we compute extreme point solution in step 2(a) in every iteration and the cost never increases in subsequent iterations.

2.4

Obtaining an integrally open solution

In this section, we round the at most two fractionally opened facilities obtained in the previous section violating the budget by an additional cost of fmax . Lemma 9. Given an optimal pseudo-integral solution w, ˜ an integrally open solution w ˆ can X X be obtained such that fi w ˆi ≤ B + fmax . j∈C 0 i∈T (j)

Proof. If there is no fractionally opened facility, then set w ˆi = w ˜i ∀i ∈ F. If there is only one fractionally opened facility i0 , then set w ˆi0 = 1, w ˆi = w ˜i ∀i ∈ F. Else, let {i1 , i2 } denote the fractionally opened facilities in solution w. ˜ Set w ˆi = w ˜i ∀i ∈ F{i1 , i2 }. For {i1 , i2 }, do the following: 1. If both the facilities belong to the same cluster j, i.e. {i1 , i2 } ⊆ T (j), then if c(i1 , j) ≤ c(i2 , j), set w ˆi1 = 1, w ˆi2 = 0, A = A ∪ {i1 }, else set w ˆi1 = 0, w ˆi2 = 1, A = A ∪ {i2 }. Break. 2. If the facilities belong to two different clusters of a group Gj (j, jc ), i.e. {i1 , i2 } ⊆ T (j) ∪ T (jc ), then set w ˆi1 = 1, w ˆi2 = 1, A = A ∪ {i1 ∪ i2 } and break. 11

f i1 w ˆi1 + fi2 w ˆi2 ≤ fi1 w ˜i1 + fi1 (1 − w ˜i1 ) + fi2 w ˜i2 + fi2 (1 − w ˜i2 ) ≤ fi1 w ˜i1 + fi2 w ˜i2 + fmax (1 − w ˜ i1 + 1 − w ˜i2 ) = fi1 w ˜i1 + fi2 w ˜i2 + fmax (2 − (w ˜i1 + w ˜i2 )) ≤ fi1 w ˜ i1 + f i2 w ˜i2 + fmax , where the last inequality follows because w ˜i1 + w ˜i2 = 1. Thus, the budget loss in either case is no more than fmax and hence, X X X X fi w ˆi ≤ fi w ˜i + fmax = B + fmax . j∈C 0 i∈T (j)

j∈C 0 i∈T (j)

Next, we present a feasible assignment of bounded cost. We first do so for dense clusters. X dj w ˜i = b c or else we could reduce w Consider j ∈ CD . Note that ˜i and obtain a u i∈Nj X X X dj better solution. Also, as for j ∈ CD , w ˆi = w ˜i we have w ˆi = b c. Define u i∈Nj i∈Nj i∈Nj X d res(j) = dj − u w ˆi . Note that, res(j) = dj − ub uj c < u. i∈Nj

Lemma 10. Given an integrally open solution w, ˆ it is possible to obtain an assignment gˆi by distributing the demand dj for each j ∈ CD such that: 1. gˆi ≤ 2u wˆi ∀i ∈ Nj . 2.

X

gˆi = dj .

i∈Nj

3.

X

gˆi c(i, j) ≤ 2u

i∈Nj

X

w ˆi c(i, j).

i∈Nj





   res(j)  ˆi < u w ˆi + u · 1 = 2u w ˆi , Proof. For all j ∈ CD , i ∈ Nj , set gˆi = u + X w  w ˆi  i∈Nj w ˆi

where the last inequality follows because X

≤ 1 and res(j) < u. Also, w ˆi

X

gˆi =

i∈Nj

i∈Nj





 X X X   res(j) X u + res(j)  w ˆ = u w ˆ + w ˆ = u w ˆi + res(j) = dj . Thus, claim i i i X X    i∈Nj i∈Nj w ˆi  i∈Nj w ˆi i∈Nj i∈Nj

i∈Nj

(2) follows. Claim (3) follows from Claim (1) trivially. Next, we define the assignments of demands for sparse clusters. Let j ∈ CS and λ(j) denote the center of the cluster that serves the demand of j (we will identify λ(j) shortly). To be able to do so, we define a structure on the groups defined in previous section. The structure is nothing but a forest. A tree in the forest consists of groups as nodes and there 12

is an edge from a group Gu (u, uc ) to a group Gv (v, vc ), if there is a directed edge from u to vc or v in F0 . Gv is then called the parent group of Gu and Gu a child group of Gv . ˆ can be obtained such that: Lemma 11. Given an integrally open solution w, ˆ assignments h 1. If j ∈ CS and λ(j) ∈ C˜S , then dj c(j, i∗ (j)) ≤ 25 dj c(j, σ(j)) where i∗ (j) is the facility, in the cluster of λ(j), that serves the demand of j. 2. If j ∈ CS and λ(j) ∈ CD , then dj c(j, λ(j)) ≤ dj c(j, σ(j)). ˆi ≤ u w ˆi ≤ 3 u w 3. h ˆi ∀i ∈ NCD and hence gˆi + h ˆi . ˆi ≤ 5 u w 4. h ˆi ∀i ∈ NCS . ˆ for the rounded solution w ˆ i = 0 ∀i ∈ Proof. We define the assignments h ˆ as follows. Set h ∗ ˆ ˜ NCS for which w ˆi = 0. ∀j ∈ CS s.t. ∃i ∈ ball(j) with w ˆi = 1 set hi = dj , i (j) = i and λ(j) = j. For all j ∈ CS s.t. @ any i ∈ ball(j) with w ˆi = 1, there are two possibilities: Case I: σ(j) ∈ C˜S If j, σ(j) belong to the same group, Gσ(j) , then ∃ i ∈ ball(σ(j)) such that w ˆi = 1 (since no facility is opened in j, there must be an open facility in σ(j)). Assign the demand of ˆi = h ˆ i + dj , i∗ (j) = i and λ(j) = σ(j). By triangle inequality, we have j to i i.e., set h c(j, i∗ (j)) ≤ c(j, λ(j)) + c(λ(j), i∗ (j)). c(λ(j), i∗ (j)) ≤ Rλ(j) = 2Cˆλ(j) < 21 c(j, λ(j)) (by separation property, Lemma 1). This implies, c(j, i∗ (j)) ≤ 23 c(j, σ(j)). Else (i.e. j, σ(j) belong to two different groups) consider two groups Gr and Gs s.t j ∈ Gr and σ(j) ∈ Gs . Note that, Gr must be a child group of Gs . Let ˆj be the other cluster member in the group Gs . Then ∃ i ∈ ball(σ(j)) ∪ ball( ˆj) such that w ˆi = 1. If more than one facility is open, choose one arbitrarily. Assign the demand of j to i i.e., set ˆi = h ˆ i + dj , i∗ (j) = i and λ(j) = σ(j) if i ∈ ball( σ(j)) else λ(j) = ˆj. Consider the case h when λ(j) = ˆj. By triangle inequality, c(j, λ(j)) ≤ c(j, σ(j)) + c(σ(j), λ(j)) ≤ 2c(j, σ(j)) because ˆj is closer to σ(j) than j i.e. c(σ(j), ˆj) ≤ c(j, σ(j)). Also, c(λ(j), i∗ (j)) ≤ Rλ(j) = 2Cˆλ(j) ≤ 21 c(σ(j), λ(j)) ≤ 12 c(j, σ(j)). Hence, c(j, i∗ (j)) ≤ 25 c(j, σ(j)). When λ(j) = σ(j) then, similar to the if part it can be shown that c(j, i∗ (j)) ≤ 23 c(j, σ(j)). Hence claim (1) is proved. Case II: σ(j) ∈ CD . Set λ(j) = σ(j), claim (2) holds trivially. We next define the assignments so that claim (3) holds. The demand of j is served uniformly by integrally opened facilities in Nσ(j) i.e. ∀i ∈ Nσ(j) X d ˆ ˆ i ≤ dj because w set hi = Xj w ˆi . Observe that h ˆi ≤ w ˆi ∀i ∈ Nσ(j) . Since dj < u w ˆi i∈Nσ(j) i∈Nσ(j)

ˆi ≤ u = u w ˆi ≤ 2 u w we have h ˆi . Also, gˆi + h ˆi + dj ≤ 3 u w ˆi . Thus, claim (3) follows. Proof of Claim (4) is as follows: Consider Figure 1. The dashed ovals represent groups and solid circles correspond to clusters of the centres within the groups (each group has two clusters centres j i and jci ). Dotted directional lines show the relationship between j and σ(j) for each j. Exactly one facility is open in ball(j k ) ∪ ball(jck ) for any group Gj k . Shaded clusters are the ones where some facility has been opened while rest have to depend on facilities in shaded clusters.

13

Case I: (Figure 1-(a)) When i ∈ ball(jc1 ) is open, i serve its own demand djc1 , and demands dj 2 (via j 1 ), dj 3 and dj 4 . Each demand is no more than u. Thus, capacity violation at i is ˆ i = dj 1 + dj 2 + dj 3 + dj 4 ≤ 4u). by a factor at most 4 (h c Case II: (Figure 1-(b)) When i ∈ ball(j 1 ) is open, i serve its own demand dj 1 and demands djc1 , dj 2 , dj 3 and dj 4 . Each demand is no more than u. Thus, capacity violation at i is by a ˆ i = dj 1 + dj 1 + dj 2 + dj 3 + dj 4 ≤ 5u). factor at most 5 (h c

Figure 1: Capacity violation by factor 5 in the sparse world

Claims (3) and (4) show that the assignments can be done violating the capacities at most by a factor of 5. Lemma 12. An integrally open solution σ ¯ =< x ¯, y¯ > to knapsack-median problem instance (C, F, c, f, u, B) can be obtained such that Cost(σ ∗ , F) ≤ O(1)LPopt . Proof. Set y¯i = w ˆi ∀i ∈ F. For j 0 ∈ C, set ˆ g ˆi x ¯ij 0 = dj Aσ∗ (j 0 , Nj ) + dhˆi Aσ∗ (j 0 , Nˆj )∀j ∈ CD , i ∈ Nj , ˆj ∈ CS s.t λ(ˆj) = σ(ˆj) = j and, j X 0 0 ∗ x ¯ij = Aσ (j , Nˆj )∀j ∈ CS , i ∈ Nj . ˆ j∈C˜S :λ(ˆ j)=j

From Lemma (2), c(j, j 0 ) ≤ 2c(i, j 0 ) + 2Rj 0 for all j ∈ C 0 , i ∈ Nj . Multiplying X both sides by Aσ∗ (j 0 , Nj ) and summing over all j ∈ C 0 we get c(j, j 0 )Aσ∗ (j 0 , Nj ) ≤ j∈C 0

X X j∈C 0 i∈Nj

2c(i, j

0

)x∗ij 0

+

X

2Rj 0 Aσ∗ (j , Nj ) ≤ 2Cˆj 0 + 2Rj 0 = 6Cˆj 0 . 0

j∈C 0

14

Summing over all j 0 ∈ C we have

XX

c(j, j 0 )Aσ∗ (j 0 , Nj ) ≤ 6LPopt —(1)

j 0 ∈C j∈C 0

From Lemma (11), X 5 X dj c(j, i∗ (j)) ≤ dj c(j, σ(j)) —(2) 2 j∈CS j∈CS :λ(j)∈CS X X dj c(j, σ(j)) —(3) dj c(j, λ(j)) ≤ j∈CS :λ(j)∈CD

X X j∈CD i∈Nj

j∈CS

ˆ i )c(i, j) ≤ 3u (ˆ gi + h

X X

w ˆi —(4)

j∈CD i∈Nj

Adding (2), (3) and (4) and applying lemma (8), we get that the cost of moving the demand dj from j to appropriate facility is bounded by 51LPopt . Thus, the total connection cost is bounded by 57LPopt . For detailed proof, see appendix 5.1.

3

Capacitated Facility Location Problem

In Capacitated Facility Location Problem, we are given a set of clients C, a set of facilities F and a real valued distance function c(i, j) on F ∪ C in metric space, each facility i has facility opening cost fi and capacity ui while each client has unit demand. The goal is to open a set of facilities and assign clients to them so as to minimize the total facility opening cost and connection cost. We deal with the case when ui = u ∀i and denote it by unif-FLP. In this section, we present a constant factor approximation for the problem The natural LP-relaxation for instance (C, F, c, f, u) of unif-FLP is given as follows(Call it LP-unif-FLP):P P P Minimize i∈F fi yi + i∈F j∈C c(i, j)xij X

xij = 1 ∀j ∈ C

(11)

xij ≤ u yi ∀i ∈ F

(12)

xij ≤ yi ∀i ∈ F, j ∈ C

(13)

subject to

i∈F

X j∈C

xij ≥ 0, yi ≥ 0 Let Cost(σ, T ) = Costf (σ, T ) + Costs (σ, T ) denote the total opening cost and the cost paid by all clients for getting service from a given subset of facilities T under solution σ. Let σ ∗ =< x∗ , y ∗ > denote the optimal solution of LP-unif-FLP. Clusters are formed the same way as in knapsack median. Sparse clusters are easy to handle in this case as there is no budget on the opened facilities. For a sparse cluster centered at j (dj < u), we open the cheapest facility i∗ in ball(j), close all other facilities in the cluster and shift their demands to i∗ . Let σ ˆ =< x ˆ, yˆ > be the solution so obtained. Lemma 13. The solution σ ˆ =< x ˆ, yˆ > satisfies the following: 1. it is integrally open and feasible 15

2. Cost(ˆ σ , NCS ) ≤ O(1)LPopt . Proof. For each j ∈ CS , claim (1) follows as i∗ is fully open and dj < u. Next, we prove claim (2). Let j ∈ CS , i ∈ Nj and j 0 ∈ C be such that xij 0 > 0. Then, by Lemma (3), c(j, j 0 ) ≤ 2c(i, j 0 ) + 2Rj 0 . Also, since i∗ ∈ ball(j), we have c(i∗ , j) ≤ Rj . Thus, c(i∗ , j 0 ) ≤ c(j, j 0 )+c(i∗ , j) ≤ c(j, j 0 )+Rj . If Rj ≤ c(j, j 0 ) then c(i∗ , j 0 ) ≤ 2c(j, j 0 ) ≤ 4c(i, j 0 )+4Rj 0 else c(i∗ , j 0 ) ≤ 2Rj ≤ 4Rj 0 ≤ 4c(i, j 0 ) + 4Rj 0 where the in the else part Psecond inequality P follows by Lemma (2). Thus, c(i∗ , j 0 )ˆ xi∗ j 0 = c(i∗ , j 0 ) i∈Nj x∗ij 0 = i∈Nj x∗ij 0 c(i∗ , j 0 ) ≤ P ∗ 0 0 i∈Nj xij 0 (4c(i, j ) + 4Rj ). Thus, Costs (j 0 , σ ˆ , NCS ) ≤ 4Costs (j 0 , σ ∗ , NCS ) + 8Cˆj 0 Aσ∗ (j 0 , NCS )

(14)

Costf (ˆ σ , NCS ) ≤ 2Costf (σ ∗ , NCS )

(15)

Also ,

Claim (15) follows as i∗ is cheapest and Size(y ∗ , ball(j)) ≥ 1/2. Adding claim (14) over all j 0 ∈ C, we get X Costs (ˆ σ , NCS ) ≤ 4Costs (σ ∗ , NCS ) + 8 Cˆj 0 Aσ∗ (j 0 , NCS ). (16) j 0 ∈C

Adding (15) and (16) we get claim (2). To deal with dense clusters, we define the notion of cluster instances where we have a budget on the total cost the cluster instance is allowed to incur. This includes the cost of opening the facilities as well as the connection cost. Let j ∈ CD and i ∈ Nj . For j 0 ∈ C, let ctr(j 0 ) = k, then c(j 0 , k) ≤ 4Cˆj 0 and c(i, j) ≤ c(i, k) ≤ c(i, j 0 ) + c(j 0 , k) ≤ c(i, j 0 ) + 4Cˆj 0 where the first inequality follows as i belongs to Nj and not Nk . Let bcj = X X x∗ij 0 [c(i, j 0 ) + 4Cˆj 0 ] be the budget on connection cost for a cluster centered at j. i∈Nj j 0 ∈C

Further, let bfj =

P

i∈Nj

fi yi∗ . Define a cluster instance Sj (j, Nj , dj , bcj , bfj ) as follows: X

zi ≥ dj

(17)

(fi + uc(i, j))zi ≤ bfj + bcj

(18)

0 ≤ zi ≤ 1

(19)

u

i∈Nj

X i∈Nj

where z denotes a feasible solution to the cluster instance. Note that any feasible solution z must have Size(z, Nj ) ≥ 1 as dj ≥ u for dense clusters. We call a solution to a cluster instance an almost integral solution if it has at most one fractionally opened facility. Following lemma holds. 16

Lemma 14. Let j ∈ CD , the cluster instance Sj (j, Nj , dj , bcj , bfj ), admits a feasible solution z 0 . P

0

x∗

0

ij 0 = lui ≤ yi∗ . We Proof. Let i ∈ Nj . Set zi0 = j ∈C u P P will show that z soPdefined is a fea0 sible solution to the cluster instance. u i∈Nj zi = i∈Nj li = dj . Also, u i∈Nj c(i, j)zi0 = P   P P P P P x∗ 0 ij 0 ∗ 0 ˆj 0 x∗ 0 = u i∈Nj c(i, j)( j ∈C ) = c(i, j)x c(i, j ) + 4 C 0 ≤ 0 0 ij ij i∈N j ∈C i∈N j ∈C u j j P P bcj . Also, i∈Nj fi zi0 ≤ i∈Nj fi yi∗ .

P Let CI-Costf (z, NCD ) = i∈NC fi zi denote the total facility opening cost paid by all D i ∈ NCD under a solution z. Further, let li be the assignment P P of the demand of j to facility i in Nj with zi opening, then, CI-Costs (z, NCD ) = j∈CD i∈Nj c(i, j)li , denote the total service cost paid by all j ∈ CD for getting served by facilities in NCD under solution z. Then, CostCI (z, NCD ) = CI-Costf (z, NCD )+ CI-Costs (z, NCD ). Lemma 15. For a feasible solution z 0 to a cluster instance, we can construct an almost integral solution z˜. Proof. Arrange the fractionally opened facilities in z 0 in non-decreasing order of fi + c(i, j)u and greedily transfer the total opening Size(z 0 , Nj ) to them. Let z˜ denote the new P P P ˜ openings. Let ˜li = z˜i u. Note that ˜i = u i∈Nj zi0 = dj . Clearly, i∈Nj li = u i∈Nj z P P ˜i ≤ i∈Nj (fi + c(i, j)u) zi0 ≤ bfj + bcj . i∈Nj (fi + c(i, j)u) z Lemma 16. Let dj ≥ u and 0 <  < 1/2 be fixed. Given an almost integral solution z˜ and assignment ˜l as obtained in Lemma (15), an integrally open solution zˆ and assignment ˆl (possibly fractional) can be obtained such that P P 1. ˆli ≤ (1 + )ˆ zi u ∀i ∈ Nj , and i∈Nj ˆli = i∈Nj ˜li = dj 2. CostCI (ˆ z , Nj ) ≤ (1/) CostCI (˜ z , Nj ) ≤ (1/)(bfj + bcj ) Proof. We will construct solution zˆ and assignment ˆl such that they satisfy the following along with the claims (1) and (2): 1 X fi z˜i ∀j ∈ CD ,  i∈Nj i∈Nj X X c(i, j)ˆli ≤ (1 + ) c(i, j)˜li ∀j ∈ CD X

i∈Nj

fi zˆi ≤

(20) (21)

i∈Nj

Adding (20) and (21) we get claim (2). We now proceed to prove claims (1), (20) and (21). If there is no fractionally open facility, we do nothing, i.e. set zˆ = z˜. Else, there is exactly one fractional facility, say i1 , and at least one integral facility, say i2 , as Size(˜ z , Nj ) ≥ 1. There are two possibilities w.r.t i1 , 1. z˜i1 <  2. z˜i1 ≥  17

In first case, close i1 and shift its demand to i2 at a loss of factor (1 + ) in its capacity. Note that ˜li1 < u while ˜li2 = u. Thus, set zˆi1 = 0, ˆli1 = 0 and ˆli2 = ˜li1 + ˜li2 < (1 + )u = (1 + )uˆ zi2 . Also, ˆli2 ≤ (1 + )˜li2 . Then, c(i2 , j)ˆli2 ≤ (1 + ) c(i2 , j)˜li2 . There is no loss in facility cost in this case. In second case, simply open i1 , at a loss of 1/ in facility cost. Set zˆi1 = 1. Then, fi1 zˆi1 ≤ (1/)fi1 z˜i1 . There is no loss in connection cost in this case. P ˆ P ˜ i∈Nj li = i∈Nj li = dj holds clearly. Lemma 17. Let 0 <  < 1/2 be fixed. An integrally open solution σ ¯ =< x ¯, y¯ > to (C, F, c, f, u) can be obtained such that, P 1. ¯ij ≤ (1 + )¯ yi u ∀i ∈ F, j∈C x 2. Cost(¯ σ , F) ≤ O(1/)LPopt Proof. Set y¯i = yˆi ∀i ∈ NCS , y¯i = zˆi ∀ i ∈ NCD and x ¯ij 0 = x ˆij 0 ∀i ∈ NCS , j 0 ∈ C and P ˆ ˆ for j ∈ CD , j 0 ∈ C, i ∈ Nj , let x ¯ij 0 = dlij i0 ∈Nj x∗i0 j 0 = dlij Aσ∗ (j 0 , Nj ). Feasibility wrt P ˆ constraints (11) and (13): x ¯ij 0 ≤ dlij ≤ 1 = zˆi = y¯i , and Aσ¯ (j 0 , Nj ) = ¯ij 0 = i∈Nj x P ˆ li 0 0 ∗ ∗ i∈Nj dj Aσ (j , Nj ) = Aσ (j , Nj ). Then, the following holds: P a. ¯ij 0 ≤ (1 + )¯ yi u ∀i ∈ NCD , j 0 ∈C x b. Cost(¯ σ , NCD ) ≤ O(1/)LPopt Claim (a) follows from Lemma (16) as follows: Let j ∈ CD , i ∈ Nj , j 0 ∈ C. Then, X X ˆli x ¯ij 0 = Aσ∗ (j 0 , Nj ) = ˆli ≤ (1 + )u zˆi = (1 + )u y¯i . For claim (b), from dj 0 0 j ∈C j ∈C XX Lemma (12), we have c(j, j 0 )Aσ∗ (j 0 , Nj ) ≤ 6LPopt —(1). j 0 ∈C j∈C 0

From the proof of Lemma (7), we also have

P

j∈CD

P

i∈Nj

P

j 0 ∈C



 c(i, j 0 ) + 4Cˆj 0 x∗ij 0 ≤

5LPopt . Summing claim (2) of Lemma (16) over all j ∈ CD , we get   P P P P P CostCI (ˆ z , NCD ) ≤ (1/) j∈CD (bfj +bcj ) = (1/) i∈NC fi yi∗ + j∈CD i∈Nj j 0 ∈C c(i, j 0 ) + 4Cˆj 0 x∗ij 0 ≤ D P (1/) i∈NC fi yi∗ + 5LPopt = O(1/)LPopt —(2). D Adding (1) and (2), we get claim (b). For detailed proof of claim (b), see appendix 5.2.

4

Conclusion

We presented first constant factor approximation algorithm for (uniform) capacitated knapsack median problem violating the budget by an additional cost of fmax in the facility opening cost and a loss of factor 5 in capacity. We also gave a constant factor (O(1/)) approximation for uniform capacitated facility location at a loss of O(1 + ) in capacity. The result shows that the natural LP is not too bad.

18

The results presented here easily extend to capacitated k facility location problem as well. In particular, we obtain factor 57 approximation (may be improved further) for CapkFLP violating the capacities by a factor of 5 and using k + 1 facilities. This improves upon factor 191 for l = 2 of Byrka et al. [5]. With a little modification, we should be able to obtain factor O(1/) algorithm violating the capacities by (2 + ). This will improve the factor from O(1/2 ) of Byrka et al. [5] to O(1/). It would be interesting to extend our results to non-uniform capacities. Conflicting requirement of facility costs and capacities makes the problem challenging.

References [1] Karen Aardal, Pieter L. van den Berg, Dion Gijswijt, and Shanfei Li. Approximation algorithms for hard capacitated k-facility location problems. European Journal of Operational Research, 242(2):358–368, 2015. [2] Ankit Aggarwal, Anand Louis, Manisha Bansal, Naveen Garg, Neelima Gupta, Shubham Gupta, and Surabhi Jain. A 3-approximation algorithm for the facility location problem with uniform capacities. Journal of Mathematical Programming, 141(1-2):527– 547, 2013. [3] Hyung-Chan An, Mohit Singh, and Ola Svensson. Lp-based algorithms for capacitated facility location. In Foundations of Computer Science (FOCS), 2014 IEEE 55th Annual Symposium on, pages 256–265, Oct 2014. [4] Manisha Bansal, Naveen Garg, and Neelima Gupta. A 5-approximation for capacitated facility location. In Algorithms - ESA 2012 - 20th Annual European Symposium, Ljubljana, Slovenia, September 10-12, 2012. Proceedings, pages 133–144, 2012. [5] Jaroslaw Byrka, Krzysztof Fleszar, Bartosz Rybicki, and Joachim Spoerhase. Bi-factor approximation algorithms for hard capacitated k -median problems. In Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2015, San Diego, CA, USA, January 4-6, 2015, pages 722–736, 2015. [6] Jaroslaw Byrka, Thomas Pensyl, Bartosz Rybicki, Joachim Spoerhase, Aravind Srinivasan, and Khoa Trinh. An improved approximation algorithm for knapsack median using sparsification. In Algorithms - ESA 2015 - 23rd Annual European Symposium, Patras, Greece, September 14-16, 2015, Proceedings, pages 275–287, 2015. [7] Jaroslaw Byrka, Thomas Pensyl, Bartosz Rybicki, Aravind Srinivasan, and Khoa Trinh. An improved approximation for k -median, and positive correlation in budgeted optimization. In Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2015, San Diego, CA, USA, January 4-6, 2015, pages 737–756, 2015. [8] Jaroslaw Byrka, Bartosz Rybicki, and Sumedha Uniyal. An approximation algorithm for uniform capacitated k-median problem with (1 + ) capacity violation. In Integer Programming and Combinatorial Optimization - 18th International Conference, IPCO 2016, Li`ege, Belgium, June 1-3, 2016, Proceedings, pages 262–274, 2016. 19

[9] Moses Charikar and Sudipto Guha. Improved combinatorial algorithms for the facility location and k-median problems. In Proceedings of the 40th Annual IEEE Symposium on Foundations of Computer Science (FOCS), New York, NY, USA, pages 378–388, 1999. [10] Moses Charikar and Sudipto Guha. Improved combinatorial algorithms for facility location problems. SIAM Journal on Computing, 34(4):803–824, 2005. ´ Tardos, and David B. Shmoys. A constant-factor [11] Moses Charikar, Sudipto Guha, Eva approximation algorithm for the k -median problem (extended abstract). In Proceedings of the Thirty-First Annual ACM Symposium on Theory of Computing, May 1-4, 1999, Atlanta, Georgia, USA, pages 1–10, 1999. [12] Moses Charikar and Shi Li. A dependent lp-rounding approach for the k-median problem. In Automata, Languages, and Programming - 39th International Colloquium, ICALP 2012, Warwick, UK, July 9-13, 2012, Proceedings, Part I, pages 194–205, 2012. [13] H. G¨ okalp Demirci and Shi Li. Constant approximation for capacitated k-median with (1+epsilon)-capacity violation. In 43rd International Colloquium on Automata, Languages, and Programming, ICALP 2016, July 11-15, 2016, Rome, Italy, pages 73:1– 73:14, 2016. [14] Sudipto Guha and Samir Khuller. Greedy strikes back: Improved facility location algorithms. Journal of Algorithms, 31(1):228–248, 1999. [15] Madhukar R. Korupolu, C. Greg Plaxton, and Rajmohan Rajaraman. Analysis of a local search heuristic for facility location problems. Journal of Algorithms, 37(1):146– 188, 2000. [16] Ravishankar Krishnaswamy, Amit Kumar, Viswanath Nagarajan, Yogish Sabharwal, and Barna Saha. The matroid median problem. In Proceedings of the Twenty-Second Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2011, San Francisco, California, USA, January 23-25, 2011, pages 1117–1130, 2011. [17] Amit Kumar. Constant factor approximation algorithm for the knapsack median problem. In Proceedings of the Twenty-third Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’12, pages 824–832, 2012. [18] Retsef Levi, David B. Shmoys, and Chaitanya Swamy. Lp-based approximation algorithms for capacitated facility location. Journal of Mathematical Programming, 131(12):365–379, 2012. [19] Shanfei Li. An Improved Approximation Algorithm for the Hard Uniform Capacitated k-median Problem. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2014), Germany, pages 325– 338. [20] Shi Li. A 1.488 approximation algorithm for the uncapacitated facility location problem. Journal of Information and Computation, 222:45–58, 2013.

20

[21] Shi Li. On uniform capacitated k -median beyond the natural LP relaxation. In Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2015, San Diego, CA, USA, January 4-6, 2015, pages 696–707, 2015. [22] Shi Li. Approximating capacitated k -median with (1 + )k open facilities. In Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2016, Arlington, VA, USA, January 10-12, 2016, pages 786–796, 2016. [23] Chaitanya Swamy. Improved approximation algorithms for matroid and knapsack median problems and applications. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, APPROX/RANDOM 2014, September 4-6, 2014, Barcelona, Spain, pages 403–418, 2014.

5

Appendix

5.1

Detailed proof for lemma 14

For j ∈ CD , i ∈ Nj , ˆj ∈ C˜S s.t λ(ˆj) = σ(ˆj) = j, we have P P ˆ 0 xij 0 = j 0 ∈C c(i, j 0 )[ dgˆji Aσ∗ (j 0 , Nj ) + dhˆi Aσ∗ (j 0 , Nˆj )] j 0 ∈C c(i, j )¯ j

=

g ˆi dj

P



g ˆi dj

P

j 0 ∈C j 0 ∈C

Aσ∗ (j 0 , Nj )c(i, j 0 ) +

ˆi h dˆj

P

j 0 ∈C

Aσ∗ (j 0 , Nˆj )c(i, j 0 )

Aσ∗ (j 0 , Nj )(c(j, j 0 ) + c(i, j)) +

ˆi h dˆj

P

j 0 ∈C

Aσ∗ (j 0 , Nˆj )(c(ˆj, j 0 ) + c(j, ˆj) +

c(j, i)) by triangle inequality

ˆi h dˆj

P ˆ P Aσ∗ (j 0 , Nj )c(j, j 0 )+ dgˆji j 0 ∈C Aσ∗ (j 0 , Nj )c(i, j))+ dhˆi j 0 ∈C Aσ∗ (j 0 , Nˆj )c(ˆj, j 0 )+ j ˆi P h 0 0 ∗ (j , Nˆ )c(j, ˆ ∗ (j , Nˆ )c(j, i) A j) + A 0 0 σ σ j ∈C j ∈C j j dˆ

= P

g ˆi dj

P



g ˆi dj

P

j 0 ∈C

j

4Cˆj 0 ) +

P ˆ P Aσ∗ (j 0 , Nj )(2c(i0 , j 0 )+4Cˆj 0 )+ dgˆji j 0 ∈C Aσ∗ (j 0 , Nj )c(i, j)+ dhˆi j 0 ∈C Aσ∗ (j 0 , Nˆj )(2c(i0 , j 0 )+ j P ˆi P h 0 0 ˆ ∗ ∗ A (j , N )c(j, j) + A (j , N )c(j, i) ... by Lemma (3) ˆ ˆ σ σ j 0 ∈C j 0 ∈C j j dˆ j 0 ∈C

ˆi h dˆj

j

g ˆi P j 0 ∈C dj 2

= ˆ hi P

P

i0 ∈Nj

0

c(i , j

0

)x∗i0 j 0 +4 dgˆji

P

j 0 ∈C

P Cˆj 0 Aσ∗ (j 0 , Nj )+ dgˆji c(i, j) j 0 ∈C Aσ∗ (j 0 , Nj )+

ˆ i c(j, ˆj) + h ˆ i c(j, i) by definition of Aσ∗ (j 0 , Nˆ ) and x∗0 0 c(i0 , j 0 ) + h j Pi j Aσ∗ (j , Nj ) and j 0 ∈C Aσ∗ (j 0 , Nˆj ) = dˆj

2 dˆ j

j 0 ∈C 0

j

i0 ∈Nˆj

ˆ P Cˆj 0 Aσ∗ (j 0 , Nj )+ˆ gi c(i, j)+2 dhˆi j 0 ∈C Costs (j 0 , σ ∗ , Nˆj )+ j 0 ˆ i c(j, ˆj) + h ˆ i c(j, i) because P 0 Aσ∗ (j 0 , Nj ) = dj ˆj 0 + h ∗ (j , Nˆ )C A 0 σ j ∈C j ∈C j

= dgˆji 2 ˆi P h

4 dˆ

P

P

j 0 ∈C

Costs (j 0 , σ ∗ , Nj )+4 dgˆji

P

j 0 ∈C

Summing over all i ∈ Nj , we get P P P P P P 0 xij 0 ≤ 2 i∈Nj dgˆji j 0 ∈C Costs (j 0 , σ ∗ , Nj )+4 i∈Nj dgˆji j 0 ∈C Cˆj 0 Aσ∗ (j 0 , Nj )+ i∈Nj j 0 ∈C c(i, j )¯ P P P ˆ P ˆ P ˆ i c(j, ˆj)+ gˆi c(i, j)+2 i∈Nj dhˆi j 0 ∈C Costs (j 0 , σ ∗ , Nˆj )+4 i∈Nj dhˆi j 0 ∈C Aσ∗ (j 0 , Nˆj )Cˆj 0 + i∈Nj h j

j

21

P

4

i∈Nj

ˆ i c(j, i) h

P P Cˆj 0 Aσ∗ (j 0 , Nj )+2 i∈Nj uc(i, j)+2 j 0 ∈C Costs (j 0 , σ ∗ , Nˆj )+ X X P 0 ˆ i = dˆ , gˆi ≤ ˆ0 ˆ ∗ gˆi = dj , h i∈Nj dˆ j 0 ∈C Aσ (j , Nˆ j j c(j, i) because j c(j, j)+ j )Cj +dˆ

=2 P

P

j 0 ∈C

Costs (j 0 , σ ∗ , Nj )+4

P

j 0 ∈C

i∈Nj

i∈Nj

ˆ i < dˆ 2u and h j P P P Costs (j 0 , σ ∗ , Nj )+4 j 0 ∈C Cˆj 0 Aσ∗ (j 0 , Nj )+2 i∈Nj u c(i, j)w ˆi +2 j 0 ∈C Costs (j 0 , σ ∗ , Nˆj )+ P 0 ˆ0 ˆ ∗ ˆi = 1 or 0 ˆi because dˆj < u and w j 0 ∈C Aσ (j , Nˆ i∈Nj u c(j, i)w j )Cj + dˆ j c(j, j) +

4

Suggest Documents