A 2-Approximation Algorithm for the Soft-Capacitated Facility Location Problem Mohammad Mahdian
Yinyu Yey
Jiawei Zhang z
Abstract This paper is divided into two parts. In the first part of this paper, we present a 2-approximation algorithm for the soft-capacitated facility location problem. This achieves the integrality gap of the natural LP relaxation of the problem. The algorithm is based on an improved analysis of an algorithm for the linear facility location problem, and a bifactor approximate-reduction from this problem to the soft-capacitated facility location problem. We will define and use the concept of bifactor approximate reductions to improve the approximation factor of several other variants of the facility location problem. In the second part of the paper, we present an alternative analysis of the authors’ 1.52approximation algorithm for the uncapacitated facility location problem, using a single factor-revealing LP. This answers an open question of [18]. Furthermore, this analysis, combined with a recent result of Thorup [25] shows that our algorithm can be implemented in quasi-linear time, achieving the best known approximation factor in the best possible running time.
1
Introduction
Variants of the facility location problem (FLP) have been studied extensively in the operations research and management science literatures and have received considerable attention in the area of approximation algorithms [21]. In the metric uncapacitated facility location problem (UFLP), which is the most basic facility location problem, we are given a set F of facilities, a set C of cities (a.k.a. clients), a cost fi for opening facility i 2 F , and a connection cost ij for connecting client j to facility i. The objective is to open a subset of the facilities in F , and connect each city to an open facility so that the total cost is minimized. We assume that the connection costs are metric, meaning that they are symmetric and satisfy the triangle inequality. Since the first constant factor approximation algorithm due to Shmoys, Tardos and Aardal [22], a large number of approximation algorithm have been proposed for the UFLP [23, 12, 13, 17, 24, 14, 1, 3, 4, 5, 8, 14, 16], and the current best known approximation factor is 1.52 given by Mahdian, Ye and Zhang [18]. Guha and Khuller [8] proved that it is impossible to get an approximation guarantee of 1.463 for the UFLP, unless NP DTIME[nO(log log n) ℄.
Laboratory for Computer Science, MIT, Cambridge, MA 02139, USA. E-mail:
[email protected]. y Department of Management Science and Engineering, School of Engineering, Stanford University, Stanford, CA 94305,
USA. E-mail:
[email protected]. Research supported in part by NSF grant DMI-0231600. z Department of Management Science and Engineering, School of Engineering, Stanford University, Stanford, CA 94305, USA. E-mail:
[email protected]. Research supported in part by NSF grant DMI-0231600.
1
The growing interests in the UFLP rely on not only its applications in a large number of settings [7], but also the fact that the UFLP is one of the most basic models among discrete location problems. The insights gained in dealing with the UFLP may also apply to more complicated location models, and in many cases the latter can be reduced directly to the UFLP. In this paper, we give a 2-approximation algorithm for the soft-capacitated facility location problem (SCFLP) by reducing it to the UFLP. The SCFLP is similar to the UFLP, except that there is a capacity ui associated with each facility i, which means that if we want this facility to serve x cities, we have to open it dx=ui e times at a cost of fi dx=ui e. This problem is also known as facility location problem with integer decision variables in the operations research literature (See [2]). Chudak and Shmoys [6] gave a 3-approximation algorithm for the SCFLP with uniform capacities (i.e., ui = u for all i 2 F ) using LP rounding. For non-uniform capacities, Jain and Vazirani [14] showed how to reduce this problem to the UFLP, and by solving the UFLP through a primal-dual algorithm, they obtained a 4-approximation. A local search algorithm proposed by Arya et al [1] had an approximation ratio 3:72. Following the approach of Jain and Vazirani [14], Jain, Mahdian, and Saberi [13] showed that the SCFLP can be solved within a factor of 3. This result was further improved by the authors [18] to a 2:89-approximation for the SCFLP. This is the best previously known algorithm for this problem. We improve this factor to 2, achieving the integrality gap of the natural LP relaxation of the problem. The main idea of our algorithm is to consider algorithms and reductions that have separate (not necessarily equal) approximation factors for facility and connection costs. We will define the concept of bifactor approximate reduction in this paper, and show how it can be used to get an approximation factor of 2 for the SCFLP. We will also generalize our algorithm to a common generalization of the SCFLP and the concave-cost FLP. The idea of using bifactor approximation algorithms and reductions can be used to improve the approximation factor of several other problems in a straightforward manner. We show two examples of this in Appendix C. In the second part of this paper, we present an alternative analysis for the 1.52-approximation algorithm for the UFLP [18] using a single factor-revealing LP. This answers an open question of [18]. Furthermore, this analysis shows that the second phase of the 1.52 algorithm can be implemented in quasi-linear time. This, together with a recent result of Thorup [25] proves that our algorithm can be implemented in quasi-linear time, achieving the best known approximation factor in essentially the best possible running time. The rest of this paper is organized as follows: In Section 2 we present the necessary definitions and notations. In Section 3 we present a lemma on the approximability of the linear-cost facility location problem. In Section 4 we define the concept of bifactor approximate reductions between facility location problems, and present an algorithm for the SCFLP and a common generalization of the SCFLP and the concave-cost FLP using the lemma proved in Section 3 and a bifactor reduction from the SCFLP to the linear-cost FLP. Then, in Section 5, we present a new analysis on the 1.52 algorithm for the UFLP and show how it leads to an implementation in quasi-linear times.
2
Definitions and notations
In this paper, we will define reductions between various facility location problems. Many such problems can be considered as special cases of the generalized facility location problem, as defined below. This problem was first defined and studied in [10]. Definition 1 In the metric generalized facility location problem, we are given a set C of n cities, a set F 2
of nf facilities, a connection cost ij between city j and facility i for every i 2 F ; j 2 C , and a facility cost function fi : f0; : : : ; n g 7! R + for every i 2 F . Connection costs are symmetric and obey the triangle inequality. The value of fi (k ) equals the cost of opening facility i, if it is used to serve k cities. A solution to the problem isPa function : C ! F assigning each city to a facility. The facility cost F of the solution is defined as i2F fi (jfj : P (j ) = igj), i.e., the total cost for opening facilities. The connection cost (a.k.a. service cost) C of is j 2C (j );j , i.e., the total cost of opening each city to its assigned facility. The objective is to find a solution that minimizes the sum F + C . Now we can define uncapacitated and soft-capacitated facility location problems as special cases of the generalized FLP: Definition 2 The metric uncapacitated facility location problem (UFLP) is a special case of the generalized FLP in which all facility cost functions are of the following form: for each i 2 F , fi (k ) = 0 if k = 0, and fi (k ) = fi if k > 0, where fi is a constant (which is called the facility cost of i). Definition 3 The metric soft-capacitated facility location problem (SCFLP) is a special case of the generalized FLP in which all facility cost functions are of the form fi (k ) = fi dk=ui e, where fi and ui are constants for every i 2 F . ui is called the capacity of facility i. The 1.52-approximation algorithm of Mahdian, Ye, and Zhang [18] is built upon an earlier approximation algorithm of Jain, Mahdian, and Saberi [13]. We denote these two algorithms by the MYZ and the JMS algorithms, respectively. The analyses of both of these algorithms have the feature that allow the approximation factor for the facility cost to be different from the approximation factor for the connection cost, and give a way to compute the tradeoff between these two factors. The following definition captures this notion. Definition 4 An algorithm is called a ( f ;
)-approximation algorithm for the generalized FLP, if for every instance I of the generalized FLP, and for every solution SOL for I with facility cost FSOL and connection cost CSOL , the cost of the solution found by the algorithm is at most f FSOL +
CSOL . Recall the following theorem of Jain et al. [13] on the approximation factor of the JMS algorithm. Theorem A [13]. Let f 1 be fixed and
:= supk fzk g, where zk is the solution of the following optimization program (which we call the factor-revealing LP).
Pk
maximize subject to
i=1 i f f Pk i=1 di 8 1 i < k : i i+1 8 1 j < i < k : rj;i rj;i+1 8 1 j < i k : i rj;i + di + dj k i 1 X X 8 1 i k : max(rj;i dj ; 0) + max(i j =i j =1 8 1 j i k : j ; dj ; f; rj;i 0
3
(1) (2) (3) (4)
dj ; 0) f
(5) (6)
Then the JMS algorithm is a ( f ;
)-approximation algorithm for the UFLP. We will use the above theorem in this paper to give an alternative proof of the following theorem about the performance of the MYZ algorithm. Theorem B [18]. Let ( f ;
) be a pair obtained from the above factor-revealing LP. Then for every Æ there is a ( f + ln(Æ ) + ; 1 +
Æ 1 )-approximation algorithm for the UFLP.
3
1,
The linear-cost facility location problem
The linear-cost facility location problem is a special case of the generalized FLP in which the facility costs are of the form
fi (k) =
k=0 ai k + bi k > 0
0
where ai and bi are nonnegative values for each i incremental) cost of facility i, respectively.
2 F . ai and bi are called the setup and marginal (a.k.a.
We denote an instance of the linear-cost FLP with marginal costs (ai ), setup costs (bi ), and connection costs ( ij ) by LF LP (a; b; ). Clearly, the regular UFLP is a special case of the linear-cost FLP with ai = 0, i.e., LF LP (0; b; ). Furthermore, it is straightforward to see that LF LP (a; b; ) is equivalent to an instance of the regular UFLP in which the marginal costs are added to the connection costs. More precisely, let ij = ij + ai for i 2 F and j 2 C , and consider an instance of the UFLP with facility
ij ). We denote this instance by UF LP (b; + a). It is easy to see costs (bi ) and connection costs ( that LF LP (a; b; ) is equivalent to UF LP (b; + a). Thus, the linear-cost FLP can be solved using any algorithm for the UFLP, and the overall approximation ratio will be the same. However, for applications in the next section, we need bifactor approximation factors of the algorithm (as defined in Definition 4). It is not necessarily true that applying a ( f ;
)-approximation algorithm for the UFLP on the instance UF LP (b; a + ) will give a ( f ;
)-approximate solution for LF LP (a; b; ). However, we will show that the JMS algorithm has this property. The following lemma, whose proof is given in Appendix A, generalizes Theorem A for the linear-cost FLP. Lemma 1 Let ( f ;
) be a pair obtained from the factor-revealing LP in Theorem A. Then applying the JMS algorithm on the instance UF LP (b; a + ) will give a ( f ;
)-approximate solution for LF LP (a; b; ). The above lemma and Theorem 9 in [13] give us the following corollary, which will be used in the next section. Corollary 2 There is a (1; 2)-approximation algorithm for the linear-cost facility location problem. It is worth mentioning that the MYZ algorithm can also be generalized for the linear-cost FLP. The only trick is to scale up both a and b in the first phase by a factor of Æ , and scale them both down in the second phase. The rest of the proof is almost the same as the proof of Lemma 1.
4
4
The soft-capacitated facility location problem
In this section we will show how the soft-capacitated facility location problem can be reduced to the linearcost FLP. In Section 4.1 we define the concept of reduction between facility location problems. We will use this concept in Sections 4.2 and 4.3 to obtain approximation algorithms for the SCFLP and a generalization of the SCFLP and the concave-cost FLP.
4.1 Reduction between facility location problems Definition. A reduction from a facility location problem A to another facility location problem B is an efficient procedure R that maps every instance I of A to an instance R(I ) of B . This procedure is called a (f ; )-reduction if the following conditions hold.
and connection cost 1. For any instance I of A and any feasible solution for I with facility cost FA , there is a corresponding solution for the instance R(I ) with facility cost F f F and CA A B C . connection cost CB A
2. For any feasible solution for the instance R(I ), there is a corresponding feasible solution for I whose total cost is at most as much as the total cost of the original solution for R(I ). In other words, the FLP instance R(I ) is an over-estimate of the FLP instance I . Theorem 3 If there is a (f ; )-reduction from a facility location problem A to another facility location problem B , and a ( f ;
)-approximation algorithm for B, then there is a ( f f ;
)-approximation algorithm for A.
Proof. On an instance I of the problem A, we compute R(I ), run the ( f ;
)-approximation algorithm for B on R(I ), and output the corresponding solution for I . In order to see why this is a ( f f ;
)approximation algorithm for A, let SOL denote an arbitrary solution for I , ALG denote the solution that the above algorithm finds, and FP and CP (FPALG and CPALG , respectively) denote the facility and connection costs of SOL (ALG, respectively) when viewed as a solution for the problem P (P = A; B ). By the definition of (f ; )-reductions and ( f ;
)-approximation algorithms we have ALG + C ALG F ALG + C ALG F + C F + C ; FA
A
B f f A f B B B A
which completes the proof of the lemma. We will see examples of reductions in the rest of this paper.
4.2 The soft-capacitated facility location problem In this subsection, we give a 2-approximation algorithm for the soft-capacitated FLP by reducing it to the linear-cost FLP. Theorem 4 There is a 2-approximation algorithm for the soft-capacitated facility location problem. 5
Proof. We use the following reduction: Construct an instance of the linear-cost FLP, where we have the same sets of facilities and clients. The connection costs remain the same. However, the facility cost of the ith facility is (1 + kui1 )fi if k 1 and 0 if k = 0. Note that, for every k 1, d uki e 1 + kui1 2 d uki e: Therefore, it is easy to see that this reduction is a (2; 1)-reduction. By Lemma 2, there is a (1; 2)approximation algorithm for the linear-cost FLP, which together with Theorem 3 completes the proof.
Furthermore, we now illustrate that the following natural linear programming formulation of the SCFLP has an integrality gap of 2. This means that we cannot obtain a better approximation ratio using this LP relaxation as the lower bound. minimize subject to
X i2F
fi yi +
XX i2F j 2C
ij xij
(7)
8 i 2 F ; j 2XC : xij yi 8 i 2 F : xij uiyi j 2C X 8j 2 C : xij = 1 i2F 8 i 2 F ; j 2 C : xij 2 f0; 1g 8 i 2 F : yi is a nonnegative integer
(8) (9)
In a natural linear program relaxation, we replace the constraints (8) and (9) by xij 0 and yi 0. Here we see that even if we only relax constraint (9), the integrality gap is 2. Consider an instance of the SCFLP that consists of only one potential facility i, and k 2 clients. Assume that the capacity of facility i is k 1, the facility cost is 1, and all connection costs are 0. It is clear that the optimal integral solution has cost 2. However, after relaxing constraint (9), the optimal fractional solution has cost 1 + k 1 1 . Therefore, the integrality gap between the integer program and its relaxation is k2k1 which tends to 2 as k tends to infinity.
4.3 The concave soft-capacitated facility location problem In this subsection, we consider a common generalization of the soft-capacitated facility location problem and the concave-cost facility location problem. This problem, which we refer to as the concave softcapacitated FLP, is the same as the soft-capacitated FLP except that if r 0 copies of facility i are open, then the facility cost is g (r )ai where g (r ) is a given concave function of r . In other words, the concave soft-capacitated FLP is a special case of the generalized FLP in which the facility cost functions are of the form fi (x) = fi g (dx=ui e) for constants ai ; ui and a concave function g . It is also a special case of the so-called stair-case cost facility location problem [11]. On the other hand, it is a common generalization of the soft-capacitated FLP (when g (r ) = r ) and the concave-cost FLP (when ui = 1 for all i). The concave-cost FLP is a special case of the generalized FLP in which facility cost functions are required to be concave (See [10]). The main result of this subsection is the following.
; 1)-reducible to the linear-cost FLP. Theorem 5 The concave soft-capacitated FLP is ( gg(2) (1) 6
The proof of the above theorem is given in Appendix B. We first show that the concave soft-capacitated ; 1) reducible to the concave-cost FLP. Then we show that the latter is equivalent to the linearFLP is ( gg(2) (1) cost FLP. Therefore, by Theorem 5, a good approximation algorithm for linear-cost FLP would imply a good approximation for the concave soft-capacitated FLP.
5
The uncapacitated facility location problem
In this section, we present a new analysis of the 1.52-approximation algorithm of Mahdian, Ye, and Zhang [18] for the UFLP. The analysis of the MYZ algorithm in [18] is based on combining a result of Jain et al. [13] (which is proved using factor-revealing LPs) with an analysis of a greedy augmentation procedure of Charikar et al. [3]. Here, we analyze the MYZ algorithm using a single factor-revealing LP. This gives us a new perspective on the MYZ algorithm. As a corollary, we use a recent result of Thorup [25] that the JMS algorithm can be implemented in quasi-linear time to improve the running time of the MYZ algorithm. We begin by sketching the MYZ algorithm. The algorithm consists of two phases. In the first phase, we scale up the facility costs in the instance by a factor of Æ (which will be fixed later), and then run the JMS algorithm (see [13] for a description) on the modified instance. In addition to finding a solution for the scaled instance, the JMS algorithm outputs the share of each city of the total cost of the P solution. Let j denote the share of city j of the total cost (Therefore the total cost of the solution is j j ). The main step in the analysis of the JMS algorithm is to prove that for any collection S of one facility fS with opening cost Æf (f in the original instance) and k cities with connection costs d1 ; : : : ; dk to fS and shares 1 ; : : : ; k of the total cost, the values Æf , dj ’s, j ’s and rj;i’s (whose definition is omitted here, since we don’t need it) satisfy the inequalities (2)-(6) in Theorem A, except that the inequality (5) is replaced by
81 i k :
i 1 X j =1
max(rj;i
dj ; 0) +
k X j =i
max(i
dj ; 0) Æf
(10)
In the second phase of the MYZ algorithm we reduce the scaling factor Æ continuously, until it gets to 1. If at any point during this process a facility could be opened without increasing the total cost (i.e., if the opening cost of the facility equals the total amount that cities can save by switching their “service provider” to that facility), then we open the facility and connect each city to its closest open facility. The second phase of the MYZ algorithm is equivalent to a greedy augmentation procedure of [8, 3], and, in fact, a lemma from [3] is used in [18] in order to analyze the second phase. Here we analyze this phase differently. First, we modify the second phase as follows: Instead of decreasing the scaling factor continuously from Æ to 1, we decrease it discretely in L steps where L is a constant. Let Æi denote the value of the scaling factor in the i’th step. Therefore, Æ = Æ1 > Æ2 > : : : > ÆL = 1. We will fix the value of of Æi ’s later. After decreasing the scaling factor from Æi 1 to Æi , we consider facilities in an arbitrary order, and open those that can be opened without increasing the total cost. We denote this modified algorithm by MYZL . Clearly, if L is sufficiently large (depending on the instance), the algorithm MYZL computes the same solution as the MYZ algorithm. In order to analyze the above algorithm, we need to add extra variables and inequalities to the inequalities (2), (3), (4), (10) and (6). Let rj;k+i denote the connection cost that city j in S pays after we change the scaling factor to Æi and process all facilities as described above (Thus, rj;k+1 is the connection cost of city 7
j after the first phase). Therefore, by the description of the algorithm, we have
81 i L :
k X j =1
max(rj;k+i
dj ; 0) Æi f;
(11)
since otherwise we could open fS and decrease the total cost.
Now, we compute the share of the city j of the total cost of the solution that the MYZL algorithm finds. In the first phase of the algorithm, the share of city j of the total cost is j . Of this amount, rj;k+1 is spent on the connection cost, and j rj;k+1 is spent on the facility costs. However, since the facility costs are scaled up by a factor of Æ in the first phase, therefore the share of city j of facility costs in the original instance is equal to (j rj;k+1 )=Æ . After we reduce the scaling factor from Æi to Æi+1 (i = 1; : : : ; L 1), the connection cost of city j is reduced from rj;k+i to rj;k+i+1 . Therefore, in this step, the share of city j of the facility costs is rj;k+i rj;k+i+1 with respect to the scaled instance, or (rj;k+i rj;k+i+1)=Æi+1 with respect to the original instance. Thus, at the end of the algorithm, the total share of city j of facility costs is
j
rj;k+1 LX1 rj;k+i rj;k+i+1 + : Æ Æ i +1 i=1
We also know that the final amount that city j pays for the connection cost is rj;k+L . Therefore, the share of the facility j of the total cost of the solution is:
j
rj;k+1 LX1 rj;k+i rj;k+i+1 j + + rj;k+L+1 = Æ Æi+1 Æ i=1
+
LX1 i=1
1
Æi+1
1
Æi
rj;k+i:
(12)
This, together with a dual fitting argument similar to [13], imply the following. Theorem 6 Let (f ; ) be such that f maximization program for every k . maximize subject to
1 and is an upper bound on the solution of the following
Pk j PL 1 1 j =1 Æ + i=1 Æi+1 Pk
i=1 di (2); (3); (4); (10); (11); (6)
1
Æi
rj;k+i
f f
(13)
Then, the MYZL algorithm is a (f ; )-approximation algorithm for the UFLP. In the following theorem, we analyze the factor-revealing LP (13) and rederive the main result of [18]. In L i order to do this, we need to set the values of Æi ’s. Here, for simplicity of computations, we set Æi to Æ L 1 ; however, it is easy to observe that any choice of Æi ’s such that the limit of maxi (Æi+1 Æi ) as L tends to infinity is zero, will also work. The proof is given in Appendix D. Theorem 7 Let ( f ;
) be a pair given by the maximization program in Theorem A, and Æ 1 be an arbitrary number. Then for every , if L is a sufficiently large constant, the MYZL algorithm is a ( f + ln(Æ ) + ; 1 +
Æ 1 )-approximation algorithm for the UFLP. 8
The above analysis enables us to prove the following result. Corollary 8 For every > 0, there is a quasi-linear time (1:52 + )-approximation algorithm for the UFLP, both in the distance oracle model (where the connection costs are given by a matrix) and in the sparse graph model (where the connection costs are distances in a given graph). Proof Sketch. We use the MYZL algorithm for a large constant L. Thorup [25] shows that for every > 0, the JMS algorithm can be implemented in quasi-linear time (in both distance oracle and graph models) with an approximation factor of 1:61 + . It is straightforward to see that his argument actually implies the stronger conclusion that the quasi-linear algorithm is a ( f + ;
+ )-approximation, where ( f ;
) are given by Theorem A. This shows that the first phase of the MYZL algorithm can be implemented in quasi-linear time. The second phase consists of constantly many rounds. Therefore, we only need to show that each of these rounds can be implemented in quasi-linear time. This is easy to see in the distance oracle model. In the graph model, we can use the exact same argument as the one used by Thorup in the proof of Lemma 5.1 of [25]. Acknowledgements. We would like to thank Asaf Levin for pointing out that our analysis of the 2approximation algorithm for the soft-capacitated facility location problem is tight.
References [1] V. Arya, N. Garg, R. Khandekar, A. Meyerson, K. Munagala, and V. Pandit. Local search heuristics for k-median and facility location problems. In Proceedings of 33rd ACM Symposium on Theory of Computing, 2001. [2] P. Bauer and R. Enders. A capacitated facility location problem with integer decision variables. In International Symposium on Mathematical Programming (ISMP), 1997. [3] M. Charikar and S. Guha. Improved combinatorial algorithms for facility location and k-median problems. In Proceedings of the 40th Annual IEEE Symposium on Foundations of Computer Science, pages 378–388, October 1999. [4] F.A. Chudak. Improved approximation algorithms for uncapacited facility location. In R.E. Bixby, E.A. Boyd, and R.Z. R´ıos-Mercado, editors, Integer Programming and Combinatorial Optimization, volume 1412 of Lecture Notes in Computer Science, pages 180–194. Springer, Berlin, 1998. [5] F.A. Chudak and D. Shmoys. Improved approximation algorithms for the uncapacitated facility location problem. unpublished manuscript, 1998. [6] F.A. Chudak and D. Shmoys. Improved approximation algorithms for the capacitated facility location problem. In Proc. 10th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 875–876, 1999. [7] G. Cornuejols, G.L. Nemhauser, and L.A. Wolsey. The uncapacitated facility location problem. In P. Mirchandani and R. Francis, editors, Discrete Location Theory, pages 119–171. John Wiley and Sons Inc., 1990. 9
[8] S. Guha and S. Khuller. Greedy strikes back: Improved facility location algorithms. Journal of Algorithms, 31:228–248, 1999. [9] S. Guha, A. Meyerson, and K. Munagala. Hierarchical placement and network design problems. In Proceedings of the 41th Annual IEEE Symposium on Foundations of Computer Science, 2000. [10] M. Hajiaghayi, M. Mahdian, and V.S. Mirrokni. The facility location problem with general cost functions. unpublished manuscript, 2002. [11] K. Holmberg. Solving the staircase cost facility location problem with decomposition and piecewise linearization. European Journal of Operational Research, 74:41–61, 1994. [12] K. Jain, M. Mahdian, E. Markakis, A. Saberi, and V.V. Vazirani. Approximation algorithms for facility location via dual fitting with factor-revealing LP. to appear in Journal of the ACM, 2002. [13] K. Jain, M. Mahdian, and A. Saberi. A new greedy approach for facility location problems. In Proceedings of the 34st Annual ACM Symposium on Theory of Computing, 2002. [14] K. Jain and V.V. Vazirani. Approximation algorithms for metric facility location and k-median problems using the primal-dual schema and lagrangian relaxation. Journal of the ACM, 48:274–296, 2001. [15] D. Karger and M. Minkoff. Building Steiner trees with incomplete global knowledge. In Proceedings of the 41st Annual IEEE Symposium on Foundations of Computer Science, 2000. [16] M.R. Korupolu, C.G. Plaxton, and R. Rajaraman. Analysis of a local search heuristic for facility location problems. In Proceedings of the 9th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 1–10, January 1998. [17] M. Mahdian, E. Markakis, A. Saberi, and V.V. Vazirani. A greedy facility location algorithm analyzed using dual fitting. In Proceedings of 5th International Workshop on Randomization and Approximation Techniques in Computer Science, volume 2129 of Lecture Notes in Computer Science, pages 127–137. Springer-Verlag, 2001. [18] M. Mahdian, Y. Ye, and J. Zhang. Improved approximation algorithms for metric facility location problems. In Proceedings of 5th International Workshop on Approximation Algorithms for Combinatorial Optimization (APPROX 2002), 2002. [19] R. Ravi and A. Sinha. Integrated logistics: Approximation algorithms combining facility location and network design. In Proceedings of the 9th Conference on Integer Programming and Combinatorial Optimization, 2002. [20] G. Robins and A. Zelikovsky. Improved Steiner tree approximation in graphs. In Proceedings of the 10th ACM-SIAM symposium on Discrete Algorithms, pages 770–779, 1999. [21] D.B. Shmoys. Approximation algorithms for facility location problems. In K. Jansen and S. Khuller, editors, Approximation Algorithms for Combinatorial Optimization, volume 1913 of Lecture Notes in Computer Science, pages 27–33. Springer, Berlin, 2000. [22] D.B. Shmoys, E. Tardos, and K.I. Aardal. Approximation algorithms for facility location problems. In Proceedings of the 29th Annual ACM Symposium on Theory of Computing, pages 265–274, 1997. 10
[23] M. Sviridenko. An 1.582-approximation algorithm for the metric uncapacitated facility location problem. In Proceedings of the 9th Conference on Integer Programming and Combinatorial Optimization, 2002. [24] M. Thorup. Quick k -median, k -center, and facility location for sparse graphs. In Automata, Languages and Programming, 28th International Colloquium, Crete, Greece, volume 2076 of Lecture Notes in Computer Science, pages 249–260, 2001. [25] M. Thorup. Quick and good facility location. In Proceedings of the 14th ACM-SIAM symposium on Discrete Algorithms, 2003.
A
Proof of Lemma 1
Proof. Let SOL be an arbitrary solution for LF LP (a; b; ), which can also be viewed as a solution for UF LP (b; ) for = + a. Consider a facility f that is open in SOL, and the set of clients connected to it in SOL. Let k denote the number of these clients, f (k ) = ak + b (for k > 0) be the facility cost function of f , and dj denote the connection cost between client j and the facility f in the instance UF LP (b; a + ). Therefore, dj = dj a is the corresponding connection cost in the original instance LF LP (a; b; ). Recall [13] the definition of j and rij in the factor-revealing LP of Theorem A. It is proved [13] that i rj;i + dj + di . We strengthen this inequality as follows. Claim 9
i rj;i + dj + di
Proof. It is true if i = j since it happens only if rj;i = j . Otherwise, consider clients i and j (< i) at time t = i . Let s be the facility j is assigned to at time t. By triangle inequality, we have
si = si + as sj + di + dj + as = sj + di + dj rj;i + di + dj : On the other hand i
si since otherwise i could have connected to facility s at a time earlier than t.
It is also known [13] that i 1 X j =1
max(rj;i
Notice that max(a i 1 X j =1
max(rj;i
dj ; 0) +
k X j =i
max(i
dj ; 0) b:
x; 0) max(a; 0) x if x 0. Therefore, we have dj ; 0) +
k X j =i
max(i
dj ; 0) b + ka:
Claim 9 and Inequality 14 show that the values j , rij , dj , a, and following optimization program.
11
(14)
b constitute a feasible solution of the
Pk
maximize subject to
i=1 i
f (ak + b) i=1 di 8 1 i < k : i i+1 8 1 j < i < k : rj;i rj;i+1 8 1 j < i k : i rj;i + di + dj Pk
81 i k :
i 1 X j =1
max(rj;i
(15)
dj ; 0) +
8 1 j i k : j ; dj ; a; b; rj;i 0
k X j =i
max(i
dj ; 0) b + ka
However, it is clear that the above optimization program and the factor-revealing LP in Theorem A are equivalent. This completes the proof of this lemma.
B
Proof of Theorem 5
The theorem is established by the following lemmas which show the reductions between the concave soft-capacitated FLP, the concave-cost FLP and the linear-cost FLP.
; 1) reducible to the concave-cost FLP. Lemma 10 The concave soft-capacitated FLP is ( gg(2) (1) Proof. Given an instance I of the concave soft-capacitated FLP, where the facility cost function of the facility i is fi (k ) = g (dk=ui e)ai , we construct an instance R(I ) of the concave-cost FLP as follows: We have the same sets of facilities and clients and the same connection costs as in I . The facility cost function of the ith facility is given by
fi0(k) =
( 0
g(r) + (g(r + 1) g(r))( kui1
r + 1) ai
> 0; r := dk=ui e if k = 0: if k
Concavity of g implies that the above function is also concave, and therefore concave-cost FLP. Also, it is easy to see from the above definition that
R(I )
is an instance of
g(dk=ui e)ai fi0 (k) g(dk=ui e + 1)ai : By the concavity of the function g , we have g(gr(+1) r) and number k ,
gg
fi (k) fi0 (k)
(2) (1)
for every r
1. Therefore, for every facility i
g(2) f (k): g(1) i
This completes the proof of the lemma. Moreover, we will show a simple (1; 1)-reduction from the concave-cost FLP to the linear-cost FLP. This, together with the above lemma, reduce the concave soft-capacitated facility location problem to the linearcost FLP.
12
Lemma 11 There is a (1; 1)-reduction from the concave-cost FLP to the linear-cost FLP. Proof. Given an instance I of concave-cost FLP, we construct an instance R(I ) of linear-cost FLP as follows: Corresponding to each facility i in I with facility cost function fi (k ), we put n copies of this facility in R(I ) (where n is the number of clients), and let the facility cost function of the l’th copy be l
fi (k) = ( )
fi (l) + (fi (l) fi(l 0
1))(k
l)
if k > 0 if k = 0:
In other words, the facility cost function is the line that passes through the points (l 1; f (l 1)) and (l; f (l)). The set of clients, and the connection costs between clients and facilities are unchanged. We prove that this reduction is a (1; 1)-reduction.
For any feasible solution SOL for I , we can construct a feasible solution SOL0 for R(I ) as follows: If a facility i is open and k clients are connected to it in SOL, we open the k ’th copy of the corresponding (k ) facility in R(I ), and connect the clients to it. Since fi (k ) = fi (k ), the facility and connection costs of SOL0 is the same as those of SOL.
Conversely, consider an arbitrary feasible solution SOL for R(I ). We construct a solution SOL0 for I as follows. For any facility i, if at least one of the copies of i is open in SOL, we open i and connect all clients that were served by a copy of i in SOL to it. We show that this does not increase the total cost of the solution: Assume the l1 ’th, l2 ’th, . . . , and ls ’th copies of i were open in SOL, serving k1 ; k2 ; : : :, and ks clients, respectively. By concavity of fi , and the fact that fi(l) (k) fi(k) (k) = fi (k) for every l, we have
fi (k1 + + ks ) fi(k1 ) + + fi (ks ) fi(l1 ) (k1 ) + + fi(ls ) (ks ): This shows that the facility cost of SOL0 is at most the facility cost of SOL.
C
Other examples
The main trick we used in Section 4 was to differentiate between the parts of the approximation ratio that correspond to the facility costs and connection costs. This trick can be used to improve the approximation factor of other problems that are solved by reductions to a facility location problem. In this appendix, we present two such examples.
C.1
The capacitated cable facility location problem with non-uniform demands
The capacitated cable facility location (CCFLP), each client has a demand that needs to be routed to an open facility. This connection can be done by installing cables of capacity u (i.e., capable of routing d units of demand in total) on the edges of the graph. More than one cable can be installed on each edge, at a cost equal to the number of cables times the cost of the edge. The objective is to open some facilities and install some cables to minimize the total cost of facilities and cables. This problem was introduced by Ravi and Sinha [19]. If all the clients have uniform demands, Ravi and Sinha [19] show that the CCFLP can be approximated by a factor of UF LP + ST where UF LP and ST denote the approximation ratio 13
for the UFLP and the Steiner Tree problem. For non-uniform demands, they show that the CCFLP can be approximated by a factor of 2UF LP + ST 4:59 (using UF LP 1:52 [18] and ST 1:55 [20]). As noted by Ravi and Sinha [19], the coefficient 2 in front of UF LP in the above expression is only because of doubling connection costs (and not facility costs). Therefore, we can get the following result. Observation 12 The capacitated cable facility location problem with non-uniform demands can be approximated within a factor of 3:97. Proof. It is implicitly proved in [19] that a (Rf ; R )-approximation algorithm for the UFLP and a ST approximation algorithm for the Steiner tree problem imply a (max(Rf ; 2R ) + ST )-approximation algorithm for the CCFLP with non-uniform demands. Using ST 1:55 [20] and Theorem B, we get an algorithm with the approximation ratio at most max( f + ln Æ; 2(1 +
Æ 1 )) + 1:55 for the CCFLP. This expression is minimized at ( f ;
) = (1:11; 1:78) and Æ = 3:7, yielding a factor of 3:97.
C.2
The load-balanced facility location
In the load-balanced facility location (LBFLP), each open facility i is required to serve at least Li clients. This problem was introduced independently by Guha, Meyerson and Munagala [9] and by Karger and Minkoff [15]. They reduce the LBFLP to the UFLP, and show that if the UFLP can be approximated a ; a)-approximation for the LBFLP, i.e, the algorithm produce within a factor of , there is a bicriteria ( 1+ 1 a a solution in which each facility i serves at least aLi clients for some a < 1, and the total cost is at most 1+a times the optimal cost of a solution satisfying the lower bounds. Their result can be used to obtain 1 a constant factor approximation algorithms for the connected facility location problem and some network design problems, in particular the access network design problem. This, together with 1:52, lead to bicriteria approximation ratios (4:56; 0:5) and (3:04; 1=3) for the LBFLP. This result is based on a reduction that transforms an LBFLP instance of total cost C + F into an instance of the UFLP of total cost at most (1 + 12aa )(C + F ), such that any solution for the reduced UFLP instance gives a solution to the original LBFLP instance violating the lower bounds by at most a factor of a. A close look at this result [15, 9] shows that for the same reduction the reduced UFLP instance has a solution with facility cost at most F + 12aa C and connection cost at most C . Therefore, we have the following Observation 13 If the UFLP can be approximated within a factor of ( f ;
), then there is a bicriteria ( 12aa f +
; a)-approximation for the load-balanced facility location problem. In particular, using the pairs ( f ;
) = (1; 2) and ( f ;
) = (1:11; 1:78), we get bicriteria approximation algorithms for the LBFLP with factors (4; 1=2) and (2:89; 1=3).
D Proof of Theorem 2 Proof. Since inequalities of the factor-revealing LP (13) are a superset of the inequalities of the factorrevealing LP (1), by Theorem A and the definition of ( f ;
), we have k X j =1
j f Æf +
k X j =1
dj
(16)
14
By inequality (11), for every i = 1; : : : ; L, we have k X j =1
rj;k+i
k X j =1
max(rj;k+i
dj ; 0) +
k X j =1
dj Æ i f +
k X j =1
dj :
(17)
Therefore, k X j
Æ
j =1 1
= (
k X
Æ
j =1
+
LX1 i=1
j ) +
1Æ ( f Æf +
= f f + =
1
1
Æi+1 LX1 i=1 k X j =1
Æi
0
(
!
rj;k+i 1
1
Æi+1
dj ) +
Æi
LX1 i=1
0
1)(Æ 1=(L
1)
j =1
1
rj;k+iA
( 1
Æi+1
LX1 k Æ
X dj + ( i Æ j =1 Æ i=1 i+1
f + (L
)
k X
1)
1
Æi
1)f + (
f +(
)(Æi f +
1
1
ÆL
Æ1
+1 Æ
1
Æ
) )
k X j =1 k X j =1 k X j =1
1
dj )A dj dj :
This, together with Theorem 6 shows that the MYZL is a ( f + (L 1)(Æ 1=(L 1) 1); 1 +
Æ 1 )approximation algorithm for the UFLP. The fact that the limit of (L 1)(Æ 1=(L 1) 1) as L tends to infinity is ln(Æ ) completes the proof.
15