Maximizing Submodular Set Functions Subject to Multiple Linear Constraints∗ Ariel Kulik†
Hadas Shachnai‡
Tami Tamir
§
Abstract The concept of submodularity plays a vital role in combinatorial optimization. In particular, many important optimization problems can be cast as submodular maximization problems, including maximum coverage, maximum facility location and max cut in directed/undirected graphs. In this paper we present the first known approximation algorithms for the problem of maximizing a non-decreasing submodular set function subject to ¯ for some d ≥ 1, and multiple linear constraints. Given a d-dimensional budget vector L, an oracle for a non-decreasing submodular set function f over a universe U , where each element e ∈ U is associated with a d-dimensional cost vector, we seek a subset of elements ¯ such that f (S) is maximized. S ⊆ U whose total cost is at most L, We develop a framework for maximizing submodular functions subject to d linear constraints that yields a (1 − ε)(1 − e−1 )-approximation to the optimum for any ε > 0, where d > 1 is some constant. Our study is motivated by a variant of the classical maximum coverage problem that we call maximum coverage with multiple packing constraints. We use our framework to obtain the same approximation ratio for this problem. To the best of our knowledge, this is the first time the theoretical bound of 1 − e−1 is (almost) matched for both of these problems.
1
Introduction
A function f , defined over a collection of subsets of a universe U , is called submodular if, for any S, T ⊆ U , f (S) + f (T ) ≥ f (S ∪ T ) + f (S ∩ T ). Alternatively, f is submodular if it satisfies the property of decreasing marginal value, namely, for any A ⊆ B ⊆ U and e ∈ U \ B, f (B ∪ {e}) − f (B) ≤ f (A ∪ {e}) − f (A). The function f is non-decreasing if, for any subsets T and S such that T ⊆ S, f (T ) ≤ f (S). The concept of submodularity plays a vital role in combinatorial theorems and algorithms, and its importance in discrete optimization has been well studied (see, e.g., [11] and the references therein, and the surveys in [8, 24]). Submodularity can be viewed as a discrete analog of convexity. Many practically important optimization problems, including maximum ∗
A preliminary version of this paper appeared in the Proceedings of the 20th Annual ACM-SIAM Symposium on Discrete Algorithms, New York, January 2009. † Computer Science Department, Technion, Haifa 32000, Israel. E-mail:
[email protected] ‡ Computer Science Department, Technion, Haifa 32000, Israel. E-mail:
[email protected]. Work supported by the Technion V.P.R. Fund and by Smoler Research Fund. § School of Computer Science, The Interdisciplinary Center, Herzliya, Israel. E-mail:
[email protected]
1
coverage, maximum facility location, and max cut in directed/undirected graphs, can be cast as submodular optimization problems (see, e.g., [8]). This paper presents the first known approximation algorithms for the problem of maximizing a non-decreasing submodular set function subject to multiple linear constraints. Given a ¯ for some d ≥ 1, and an oracle for a non-decreasing submodular d-dimensional budget vector L, set function f over a universe U , where each element i ∈ U is associated with a d-dimensional ¯ such that cost vector c¯i , we seek a subset of elements S ⊆ U whose total cost is at most L, f (S) is maximized. There has been extensive work on maximizing submodular monotone functions subject to matroid constraint.1 For the special case of uniform matroid, i.e., the problem {max f (S) : |S| ≤ k}, for some k > 1, Nemhauser et. al showed in [17] that a greedy algorithm yields a ratio of 1 − e−1 to the optimum. Later works presented greedy algorithms that achieve this ratio for other special matroids or for certain submodular monotone functions (see, e.g., [1, 14, 22, 6]). For a general matroid constraint, Calinescu et al. showed in [4] that a scheme based on solving a continuous relaxation of the problem followed by pipage rounding (a technique introduced by Ageev and Sviridenko [1]) achieves the ratio of 1 − e −1 for maximizing submodular monotone functions that can be expressed as a sum of weighted rank functions of matroids. Recently, this result was extended by Vondr´ak [24] to general monotone submodular functions. The bound of 1 − e−1 is the best possible for all of the above problems; this follows from a result of Feige [7], which holds already for the maximum coverage problem. The techniques introduced in these previous works are powerful and elegant, but do not seem to lead to efficient approximation algorithms for maximizing a submodular function subject to d linear constraints, already for d = 2. While the greedy algorithm is undefined for d > 1, a major difficulty in rounding the solution of the continuous problem (as in [4, 24]) is to preserve the approximation ratio while satisfying the constraints. A noteworthy contribution of our framework is in finding a way to get around this difficulty (see Section 1.1). The problem of maximizing non-monotone submodular functions has been studied as well. Feige et al. [8] considered the (unconstrained) maximization of a general non-monotone submodular function. The paper gives several (randomized and deterministic) approximation algorithms, as well as hardness results, also for the special case where the function is symmetric. Subsequent to our work, Lee et al. [15] studied the problem of maximizing general submodular functions under linear and matroid constraints and presented algorithms that achieve approximation ratio of 1/5 − ε for the problem with d linear constraints and a ratio of 1/(d + 2 + 1/d + ε) for d matroid constraints, for any fixed integer d ≥ 1. Our study is motivated by the following variant of the classical maximum coverage problem that we call maximum coverage with multiple packing constraints (MCMP). Given is a collection of subsets {S1 , . . . , Sm } over a ground set of elements A = {a1 , . . . , an }. Each element aj is associated with a d-dimensional size vector s¯j = (sj,1 , . . . , sj,d ) and a non-negative value wj . ¯ = (B1 , . . . , Bd ), and a budget k > 1. Also, given is a d-dimensional bin whose capacity is B The goal is to select k subsets in {S1 , . . . , Sm } and to determine which of the elements in ¯ and these subsets are covered, such that the overall size of covered elements is at most B, their total value is maximized. In the special case where d = 1, we call the problem maximum coverage with packing constraint (MCP). MCP is known to be APX-hard, even if all elements have the same (unit) size and the same (unit) profit, and each element belongs to at most four subsets [20]. Since MCP includes as a special case the maximum coverage problem, the best 1
A (weighted) matroid is a system of ‘independent subsets’ of a universe, which satisfies certain hereditary and exchange properties [18].
2
approximation ratio one can expect is 1 − e −1 [7].2 We consider also the following generalization of MCMP. Given is a set of n items. Each item aj is associated with a d-dimensional size vector s¯j = (sj,1 , . . . , sj,d ), a weight wj ≥ 0 and a set Cj ⊆ {1, . . . , m} of colors. Also, given are N d-dimensional bins, where the capacity of ¯i = (Bi,1 , . . . , Bi,d ), 1 ≤ i ≤ N . Each bin i has ki ≥ 1 compartments, where each bin i is B compartment can accommodate items of the same color. Thus, in a feasible packing, bin i contains items of at most ki distinct colors whose total size in the rth dimension is at most Bi,r , 1 ≤ r ≤ d, and each packed item aj is assigned one of the colors in Cj . The problem of packing with color sets (PCS) is to find a feasible packing of a subset of the items in the bins whose total weight is maximized. PCS shows up in many applications, including network services, video-on-demand systems and e-commerce (see in the Appendix). We note that PCS can be viewed as a restricted version of the separable assignment problem (SAP) studied in [9]. An instance of SAP consists of a set of n items A = {a 1 , . . . , an } and a set of N bins. The profit from packing item j in bin i is w i,j ≥ 0, for 1 ≤ i ≤ N , 1 ≤ j ≤ n; also, each bin i has an arbitrary packing constraint given by a family I i of subsets of A that can be packed in the bin.3 For all 1 ≤ i ≤ N , any subset of a set in I i is also in Ii . The goal is to pack items into the bins so as to maximize the total profit. Consider the special case of SAP where any item aj has the same profit for any bin i to which it is assigned, i.e., w1,j = · · · = wN,j = wj . We call this problem restricted SAP (RSAP). Then an instance of the PCS problem is an instance of RSAP in which I i is the family of possible solutions for MCMP on bin i, 1 ≤ i ≤ N . Fleischer et al. presented in [9] approximation algorithms for SAP which use as a subroutine algorithms for a single bin. In particular, given an r-approximation algorithm for finding a high value packing of a single bin, for some 0 < r ≤ 1, the paper [9] gives (i) a polynomialtime LP-based ((1 − e−1 )r)-approximation algorithms, and (ii) a local search (r/(r + 1) − ε)approximation algorithm, for any ε > 0. Both algorithms can be used for solving PCS, by taking in the subroutine our approximation algorithm for MCMP (see Section 4). Finally, we note that PCS generalizes the problem of class-constrained multiple knapsack (CCMK), in which d = 1 and |Cj | = 1 for all 1 ≤ j ≤ n [19, 20]. The paper [20] presents a polynomial time approximation scheme (PTAS) for any instance of CCMK in which m, the number of colors, is fixed.
1.1
Our Results
In Section 2 we develop a framework for maximizing submodular functions subject to d linear constraints, that yields a (1 − ε)(1 − e −1 )-approximation to the optimum for any ε > 0, where d > 1 is some constant. This extends a result of [22] (within factor 1 − ε). A key component in our framework is to obtain approximate solution for a continuous relaxation of the problem. This can be done using an algorithm recently presented by Vondr´ak [24]. For some specific submodular functions, other techniques can be used to obtain fractional solutions with the same properties (see, e.g., [1, 4]). In Section 3 we show that MCP can be approximated within factor 1 − e −1 , by applying known results for maximizing submodular functions. Here we use the fact that the fractional version of MCP defines a non-decreasing submodular set function; this is not true already for d = 2. For MCMP we show in Section 4 that our framework yields an approximation ratio of (1 − ε)(1 − e−1 ) when d > 1 is a constant. 2 3
For other known results for the maximum coverage problem, see e.g., [14, 21, 1]. It is assumed that the families Ii are implicitly given.
3
For RSAP we show (in Section 5.1) that when all bins are identical (i.e., have the same packing constraint) a simple greedy algorithm attains an approximation ratio of (1 − e −r ), where r is the approximation ratio of the packing algorithm for a single bin. For RSAP with arbitrary bins, we show that the same greedy algorithm attains approximation ratio of r r+1 . This extends the results of [5] for the greedy algorithm, when applied to the multiple knapsack problem. The approximation ratio for identical bins improves the ratios obtained by the algorithms of [9]. For arbitrary bins, we obtain approximation ratio that is similar to the approximation ratios of the algorithms in [9], however, our algorithm is much simpler. Applying the greedy algorithm with our approximation algorithm for MCMP for the single bin problem yields approximation ratio of 0.468 − ε for PCS on identical bins, and the ratio 0.388 − ε for arbitrary bins. In section 5.2 we extend the result of [20] by showing that PCS admits a PTAS when d = 1 and the number of colors is fixed.4 Technical Contribution: The heart of our framework for maximizing submodular functions is a rounding step that preserves multiple linear constraints. Here we use a non-trivial combination of randomized rounding with two enumeration phases: one on the most profitable elements in some optimal solution, and the other on the ‘big’ elements (see Section 2). This enables to show that the rounded solution can be converted to a feasible one with high expected profit. We use our framework to obtain approximation algorithm for MCMP. An interesting feature of this algorithm is the non-standard use of our framework for analyzing the algorithm rather than for construction of the solution. More specifically, we use the framework to show that the expected value obtained from a simpler random process is high, while the value itself (and a solution which attains this value) can be found by solving a linear program.
2
Maximizing Submodular Functions
In this section we describe our framework for maximizing a non-decreasing submodular set function subject to multiple constraints. For short, we call this problem MLC.
2.1
Preliminaries
Given a universe U , we call a subset of elements S ⊆ U feasible if the total cost of elements in ¯ we refer to f (S) as the value of f . S is bounded by L; An essential component in our framework is the distinction between elements by their costs. We say that an element i ∈ U is big in dimension r if c i,r ≥ ε4 Lr ; element i is big if for some 1 ≤ r ≤ d, i is big in dimension r. An element is small in dimension r if it is not big in dimension r, and small if it is not big. Note that the number of big elements in a feasible solution is at most d · ε−4 . Our framework applies some preliminary steps, after which it solves a residual problem. Given an instance of MLC, we consider two types of residual problems. For a subset T ⊆ U , define another instance of MLC in which the objective function is f T (S) = f (S ∪ T ) − f (T ) (it is easy to verify that fT is a non-decreasing submodular set function); the costPof each element ¯ − c¯(T ) where c¯(T ) = ¯i , and the remains as in the original instance, the budget is L i∈T c universe (which is a subset of the original universe) depends on the type of residual problem. 4
Note that this is the best possible, since this subclass of instances includes as special case multiple knapsack, which is strongly NP-hard [16].
4
• Value residual problem- the universe consists of all elements i ∈ U \T such that f T ({i}) ≤ f (T ) |T | . • Cost residual problem- the universe consists of all small elements in the original problem. These two types of problems allow us to convert the original problem to a problem with some desired properties, namely, either all elements are of bounded value, or all elements are of bounded cost in each dimension. Extension by Expectation: Given a non-decreasing submodular function f : 2 U → R+ , we define F : [0, 1]U → R+ to be the following continuous extension of f . For any y¯ ∈ [0, 1] U , let R ⊆ U be a random variable such that i ∈ R with probability y i . Then define ! X Y Y (1 − yi ) F (¯ y ) = E[f (R)] = f (R) yi R⊆U
i∈R
i∈R /
(For the submodular function fT , the continuous extension is denoted by F T .) This extension of a submodular function has been previously studied (see, e.g, [1, 4, 24]). We consider the following continuous relaxation of MLC. Define the polytope of the instance X ¯ P = {¯ y ∈ [0, 1]U | yi c¯i ≤ L}, i∈U
and the problem is to find y¯ ∈ P for which F (¯ y ) is maximized. Similar to the discrete case, y¯ ∈ [0, 1]U is feasible if y¯ ∈ P . For some specific submodular functions, linear programming can be used to obtain y¯ ∈ P such that F (¯ y ) ≥ (1−e−1 )O, where O is the optimal solution for MLC (see e.g [1, 4]). Recently, Vondr´ak [24] gave an algorithm that finds y¯ ∈ P 0 such that F (¯ y ) ≥ (1 − e−1 − o(1))Of where Of = maxz¯∈P 0 F (¯ z ) ≥ O, and P 0 is a matroid polytope.5 While the algorithm of [24] is presented in the context of matroid polytopes, it can be easily extended to general convex P polytope P with ¯0 ∈ P , as long as the value y¯ = argmax y¯∈P i∈U yi wi can be efficiently found for any vector w ¯ ∈ RU + . In our case, this can be efficiently done using linear programming. The algorithm of [24] can be used in our framework for obtaining a fractional solution for the continuous relaxation of a given instance. Overview Our algorithm consists of two main phases to which we refer as profit enumeration and the randomized procedure. The randomized procedure returns a feasible solution for its input instance, whose expected value is at least (1 − Θ(ε))(1 − e −1 ) times the optimal solution, minus Θ(MI ), where MI is the maximal value of a single element in this instance. Hence, to guarantee a constant approximation ratio (by expectation), the profit enumeration phase guesses (by enumeration) a constant number of elements of highest value in some optimal solution; then the algorithm proceeds to the randomized procedure taking the value residual problem with respect to the guessed subset. Since the maximal value of a single element in the value residual problem is bounded, we obtain the desired approximation ratio. The randomized procedure uses randomized rounding in order to attain an integral solution from a fractional solution returned by the algorithm of [24]. However, simple randomized rounding may not guarantee a feasible solution, as some of the linear constraints may be 5
The o(1) factor can be eliminated.
5
violated. This is handled by the following steps. First, the algorithm enumerates on the big elements in an optimal solution: this enables to bound the variance of the cost in each dimension, and the event of discarding an infeasible solution occurs with small probability. Second, we apply a fixing procedure, in which a nearly feasible solution is converted to a feasible solution, with small harm to the objective function.
2.2
Profit Enumeration
In section 2.3 we present algorithm MLC RR ε,d (I). Given an instance I of MLC and some ε > 0, MLC RRε,d (I) returns a feasible solution for I whose expected value is at least (1 − Θ(ε))(1 − e−1 )O − dε−3 MI , where MI = max f ({i}) i∈U
(1)
is the maximal value of any element in I, and O is the value of the optimal solution. We use this algorithm as a procedure in the following. Approximation Algorithm for MLC (A M LC ) 1. Initialize D = ∅. 2. For any R ⊆ U such that |R| ≤ ded · ε−3 e: (a) S ← MLC RRε,d (IR ), where IR is the value residual problem with respect to R. (b) if f (R ∪ S) > f (D) then set D = R ∪ S. 3. Return D Theorem 1 Algorithm AM LC runs in polynomial time and returns a feasible solution for the input instance I, with expected approximation ratio of (1 − Θ(ε))(1 − e −1 ). The above theorem implies that, for any εˆ > 0, with a proper choice of ε, A M LC is a polynomial time (1 − εˆ)(1 − e−1 )-approximation algorithm for MLC. Proof: Let O = {i1 , . . . , ik } be an optimal solution for I (we use O to denote both an optimal sub-collection of elements and the optimal value). Let h = ded · ε −3 e, and K` = {i1 , . . . , i` } (for any ` ≥ 1), and assume that the elements are ordered by their residual profits, i.e., i` = argmaxi∈OP T \K`−1 fK`−1 ({i}). Clearly, if there are less than h elements in O, then these elements are considered in some iteration of AM LC , and the algorithm finds an optimal solution; otherwise, consider the (Kh ) iteration in which R = Kh . For any j > h, fKh−1 ({ij }) ≤ fKh−1 ({ih }) ≤ f|K . Hence, the h| elements ih+1 , . . . , ik belong to the value residual problem with respect to R = K h , and the optimal solution of the residual problem is f R (O \ Kh ) = fR (O). For some α ∈ [0, 1], let f (R) = α · O. Then the optimal solution for the residual problem is (1 − α)O. Hence, by Theorem 2 (see in Section 2.3), the expected profit of MLC RR ε,d (IR ) is at least (1 − cε)(1 − e−1 )(1 − α) · O − dε−3 MIR , where c > 0 is some constant, and MIR is defined in (1). By the definition of the residual −1 3 (R) ≤ e d ε · αO; thus, the expected profit from the solution is problem, we get that MIR ≤ f|R| at least αO + (1 − cε)(1 − e−1 )(1 − α)O − dε−3 · 6
αO ≥ (1 − Θ(ε))(1 − e−1 )O. h
The expected profit of the returned solution is at least the expected profit in any iteration of the algorithm. This yields the desired approximation ratio. For the running time of the algorithm we note that, for fixed values of d ≥ 1 and ε > 0, the number of iterations of the loop is polynomial in the number of sets, and each iteration takes a polynomial number of steps.
2.3
The Randomized Procedure
For the randomized procedure, we use the following algorithm which is parametrized by ε and d and accepts an input I: Rounding Procedure for MLC (MLC RRε,d (I)) 1. Enumerate on all possible sub-collections of big elements which yield feasible solutions; denote the chosen sub-collection by T , and let T r ⊆ T be the sub-collection of elements in T which are big in the r-th dimension, r = 1, . . . , d. Denote by I T the cost residual problem with respect to T . (a) Find x ¯ in the polytope of IT such that FT (¯ x) is at least (1 − e−1 − ε) times the optimal solution of IT . (b) Add any small element i ∈ IT to the solution with probability (1 − ε)x i ; add any element i ∈ T to the solution with probability (1 − ε). Denote the selected elements by D. P ˜ r = Lr − Lgr . (c) For any 1 ≤ r ≤ d, let Lgr = ci,r , and L i∈Tr
(d) If for some 1 ≤ r ≤ d one of the following holds: ˜ r > εLr and P • L ci,r > Lr Pi∈D ˜ ˜r • Lr ≤ εLr and i∈D\Tr ci,r > εLr + L then select D = ∅, else
˜ (e) For any P dimension 1 ≤ r ≤ d such that Lr ≤ εLr , remove from D elements in Tr until i∈D ci,r ≤ Lr . (f) If f (D) is larger than the value of the current best solution, then set D to be the current best solution
2. Return the best solution. We now analyze algorithm AM LC . For an instance I of MLC, let O be an optimal solution (O is used both as the selected set of elements, and as the value of the solution). Theorem 2 Given an input I, algorithm MLC RR ε,d (I) returns a feasible subset of elements S such that E[f (S)] ≥ (1 − Θ(ε))(1 − e−1 )O − dε−3 MI , where MI is defined in (1). We consider the iteration in which T contains exactly all the big elements in O. To prove Theorem 2, we use the next technical lemmas. First, define W = f (D) when D is considered after stage (1b), then Lemma 3 E[W ] ≥ (1 − Θ(ε))(1 − e−1 )O.
7
Proof: Let D1 be the collection of small elements in D, and D 2 = D \ D1 the collection of big elements in D. In Step (1a) we get that F T (¯ x) ≥ (1 − e−1 − ε)fT (O) (the optimal solution for IT is fT (O), by the selection of O \ T ). Hence, due to the convexity of F (see [24], we have that E[fT (D1 )] = F ((1 − ε)¯ x) ≥ (1 − ε)(1 − e−1 − ε)fT (O), and E[f (D2 )] = F ((1 − ε)1T ) ≥ (1 − ε)F (1T ) = (1 − ε)f (T ). ( y = 1T ∈ {0, 1}U such that yi = 1 iff i ∈ T ). It follows that E[W ] = E[f (D)] = E[f (D1 ) + fD1 (D2 )] ≥ E[f (D2 )] + E[fT (D1 )] ≥ (1 − ε)f (T ) + (1 − ε)(1 − e−1 − ε)fT (O) ≥ (1 − Θ(ε))(1 − e−1 )O. Lemma 3 implies that, after the randomized rounding of stage (1b), the integral solution D has a high value. In the next lemma we show that the modifications applied to D in stages (1d) and (1e) may cause only small harm to the expected value of the solution. We say that a solution is nearly feasible in dimension r if it does not satisfy any of the conditions in (1d), and nearly feasible, if it is nearly feasible in each dimension. Let F (F r ) be an indicator for the feasibility of D (feasibility of D in dimension r) after stage (1b). Lemma 4 P r(F = 0) ≤ dε. Proof: For some 1 ≤ r ≤ d, let Zr,1 be the cost of D ∩ Tr in dimension r, and let Zr,2 be the cost of D \ Tr in dimension r. Clearly, Zr,1 ≤ Lgr (≤ Lr ). Let Xi be an indicator random variable for the selection of the element i (note that the X i ’s are independent). Let Small(r) be the collection of all the elements which are not big in dimension r, i.e., P Pfor any i ∈ Small(r), 4 ci,r < ε Lr . Then, Zr,2 = i∈Small(r) Xi ci,r . It follows that E[Zr,2 ] = i∈Small(r) ci,r E[Xi ] ≤ ˜ r , and (1 − ε)L X ˜ r. V ar[Zr,2 ] ≤ E[Xi ]ci,r · ε4 Lr ≤ ε4 Lr L i∈Small(r)
Recall that by the Chebyshev-Cantelli bound, for any t > 0, P r(Zr,2 − E[Zr,2 ] ≥ t) ≤
V ar[Zr,2 ] . V ar[Zr,2 ] + t2
˜ r > εLr , using the Chebyshev-Cantelli inequality, we have Thus, if L 4 ˜ ˜ r ) ≤ ε Lr Lr ≤ ε; P r(Fr = 0)P r(Zr,2 − E[Zr,2 ] > εL ˜ 2r ε2 L
˜ r ≤ εLr . Similarly, else L P r(Fr = 0) ≤ P r(Zr,2 − E[Zr,2 ] > εLr ) ≤ By the union bound, we get that P r(F = 0) ≤ dε. 8
˜r ε4 Lr L ≤ ε3 . 2 ε Lr 2
P
c
i,r For any dimension r, let Rr = i∈D and define R = maxr Rr , where D is considered Lr after stage (1b). R denotes the maximal relative deviation of the cost from the r-th entry in the budget vector, where the maximum is taken over 1 ≤ r ≤ d.
Lemma 5 For any ` > 1, P r(R > `)
`) = P r(Zr,2 > ` · Lr − Lgr ) ≤ P r(Zr,2 − E[Zr,2 ] > (` − 1)Lr ) ≤
˜r ε4 Lr L ε4 , ≤ 2 (` − 1)2 (` − 1)2 Lr
and by the union bound, we get that P r(R > `) ≤
dε4 . (` − 1)2
Lemma 6 For any integer ` > 1, if R ≤ ` then f (D) ≤ 2d` · O. Proof: The set D can be partitioned to 2d` sets D 1 , . . . D2d` such that each of this sets is a feasible solution. Hence, f (Di ) ≤ O, and so f (D) ≤ f (D1 ) + . . . + f (D2d` ) ≤ 2d`f (O). We omit the details. Let W 0 = f (D) when D is considered after stage (1d). Lemma 7 E[W 0 ] ≥ (1 − Θ(ε))(1 − e−1 )O. Proof: By Lemmas 4 and 5, it holds that E[W ] = E [W |F = 1] · P r(F = 1) + E [W |F = 0 ∧ R < 2] · P r(F = 0 ∧ (R < 2)) ∞ h i X + E W |F = 0 ∧ (2` ≤ R ≤ 2`+1 ) · P r(F = 0 ∧ (2` ≤ R ≤ 2`+1 )) `=1
∞ X 2`+2 ≤ E[W |F = 1] · P r(F = 1) + 4d εˆ · O + d εˆ · O · . (2`−1 )2 2
2 4
`=1
Since the last summation is a constant, using Lemma 3, we have that: E[W |F = 1] · P r(F = 1) ≥ (1 − cˆ ε)(1 − e−1 )O, where c is some constant. Also, since W 0 = W if F = 1 and W 0 = 0; otherwise, we have that E[W 0 ] = E[W |F ] · P r(F ) ≥ (1 − cˆ ε)(1 − e−1 )O.
9
Lemma 8 Let P = f (D) when D is considered after stage (1f). Then D is a feasible solution, and E[P ] ≥ (1 − Θ(ε))(1 − e−1 )O − dε−3 · MI . ˜ r > εLr then no elements are removed Proof: In stage (1e), for each dimension 1 ≤ r ≤ d, if L ˜ r ≤ εLr then, from the solution (and, clearly, the solution is feasible in this dimension). If L if all big elements in the r-th dimension are removed, the solution becomes feasible in this dimension, since X ˜ r + εLr ≤ 2εLr ≤ Lr ci,r ≤ L i∈D\Tr
(for ε < 1/2). This implies that it is possible to convert the solution to a feasible solution in the r-th dimension by removing only elements which are big in this dimension. At most ε −3 elements need to be removed due to each dimension r (since c i,r ≥ ε4 Lr when i is big in the rth dimension). Hence, in stage (1e) at most dε −3 elements are removed. Then the expected value of the solution after this stage satisfies E[P ] ≥ E[W 0 ] − dε−3 MI (since the profit is a non-decreasing submodular function) and, by Lemma 7, E[P ] ≥ (1 − Θ(ε))(1 − e−1 )O − dε−3 · MI . Proof of Theorem 2. Since any non-feasible solution is converted to a feasible one, the algorithm returns a feasible solution. By Lemma 8, the expected value of the returned solution is at least (1 − Θ(ε))(1 − e −1 )O − dε−3 · MI . For the running time of the algorithm, we note that each iteration of the loop runs in polynomial time; the number of iterations of the main loop is also polynomial, as the number of sets in T is bounded by dε−4 , which is a constant for fixed values of d ≥ 1, ε > 0.
3
Approximation Algorithm for MCP
The MCMP problem in single dimension can be formulated as follows. Given is a ground set A = {a1 , ..., an }, where each element aj has a weight wj ≥ 0 and size sj ≥ 0. Also, given are a a bin of capacityPB, a collection of subsets S = {S 1 , ...,PSm }, and an integer k > 1. Let s(E) = aj ∈E sj for all E ⊆ A, and w(E) = aj ∈E wj . The goal is to select a subS collection of sets S 0 , such that |S 0 | ≤ k, and a set of elements E ⊆ Si ∈S 0 Si , such that s(E) ≤ B and w(E) is maximized. Let O be an optimal solution for MCP (we use O also as the weight of the solution). Our approximation algorithm for MCP combines an enumeration phase, which involves guessing the ` = 3 elements with highest weights in O and sets that cover them, with maximization of a submodular function. Given a correct guess of the ` elements with highest profits, we use two submodular functions: the function f is used for finding a collection of subsets for our approximate solution, and the function g is used for selecting the covered elements in these subsets. More specifically, we arbitrarily associate each element a j in O with a set Si in O which contains aj ; we then consider them as a pair (aj , Si ). The first stage of our algorithm is to guess T , a collection of ` pairs (aj , Si ), such that aj ∈ Si . The elements in T are the ` elements with highest weights in O. Let TE be the collection of elements in T , and let T S be the collection of
10
sets in T . Also, let k 0 = k − |TS | and wT = minaj ∈TE wj . We denote by O 0 the weight of the solution O excluding the elements in T E ; then, w(TE ) = O − O 0 . We use our guess of T to define a non-decreasing submodular set-function over S \ T S . Let 0 B = B − s(TE ). We first define the function g : 2A → R: g(E) = max
n X
xj wj
j=1
subject to: 0 ≤ xj ≤ 1 xj = 0 n X
∀ aj ∈ E ∀ aj ∈ /E
(2)
xj sj ≤ B 0
j=1
Note that while g is formulated as a linear program, given a collection of elements E, the value of g(E) can be easily evaluated by a simple greedy algorithm, and the vector x ¯ for which the value of g(E) is attained has a single fractional entry. Lemma 9 The function g is a non-decreasing submodular set function. Proof: Let V, W ⊆ A. We show below that g(V ) + g(W ) ≥ g(V ∪ W ) + g(V ∩ W ).
(3)
Consider the vectors z¯ = (z1 , . . . , zn ) and y¯ = (y1 , . . . , yn ), which maximize g(V ∪ W ) and g(V ∩ W ), respectively. Define z¯1 = (z11 , . . . , zn1 ), in which zj1 = zj for any aj ∈ W , and zj1 = 0 otherwise, and let z¯2 = z¯ − z¯1 . In the following we define two vectors, x ¯ V and x ¯W , as feasible solution for the above program, where the collections of elements are E = W and E = V , respectively. 1. Set v¯ = 0 and x ¯W = z¯1 . 2. For any aj ∈ V ∩ W do: o n 0 xW ·¯ s , y ¯ = (s1 , . . . , sn ). (a) rj = min 1 − zj1 , B −¯ j where s sj W (b) xW j = xj + rj ; vj = yj − rj
3. x ¯V = z¯2 + v¯ It is easy to see that at the end of the process, x ¯W + x ¯V = z¯ + y¯. Also, clearly, x ¯ W is a feasible solution for the linear inequalities in (2) with respect to the set W . We now show that x ¯ V is a feasible solution for this system with respect to the set V . For any a j ∈ / V , it holds that V V 2 V xj = 0. For any aj ∈ V , if aj ∈ / W then xj = zj and 0 ≤ xj ≤ 1; otherwise, aj ∈ W , and V xj = vj = yj − rj . By the definition of rj , it holds that 0 ≤ rj ≤ yj , thus 0 ≤ rj ≤ yj ≤ 1. It remains to show that x ¯V · s¯ ≤ B 0 . To do so we consider two scenarios: • In case x ¯W · s¯ = B 0 , we have that x ¯V · s¯ + B 0 = (¯ xW + x ¯V ) · s¯ = (¯ z + y¯) · s¯ ≤ 2B 0 and x ¯V · s¯ ≤ B 0 as desired. 11
• Otherwise x ¯W · s¯ < B 0 , this means that whenever the calculation of r j is preformed, either rj = yj or rj = 1 − zj1 . We get that vj = 0 or vj = yj − (1 − zj1 ) ≤ zj1 . Hence, v¯ ≤ z¯1 . It follows that x ¯V · s¯ = (¯ z 2 + v¯) · s¯ ≤ (¯ z 2 + z¯1 ) · s¯ = z¯ · s¯ ≤ B 0 . Therefore, x ¯ V is a feasible solution for the system (2) with respect to the set V . Let w ¯ = (w1 , . . . , wn ). From the above, we conclude that g(W ) + g(V ) ≥ x ¯W · w ¯+x ¯V · w ¯ = (¯ xW + x ¯V ) · w ¯ = (¯ z + y¯)w ¯ = g(W ∪ V ) + g(W ∩ V ), thus showing (3). It is also easy to see that g is non-decreasing set-function. This complete the proof. For any S 0 ⊆ S \ TS , we define the set C(S 0 ) to be a subset of elements whose weights are at most wT : S C(S 0 ) = {aj | aj ∈ Si ∈S 0 ∪TS Si , aj ∈ / TE , wj ≤ wT }. We use g and C(S 0 ) to define f : 2S\TS → R by f (S 0 ) = g(C(S 0 )). Consider the problem max f (S 0 ) subject to: |S 0 | ≤ k 0 .
(4)
By taking S 0 to be all the sets in O excluding TS , we get that |S 0 | ≤ k 0 and f (S 0 ) ≥ O 0 . This gives a lower bound for the value of the problem. To find a collection of subsets S 0 such that f (S 0 ) is an approximation for the problem (4), we use the following property of f : Lemma 10 The function f is a non-decreasing submodular set function. Proof: Let W, V ⊆ S \ TS then, since g is submodular and non-decreasing we have: f (W ) + f (V ) = g(C(W )) + g(C(V )) ≥ g(C(W ) ∪ C(V )) + g(C(W ) ∩ C(V )) ≥ g(C(W ∪ V )) + g(C(W ∩ V )) = f (W ∪ V ) + f (W ∩ V ) Also, since W ⊆ V implies that C(W ) ⊆ C(V ), and g is non-decreasing, f is non-decreasing as well. This means that we can find a (1 − e−1 )-approximation for the problem (4) by using a greedy algorithm [17]. Let S 0 be the collections of subsets obtained by this algorithm. Since it is a (1−e−1 )-approximation for (4), we have that g(C(S 0 )) = f (S 0 ) ≥ (1−e−1 )O 0 . Let x ¯ be the vector that maximizes g(C(S 0 )), which has at most one fractional entry. (As mentioned above, such x ¯ can be found using a simple greedy algorithm.) Consider the collection of elements C = {aj |xj = 1}. Since there is at most one fractional entry in x ¯, by the definition of C(S 0 ) 0 we have that w(C) ≥ g(C(S )) − wT . Now consider the collection of sets S 0 ∪ TS , along with the elements C ∪ TE . This is a feasible solution for MCP whose total weight is at least 1 w(TE ) + g(C(S 0 )) − wT ≥ (1 − )w(TE ) + (1 − e−1 )O 0 ` 1 0 = (1 − )(O − O ) + (1 − e−1 )O 0 ≥ (1 − e−1 )O. ` The last inequality follows from the fact that ` = 3, therefore (1 − 1l ) ≥ 1 − e−1 . We now summarize the steps of our approximation algorithm. 12
Approximation Algorithm for MCP 1. Enumerate on all the possible sets T of pairs (a j , Si ), such that aj ∈ Si and |T | ≤ `: (a) Find a (1 − e−1 )-approximation S 0 for the problem (4), using the greedy algorithm [17]. (b) Let x ¯ be the vector that maximizes g(C(S 0 )), such that x ¯ has at most one fractional entry. Define C = {aj |xj = 1} (c) Consider the collection of sets S 0 ∪ TS , along with the elements C ∪ TE . If the weight of this solution is higher than the best solution found so far, select it as the best solution. 2. Return the best solution found. By the above discussion, we have Theorem 11 The approximation algorithm for MCP achieves a ratio of 1−e −1 to the optimal and has a polynomial running time. The result in this section can be easily extended to solve a generalization of MCP where each set has a cost ci , and there is a budget L for the sets, by using an algorithm of [22] to find a (1 − e−1 )-approximation for f (S 0 ). In contrast, there is no immediate extension of the above result to MCMP. A main obstacle is the fact that when attempting to define a function g (and accordingly f ) that involves more than a single linear constraint, the resulting function is not submodular.
4
A Randomized Approximation Algorithm for MCMP
The problem of maximum coverage with multiple packing constraints is the following variant of the maximum coverage problem. Given is a collection of sets S = {S 1 , ..., Sm } over a ground set A = {a1 , ..., an }, where each element aj has a weight wj ≥ 0 and a d-dimensional size vector s¯j = (sj,1 , . . . , sj,d ), such that sj,r ≥ 0 for all 1 ≤ r ≤ d. Also, given is an integer ¯ = (B1 , . . . , Bd ). A k > 1, and a bin whose capacity is given by the d-dimensional vector B P collection of elements E is feasible if for any 1 ≤ r ≤ d, aj ∈E sj,r ≤ Br ; the weight of E is P w(E) = aj ∈E wj . The goal is to select a sub-collection of sets S 0 ⊆ S of size at most k and a feasible collection of elements E ⊆ A, such that each element in E is an element in some Si ∈ S 0 and w(E) is maximized. An important observation when attempting to solve this problem is that, given the selected sub-collection of sets S 0 , choosing the set of elements E (which are covered by S 0 ) yields an instance of the classic multidimensional knapsack problem (MKP). It is well known that MKP admits a PTAS [10], and that the existence of an FPTAS for the problem would imply that P = N P (see, e.g., [13]). Our algorithm makes use of the two main building blocks of the PTAS for MKP as presented in [13], namely, an exhaustive enumeration phase, combined with certain properties of the linear programing relaxation of the problem. Let O be an optimal solution for the given instance for MCMP. We arbitrarily associate each element aj selected in O with some selected subset S i in O such that aj ∈ Si . For the use of our algorithm, we guess a collection T of ` pairs (a j , Si ) of an element aj and a set Si with which it is associated, such that the collection of elements in T consists of the ` elements with highest weights in O. Let TE be the collection of elements in T , and let T S be the collection of sets in T . Also, let wT = minaj ∈TE wj . 13
After guessing the pairs in T and taking them to be the initial solution for the problem, we use the following notation, which reflects the problemP that now needs to be solved. Define ¯ 0 = (B 0 , . . . , B 0 ) where B 0 = Br − the capacity vector B r 1 aj ∈TE sj,r for 1 ≤ r ≤ d. We reduce d the collection of elements to ¯ 0} A0 = {aj ∈ A \ TE |wj ≤ wT and s¯j ≤ B (A0 consists of all the elements whose weight is not greater than the smallest weight of an element in T , and P fit into the new capacity vector). Define the subsets to be S i0 = Si ∩ A0 . Also, let O 0 = O − aj ∈TE wj be the total weight in the optimal solution from elements not in T. We define the size of a subset Si0 to be the total size of elements associated with S i0 (i.e., the elements associated with Si , excluding elements in TE ), and denote it by s¯ˆi = (ˆ si,1 , . . . , sˆi,d ). We say the a subset Si0 is big in dimension r if sˆi,r > ε4 Br0 , and small in dimension r otherwise. Since there are up to ε−4 subsets that are big in dimension r in O, we can guess which sets are big in each dimension in the solution O. Let G r be the collection of big sets in dimension r in our guess. Also, let xi ∈ {0, 1} be an indicator for the selection of S i0 , 1 ≤ i ≤ m; yi,j ∈ {0, 1} indicates whether aj ∈ A0 is associated with Si0 , 1 ≤ i ≤ m. Using the guess of T and our guess of the big sets, we define the following linear programming relaxation for the problem. m X X
(LP ) maximize
yi,j wj
i=1 j|aj ∈Si0
subject to: ∀i, j s.t. aj ∈ Si0 : ∀aj ∈ A0 :
0X ≤ yi,j ≤ xi yi,j ≤ 1 i|aj ∈Si0
m X
i=1 m X
∀1 ≤ r ≤ d :
xi ≤ k X
yi,j sj,r ≤ Br0
i=1 j|aj ∈Si0
∀Si ∈ TS : xi = 1 X ∀1 ≤ r ≤ d and i ∈ Gr : xi = 1, yi,j sj,r ≥ ε4 Br0 aj ∈Si0
∀1 ≤ r ≤ d and i ∈ / Gr : 0 ≤ xi ≤ 1,
X
yi,j sj,r ≤ ε4 Br0 xi
aj ∈Si0
It is important to note that, given a correct guess of T and G r for every 1 ≤ r ≤ d, the value of the optimal solution of LP is at least O 0 . The guessed sets Gr are involved in the last two constraints of the above program. All sets in G r were added to the solution (xi = 1), and are ‘forced’ to be big. In contrast, sets which are not in G r have to satisfy the constraint X yi,j sj,r ≤ ε4 Br0 xi . (5) aj ∈Si0
Therefore, these sets remain small even after scaling the values of y i,j by x−1 i . The solution of LP is used to randomly determine the sets selected for our solution. For 14
any set Si , 1 ≤ i ≤ m, if Si ∈ TS then Si is added to the solution; otherwise, S i is added to the solution with probability (1 − ε)xi . If the resulting solution D contains more S than k subsets, then the algorithm returns an empty solution; otherwise, we define C = Si ∈D Si0 and solve the following program. X (LP 0 ) maximize: yj wj aj ∈C
subject to: ∀ 1 ≤ r ≤ d :
X
yj sj,r ≤ Br0
aj ∈C
∀ aj ∈ C :
0 ≤ yj ≤ 1
Any basic solution for the above linear program has at most d fractional entries. Let A be the collection of elements aj ∈ C for which yj = 1; then, clearly, A ∪ TE along with the collection of subsets D form a feasible solution for the problem. We now summarize the steps of our algorithm, which gets as input the parameters ` ≥ 1 and ε > 0. Approximation Algorithm for MCMP 1. If k ≤ ε−3 + ` enumerate on the subsets in the optimal solution, and run the PTAS for MKP for selecting the elements. (this guaranties an approximation ratio of (1 − ε)) 2. For each collection T of at most ` pairs (a j , Si ), where aj ∈ Si , and any Gr ⊆ {1, . . . , m} of size at most ε−4 do the following: (a) Solve LP with respect to T and Gr . Let x ¯ = (x1 , . . . , xm ) be the (partial) solution. (b) Initially, set D = ∅. For any Si ∈ S, if Si ∈ TS add Si to D; otherwise, add Si to D with probability (1 − ε)xi . (c) If D contains more than k sets, continue to the next iteration of the loop in Step 2. (d) Let y¯ be the solution of LP0 for the elements in D. Set A to be all the elements a j such that yj = 1. (e) If the total weight of elements in A ∪ T E is greater than the weight of the current best solution, choose A ∪ TE with D as the current best solution. 3. Return the best solution found It is easy to verify that the running time of the algorithm is polynomial (for fixed ` and ε), since the number of iterations of the main loop (in Step 2.) is polynomial. Also, clearly, the solution returned by the algorithm is always feasible. It remains to show that the algorithm achieves the desired approximation ratio of αf = 1 − (1 −
1 f ) > (1 − e−1 ), f
(6)
where f is the maximal number of subsets in which a single element appears. For the case where k ≤ ε−3 + `, the claim is trivial. Hence, we assume below that k > ε −3 + `. To show the approximation ratio of α f , we refer to the iteration in which we use the correct guess for T and Gr . We define a slightly more complex randomized rounding process. Let X1 , . . . , Xm be a set of indicator random variables defined as follows. For any i such that Si ∈ / TS Xi = 1 iff Si ∈ D, and for any i ∈ TS , Xi = 1 with probability 1 − ε. Thus, for any 15
1 ≤ i ≤ m Xi = 1 with probability (1 − ε)xi . Also, we note that the Xi ’s are independent random variables. Let {yi,j } be the solution obtained for LP in line (2a) of the algorithm. The values of X i ’s are used to determine the value of the random variable Y j , for any aj ∈ A0 , namely, X yi,j Yj = min 1, Xi . xi i|aj ∈Si ,xi 6=0
Our goal is to show that (a slightPmodification of) Y 1 , . . . , Yn forms a solution for LP0 with high expected value. Define yj = i|aj ∈Si yi,j , then the following holds. Lemma 12 For any aj ∈ A0 ,
E[Yj ] ≥ (1 − ε)αf · yj ,
where αf is defined in (6). We use in the proof the next claim. Claim 13 For any x ∈ [0, 1] and f ∈ N,
x 1− 1− f
f
≥ x · αf
f f −2 x ≤ 1 − Proof: Let h(x) = 1− 1 − fx −x·αf , then h(0) = h(1) = 0. Also, h00 (x) = − f −1 f f 0 for x ∈ [0, 1]. Hence h(x) ≥ 0 for x ∈ [0, 1] and the claim holds. Proof of Lemma 12: For the case where y j = 0 the claim is trivial, thus we assume below that yj 6= 0. Let S[j] = {1 ≤ i ≤ m | aj ∈ Si } and fj = |S[j]|. Let δ ∈ (0, 1) such that, for any y i ∈ S[j] with xi 6= 0, the value xi,ji is an integral multiple of δ, and 1 is also an integral multiple of δ (assuming all values are rational, such δ exists). Let H = δ −1 , and let Z1 , . . . , ZH be a set of indicator random variables used as follows. Whenever X i is selected in our random process y (i.e., Xi = 1), randomly select xi,ji ·δ −1 indicators among Z1 , . . . , ZH with uniform distribution. For 1 ≤ h ≤ h, Zh = 1 if it was selected by some Xi , i ∈ S[j] (we say that Zi is selected in this case), otherwise Zh = 0. In this process, the probability of a specific indicator Z h to be y selected by a specific Xi is zero when xi = 0, and (1 − ε)xi · xi,ji = (1 − ε)yi,j otherwise. Hence, we get that, for all 1 ≤ h ≤ H, Y E[Zh ] = P r(Zh = 1) = 1 − (1 − (1 − ε)yi,j ) i∈S[j]
fj X 1 (1 − (1 − ε)yi,j ) ≥ 1− fj i∈S[j] (1 − ε)yj fj = 1− 1− ≥ (1 − ε)yj αf . fj
(7)
The first inequality follows from the inequality of the three means, and the last inequality follows from Claim Ph 13. 0 Let Yj = δ · r=1 Zr be δ times the number of selected indicators. An important property of Yj0 is that Yj0 ≤ Yj . Indeed, if Yj = 1 then Yj0 ≤ 1 since there are only h = δ −1 indicators, 16
and if Yj < 1, then no more than Yj δ −1 indicators are selected; therefore, E[Y j0 ] ≤ E[Yj ]. By (7), we have that "H # H X X 0 E[Yj ] = E Zh δ = δ E[Zh ] ≥ δ · H · (1 − ε)yj αf = (1 − ε)yj · αf . h=1
h=1
Hence, E[Yj ] ≥ (1 − ε) · αf · yj as desired.
P
We use another random variable, Y = aj ∈A0 Yj wj ; Y can be viewed as the value of {Yj } when used as a solution for LP0 . Let OP T be the value of the optimal solution for LP, given a correct guess of T and Gr . Then, by Lemma 12, we have that X (8) Yj wj ≥ (1 − ε) · αf · OP T ≥ (1 − ε) · αf · O 0 . E[Y ] = E aj ∈A0
P y Define the size of the set Si0 to be sˆ¯i = (ˆ si,1 , . . . , sˆi,d ), where sˆi,r = aj ∈S 0 xi,ji sj,r if xi 6= 0 i P and sˆi,r = 0 otherwise. For any dimension 1 ≤ r ≤ d, let B rg = i∈Gr sˆi,r , and ˜r = B 0 − B g . B r r
(9)
Also, we use the notation Zr,1 =
X
X
Xi sˆi,r , Zr,2 =
i∈Gr
Xi · sˆi,r ,
(10)
i∈G / r
and Zr = Zr,1 +Zr,2 : Zr,1 is the total size of selected big subsets in dimension r, Z r,2 is the total size of not-big subsets in dimension r, and Z r is the total size of all subsets in dimension r. We say that the solution (output by the above randomized rounding process) is nearly feasible in dimension r, 1 ≤ r ≤ d, if one of the following holds: ˜r > εBr0 and Zr ≤ Br0 (i) B ˜r ≤ εB 0 and Zr,2 ≤ εB 0 + B ˜r (ii) B r r We define a set of d random variables, F 1 , . . . , , Fd , to indicate the feasibility of the solution: for any 1 ≤ r ≤ d, Fr = 1 if the solution is feasible in dimension r, and F r = 0 otherwise. Next, we show that the probability that the solution is infeasible, in any dimension, is small. Lemma 14 For any 1 ≤ r ≤ d, P r(Fr = 0) ≤ ε. Proof: By the definition of Zr,2 , we have that X X ˜r sˆi,r ) = (1 − ε)B E[Xi ]ˆ si,r ≤ (1 − ε)(Br0 − E[Zr,2 ] ≤ i∈Gr
i∈G / r
Also, V ar[Zr,2 ] = V ar[ ≤
P
(11)
P
i∈G / r
i∈G / r
Xi · sˆi,r ] =
P
i∈G / r
sˆ2i,r V ar [Xi ]
ε4 Br0 sˆi,r (1 − ε)xi = ε4 Br0 (1 − ε)
˜r ≤ ε4 Br0 (1 − ε)B 17
P
i∈G / r
sˆi,r xi
(12)
The second equality holds since the X i ’s are independent random variables. The first inequality follows from (5), and the second inequality follows from (11). In bounding P r(Fr = 0) we distinguish between two cases. ˜ r > εBr0 , then (i) Suppose that B ˜r ) P r(Fr = 0) = P r(Zr > Br0 ) = P r(Zr,2 > B ˜r ) ≤ P r(Zr,2 − E[Zr,2 ] > εB ≤
˜r B0 ε4 Br0 (1 − ε)B ≤ ε2 · r ≤ ε ˜ r2 ˜r ε2 B B
The second equality follows from the fact that Z r,1 ≤ Brg . The first inequality follows ˜r and from (11), the second inequality is due to Chebyshev-Cantelli, 6 by taking t = εB 0 ˜ the last inequality holds since Br ≥ εBr . ˜r ≤ εB 0 , we have that (ii) For the case where B r ˜ r + εB 0 ) P r(Fr = 0) = P r(Zr,2 > B r ≤ P r(Zr,2 − E[Zr,2 ] > εBr0 ) ≤
˜r ε4 Br (1 − ε)B ε5 Br02 ≤ = ε3 ε2 Br02 ε2 Br02
The second inequality follows from the Chebyshev-Cantelli bound with t = εB r0 , using ˜r < εB 0 . (12), and the last inequality holds since B r Thus, we get that for any 1 ≤ r ≤ d, P r(F r = 0) ≤ ε.
The above lemma bounds the probability for a small deviation of Z r from Br0 . Larger deviations can be easily bounded. Let R r = Zr /Br0 . Lemma 15 For any dimension 1 ≤ r ≤ d and t > 1, P r(Rr > t) ≤
ε4 . (t − 1)2
Proof: By definition, we have that P r(Rr > t) = P r(Zr,1 + Zr,2 > t · Br0 ) ≤ P r(Zr,2 > t · Br0 − Brg ) ˜r ) = P r(Zr,2 − E[Zr,2 ] > (t − 1)B 0 ) ≤ P r(Zr,2 − E[Zr,2 ] > t · Br0 − Brg − B r ≤
˜r ε4 Br0 B ε4 V ar[Zr,2 ] ≤ ≤ (t − 1)2 V ar[Zr,2 ] + (t − 1)2 Br0 2 (t − 1)2 Br0 2
The first inequality follows from the fact that Z r,1 ≤ Brg , the second inequality follows from 6
Recall that by the Chebyshev-Cantelli bound, for any t > 0, P r(Zr,2 − E[Zr,2 ] ≥ t) ≤
18
V ar[Zr,2 ] . V ar[Zr,2 ]+t2
(11). The third inequality holds due to Chebyshev-Cantelli, and the last inequality follows ˜r ≤ B 0 . from the fact that B r Next, we bound the probability that more than k sets are selected for the solution. W.l.o.g, we assume that k > ε−3 + `. Lemma 16 For any t > 1, ε P r(|D| > t · k) ≤ 2 . t Proof: Let h be the number of subsets in T . Let D 0 the set of all subsets in D which are not in T . Also define k 0 = k − h. Since h ≤ `, we have that k 0 > ε−3 . Clearly, P r (|D| > t · k) = P r D 0 > t · k − h ≤ P r D 0 > t · k 0 . Now, E [|D 0 |] ≤ (1 − ε)k 0 , and V ar [|D 0 |] ≤ (1 − ε)k 0 . Hence, by Chebyshev-Cantelli, P r D 0 > t · k 0 ≤ P r D 0 − E D 0 > t · k 0 − (1 − ε)k 0 ε (1 − ε)k 0 ≤ 2 ≤ P r D 0 − E D 0 > t · ε · k 0 ≤ 0 2 (t · ε · k ) t
This completes the proof.
Let R = max{max1≤r≤d Rr , |D| k } be the maximal relative deviation of the solution, from the capacity constraints B10 , . . . , Bd0 , or the cardinality constraint k. The next
either result
follows from Lemmas 15 and 16, by applying the union bound. Corollary 17 For any t > 1, P r(R > t) ≤
dε ε + 2. (t − 1)2 t
Let F be a random variable, such that F = 1 if |D| ≤ k and the solution is nearly feasible in dimension r, for any 1 ≤ r ≤ d, and F = 0 otherwise. The next result follows from Lemmas 14 and 15, using the union bound. Corollary 18 P r(F = 0) ≤ (d + 1) · ε Now, we bound the value of Y as a function of R. Lemma 19 For any integer t > 1, if R ≤ t then Y ≤ t · c d · O 0 , where cd is constant for fixed d. Proof: W.l.o.g., assume that Br0 = 1 for all 1 ≤ r ≤ d (this holds with proper scaling of the element sizes). For all aj ∈ A0 , let |¯ sj |1 = sj,1 + . . . + sj,d . Since R < t, we have that |D| ≤ t · k. Divide D into (up to) t sets of sizes not greater than k, D 1 , . . . , Dt . For each set Dh , 1 ≤ h ≤ t X y P i,j and aj ∈ A0 , define Yh,j = min , 1 , and let zh = aj ∈A0 Yh,j |¯ sj |1 . We note that xi i∈Dh ,xi 6=0
zh ≤
X
|s¯ˆi |1 ,
(13)
Si ∈Dh
where s¯ˆi is the total size of the elements in S i0 , summed over the d dimensions. We distinguish between two cases: n o Yh,j (i) If zh > 1 then the set Dh , along with the vector zh , is a solution for LP0 with value P of z1h aj ∈A0 Yh,j wj . Recall that in any basic solution for LP 0 , at most d values are 19
fractional, thus the value of a solution for this program is at most (d + 1)O 0 . It follows that 1 X Yh,j wj ≤ (d + 1)O 0 (14) zh 0 aj ∈A
for all 1 ≤ h ≤ t such that zh > 1. (ii) If zh ≤ 1 then the set Dh , along with the vector {Yh,j }, is a solution for LP0 , and by the same argument we have that X Yh,j wj ≤ (d + 1)O 0 (15) aj ∈A0
for all 1 ≤ h ≤ t such that zh ≤ 1. Combining (14) and (15) we get that X Yh,j wj ≤ (zh + 1)(d + 1)O 0 aj ∈A0
for all 1 ≤ h ≤ t. Hence, Y =
t X X
h=1 aj ∈A0
Yh,j wj ≤
t X
(zh + 1)(d + 1)O 0 ≤ (d + 1)2 · t · O 0 .
h=1
The last inequality follows from (13) and the fact that
Pt
h=1
P
Si ∈Dh
|s¯ˆi |1 ≤ d · t.
Combining the results of the above lemmas, we obtain the following. Lemma 20 For some c00d , E[Y |F = 1]P r(F = 1) ≥ (1 − c00d ε) · αf · O 0 . Proof: Using conditional probabilities we have that E[Y ] = E[Y |F = 1]P r(F = 1) + E[Y |F = 0]P r(F = 0) ≤ E[Y |F = 1]P r(F = 1) + E[Y |F = 0 ∧ R ≤ 2]P r(F = 0) + ∞ X E[Y |2t < R ≤ 2t+1 ]P r(2t < R ≤ 2t+1 ) t=1
≤ E[Y |F = 1]P r(F = 1) + E[Y |F = 0 ∧ R ≤ 2]P r(F = 0) + ∞ X E[Y |R ≤ 2t+1 ]P r(2t < R) t=1
0
≤ E[Y |F = 1]P r(F = 1) + 2cd O ε + ≤ E[Y |F = 1]P r(F = 1) + εO 0
∞ X
2
t+1
t=1 ∞ X
2cd +
t=1
= E[Y |F = 1]P r(F = 1) +
εO 0 c0d ,
20
cd O
0
2t+1 cd
dε ε + (2t − 1)2 (2t )2
d 1 + t 2 t 2 (2 − 1) (2 )
!
where c0d is some constant. The third inequality follows from Corollary 17 and Lemma 19, and the last equality holds since the series under summation sums into some finite value. Using (8) we get that (1 − ε) · αf · O 0 ≤ E[Y ] ≤ E[Y |F = 1]P r(F = 1) + εO 0 c0d , or (1 − c00d ε) · αf · O 0 ≤ E[Y ] ≤ E[Y |F = 1]P r(F = 1), where c00d is some constant.
Let W = (1 − ε)Y when F = 1, and W = 0 otherwise (recall that F = 1 if the solution is nearly feasible in any dimension, and |D| ≤ k). We show that W is a lower bound for the value of the solution for LP0 in line (2d). (In case this line is not reached by the algorithm, we take as solution for LP0 the vector y¯ = (0, 0, . . . 0) and consider its value as zero.) If F = 0 then W = 0, and the claim holds. If F = 1 then the following hold: (a) line (2d) is reached, since |D| ≤ k, and (b) the vector {(1 − ε)Y j } is a feasible solution for LP0 of value (1 − ε)Y . By Lemma 20, we get that E[W ] = (1 − ε)E[Y |F = 1]P r(F = 1) ≥ (1 − ε) · (1 − c 00d ε) · αf · O 0 . Next, we lower bound the expected value of the solution output by the algorithm. Let Q be the weight of the solution output in line (2e). In case this line is not reached by the algorithm, we take Q = 0. Lemma 21 Assuming that ` ≥ d/ε, the expected value of Q satisfies E[Q] ≥ (1 − ε) · (1 − c00d ε) · αf · O 0 . Proof: By the above discussion, the value of the solution of LP 0 in line (2d) is at least W . Recall that the total weight of T is O − O 0 , and let wmax be the maximal weight of any element aj ∈ A0 then, by the definition of A0 , wmax ≤ (O − O 0 )/`. Since in any basic solution for LP0 there at most d fractional values, we get that Q ≥ O − O 0 + W − dwmax . Thus, Q ≥ (O − O 0 )(1 − d` ) + W , and d E[Q] ≥ (O − O 0 )(1 − ) + E[W ] ` d ≥ (O − O 0 )(1 − ) + (1 − ε) · (1 − c00d ε) · αf · O 0 ` ≥ (1 − ε) · (1 − c00d ε) · αf · O Since the expected value of the solution returned by the algorithm is at least the expected value of the solution in any iteration, we summarize in the next result. Theorem 22 For any fixed d and εˆ > 0, by properly setting the values of ε and `, the algorithm achieves approximation ratio of (1 − εˆ)α f and runs in polynomial time. We note that when f = 1 we get an instance of d-dimensional class-constrained knapsack (d-CCK). For the case where d = 1 an FPTAS was given in [20]. It is also known that d-CCK is strongly NP-hard. By Theorem 22, we get the following extension for the result of [20].
21
Corollary 23 There is a randomized polynomial time approximation scheme for d-CCK, for any fixed d ≥ 1.
5 5.1
Approximation Algorithms for RSAP A Greedy Algorithm
Consider a greedy algorithm for RSAP which packs the bins sequentially, optimizing separately on each bin. Formally, given N > 1 bins, in iteration `, 1 ≤ ` ≤ N , use an r-approximation algorithm for packing a subset of non-packed-yet items of maximum profit into the `-th bin. We distinguish between instances in which the bins are identical or arbitrary. Lemma 24 Let r ≤ 1 be the approximation ratio for a single bin, then the approximation ratio of the greedy algorithm for RSAP on N identical bins is 1 − e −r . Proof: Let OP T be the value of an optimal solution. Let Y ` be the subset of elements packed by Greedy in theP `-th bin. For a set of elements Z, let P (Z) denote the total profit of Z, and let ∆` = OP T − `i=1 P (Yi ). The elements of OP T that are not packed in the first i − 1 bins by Greedy are available when the i-th bin is packed. Therefore, by the pigeonhole principle, r∆ and the fact that we are using an r-approximation algorithm for a single bin, P (Y i ) ≥ Ni−1 . By definition, ∆` = ∆`−1 − P (Y` ). Thus, ∆` ≤ ∆`−1 − We get that ∆N ≤ (1 − The profit of Greedy equals
PN
i=1
r r ∆`−1 = ∆`−1 (1 − ). N N
r N ) · OP T ≤ e−r · OP T. N
P (Yi ) = OP T − ∆N ≥ (1 − e−r )OP T .
For PCS we can use for each bin our algorithm for MCMP, for which r = (1 − ε)(1 − e −1 ). Thus, we have −1 Theorem 25 The greedy algorithm achieves a ratio of (1 − e (1−ε)(1−e ) ) ≈ 0.468 − O(ε) for PCS with identical bins. We note that the best approximation ratio attained by the algorithms of [9] for PCS is (1 − e−1 )2 − ε ≈ 0.4 − ε. More generally, when all bins are identical, the above ratio of (1 − e −r ) r for RSAP is always greater than the approximation ratios of (1 − e −1 )r and r+1 − ε attained by the algorithms of [9]. For instances with arbitrary bins, the greedy algorithm fills the bins sequentially (in any order), using for each bin an r-approximation algorithm. r Lemma 26 The greedy algorithm achieves a ratio of r+1 for RSAP with arbitrary bins. Proof: Let r ∈ (0, 1] be the approximation ratio for a single bin. Let X j denote the set of items that some fixed optimal solution assigns to the jth bin and is not packed by greedy at all. Also, let Yj denote the items that Greedy packs in the jth bin. Then P (Y j ) ≥ rP (Xj ) since PN Xj was available to be packed when Greedy considered bin j. Thus, we get that j=1 Yj ≥ PN PN j=1 rP (Xj ). If j=1 P (Xj ) ≥ OP T /(r + 1) then Greedy has packed a total of at least 1 r r+1 OP T ; otherwise, by definition of X j , Greedy must have packed the other (1 − r+1 )-fraction r of the profit, i.e. at least r+1 OP T . As before, for PCS we apply Greedy with the algorithm for MCMP for each bin. Thus, we have 22
−1
(1−ε)(1−e ) Theorem 27 The greedy algorithm achieves a ratio of 1+(1−ε)(1−e −1 ) ≈ 0.387 − O(ε) for PCS with arbitrary bins. The above approximation ratio is slightly lower than the ratio of ≈ 0.4 achieved by the algorithms in [9], but the algorithm is much simpler.
5.2
A PTAS for PCS with Fixed Number of Colors
In this section we consider instances of PCS in single dimension (i.e., d = 1) where m > 1 is some fixed constant. Assume first that all bins have the same capacity, B, and the same number of compartments, k, for some B, k > 1. The term ‘guess’ is used to describe item selection that is performed in polynomial time, via enumeration or by using binary search. The scheme proceeds in the following steps. Preprocessing Guess OP T (I), the optimal profit from packing the instance, within factor 1 − ε. This can be done in O(log(1/ε)) steps, by running first a constant factor approximation algorithm for PCS (as given in Section 5). Having guessed a value of P satisfying (1 − ε)OP T (I) ≤ P ≤ OP T (I), discard all the items j ∈ I with w j < εP/n; then, divide all the profits by εP/n and round each down to the nearest power of (1 + ε). The resulting instance, I 0 , has h ≤ dlog 1+ε nε e = O( lnεn ) distinct profits. We further partition the items into groups, such that the items in each group have the same set of potential colors and the same value (to within factor of 1 + ε). Since the number of color sets is at most 2m , which is a constant, we get that the number of groups is H = O(2m · ln n/ε) = O(ln n/ε). Denote the groups by I 1 , . . . , IH . Guessing items Guess the contribution of I h , 1 ≤ h ≤ H to the total profit. More specifically, for any 1 ≤ h ≤ H, we guess the values k h , 1 ≤ kh ≤ H/ε2 ; kh is the contribution of Ih to the overall profit in some optimal packing, as multiple of ε 2 P/H. For a subset of items F , whose packing gives the profit P , denote by F h ⊆ Ih the subset of items in F in the group I h ; then, we guess the value of kh for which kh ·
ε2 P ε2 P ≤ P (Fh ) ≤ (kh + 1) , H H
P where P (Fh ) = j∈Fh wj is the total profit of Fh . We seek a vector (k1 , . . . , kH ) satisfying PH 2 h=1 kh ≤ H/ε . The next lemma shows that (k 1 , . . . , kH ) can be guessed in polynomial time. 2m
Lemma 28 The number of vectors (k1 , . . . , kH ) is bounded by nO( ε ln(1/ε)) . Proof: Note that the total number of vectors (k 1 , . . . , kH ) is the number of H-tuples whose m H+ H2 −1 ε coordinates sum to H/ε2 , given by , where H ≤ 2 εln n . Since, for any n ≥ ` ≥ 0, H−1 n en ` ` ≤ ( ` ) , we get that
H + εH2 − 1 H −1
≤(
m 3e H H·ln( 3e2 ) O( 2ε ln(1/ε)) ε ) = e . = n ε2
As shown in [5, 20], the above preprocessing step, as well as the way we guess the contribution of each group Ih , 1 ≤ h ≤ H, may cause a harm of at most εP to the overall profit of the solution. 23
Given the contribution of each group I h in some optimal packing, we select the smallest items in Ih that achieve this profit. The colors assigned to these elements are unknown yet. Next, we reduce the number of distinct item sizes in I 1 , . . . , IH to O(1/ε). This can be done using the shifting technique (see, e.g., in [23]). Generally, the items are sorted in nondecreasing order by sizes; then, the ordered list is partitioned into at most 1/ε 2 subsets, each containing Q = dnε2 e items. The size of each item is rounded up to the size of the largest item in its subset. This yields an instance in which the number of distinct item sizes is n/Q ≤ 1/ε 2 . For each color set, we distinguish between the items selected for packing in this set by values and sizes. Thus, the items selected for packing are partitioned to subsets, such that the items in each subset have the same size, value, and set of colors. Since the number of item sizes is O(1/ε), the packed items are partitioned into R = O(H/ε) subsets. Denote these subsets by U1 , . . . , UR . Finally, we guess how many bins are assigned each subset of colors, out of m k possible subsets. Reduction to CCMK For each subset of items U r , 1 ≤ r ≤ R, guess the contribution to the overall profit of a given color i, 1 ≤ i ≤ m. This can be done in polynomial time, by ε2 P (xr +1)ε2 P , ]. guessing an integer vector (x1 , . . . , xR ) such that the contribution of Ur is in [ xrR R 2m
The number of such vectors is bounded by n O( ε2 ln(1/ε)) . (The proof is similar to the proof of Lemma 28.) This gives the number of elements that need to be packed in color i, out of a subset of elements having the same value, size, and color set (that contains i). Since each of the packed items is assigned a single color, we now get an instance of CCMK. Hence, the scheme proceeds using the packing step of the PTAS in [20]. As shown in [20], when each of the bins ` has arbitrary capacity, B ` , and any number of compartments 1 ≤ k` ≤ m, we define O(log n) bin types and guess, for each subset of colors, the number of bins of each size that are assigned this subset. This can be done in polynomial time. Thus, our scheme can be applied also for instances of PCS with arbitrary bins. Acknowledgments. We thank Seffi Naor for many helpful discussions. We also thank Chandra Chekuri and Uri Feige for insightful comments and suggestions.
References [1] A. Ageev and M. Sviridenko. Pipage rounding: A new method of constructing algorithms with proven performance guarantee. J. Combinatorial Optimization, 8(3):307–328, 2004. [2] D. Amzallag, M. Livschitz, J. Naor, and D. Raz. Cell planning of 4g cellular networks: Algorithmic techniques, and results. In Proceedings of the 6th IEE International Conference on 3G & Beyond (3G ’2005), pages 501–506, 2005. [3] N. Andelman and Y. Mansour. Auctions with budget constraints. In Proc. of SWAT, pages 26–38, 2004. [4] G. Calinescu, C. Chekuri, M. P´al, and J. Vondr´ak. Maximizing a submodular set function subject to a matroid constraint. In IPCO, pages 182–196, 2007. [5] C. Chekuri and S. Khanna. A ptas for the multiple knapsack problem. SIAM Journal on Computing, 35(3):713–728, 2006. [6] C. Chekuri and A. Kumar. Maximum coverage problem with group budget constraints and applications. In APPROX-RANDOM, pages 72–83, 2004. 24
[7] U. Feige. A threshold of ln n for approximating set cover. J.of ACM, 45(4):634–652, 1998. [8] U. Feige, V.S.Mirrokni, and J. Vondr´ak. Maximizing non-monotone submodular functions. In FOCS, 2007. [9] L. Fleischer, M. X. Goemans, V. S. Mirrokni, and M. Sviridenko. Tight approximation algorithms for maximum general assignment problems. In SODA ’06: Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm, pages 611–620, New York, NY, USA, 2006. ACM. [10] A. M. Frieze and M. Clarke. Approximation algorithms for the m-dimensional 0-1 knapsack problem: worst-case and probabilistic analyses. European J. of Operational Research, 15(1):100–109, 1984. [11] T. Fujito. Approximation algorithms for submodular set cover with applications. IEICE Trans. Inf. and Systems, E83-D(3), 2000. [12] R. Garg, V. Kumar, and V. Pandit. Approximation algorithms for budget-constrained auctions. In Proc. of APPROX, 2001. [13] H. Kellerer, U. Pferschy, and D. Pisinger. Knapsack Problems. Springer, 1 edition, October 2004. [14] S. Khuller, A. Moss, and J. Naor. The budgeted maximum coverage problem. Inf. Process. Letters, 70(1):39–45, 1999. [15] J. Lee, V.S.Mirrokni, V. Nagarajan, and M. Sviridenko. Non-monotone submodular maximization under matroid and knapsack constraints. In STOC, 2009. [16] S. Martello and P. Toth. Algorithms for knapsack problems. Annals of Discrete Math., 31:213–258, 1987. [17] G. Nemhauser, L. Wolsey, and M. Fisher. An analysis of the approximations for maximizing submodular set functions. Mathematical Programming, 14:265–294, 1978. [18] A. Schriejver. Combinatorial Optimization - polyhedra and efficiency. Springer Verlag Berlin Heidelberg, 2003. [19] H. Shachnai and T. Tamir. On two class-constrained versions of the multiple knapsack problem. Algorithmica, 29(3):442–467, 2001. [20] H. Shachnai and T. Tamir. Polynomial time approximation schemes for class-constrained packing problems. J. of Scheduling, 4(6):313–338, 2001. [21] A. Srinivasan. Distributions on level-sets with applications to approximation algorithms. In 44th Symp. on Foundations of Computer Science (FOCS), pages 588–597, 2001. [22] M. Sviridenko. A note on maximizing a submodular set function subject to knapsack constraint. Operations Research Letters, 32:41–43, 2004. [23] V. Vazirani. Approximation Algorithms. Springer Verlag, 2001. [24] J. Vondr´ak. Optimal approximation for the submodular welfare problem in the value oracle model. In STOC, pages 67–74, 2008. 25
A
Applications of Maximum Coverage with Packing Constraints
The problem of maximum coverage with (multiple) packing constraints captures many real life scenarios. We describe below a few of them. Data Placement in Video on Demand (VoD) systems: A VoD system services n clients. Each client specifies a list of preferred movies, out of which the system can select to show any movie. For each of the m movies offered to the clients, let b i denote the size (storage requirement) of movie i, and let Si denote the set of clients that include movie i in their list. Each client j also specifies the amount w j she is willing to pay for the service and the bandwidth sj required in order to receive the movie stream. The system has a limited storage capacity, L, and a limited bandwidth B. In other words, it can store movies of total size at most L, and transmit movie streams of total bandwidth B. The goal is to select a subset of movies to be stored, such that the total revenue from servicing clients is maximized. Thus, we get an instance of MCMP. In a more general setting, when the system consists of a set of servers, where each server ` has storage capacity L ` and available bandwidth B` , we get an instance of PCS. Production Planning and Worker Assignment: In many manufacturing systems, some tasks can be performed by several machines/workers. The input consists of a set of tasks, each associated with a profit wj , that is gained if the task is completed, and the amounts sj,1 , . . . , sj,d of d resource requirements (for example, amounts of physical materials), for some d ≥ 1. Also, given is a set of workers with a payment (salary) b i associated with hiring worker ¯ with the amounts of available resources. Each i, a budget L for the project, and a vector B worker i is qualified to perform a subset of the tasks. The goal is to hire a team of workers to perform a most profitable subset of the tasks. The workers need to qualify for handling the ¯ of the resource are used. This tasks, their total salary should not exceed L, and at most B yields an instance of MCMP. Antenna Positions in Wireless Communication: A provider of wireless communication needs to position antennas in the network. For each spot i, the cost of setting up an antenna in i is bi . For each client j, the revenue from servicing j is w j ; the expected volume of communication generated by client j is s j . Given a budget L, and a bound B on the total volume of communication, the goal is to determine the number of antennas to be used and their locations in the network, such that the total revenue is maximized. When the antennas need to be positioned in a wide-area network which consists of N > 1 local-area networks, such that network `, 1 ≤ ` ≤ N , is assigned the budget L ` and can tolerate communication volume of at most B` , we get an instance of PCS. Other applications of our problems include budget-constrained procurement auctions [12, 3] and frequency assignment in cellular networks (see, e.g., [2]).
26