Improved Approximations for Non-monotone

1 downloads 0 Views 258KB Size Report
In particular, our results hold for any non-negative submodular function. .... converted to a feasible one with only a slight decrease in the total value. ..... Formally, given are n items A = {a1,...,an} and m bins, such that bin i has a d2- .... We note that any set V ∈ M can be converted to a solution S for BSAP with ¯c(S) ≤ ¯c(V ) ...
Improved Approximations for Non-monotone Submodular Maximization with Knapsack Constraints Ariel Kulik∗

Hadas Shachnai†

Abstract Submodular maximization generalizes many fundamental problems in discrete optimization, including Max-Cut in directed/undirected graphs, maximum coverage, maximum facility location and marketing over social networks. In this paper we consider the problem of maximizing any submodular function subject to d knapsack constraints, where d is a fixed constant. We establish a strong relation between the discrete problem and its continuous relaxation, obtained through extension by expectation of the submodular function. Formally, we show that, for any non-negative submodular function, an α-approximation algorithm for the continuous relaxation implies a randomized (α − ε)approximation algorithm for the discrete problem. We use this relation to improve the best known approximation ratio for the problem to 1/4 − ε, for any ε > 0. We further show that the probabilistic domain defined by a continuous solution can be reduced to yield a polynomial size domain, given an oracle for the extension by expectation. This leads to a deterministic version of our technique. Our approach has a potential of wider applicability, which we demonstrate on the examples of the Generalized Assignment Problem and Maximum Coverage Problems with additional knapsack constraints.



Computer Science Department, Technion, Haifa 32000, Israel. E-mail: [email protected] Computer Science Department, Technion, Haifa 32000, Israel. E-mail: [email protected]. Work partially supported by the Technion V.P.R. Fund, by Smoler Research Fund, and by the Ministry of Trade and Industry MAGNET program through the NEGEV Consortium (www.negev-initiative.org). †

1

Introduction

A real-valued function f , whose domain is all the subsets of a universe U , is called submodular if, for any S, T ⊆ U , f (S) + f (T ) ≥ f (S ∪ T ) + f (S ∩ T ). The concept of submodularity, which can be viewed as a discrete analog of convexity, plays a central role in combinatorial theorems and algorithms (see, e.g., [10] and the references therein, and the comprehensive surveys in [8, 20, 15]). Submodular maximization generalizes many fundamental problems in discrete optimization, including Max-Cut in directed/undirected graphs, maximum coverage, maximum facility location and marketing over social networks (see, e.g., [11]). In many settings, including set covering or matroid optimization, the underlying submodular functions are monotone, meaning that f (S) ≤ f (T ) whenever S ⊆ T . Here, we are more interested in the general case where f(S) is not necessarily monotone. A classic example of such a submodular P function is f (S) = e∈δ(S) w(e), where δ(S) is a cut in a graph (or hypergraph) induced by a set of vertices S, and w(e) is the weight of an edge e. In this paper we consider the following problem of maximizing a non-negative submodular set ¯ for some function subject to d knapsack constraints (SUB). Given a d-dimensional budget vector L, d ≥ 1, and an oracle for a non-negative submodular set function f over a universe U , where each element i ∈ U is associated with a d-dimensional cost vector c¯(i), we seek a subset of elements ¯ such that f (S) is maximized. S ⊆ U whose total cost is at most L, There has been extensive work on maximizing submodular monotone functions subject to matroid constraint.1 For the special case of uniform matroid, i.e., the problem {max f (S) : |S| ≤ k}, for some k > 1, Nemhauser et. al showed in [17] that a greedy algorithm yields a ratio of 1 − e−1 to the optimum. Later works presented greedy algorithms that achieve this ratio for other special matroids or for certain submodular monotone functions (see, e.g., [1, 12, 19, 4]). For a general matroid constraint, Calinescu et al. showed in [3] that a scheme based on solving a continuous relaxation of the problem followed by pipage rounding (a technique introduced by Ageev and Sviridenko [1]) achieves the ratio of 1 − e−1 for maximizing submodular monotone functions that can be expressed as a sum of weighted rank functions of matroids. Subsequently, this result was extended by Vondr´ak [20] to general monotone submodular functions. In [14] we developed a framework for maximizing monotone submodular functions subject to d knapsack constraints that yields a (1 − e−1 − ε)-approximation to the optimum for any fixed ε > 0, where d ≥ 1 is some constant. The bound of 1 − e−1 is the best possible for all of the above problems. This follows from the lower bound of Nemhauser and Wolsey [16] in the oracle model, and the later result of Feige [6] for the specific case of maximum coverage, under the assumption that P 6= N P . Recently, Demaine and Zadimoghaddam [5] considered bi-criteria approximations for monotone submodular set function optimization. The problem of maximizing a non-monotone submodular function has been studied as well. Feige et al. [8] considered the (unconstrained) maximization of a general non-monotone submodular function. The paper gives several (randomized and deterministic) approximation algorithms, as well as hardness results, also for the special case where the function is symmetric. Lee et al. [15] studied the problem of maximizing a general submodular function under linear and matroid constraints. They proposed algorithms that achieve approximation ratio of 1/5 − ε for the problem with d linear constraints and a ratio of 1/(d + 2 + 1/d + ε) for d matroid constraints, for any fixed integer d ≥ 1. 1

A (weighted) matroid is a system of ‘independent subsets’ of a universe, which satisfies certain hereditary and exchange properties [18].

1

Several fundamental algorithms for submodular maximization (see, e.g., [1, 3, 20, 15]) use a continuous extension of submodular function, to which we refer as extension by expectation. Given a submodular function f : 2U → R, we define F : [0, 1]U → R. For any y¯ ∈ [0, 1]U , let R ⊆ U be a random variable such that i ∈ R with probability yi (we say that R ∼ y¯). Then define F (¯ y ) = E[f (R)] . The general framework of these algorithms is to first obtain a fractional solution for the continuous extension, followed by rounding which yields a solution for the discrete problem. Using the definition of FP , we define the continuous relaxation of our problem called continuous U ¯ be the polytope of the instance, then the problem is SUB. Let P = {¯ y ∈ [0, 1] | i∈U yi c¯(i) ≤ L} to find y¯ ∈ P for which F (¯ y ) is maximized. For α ∈ (0, 1], an algorithm A yields α-approximation for the continuous problem with respect to a submodular function f , if for any assignment of nonnegative costs to the elements, and for any non-negative budget, A finds a feasible solution for the continuous problem of value at least αO, where O is the value of the optimal discrete solution for the given costs and budget. For some specific families of submodular functions, linear programming can be used to derive such approximation algorithms (see e.g [1, 3]). For monotone submodular functions, Vondr´ ak presented in [20] a (1 − e−1 − o(1))-approximation algorithm for the continuous problem, which was later used as a main building block in the (1 − e−1 − ε)-approximation algorithm of [14]. Subsequently, Lee et al. [15] considered the problem of maximizing any submodular function with multiple knapsack constraints and developed a ( 14 − o(1))-approximation algorithm for the continuous problem; however, noting that the rounding method of [14], which proved useful for monotone functions, cannot be applied in the non-monotone case, a ( 15 − ε)-approximation was obtained for the discrete problem, by using simple randomized rounding. Our results In this paper we establish a strong relation between the problem of maximizing any submodular function subject to d knapsack constraints and its continuous relaxation. Formally, we show (in Theorem 2.5) that for any non-negative submodular function, an α-approximation algorithm for the continuous relaxation implies a randomized (α − ε)-approximation algorithm for the discrete problem. We use this relation to obtain approximation ratio of 1/4 − ε for SUB, for any ε > 0, thus improving the best known result for the problem, due to Lee et al. [15]. An important consequence of the above relation is that for any class of submodular functions, a future improvement of the approximation ratio for the continuous problem, to a factor of α, immediately implies an approximation ratio of (α − ε) for the original instance. The technique that we use (as given in Section 2) generalizes (and simplifies) the framework developed in [14]. In particular, our results hold for any non-negative submodular function. Moreover, the relation between the approximation ratio for the continuous relaxation and the ratio for the discrete version for specific submodular functions (as given in Theorem 2.5) cannot be deduced using the framework of [14].2 Our technique applies random sampling on the solution space, using a distribution defined by the fractional solution for the problem. In Section 2.5 we show how to convert a feasible solution for the continuous problem to another feasible solution with up to O(log |U |) fractional entries, given an oracle to the extension by expectation. This facilitates the usage of exhaustive search instead of sampling, which leads to deterministic version of our technique. Specifically, we obtain a deterministic (1/4 − ε)-approximation for general instances and (1 − e−1 − ε)-approximation for 2

For example, for maximum coverage with multiple knapsack constraints, a (1 − (1 − 1/r)r − ε)-approximation follows from our technique for any instance in which the maximum frequency of an element is r ≥ 1, while the framework of [14] yields the ratio 1 − e−1 − ε.

2

instances where the submodular function is monotone. Our approach has a potential of wider applicability, which we demonstrate on the examples of the Generalized Assignment Problem and Maximum Coverage Problems with additional knapsack constraints (given in Section 3 and Appendix D, respectively) . Due to space constraints some of the proofs are relegated to Appendix C.

2

Maximizing Submodular Functions

2.1

Preliminaries

Notation An essential component in our framework is the distinction between elements by their ¯ otherwise, the element is big. costs. We say that an element i is small if c¯(i) ≤ ε3 L; Given a universe U , we call a subset of elements S ⊆ U feasible if the total cost of elements in ¯ We say that S is ε-nearly feasible (or nearly feasible, if ε is known from the S is bounded by L. ¯ We refer to f (S) as the value context) if the total cost of the elements in S is bounded by (1 + ε)L. of S. Similar to the discrete case, y¯ ∈ [0, 1]U is feasible if y¯ ∈ P . For any subset T ⊆ U , we define fT : 2U → R+ by fT (S) = f (S ∪ T ) − f (T ). It is easy to verify that if f is a submodular set function then fT is also P a submodular set function. Finally, P for any set S ⊆ U , we define P c¯(S) = i∈S c¯(i), and cr (S) P = i∈S cr (i). For a fractional solution y ) = i∈U c¯(i) · yi . y¯ ∈ [0, 1]U , we define cr (¯ y ) = i∈U cr (i) · yi and c¯(¯ Overview Our algorithm consists of two main phases, to which we refer as rounding procedure and profit enumeration. The rounding procedure yields an (α − O(ε))-approximation for instances in which there are no big elements, using an α-approximate solution for the continuous problem. It relies heavily on Theorem 2.1 that gives conditions on the probabilistic domain of solutions, which guarantee that the expected profit of the resulting nearly feasible solution is high. This solution is then converted to a feasible one, by using a fixing procedure. We first present a randomized version and later show how to derandomize for the rounding procedure. The profit enumeration phase uses enumeration over the most profitable elements in an optimal solution, to reduce a general instance to another instance with no big elements, on which we apply the rounding procedure. Finally, we combine the above results with an algorithm for the continuous problem (e.g., the algorithm of [20], or [15]) to obtain approximation algorithm for SUB.

2.2

A Probabilistic Theorem

We first prove a general probabilistic theorem that refers to a slight generalization of our problem (called generalized SUB). In addition to the standard input for the problem, there is also a collection of subsets M ⊆ 2U , such that if T ∈ M and S ⊆ T then S ∈ M. The goal is to find a subset ¯ and f (S) is maximized. S ⊆ M, such that c¯(S) ≤ L Theorem 2.1 For a given input of generalized SUB, let χ be a distribution over M and D a random variable D ∼ χ, such that 1. E [f (D)] ≥ O/5, where O is an optimal solution for the given instance. 2. For 1 ≤ r ≤ d, E[cr (D)] ≤ Lr Pm 3. For 1 ≤ r ≤ d, cr (D) = k=1 cr (Dk ), where Dk ∼ χk and D1 , . . . , Dm are independent random variables. 3

4. For any 1 ≤ k ≤ m and 1 ≤ r ≤ d, it holds that either cr (Dk ) ≤ ε3 Lr or cr (Dk ) is fixed. Let D0 = D if D is ε-nearly feasible, and D0 = ∅ otherwise. Then D0 is always ε-nearly feasible, D0 ∈ M, and E[f (D0 )] ≥ (1 − O(ε))E[f (D)]. Proof: We use in the proof the next claims, whose proofs are given in the appendix. Define an indicator random variable F such that F = 1 if D is ε-nearly feasible, and F = 0 otherwise. Claim 2.1 P r[F = 0] ≤ dε. , and define R = maxr Rr , then R denotes the For any dimension 1 ≤ r ≤ d, let Rr = crL(D) r maximal relative deviation of the cost from the r-th entry in the budget vector, where the maximum is taken over 1 ≤ r ≤ d. dε3 Claim 2.2 For any ` > 1, P r[R > `] < (`−1) 2. Claim 2.3 For any integer ` > 1, if R ≤ ` then f (D) ≤ 2d` · O. Combining the above results we have Claim 2.4 E[f (D0 )] ≥ (1 − O(ε))E[f (D)]. By definition, D0 is always ε-nearly feasible, and D0 ∈ M. This completes the proof.

2.3



Rounding Instances with No Big Elements

A main advantage of inputs with no big elements is that any nearly feasible solution can be easily converted to a feasible one with only a slight decrease in the total value. Lemma 2.2 If S ⊆ U is an ε-nearly feasible solution with no big elements, then S can be converted in polynomial time to a feasible solution S 0 ⊆ S, such that f (S 0 ) ≥ (1 − O(ε)) f (S). Proof: In fixing the solution S we handle each dimension separately. For any dimension 1 ≤ r ≤ d, if cr (S) ≤ Lr then no modification is needed; otherwise, cr (S) > Lr . Since all elements in S are small, we can partition S into ` disjoint subsets S1 , S2 , . . . , S` such that εLr ≤ cr (Sj ) < (ε + ε3 )Lr for any 1 ≤ j ≤ `, where ` = Ω(ε−1 ). Since f is submodular, by Lemma A.3 we have f (S) ≥ P` f (S) = f (S) · O(ε) (note j=1 fS\Sj (Sj ). Hence, there is some 1 ≤ j ≤ ` such that fS\Sj (Sj ) ≤ ` that fS\Sj (Sj ) can have a negative value). Now, cr (S \ Sj ) ≤ Lr , and f (S \ Sj ) ≥ (1 − O(ε))f (S). We repeat this step in each dimension to obtain a feasible set S 0 with f (S 0 ) ≥ (1 − O(ε))f (S).  Combined with Theorem 2.1, we have the following rounding algorithm. Randomized rounding algorithm for SUB with no big elements Input: An SUB instance, a feasible solution x ¯ for the continuous problem, with F (¯ x) ≥ O/5. 1. Define a random set D ∼ x ¯. Set D0 = D if D is ε-nearly feasible, and D0 = ∅ otherwise. 2. Convert D0 to a feasible set D00 as in Lemma 2.2 and return D00 . Clearly, the algorithm returns a feasible solution for the problem. By Theorem 2.1, E[f (D0 )] ≥ (1 − O(ε))F (¯ x). By Lemma 2.2, E[f (D00 )] ≥ (1 − O(ε))F (¯ x). Hence, we have Lemma 2.3 For any instance of SUB with no big elements, any feasible solution x ¯ for the continuous problem with F (¯ x) ≥ O/5 can be converted to a feasible solution for SUB in polynomial run time with expected profit at least (1 − O(ε)) · F (¯ x).

2.4

Approximation Algorithm for SUB

Given an instance of SUB and a subset T ⊆ U , define another instance of SUB, to which we refer as ¯0 = L ¯ − c¯(T ) the residual problem with respect to T , with f remaining the objective function. Let L 0 ¯ 0, be the new budget, and the universe U consists of all elements i ∈ U \ T such that c¯(i) ≤ ε3 L 4

 ¯ 0 ). The new cost of element i is and all elements in T (formally, U 0 = T ∪ i ∈ U \ T |¯ c(i) ≤ ε3 L c0 (i) = c(i) for any i ∈ U 0 \ T , and c0 (i) = 0 for any i ∈ T . It follows that there are no big elements in the residual problem. Let S be a feasible solution for the residual problem with respect to T . ¯ 0 + c¯(T ) = L, ¯ which means that any feasible solution for the residual Then c¯(S) ≤ c¯0 (S) + c¯(T ) ≤ L problem is also feasible for the original problem. Consider the following algorithm. Approximation algorithm for SUB Input: A SUB instance and an α-approximation algorithm A for the continuous problem with respect to the function f . 1. For any T ⊆ U such that |T | ≤ h = dd · ε−4 e (a) Use A to obtain an α-approximate solution x ¯ for the continuous residual problem with respect to T . (b) Use the rounding algorithm of Section 2.3 to convert x ¯ to a feasible solution S for the residual problem (note the residual problem has no big elements). 2. Return the best solution found. Lemma 2.4 The above approximation algorithm returns (α − O(ε))-approximate solution for SUB and uses a polynomial number of invocations of algorithm A. Proof: By Lemma 2.3, in each iteration the algorithm finds a feasible solution S for the residual problem. Hence, the algorithm always returns a feasible solution for the given SUB instance. Let O = {i1 , . . . , ik } be an optimal solution for the input I. For ` ≥ 1, let K` = {i1 , . . . , i` }, and assume that the elements are ordered by their residual profits, i.e., i` = argmaxi∈O\K`−1 fK`−1 ({i}). Consider the iteration in which T = Kh , and define O0 = O ∩ U 0 . The set O0 is clearly a feasible solution for the residual problem with respect to T . We show a lower bound for f (O0 ). The set R = O \ O0 consists of elements in O \ T that are big with respect to the residual instance. The ¯ 0 (since O is a feasible solution), and thus |R| ≤ ε−3 · d. total cost of elements in R is bounded by L P (T ) Since T = Kh , for any j ∈ O\T it holds that fT (j) ≤ f|T j∈R fT ({j}) ≤ | , and we get fT (R) ≤ (T ) 0 0 0 ε−3 · d f|T | = εf (T ) ≤ εO. Thus, fO (R) ≤ fT (R) ≤ εO. Since f (O) = f (O ) + fO (T ) ≤ 0 0 f (O ) + εf (O), we have that f (O ) ≥ (1 − ε)f (O). Thus, in this iteration we get a solution x ¯ for the residual problem with F (¯ x) ≥ α(1 − ε)f (O), and the solution S obtained after the rounding satisfies f (S) ≥ (1 − o(ε))αf (O). 

We summarize in the next result. Theorem 2.5 Let f be a submodular function, and suppose there is a polynomial time α-approximation algorithm for the continuous problem with respect to f . Then there is a polynomial time randomized (α − ε)-approximation algorithm for SUB with respect to f for any ε > 0. Since there is a (1/4−o(1)) approximation algorithm for the continuous problem on general instances [15], we have Theorem 2.6 There is a polynomial time randomized (1/4 − ε)-approximation algorithm for SUB, for any ε > 0.

2.5

Derandomization

In this section we show how the algorithm of Section 2.3 can be derandomized, assuming we have an oracle for F , the extension by expectation of f . The main idea is to reduce the number of 5

fractional entries in the fractional solution x ¯, so that the number of values a random set D ∼ x ¯ can get is polynomial in the input size (for a fixed value of ε). Then, we go over all the possible values, and we are promised to obtain a solution of high value. A key ingredient in our derandomization is the pipage rounding technique of Ageev and Sviridenko [1]. We give below a brief overview of the technique. For any element i ∈ U , define the unit vector ¯i ∈ {0, 1}U , in which ij = 0 for any j 6= i, and ii = 1. Given a fractional solution x ¯ for the problem and two elements i, j, such that xi and xj are both fractional, consider the vector function − + − x ¯i,j (δ) = x ¯ + δ¯i − δ¯j. Let δx+ ¯,i,j and δx ¯,i,j (for short, δ and δ ) be the maximal and minimal value of δ for which x ¯i,j (δ) ∈ [0, 1]U . In both x ¯i,j (δ + ), x ¯i,j (δ − ), the entry of either i or j is integral. x ¯ x ¯ is convex (see [2] for Define Fi,j (δ) = F (¯ xi,j (δ)) over the domain [δ − , δ + ]. The function Fi,j 0 a detailed proof) , thus x ¯ = argmax{¯xi,j (δ+ ),¯xi,j (δ− )} F (¯ x) has fewer fractional entries than x ¯, and 0 0 F (¯ x ) ≥ F (¯ x). By appropriate selection of i, j, such that x ¯ maintains feasibility (in some sense), we can repeat the above step to gradually decrease the number of fractional entries. We use the technique to prove the next result. Lemma 2.7 Let x ¯ ∈ [0, 1]U be a solution having k or less fractional entries (i.e., |{i | 0 < xi < ¯ for some L. ¯ Then x 1}| ≤ k), and c¯(¯ x) ≤ L ¯ can be converted to a vector x ¯0 with at most k 0 = d 0 0 ¯ and F (¯ (8 ln(2k)/ε) fractional entries, such that c¯(¯ x ) ≤ (1 + ε)L, x ) ≥ F (¯ x), in time polynomial in k. Proof Sketch: Let U 0 = {i | 0 < xi < 1} be the set of all fractional entries. We define a new cost function c¯0 over the elements in U .  i∈ / U0  cr (i) r 0 cr (i) ≤ ε·L c0r (i) = 2k  ε·Lr ε·Lr ε·Lr j j j+1 2k (1 + ε/2) 2k (1 + ε/2) ≤ cr (i) < 2k (1 + ε/2) The number of distinct values c0r (i) can get for i ∈ U 0 is bounded by 8 ln(2k) . Hence, the number d ε  . Now, we can use pipage of distinct values c¯0 (i) can get for i ∈ U 0 is bounded by k 0 = 8 ln(2k) ε rounding, i.e., choose in each step two elements i, j such that c0r (i) = c0r (j). We repeat this step until there are no two elements with fractional entries and the same cost. This gives a new vector x ¯0 with up to k 0 fractional entries, satisfying F (¯ x0 ) ≥ F (¯ x). Since we use this step only for elements with the same cost, we are guaranteed to have c¯0 (¯ x) = c¯0 (¯ x0 ), and by the definition of the new cost, ¯ 3 we get that c¯(¯ x0 ) ≤ (1 + ε)L.  Using the above lemma, we can reduce the number of fractional entries in x ¯ to a number that is poly-logarithmic in k. However, the number of values D ∼ x ¯ remains super-polynomial. To reduce further the number of fractional entries, we apply the above step twice, that is, we convert x ¯ with at most |U | fractional entries to x ¯0 with at most k 0 = (8 ln(2|U |)/ε)d . We can then apply 00 the conversion again, to obtain x ¯ with at most k 00 = O(log |U |) fractional entries. ¯ for some L, ¯ and ε > 0 a constant. Then x Lemma 2.8 Let x ¯ ∈ [0, 1]U such that c¯(¯ x) ≤ L, ¯ can be 0 00 0 ¯ converted to a vector x ¯ with at most k = O(log |U |) fractional entries, such that c¯(¯ x ) ≤ (1 + ε)2 L, 0 and F (¯ x ) ≥ F (¯ x), in polynomial time in |U |. Consider the following rounding algorithm. Deterministic rounding algorithm for SUB with no big elements Input: A SUB instance, a feasible solution x ¯ for the continuous problem, with F (¯ x) ≥ O/5. 3

A detailed proof is given in Appendix C.

6

1. Define x ¯0 = (1 + ε)−2 · x ¯ (note that F (¯ x0 ) ≥ (1 + ε)−2 · F (¯ x)). 2. Convert x ¯0 to x ¯00 such that x ¯00 is fractionally feasible, the number of fractional entries in x ¯00 is −2 00 −1 O(log |U |), and F (¯ x) ≥ (1 + ε) F (¯ x ) ≥ (1 − e − O(ε))O, as in Lemma 2.8. 3. Enumerate over all possible values D for D ∼ x ¯00 . For each such value if D is ε-nearly feasible convert it to a feasible solution D0 (see Lemma 2.2). Return the solution with maximum value among the feasible solutions found. By Theorem 2.1, the algorithm returns a feasible solution of value at least (1 − O(ε))F (¯ x). Also, the running time of the algorithm is polynomial when ε is a fixed constant. Replacing the randomized rounding in the algorithm of Section 2.4 with the above we get the following result. Theorem 2.9 Let f be a submodular function, and assume we have an oracle for F . If there is a deterministic polynomial time α-approximation algorithm for the continuous problem with respect to f , then there is a polynomial time deterministic (α − ε)-approximation algorithm for SUB with respect to f , for any ε > 0. Since, given an oracle to F , both the algorithms of [20] and [15] for the continuous problem are deterministic, we get the following. Theorem 2.10 Given an oracle for F , there is a polynomial time deterministic (1 − e−1 − ε)approximation algorithm for SUB with a monotone function, for any ε > 0. Theorem 2.11 Given an oracle for F , there is a polynomial time deterministic (1/4−ε)-approximation algorithm for SUB for any ε > 0.

3

The Budgeted Generalized Assignment Problem

In this section we apply our technique for solving a budgeted variant of the well-known generalized assignment problem (GAP). We start with a few definitions. An instance of the separable assignment problem (SAP) consists of n items A = {a1 , . . . , an } and m bins. Each bin i has an associated collection of feasible sets Ii which is down-closed (S ∈ Ii implies S 0 ∈ Ii for any S 0 ⊆ S). Also, a profit pi,j ≥ 0 is gained from assigning Pm P the item aj to bin i. The goal is to choose disjoint feasible sets Si ∈ Ii so as to maximize i=1 aj ∈Si pi,j . The set of inputs for GAP is the restricted class of inputs for SAP in which the sets Ii are defined by a knapsack constraint. The best approximation for GAP is (1 − e−1 + ε), for some ε > 0, due to [7]. Our algorithm uses the (1 − e−1 )-approximation for the problem given in [9]. We consider a budgeted variant of GAP (BGAP), where each item aj has a d1 -dimensional cost vector c¯i,j ≥ 0, incurred when aj is assigned to bin i. Also, given is a global d1 -dimensional ¯ The objective is to find a maximal profit solution whose total cost is at most L. ¯ budget vector L. 1−e−1 A ( 2−e−1 − ε) -approximation was given in [13] for the case where d1 = 1, as an example for the usage of a Lagrangian relaxation technique. We give below an algorithm for a slightly more general budgeted linear constrained separable assignment problem (BSAP). The difference between BGAP and BSAP is that in the latter the set of feasible assignments for each bin is defined by d2 knapsack constraints, rather than a single constraint. Formally, given are n items A = {a1 , . . . , an } and m bins, such that bin i has a d2 dimensional capacity ¯bi . Each item aj has a d2 -dimensional size s¯i,j ≥ 0, a d1 -dimensional cost vector c¯i,j ≥ 0, and a profit pi,j ≥ 0 that i gained when aj is assigned to bin i. Also, given is a ¯ global d1 -dimensional budget vector L. P We say that a subset of items Si ⊆ A is a feasible assignment for bin i if aj ∈Si s¯i,j ≤ ¯bi . P We also define the cost and profit of Si as assignment to bin i by c¯(i, Si ) = ¯i,j , and aj ∈Si c 7

P p(i, Si ) = aj ∈Si pi,j , respectively. A solution for the problem is a tuple of m disjoint subsets of items S = (S1 , . . . , SmP ), such that each Si is a feasible assignment for bin i. PWe define the PmsetP m m cost of S by c¯(S) = c ¯ (i, S ) = ¯i,j , and its profit by p(S) = i i=1 i=1 aj ∈Si c i=1 p(i, Si ) = Pm P ¯ ¯ ¯(S) ≤ L (its total cost is bounded by L). i=1 aj ∈Si pi,j . We say that a solution S is feasible if c The problem is to find a feasible solution of maximal profit. ¯ An item aj is small As before, we say that a solution S is ε-nearly feasible if c¯(S) ≤ (1 + ε)L. 3 ¯ otherwise, aj is big. Also, an assignment Si ⊆ A if, for any bin 1 ≤ i ≤ m, it holds that c¯i,j ≤ ε · L; 3 ¯ of items to bin i is small if c¯(i, Si ) ≤ ε L. Our algorithm uses two special cases of BSAP. The first is small items BSAP, in which all items are small; the second is small assignments BSAP in which, for any bin i and a feasible assignment Si , it holds that Si is a small assignment for bin i. Overview Our algorithm proceeds in four stages. The first stage obtains an ε-nearly feasible solution with high profit for small assignments instances of BSAP, by using Theorem 2.1. The second stage shows how enumeration can be used to reduce a small items instance of BSAP into a small assignments instance, so that the algorithm of the first stage can be applied. The main idea is to guess the most profitable bins in some optimal solution and the approximate cost of those bins in this solution. This guess is used to eliminate all big assignments, by forcing additional d1 linear constraints for each bin. The result of this stage is a 2ε-nearly feasible solution of high profit. The third stage handles small items instances. The fact that all items are small is essential for converting the nearly feasible solution of the previous stage into a feasible solution. The fourth and last stage gives a reduction from general instances to small items instances. This is done by a simple enumeration, as in Section 2.3. Due to space constraints, we describe the fourth stage in Appendix B.

3.1

Small Assignments Instances of BSAP

We solve BSAP instances with small assignments by casting BSAP as a submodular maximization problem. We note that the submodular interpretations of GAP and SAP are well-known (see, e.g., [7] and [9]). Let Ii be the set of feasible assignments for bin i, and U = {(i, Si )|Si ∈ Ii }. Define fj : 2U → R+ for any aj ∈ A by fj (V ) = max{pi,j |(i, Si ) ∈ V, aj P ∈ Si }, where the maximum over an empty set is defined to be zero, and f : 2U → R+ by f (V ) = aj ∈A fj (V ). It is easy to verify that f is a non-decreasing submodular function. Also, we define the (d1 -dimensional) cost of an elementP (i, Si ) ∈ U by c¯((i, Si )) = c¯(i, Si ), and as in Section 2, the cost of a subset V ∈ 2U is c¯(V ) = (i,Si )∈V c¯(i, Si ). Finally, we define a set M of all subsets in which there is no more than a single assignment set for the same bin. We can now consider the following instance of generalized ¯ SUB (as defined in Section 2.2): Maximize f (V ) subject to the constraints V ∈ M and c¯(V ) ≤ L. We note that any set V ∈ M can be converted to a solution S for BSAP with c¯(S) ≤ c¯(V ) and p(S) = f (V ). Similarly, every solution S for BSAP can be converted to a set V ∈ M such that c¯(S) = c¯(V ) and p(S) = f (V ). Now, we use a technique of [9] to obtain a fractional solution for the submodular problem. Consider the linear program LP-BSAP over the variables XiS , for all 1 ≤ i ≤ m and S ∈ Ii . We note that the optimal solution for LP-BSAP is at least p(O), where O is an optimal solution for the BSAP instance. Since we have d2 linear constraints over the bins, where d2 ≥ 1 is some constant, a feasible solution of value (1 − ε) times the optimal solution can be found in polynomial time ¯ be such a solution, then the value of the solution X ¯ is at least (see [9] for more details). Let X (1 − ε) · p(O).

8

m X X

(LP-BSAP) maximize

XiS · p(i, S)

i=1 S∈Ii

subject to:

m X

∀aj ∈ A : ∀1 ≤ i ≤ m : ∀1 ≤ r ≤ d :

X

XiS ≤ 1

i=1 S∈Ii |aj ∈S X XiS = 1 S∈Ii m X X XiS cr (i, S) i=1 S∈Ii

≤ Lr

Let Di be a random variable over Ii with P r[Di = S] = XiS . Define a random set D = {(1, D1 ), (2, D2 ), . . . , (m, Dm )}, and let χ be the distribution of D. In [9] it was shown that E[f (D)] ¯ which means that E[f (D)] ≥ (1 − e−1 ) · (1 − is at least (1 − e−1 ) times the value of the solution X, ε) · p(O). It is easy to verify that the conditions of Theorem 2.1 hold for the distribution χ. We define D0 = D if D is ε-nearly feasible and D0 = ∅ otherwise. Then by Theorem 2.1 D0 is always ε-nearly feasible and E[f (D0 )] ≥ (1 − e−1 )(1 − O(ε))p(O). As before, D0 can be converted to a solution S for the BSAP instance, such that p(S) = f (D0 ), and c¯(S) ≤ c¯(D0 ). We summarize the above steps in the following algorithm: Approximation algorithm for Small Assignments BSAP Instances ¯ for LP-BSAP. 1. Find a (1 − ε)-approximate solution X 2. For any bin 1 ≤ i ≤ m, select an assignment Di = Si with probability XiSi and define D = {(1, D1 ), . . . , (m, Dm )} 3. If D is ε-nearly feasible return D as the solution, else return an empty assignment. Lemma 3.1 The approximation algorithm for small assignments BSAP instances returns in polynomial time an ε-nearly feasible solution with expected profit at least (1 − e−1 )(1 − O(ε))p(O), where O is an optimal solution.

3.2

Reduction into Small Assignments BSAP Instances

Let O = (S1 , . . . , Sm ) be an optimal solution for an instance of BSAP. We say that the profit of bin i (with respect to O) is pO (i) = p(i, Si ), and the cost of bin i is c¯O (i) = c¯(i, Si ). The first step in our algorithm is to guess the set T of h = d1 · ε−4 most profitable bins in the solution O. We then guess for any bin i ∈ T the cost in each dimension of bin i, with accuracy δ = ε · h−1 . That is, for each bin i ∈ T we guess a d1 -dimensional vector of integers k¯i = (ki,1 , . . . , ki,d1 ) such that, for any 1 ≤ r ≤ d1 , ki,r · δ · Lr ≤ cO (i) ≤ (ki,r + 1) · δ · Lr (1) As the number of values ki,r can get is at most dδ −1 e, we can go over all possible cost vectors in polynomial time, for some constant ε > 0. We use our guess for the set T and the vectors k¯i to define a residual instance of BSAP. The ground set of elements is A,P and there are m bins. We define the budget of the residual ¯ 0 , to be L0 = Lr (1 − instance, L / T , the feasible set r i∈T ki,r δ), for 1 ≤ r ≤ d1 . For any i ∈ 0 S 3 0 ¯ of assignments for bin i is Ii = {S| ∈ Ii , c¯(i, Si ) ≤ ε L } and for any i ∈ T the feasible set is 9

Ii0 = {S|S ∈ Ii , c¯(i, Si ) ≤ (ki,r + 1) · δ · Lr }. In both cases, the set of feasible assignments for bin i is defined by d1 + d2 linear constraints. The new cost of aj when assigned to bin i is c¯0i,j = c¯i,j if i∈ / T and c¯0i,j = 0 if i ∈ T . The new profit of aj when assigned to bin i is p0i,j = pi,j . A crucial property of the residual instance is that it is a small assignments instance. The relation between the solutions of the residual and original problems is stated in the next two lemmas. Lemma 3.2 Let S = (S1 , . . . , Sm ) be an ε-nearly feasible solution for the residual problem with respect to a set T (of size at most h) and a collection of vectors k¯i . Then S is a (2ε)-nearly feasible solution for the original problem with p(S) = p0 (S). Lemma 3.3 Let T be the set of h most profitable bins in an optimal solution O, and let k¯i be the set of cost guesses for which (1) holds. Then there is a feasible solution S 0 for the residual problem with respect to T and k¯i of profit p0 (S 0 ) ≥ (1 − ε)p(O). Proof Sketch: Simply use all the assignments in O which remain feasible with respect to the residual problem. The number of discarded assignment is bounded by d1 · ε−3 , and the profit of each of them is bounded by p(O)/ |T |. Based on these two claims the profit bound can be attained.4  Now, consider the following algorithm Nearly Feasible Algorithm for BSAP 1. Enumerate over all subsets T , |T | ≤ h = d1 · ε−4 , and cost vectors k¯i for any i ∈ T . (a) Define a residual problem with respect to T and k¯i , and run the algorithm for small assignments BSAP on the residual instance. Consider the resulting solution as a solution for the original problem. 2. Return the best solution found. Theorem 3.4 Let O be an optimal solution for a BSAP instance. Then the Nearly Feasible Algorithm for BSAP returns a (2ε)-nearly feasible solution S with expected profit at least E[p(S)] ≥ (1 − e−1 )(1 − O(ε))p(O) in polynomial time.

3.3

Small Items BSAP Instances

In this section we consider inputs in which all items are small. This allows us to fix a nearly feasible solution, similar to Lemma 2.2. Lemma 3.5 Let S = (S1 , . . . , Sm ) be an ε-nearly feasible solution for a small items instance of BSAP. Then S can be converted to a feasible solution S 0 such that p(S 0 ) ≥ (1 − O(ε))p(S). Algorithm for Small Items BSAP 1. Run the Nearly Feasible Algorithm for BSAP, Let S be the resulting solution. 2. Convert S to a feasible solution S 0 as in Lemma 3.5, and return S 0 . The properties of the algorithm follow immediately from Lemmas 3.4 and 3.5. Lemma 3.6 The Algorithm for small items BSAP returns a feasible solution for the problem with expected profit of at least (1 − e−1 )(1 − O(ε))p(O), where O is an optimal solution. Since any general instance for the problem can be reduced to a small items instance (see Appendix B, we have Theorem 3.7 There is a polynomial time randomized (1 − e−1 − ε)-approximation algorithm for BSAP, for any fixed ε > 0. 4

A detailed proof is given in Appendix C.

10

References [1] A. Ageev and M. Sviridenko. Pipage rounding: A new method of constructing algorithms with proven performance guarantee. J. Combinatorial Optimization, 8(3):307–328, 2004. [2] G. Calinescu, C. Chekuri, M. Pa’l, and J. Vondra’k. Maximizing a submodular set function subject to a matroid constraint. SIAM Journal on Computing, to appear. [3] G. Calinescu, C. Chekuri, M. P´ al, and J. Vondr´ak. Maximizing a submodular set function subject to a matroid constraint. In IPCO, pages 182–196, 2007. [4] C. Chekuri and A. Kumar. Maximum coverage problem with group budget constraints and applications. In APPROX-RANDOM, pages 72–83, 2004. [5] E. D. Demaine and M. Zadimoghaddam. Scheduling to minimize power consumption using submodular functions. In SPAA, pages 21–29, 2010. [6] U. Feige. A threshold of ln n for approximating set cover. J.of ACM, 45(4):634–652, 1998. [7] U. Feige and J. Vondr´ ak. Approximation algorithms for allocation problems: Improving the factor of 1 - 1/e. In FOCS, pages 667–676. IEEE Computer Society, 2006. [8] U. Feige, V.S.Mirrokni, and J. Vondr´ak. Maximizing non-monotone submodular functions. In FOCS, 2007. [9] L. Fleischer, M. X. Goemans, V. S. Mirrokni, and M. Sviridenko. Tight approximation algorithms for maximum general assignment problems. In SODA ’06: Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm, pages 611–620, New York, NY, USA, 2006. ACM. [10] T. Fujito. Approximation algorithms for submodular set cover with applications. IEICE Trans. Inf. and Systems, E83-D(3), 2000. [11] J. D. Hartline, V. S. Mirrokni, and M. Sundararajan. Optimal marketing strategies over social networks. In WWW, pages 189–198, 2008. [12] S. Khuller, A. Moss, and J. Naor. The budgeted maximum coverage problem. Inf. Process. Letters, 70(1):39–45, 1999. [13] A. Kulik and H. Shachnai. On lagrangian relaxation and subset selection problems. In WAOA, pages 160–173, 2008. [14] A. Kulik, H. Shachnai, and T. Tamir. Maximizing submodular set functions subject to multiple linear constraints. In SODA, pages 545–554, 2009. [15] J. Lee, V.S.Mirrokni, V. Nagarajan, and M. Sviridenko. Non-monotone submodular maximization under matroid and knapsack constraints. In STOC, 2009. [16] G. Nemhauser and L. Wolsey. Best algorithms for approximating the maximum of submodular set function. Mathematics of Operations Research, 3(3):177–188, 1978. [17] G. Nemhauser, L. Wolsey, and M. Fisher. An analysis of the approximations for maximizing submodular set functions. Mathematical Programming, 14:265–294, 1978. 11

[18] A. Schriejver. Combinatorial Optimization - polyhedra and efficiency. Springer Verlag - Berlin Heidelberg, 2003. [19] M. Sviridenko. A note on maximizing a submodular set function subject to knapsack constraint. Operations Research Letters, 32:41–43, 2004. [20] J. Vondr´ ak. Optimal approximation for the submodular welfare problem in the value oracle model. In STOC, pages 67–74, 2008.

A

Basic Properties of Submodular Functions

In this section we give a few simple properties of submodular functions. Recall that f 2U → R is a submodular function if f (S) + f (T ) ≥ f (S ∪ T ) + f (T ∩ S) for any S, T ⊆ U , and we define fT (S) = f (S ∪ T ) − f (T ). Lemma A.1 Let f : 2U → R be a submodular function with f (∅) ≥ 0, and let S = S1 ∪ S2 ∪ . . . ∪ Sk where Si are disjoint sets. Then f (S) ≥ f (S1 ) + f (S2 ) + . . . f (Sk ) Proof: By induction on k. For k = 2, since f is a submodular function, f (S1 ) + f (S2 ) ≥ f (S1 ∪ S2 ) + f (S1 ∩ S2 ) = f (S) + f (∅), and since f (∅) ≥ 0, we have that f (S) ≤ f (S1 ) + f (S2 ). For k > 2, by using the induction hypothesis twice, we have f (S) ≤ f (S1 ) + f (S2 ) + . . . f (Sk−2 ) + f (Sk−1 ∪ Sk ) ≤ f (S1 ) + f (S2 ) + . . . f (Sk ).  Lemma A.2 Let f : 2U → R+ be a submodular function, and let S, T1 , T2 ⊆ U such that T1 ⊆ T2 and S ∩ T2 = ∅. Then fT2 (S) ≤ fT1 (S). Proof: Since f is submodular, f (S ∪ T1 ) + f (T2 ) ≥ f (S ∪ T1 ∪ T2 ) + f ((S ∪ T1 ) ∩ T2 ) = f (S ∪ T2 ) + f (T1 ). Hence, fT2 (S) ≤ fT1 (S).



Lemma A.3 Let f : 2U → R+ be a submodular function, and let S = S1 ∪ S2 ∪ . . . ∪ Sk , where Si are disjoint sets. Then k X f (S) ≥ fS\Si (Si ). i=1

Proof: We note that f (S) =

k X

fS1 ∪...∪Si−1 (Si )

i=1

12

By lemma A.2 we have that for each i, fS1 ∪...∪Si−1 (Si ) ≥ fS\Si (Si ). Hence we get f (S) ≥

k X

fS\Si (Si )

i=1

As desired.

B



General Inputs for BSAP

In this section we use the algorithm for small items BSAP to obtain (1 − e−1 − ε)-approximation for arbitrary inputs. Given an input for BSAP, let O = (S1 , . . . , Sm ) be an optimal solution. ∗ ) be a feasible assignment of elements to bins. We define a residual Let T ∗ = (T1∗ , . . . , Tm ¯∗ = L ¯ − c¯(T ∗ ), the new set of elements A∗ is the instance with respect to T ∗ . The new budget is L collection of all elements in A which were not assigned in T ∗ to any bin. Consider the assignment ¯ 0 , then p∗ = pi,j and c¯∗ = c¯i,j . Otherwise, set p∗ = 0 and c¯∗ = 0. of aj to bin i. If c¯i,j ≤ ε3 L i,j i,j i,j i,j P The size remains s¯∗i,j = s¯i,j . The new capacity of bin i is ¯b∗i = ¯bi − aj ∈T ∗ s¯i,j . i The residual instance with respect to T ∗ is small items instance. Any feasible assignment to the residual problem (with respect to T ∗ ) of profit v can be converted to a feasible assignment to original problem of profit p(T ∗ ) + v. Now, consider T ∗ to be the assignment of the h = d1 · ε−4 most profitable elements in an optimal solution O (the profit of aj in O is pi,j if it was assigned to bin i and zero otherwise). Then there is a solution to the residual problem of value p(O) − (1 + ε)f (T ∗ ) (the proof is similar to the proof of Lemma 3.3). Thus, if we guessed T ∗ correctly, the algorithm for small items BSAP can be used to obtain a solution with expected profit of at least (1 − e−1 − O(ε))(p(O) − (1 + ε)f (T ∗ )) for the residual problem, which we can covert to a solution for the original instance of profit at least (1 − e−1 − O(ε))p(O). And we get the following algorithm. Approximation algorithm for BSAP 1. For any feasible assignment T ∗ of at most h = d · ε−4 elements: (a) Find a (1 − e−1 − O(ε))-approximate solution S ∗ for the residual problem with respect to T ∗ . (b) Derive from S ∗ a solution for the original problem of profit p(S ∗ ) + p(T ∗ ). 2. Return the solution of maximal profit found. Theorem B.1 There is a polynomial time randomized (1 − e−1 − ε)-approximation algorithm for BSAP, for any fixed ε > 0.

C

Some Proofs

Proof of Claim 2.1: For any dimension 1 ≤ r ≤ d, it holds that E[cr (D)] = Define Vr = {k|cr (Dk ) is not fixed}. Then, V ar[cr (D)] =

m X k=1

V ar[cr (Dk )] ≤

X

E[c2r (Dk )] ≤

k∈Vr

X k∈Vr

Pm

E[cr (Dk )]·ε3 Lr ≤ ε3 Lr

k=1 E[cr (Dk )]

m X

≤ Lr .

E[cr (Dk )] ≤ ε3 L2r .

k=1

The first inequality holds since V ar[X] ≤ E[X 2 ], and the second inequality follows from the fact that cr (Dk ) ≤ ε3 Lr for k ∈ Vr . Recall that, by the Chebyshev-Cantelli inequality, for any t > 0 13

and a random variable Z, P r [Z − E[Z] ≥ t] ≤

V ar[Z] . V ar[Z] + t2

Thus, P r [cr (D) ≥ (1 + ε)Lr ] = P r [cr (D) − E[cr (D)] ≥ (1 + ε)Lr − E[cr (D)]] ε3 L 2 ≤ P r [cr (D) − E[cr (D)] ≥ ε · Lr ] ≤ 2 r2 = ε. ε Lr By the union bound, we have that P r[F = 0] ≤

d X

P r[cr (D) ≥ (1 + ε)Lr ] ≤ dε.

r=1

 Proof of Claim 2.2: By the Chebyshev-Cantelli inequality we have that, for any dimension 1 ≤ r ≤ d, P r[Rr > `] = P r[cr (D) > ` · Lr ] ≤ P r [cr (D) − E[cr (D)] > (` − 1)Lr ] ≤ ε3 L2r ε3 ≤ , ≤ (` − 1)2 L2r (` − 1)2 and by the union bound, we get that P r[R > `] ≤

dε3 . (` − 1)2 

Proof of Claim 2.3: The set D can be partitioned to 2d` sets D1 , . . . D2d` such that each of these sets is a feasible solution. Hence, f (Di ) ≤ O. Thus by lemma A.1, f (D) ≤ f (D1 ) + . . . + f (D2d` ) ≤ 2d`f (O).  Proof of Claim 2.4: By Claims 2.1 and 2.2, we have that E[f (D)] = E [f (D)|F = 1] · P r [F = 1] + E [f (D)|F = 0 ∧ R < 2] · P r [F = 0 ∧ (R < 2)] ∞ h i h i X + E f (D)|F = 0 ∧ (2` ≤ R ≤ 2`+1 ) · P r F = 0 ∧ (2` ≤ R ≤ 2`+1 ) `=1

≤ E[f (D)|F = 1] · P r [F = 1] + 4d2 ε · O + d2 ε3 · O ·

∞ X 2`+2 . (2`−1 )2 `=1

Since the last summation is a constant, and E[f (D)] ≥ O/2, we have that E[F (D)] ≤ E[f (D)|F = 1]P r [F = 1] + ε · c · E[F (D)],

14

where c > 0 is some constant. It follows that (1 − O(ε))E[f (D)] ≤ E[f (D)|F = 1]P r [F = 1]. Finally, since D0 = D if F = 1 and D0 = 0 otherwise, we have that E[f (D0 )] = E[f (D)|F = 1] · P r [F = 1] ≥ (1 − O(ε))E[f (D)].  Proof of Lemma 2.7: Let U 0 = {i | 0 < xi < 1} be the set of all fractional entries. We define a new cost function c¯0 over the elements in U .  i∈ / U0  cr (i) r 0 cr (i) ≤ ε·L c0r (i) = 2k  ε·Lr ε·Lr ε·Lr j j j+1 2k (1 + ε/2) 2k (1 + ε/2) ≤ cr (i) < 2k (1 + ε/2) Note that for any i ∈ U 0 , c¯0 (i) ≤ c¯(i), and ε · Lr , 2k

cr (i) ≤ (1 + ε/2)c0r (i) +

for all 1 ≤ r ≤ d. The number of different values c0r (i) can get for i ∈ U 0 is bounded by 8 ln(2k) ε (since all elements are small, and ln(1 + x) ≥ x/2). Hence the number of different values c¯0 (i) can  d get for i ∈ U 0 is bounded by k 0 = 8 ln(2k) . ε We start with x ¯0 = x ¯, and while there are i, j ∈ U 0 such that x0i and x0j are both fractional − − and c¯0 (i) = c¯0 (j), define δ + = δx+ ¯0 ), it ¯0 ,i,j and δ = δx ¯0 ,i,j . Since i and j have the same cost (by c x ¯ (δ + ) ≥ F (¯ holds that c¯0 (¯ xi,j (δ + )) = c¯0 (¯ xi,j (δ − )) = c¯0 (¯ x). If Fi,j x), then set x ¯00 = x ¯i,j (δ + ), otherwise 00 − 00 0 0 00 0 0 x ¯ =x ¯i,j (δ ). In both cases F (¯ x ) ≥ F (¯ x ) and c¯ (¯ x ) = c¯ (¯ x ). Now, repeat the step with x ¯0 = x ¯00 . Since in each iteration the number of fractional entries in x ¯0 decreases, the process will terminate 0 ¯ and there (after at most k iterations) with a vector x ¯ such that F (¯ x0 ) ≥ F (¯ x), c¯0 (¯ x0 ) = c¯0 (¯ x) ≤ L 0 0 0 0 0 are no two elements i, j ∈ U with c¯ (i) = c¯ (j) where xi and xj are both fractional. Also, for any i∈ / U 0 , the entry x0i is integral (since xi was integral and the entry was not modified by the process). Thus, the number of fractional entries in x ¯0 is at most k 0 . Now, for any dimension 1 ≤ r ≤ d, X X cr (¯ x0 ) = x0i cr (i) + x0i cr (i) i∈U / 0

≤ (1 + ε/2) · = (1 + ε/2) ·

i∈U 0

X

+

X

i∈U / 0

i∈U 0

X

x0i · c0r (i) +

X

x0i

·

c0r (i)

i∈U 0

i∈U

This completes the proof.

x0i xi

  ε · Lr 0 (1 + ε/2)cr (i) + 2k

ε · Lr ≤ (1 + ε)Lr . 2k 

Proof of Lemma 3.2: Clearly, S is a solution for the original problem since for every bin i it holds that Ii0 ⊆ Ii . For any bin i and aj ∈ A it holds that pi,j = p0i,j , then clearly p(S) = p0 (S). As

15

for the cost, for any 1 ≤ r ≤ d1 : cr (S) =

X

cr (i, Si ) +

i∈T

X

c0r (i, Si )

i∈T /

X X ≤ (ki,r + 1) · δLr + c0r (i, Si ) i∈T

i∈T /

≤ |T | · δLr +

X

≤ h · δLr +

c0r (i, Si )

1≤i≤m

i∈T

X

X

ki,r · δLr +

ki,r · δLr + (1 + ε)L0r

i∈T

= ε · Lr + Lr + εL0r ≤ (1 + 2ε)Lr  Proof of Lemma 3.3: Let O = (S1 , . . . , Sm ) be an optimal solution for the original problem. 0 ) for the residual problem by S 0 = S if S ∈ I 0 and S 0 = ∅ We define a solution S 0 = (S10 , . . . , Sm i i i i i otherwise. Clearly, S 0 is a solution for the residual problem. For any dimension 1 ≤ r ≤ d1 , X X X X X Lr ≥ cO = c (i, S ) + c (i, S ) ≥ c (i, S ) + k · δ · L = cr (i, Si ) + (Lr − L0r ) r i r i r i i,r r r i∈T

i∈T /

i∈T

i∈T /

i∈T /

P 0 0 0 This implies that i∈T / cr (i, Si ) ≤ Lr . Now, for any i ∈ T it holds that cr (i, Si ) = 0, and for any i 0 0 it holds that cr (i, Si ) ≤ cr (i, Si ). Thus, we conclude that X X X c0r (S 0 ) = c0r (i, Si ) + c0r (i, Si ) ≤ cr (i, Si ) ≤ L0r , i∈T

i∈T /

i∈T /

i.e., S 0 is a feasible solution. Let H be the set of bins for which Si 6= Si0 . Since (1) holds for the vectors k¯i , it is clear that for ¯ 0 (since any i ∈ T , Si0 = Si , and thus H ∩ T = ∅. The total cost of bins not in T is bounded by L P −3 0 it holds that cr (i) ≤ ε3 L0r , for i∈T / cr (i, Si ) ≤ Lr ). Hence, for all of them except at most d1 · ε some 1 ≤ r ≤ d1 . This means that for every i ∈ / T except at most d1 · ε−3 it holds that Si ∈ Ii0 andSi = Si0 . From this, we conclude that |H| ≤ d1 · ε−3 , and that none of the bins in T is in H. The profit of any bin which is not in T is smaller than p(O)/ |T | (since T is the set of most profitable bins). Thus, X d1 · ε−3 pO (i) ≤ d1 · ε−3 · p(O)/ |T | = p(O) · = ε · p(O) h i∈H

. Thus, p0 (S 0 ) = p(S 0 ) =

m X i=1

p(i, Si0 ) =

m X i=1

p(i, Si ) −

X

p(i, Si ) = p(O) − p(H) ≥ (1 − ε) · p(O)

i∈H

 Proof of Lemma 3.4: The properties of the algorithm for small assignments BSAP are described in Lemma 3.1. As the number of possibilities for T and the vectors k¯i is polynomial for fixed values of d1 ,d2 and ε, and the algorithm for Small Assignments BSAP is polynomial as well, we get that 16

the Nearly Feasible Algorithm for BSAP runs in polynomial time. Since the algorithm for Small Assignments BSAP always returns an ε-nearly feasible solution (for the residual problem), by Lemma 3.2, we get that the solutions output by the algorithm are always (2ε)-nearly feasible for the original problem. Finally, in the iteration for which the set T and vectors k¯i satisfy the requirements of Lemma 3.3, the optimal solution for the residual problem is of profit at least (1 − ε)p(O). Thus, the expected profit of the solution returned by the small assignments algorithm is at least (1 − e−1 )(1 − O(ε))(1 − ε)p(O) = (1 − e−1 )(1 − O(ε))p(O). As the expected profit of the solution returned by the algorithm is at least the expected profit at any iteration, we get that the expected profit of the solution returned by the algorithm is (1 − e−1 )(1 − O(ε))p(O). 

D

Approximation Algorithm for Maximum Coverage with Multiple Packing and Cost Constraints

The problem of maximum coverage with multiple packing and cost constraints (CMPC) is the following generalization of the maximum coverage problem. Given is a collection of sets S = {S1 , ..., Sm } over a ground set A = {a1 , ..., an }. Each element aj has a profit pj ≥ 0 and a d1 dimensional size vector s¯j = (sj,1 , . . . , sj,d1 ), such that sj,r ≥ 0 for all 1 ≤ r ≤ d1 . Each set Si has d2 -dimensional weight vector w ¯i = (wi,1 , . . . , wi,d2 ). Also given is a d1 -dimensional capacity vector ¯ ¯ = (W1 , . . . , Wd ). A solution for B = (B1 , . . . , Bd1 ), and a d2 -dimensional weight limit vector W 2 the problem is a collection of subsets H and a subset of elements E, such that for any aj ∈ E there is Si ∈ H such that aj ∈ Si . A solution is feasible if the total weight of subsets in H is bounded by ¯ and the total size of elements in E is bounded by B. ¯ The profit of a solution (H, E) is the total W profit of elements in E. The objective of the problem is to find a feasible solution with maximal profit.  f Denote the maximal number of sets a single element belongs to by f , and let αf = 1− 1 − f1 . Our objective is to obtain a (αf − ε)-approximation algorithm for the problem. Note that for any f ≥ 1 it holds that αf > 1 − e−1 . For solving the problem, our algorithm uses a slightly different point of view. Given an input for the problem, we say that a pair (¯ y, x ¯) where y¯ ∈ [0, 1]S×A and x ¯ ∈ [0, 1]S is a solution if, for any Si ∈ S and aj ∈ / Si it holds that yi,j = 0 (for short, we write yi,j = ySi ,aj ), and for any Si ∈ S and aj ∈ A it holds that yi,j ≤ xi . Intuitively, xi is an indicator for the selection of the set Si into the solution, and yi,j is an indicator for the selection of the element aj by P the set Si into P the solution. We say that such solution is feasible if, for any 1 ≤ r ≤ d1 it holds that aj ∈A sj,r · Si ∈S yi,j ≤ Br (the total size of element does not exceed the capacity), and for any 1 ≤ r ≤ d2 it holds that P not exceed the Si ∈S xi · wi,r ≤ Wr (the total weight of subsets doesP Pweight limit). The value (or profit) of the solution is defined by p(¯ y, x ¯) = p(¯ y ) = aj ∈A min{1, Si ∈S yi,j } · pj . By the above definition, a solution consists of fractional values. We say that a solution (¯ x, y¯) is semi-fractional (or semi-integral ) if x ¯ ∈ {0, 1}S (that is, sets cannot be fractionally selected, but elements can be). Also, we say that a solution is integral if both x ¯ ∈ {0, 1}S and y¯ ∈ {0, 1}S×A . Two computational problem arise from the above definitions. The first is to find a semifractional solution of maximal profit. We refer to this problem as the semi-fractional problem. The second is to find an integral solution of maximal profit, to which we refer as the integral problem. It is easy to see that the integral problem is equivalent to CMPC, and our objective is to find an optimal solution for the integral problem. 17

Overview To obtain approximation algorithm for the integral problem, we first reduce it to the semi-fractional problem. That is, we show that given a (αf − O(ε))-approximation algorithm for the semi-fractional problem we can derive approximation algorithm with the same approximation ratio for the integral problem. Next, we interpret the semi-fractional problem as a submodular optimization problem with multiple linear constraints and an infinite universe. We use the framework developed in Section 2 to solve this problem. As direct enumeration over the most profitable elements in an optimal solution is impossible here, we guess which sets are the most profitable in an optimal solution. We use this guessing to obtain a fractional solution (with polynomial number of non-zero entries) such that the conditions of Theorem 2.1 hold. Together with a fixing procedure for the nearly feasible solution obtained, this leads to the desired algorithm. The process can be derandomized by using the same tools as in Section 2.5.

D.1

Reduction to the semi-fractional problem

First, we show that a semi-fractional solution for the problem can be converted to a solution with at least the same profit and at most d1 fractional entries. Next, we show how this property enables to enumerate over the most profitable elements in an optimal solution. Throughout this section we assume that, for some constant α ∈ (0, 1), we have an α-approximation algorithm for the semi-fractional problem. Lemma D.1 Let (¯ yf , x ¯f ) be a feasible semi-fractional solution. Then y¯f can be converted in polynomial time to another feasible semi-fractional solution (¯ y, x ¯f ) with at most d1 fractional entries, f such that p(¯ y ) ≥ p(¯ y ) in polynomial time. Proof: Let (¯ yf , x ¯f ) be a semi-fractional feasible solution. W.l.o.g, we assume that aj ∈ A, P f f f Si ∈S yi,j ≤ 1, and if yi,j 6= 0 then for any Si0 6= Si it holds that yi0 ,j = 0. Note that any solution can be converted to such solution with the same profit easily. If there are more than d1 fractional entries, let s¯j1 , . . . s¯jk be the size vectors of the corresponding elements, and let Si1 . . . Sik be the corresponding sets. As k > d1 there must be a linear dependency between the vectors, w.l.o.g we can write it as λ1 s¯j1 + . . . λp s¯jp = 0 for p = d1 + 1. We can define y¯f (ε) by yif` ,j` (ε) = yif` ,j` + ελ` f f for 1 ≤ ` ≤ p, and yi,j (ε) = yi,j for any other entry. As long as y¯f (ε) ∈ [0, 1]S×A , y¯f (ε) is a semi-fractional feasible solution. Let ε+ and ε− be the maximal and minimal values of ε for which y¯f (ε) ∈ [0, 1]S×A . The number of fractional entries in y¯f (ε+ ) and y¯f (ε− ) is smaller than the number of fractional entries in y¯f . Also, p(ε) = p(¯ y f (ε)) is a linear function, thus either p(¯ y f (ε+ )) ≥ p(¯ yf ) f − f f 0 or p(¯ y (ε )) ≥ p(¯ y ). This means that we can convert y¯ to a feasible solution y¯ that has less fractional entries, and p(¯ y 0 ) ≥ p(¯ y f ). By repeating the above process as long as there are more than d1 fractional entries, we can obtain a fractional solution with d1 fractional entries or less. 

We use Lemma D.1 to prove the next result. Lemma D.2 Given an α-approximation algorithm for the semi-fractional problem, an α-approximation algorithm for the integral problem can be derived in polynomial time. Proof: Given a collection T of pairs (aj , Si ) of an element aj and a set Si such that aj ∈ Si , denote the collection of sets in T by TS , and the collection of elements in T by TE . We define a residual instance for the problem as follows. The elements are: AT = {aj ∈ A|aj ∈ / TE , for any aj 0 ∈ TE , pj ≤ pj 0 }, 0 }, where where the size of aj ∈ AT is s¯j and the profit of aj is pj . The sets are ST = {S10 , . . . , Sm Si0 = Si ∩ AT , and w ¯i0 = w ¯i if Si ∈ / TS and w ¯i0 = 0 if Si ∈ TS . The weight limit of this instance

18

P ¯T = W ¯ ¯T = B ¯ − s(TE ), where is W ¯i , and the capacity is B Si ∈TS w P − w(TS ), where w(TS ) = s(TE ) = aj ∈TE s¯j . Clearly, a solution of profit v for the residual instance with respect to a collection T gives a P solution of profit v + p(TE ) for the original instance, where p(TE ) = aj ∈TE pj . Let O = (¯ x, y¯) be an P optimal solution for the integral problem. W.l.o.g. we assume that for all aj ∈ A, Si ∈S yi,j ≤ 1 d1 (that is, no element is selected by more than one set). Let R be the collection of h = 1−α most profitable elements aj for which there is Si such that yi,j = 1 (note that there is a unique set Si for each aj ). Define T O = {(aj , Si )|aj ∈ R ∧ yi,j = 1}. It is easy to verify that the optimal integral solution for the residual problem with respect to T O is O − p(TEO ). Now, assume that we have an α-approximation algorithm for the semi-fractional problem. Then it returns a fractional solution (¯ yf , x ¯f ) with p(¯ y f ) ≥ α(O − p(TEO )). By Lemma D.1, this solution f can be converted to a solution (¯ z, x ¯ ) with up to d1 fractional entries and p(¯ z ) ≥ p(¯ y f ). Now, consider rounding down to zero the value of each fractional entry in z¯0 . This will generate a new d ·p(T O ) feasible integral solution z¯00 with p(¯ z 0 ) ≥ p(¯ y f )− 1 |T | E (as the profit of any element in the residual solution is bounded by p(TEO )

p(TEO ) |T | ).

This implies a solution for the original problem of value at least

  d1 · p(TEO ) d1 O ≥ p(TE ) 1 − + α(O − p(TEO )) ≥ αO + p(¯ y )− |T | |T | f

That is, an α-approximation for the optimum. To use this technique, we need to guess the correct set T , which can be done in time (n · m)O(1) for a constant α.  In Theorem D.7 we show that there is a polynomial time (αf − ε)-approximation algorithm for the semi-fractional problem where 1 αf = 1 − (1 − )f , (2) f and f is the maximal number of sets to which a single element can belong. This lead to the following theorem. Theorem D.3 There is a polynomial time (αf − ε)-approximation algorithm for CMPC for any ε, where f is the maximal number of sets to which a single element belongs.

D.2 D.2.1

The Semi-fractional Problem A submodular point of view

Given an instance of the semi-fractional problem, let O = (¯ y, x ¯)Pbe an optimal solution for the problem. W.l.o.g., we may assume that for any element aj ∈ A, Si ∈S yi,j ≤ 1. For P each Si ∈ S, we define the profit of Si with respect to the solution O by pO (Si ) = p(Si ) = aj ∈A yi,j (note P that if xP i = 0 then p(Si ) = 0). Since Si ∈S yi,j ≤ 1 holds for any aj ∈ A, this means that p(O) = Si ∈S pO (Si ). Given T , the collection of the h = ε−4 · d most profitable sets in O, we define a maximization ¯0=W ¯ − w(T ) (we problem for a submodular function, with d = d1 + d2 linear constraints. Let W guess the set T ). • For each Si ∈ T and z¯ ∈ [0, 1]A such that if aj ∈ / Si then P zj = 0, add the element (Si , z¯) to the universe U . The cost of this element is c¯, where cr = aj ∈A zj · si,j for 1 ≤ r ≤ d1 , and cr = 0 for d1 + 1 ≤ r ≤ d1 + d2 = d. 19

• For each set Si ∈ / T , and z¯ ∈ [0, 1]A such that 1. if aj ∈ / Si then zj = 0, 2. for any 1 ≤ r ≤ d1 it hold that

P

aj ∈A zj

· sj,r ≤ ε3 Br , and

3. for any 1 ≤ r ≤ d2 it hold that wi,r ≤ ε3 Wr0 , add (Si , z¯) to U . The cost of this element is c¯, where cr = cd1 +r = wi,r for 1 ≤ r ≤ d2 .

P

aj ∈A zj

· sj,r for 1 ≤ r ≤ d1 and

¯ is the vector B ¯ concatenated to W ¯ 0 , that is, Lr = Br for The budget vector for the problem, L, 0 U 1 ≤ r ≤ d1 , and Ld1 +r = Wr for 1 ≤ r ≤ d2 . Define fj : 2 → R for any aj ∈ A, by X fj (V ) = min{ zj , 1}, (3) (Si ,¯ z )∈V

P and f : 2U → R by f (V ) = aj ∈A fj (V ) · pj . Since each fj is a submodular non-decreasing set function, f is a submodular non-decreasing set function as well. It is easy to see that a solution of value v for the submodular optimization problem will imply a solution of the same value (profit) for the semi-fractional problem. P Let the size of a set Si ∈ S \ T (with respect to the solution O) be s(Si ) = aj ∈A yi,j · s¯j . We ¯ and w ¯ 0 . Consider the following solution for say a set Si ∈ S \ T is small if s(Si ) ≤ ε3 B ¯ i ≤ ε3 W the submodular problem. For each Si ∈ T , define the vector z¯ by zj = yi,j , for all aj ∈ A, and add the element (Si , z¯) to the solution; for each Si ∈ / T such that Si is small, define z¯ by zj = yi,j and add (Si , z¯) to the solution. Denote the resulting solution by V . It can be easily verified that V is a feasible solution, and it holds that X X X f (V ) = p(Si ) + p(Si ) = O − p(Si ) Si ∈T Si ∈S\T , Si is small Si ∈S\T , Si is not small −3 d·ε ≥ O− O ≥ (1 − ε)O, (4) h where the last inequality holds since the number of not-small sets in O is bounded by d · ε−3 , and the profit of each such set is bounded by O/h. This means that the value of the optimal solution for the submodular problem is between (1 − ε)O to O. D.2.2

Obtaining a distribution on the universe of the submodular problem

We would like to use the technique of Section 2 on the submodular problem. To do so, we first need to obtain a fractional feasible solution. As the size of U may be infinite, we cannot use the algorithm of Vondr´ak [20]. To obtain a fractional solution, we use a linear programming formulation of the ¯ 0 . Consider the problem. Let S 0 be the collection of all sets Si in S such that Si ∈ T or w ¯ i ≤ ε3 W following linear program:

20

X X

(LP (T )) maximize

Si

∈S 0

yi,j · pj

aj ∈A

subject to: ∀aj ∈ A, Si ∈ S 0 : ∀aj ∈ A, Si ∈ S 0 , aj ∈ / Si : ∀aj ∈ A :

0 ≤ yi,j ≤ xi yX i,j = 0 yi,j ≤ 1 0 SX i ∈S

∀1 ≤ r ≤ d1

X

yi,j · sj,r ≤ Br

Si ∈S 0 aj ∈A

X

∀1 ≤ r ≤ d2 :

wi,r ≤ Wr0

Si ∈S 0 \T

∀Si ∈ T : xX i =1 0 ∀Si ∈ S \ T and ∀1 ≤ r ≤ d1 : yi,j · sj,r ≤ ε3 Br · xi aj ∈A

Let x ¯f , y¯f be an optimal solution for LP (T ) of value Of . Clearly, Of is greater or equal to the optimal solution of the submodular problem; thus, by (4), Of ≥ (1 − ε)O. We use this solution ¯ ∈ [0, 1]U as follows. For any to generate fractional solution for the submodular problem. Define X Si ∈ S 0 such that xfi > 0, let z¯i ∈ [0, 1]A where zi,j =

f yi,j

, and we set Xi = X(Si ,¯zi ) = xfi . For any P P f other u ∈ U , set Xu = 0. For any aj ∈ A, define Yj = Si ∈S 0 yi,j , then clearly Of = aj ∈A Yj · pj . ¯ then for any aj ∈ A, E[fj (D)] ≥ αf ·Yj , Lemma D.4 Let D be a random variable such that D ∼ X, where fj is defined in (3). We use in the proof the next claim. Claim D.1 For any x ∈ [0, 1] and f ∈ N, xfi

  x f 1− 1− ≥ x · αf , f where αf is defined in (2).  f  f −2 x Proof: Let h(x) = 1− 1 − fx −x·αf , then h(0) = h(1) = 0. Also, h00 (x) = − f −1 1 − ≤0 f f for x ∈ [0, 1]. Hence, h(x) ≥ 0 for x ∈ [0, 1] and the claim holds.  Proof of Lemma D.4: For the case where Yj = 0 the claim trivially holds, thus we assume below that Yj 6= 0. Let Xi be an indicator random variable for (Si , z¯i ) ∈ D, for any Si ∈ S 0 . The random variables Xi are independent, and P r[Xi = 1] = xfi . Then,     f     X X yi,h fj (D) = min 1, zi,j = min 1, · Xi f     Si ∈S 0 ,aj ∈Si xi (Si ,¯ zi )∈D Let S 0 [j] = {Si ∈ S 0 | aj ∈ Si } and τj = |S 0 [j]|. Let δ ∈ (0, 1) such that, for any Si ∈ S 0 [j] with xfi 6= 0, the value

f yi,j

xfi

is an integral multiple of δ, and 1 is also an integral multiple of δ (assuming

all values are rational, such δ exists). Let H = δ −1 , and let Z1 , . . . , ZH be a set of indicator random variables used as follows. Whenever Xi is selected in our random process (i.e., Xi = 1), randomly 21

select

f yi,j

xfi

· δ −1 indicators among Z1 , . . . , ZH with uniform distribution. For 1 ≤ h ≤ H, Zh = 1 if

Zh was selected by some Xi , i ∈ S 0 [j] (we say that Zh is selected in this case), otherwise Zh = 0. In this process, the probability of a specific indicator Zh to be selected by a specific Xi is zero when xi = 0, and xfi ·

f yi,j

xfi

f = yi,j otherwise. Hence, we get that, for all 1 ≤ h ≤ H,

E[Zh ] = P r(Zh = 1) = 1 −

Y 

f 1 − yi,j



Si ∈S[j]

 τj

 ≥ 1−

1 τj

X 



f  1 − yi,j

Si ∈S 0 [j]

  Yj τ j ≥ Yj αf . = 1− 1− τj

(5)

The first inequality follows from the inequality of the three means, and the second inequality follows from Claim D.1.P Let Yj0 = δ · H h=1 Zh be δ times the number of selected indicators. An important property of 0 0 Yj is that Yj ≤ fj (D). Indeed, if fj (D) = 1 then Yj0 ≤ 1, since there are only H = δ −1 indicators, and if fj (D) < 1, then no more than fj (D)δ −1 indicators are selected; therefore, E[Yj0 ] ≤ E[fj (D)]. From (5), we have that "H # H X X 0 E[Yj ] = E Zh δ = δ E[Zh ] ≥ δ · H · Yj αf = Yj · αf . h=1

h=1

Hence, E[fj (D)] ≥ αf · Yj as desired.



The next lemma follows immediately from Lemma D.4 and (4). Recall that F is an extension by expectation of f , then ¯ ≥ αf · Of ≥ (1 − ε) · αf · O Lemma D.5 F (X) ¯ is a feasible fractional solution for the submodular problem. It can be easily verified that X Also, for any (Si , z¯) ∈ U such that X(Si ,¯z ) 6= 0, if (Si , z¯) is big (as defined in Section 2.1), it holds ¯ and let D0 be a that Si ∈ T , and X(Si ,¯z ) = 1. Define D ⊆ U to be a random set such that D ∼ X 0 0 random set such that D = D if D is ε-nearly feasible, and D = ∅ otherwise. By Theorem 2.1, we get that D0 is always ε-nearly feasible, and E[f (D0 )] ≥ (1 − O(ε))F (¯ x) ≥ (1 − O(ε))αf O. Given a nearly feasible fractional solution for the submodular problem, we now show that it can be converted to a feasible one. Lemma D.6 Let D be an ε-nearly feasible solution for the submodular problem, then D can be converted to a feasible solution D0 such that f (D0 ) ≥ (1 − O(ε))f (D) z¯ Proof: Our conversion will be done in two steps. First, let D1 = {(Si , (1+ε) )|(Si , z¯) ∈ D}. Clearly, D1 is feasible in the first d1 dimensions (for any 1 ≤ r ≤ d1 , cr (D1 ) ≤ Lr ; also, f (D1 ) ≥ (1−ε)f (D). Next, we note that for any d1 + 1 ≤ r ≤ d, for any element u ∈ U it holds that cu,r ≤ ε3 Lr . Thus, we can apply in these dimensions the fixing procedure of Lemma 2.2. Hence, we can convert D to D0 as desired.  This leads to the following algorithm. Randomized approximation algorithm for CMPC 22

1. For any subset T ⊆ S of size at most h = d · ε−4 : (a) Solve LP (T ), let x ¯f , y¯f be the solution found. ¯ and let D be a random set D ∼ X, ¯ then D0 = D if D is ε-nearly feasible, and (b) Define X, D0 = ∅ otherwise. (c) Convert D0 to a feasible set D00 as in Lemma D.6. 2. Let D00 be the solution of maximal profit found for the submodular problem. Covert it to a solution for the semi-fractional problem and return this solution. Clearly, the algorithm returns a feasible solution for the problem. Consider the iteration in which T is the set of h most profitable elements in O. In this iteration E[f (D0 )] ≥ (1 − ε)αf O. Hence, by Lemma D.6 E[f (D00 )] ≥ (1−O(ε))αf O. Thus, we get the following (by a proper selection of ε). Theorem D.7 There is a polynomial time randomized (1 − ε)αf -approximation algorithm for the semi-fractional problem, for any fixed ε > 0. The algorithm can be derandomized, using the technique in Section 2.5 (details omitted). Note ¯ can be deterministically evaluated in polynomial that here, the extension by expectation F (X) ¯ is polynomial. time, since the number of non-zero entries in X

23