On-line Complexity of Monotone Set Systems - Semantic Scholar

3 downloads 33861 Views 383KB Size Report
AT&T Labs Research. August 7, 1998. Abstract. On-line ... The video on demand problem studied in 1] and 2], corresponds to the case, when Mis .... algorithms which we call oblivious, and conjecture that this class contains the best algorithm.
On-line Complexity of Monotone Set Systems Preliminary copy: Please do not Distribute

Haim Kaplan Mario Szegedy AT&T Labs Research and

August 7, 1998

Abstract On-line models assume a player, A (randomized or deterministic), who makes immediate responses to incomming elements of an input sequence s = a1 : : :ar . In this paper we study the case, when the response for every ai is a single bit, which is interpreted as pick/not-topick. The player's objective is to pick as many elements as possible under the condition, that the set of picked elements, A(s), satis es a certain prescribed property. Bartal et al. in [3] give bounds for the randomized player's worst case performance, when the incomming elements are vertices of a xed graph G, and the property in question is some hereditary graph property. In their model G is known to the player, but she does not know when the input sequence ends. In this paper we allow A(s) to belong to an arbitrary, but xed monotone set system M, where M is known to the player. We assign the performance measure e(M) = maxA mins E(jA(s)j)=OPT(s), to every monotone set system M, where E() is the expectation over the player's ramdom choices and OPT(s) = maxM 2M jM \ sj. The above performance measure corresponds to the reciprocal of the competitive ratio. The question we raise in this article is a general one: what set theoretical properties of M determine the value of e(M)? It easily follows from earlier works, that e(M) = 1 i M is a matroid, but in general it seems very hard to give a combinatorial characterization of set systems with close to optimal performance. The video on demand problem studied in [1] and [2], corresponds to the case, when M is the monotone closure of l disjoint sets of size k. Denoting the performance of this set system with e(k; l) we show that e(k; 2) = k=(2k ? 1), and that k=(3k ? 0:5 ? minz=1;:::;k (z=2 + k=z))  e(k; 3)  k=(3k ? minz=1;:::;k (z + k=z)). Our extensive computer experiments suggest that the lower bound on e(k; 3) is sharp. Awerbuch et al. show that e(k; l) =

(1= logk logl) and prove that this bound is optimal for a wide range of k; l pairs. We show

1

that e(k; l)  kl?1 =(kl ? (k ? 1)l ) which is bigger than (1= logk log l) for some values of k and l. One motivation for obtaining bounds on e(k; l) is the following union theorem that we prove. Let M = M1 [ : : : [ Ml , and k be an upper bound on the size of the largest set in M, then the performance of M is at least as good as the performance of the worst of the components divided by a factor of e(k; l). We show that there exists a graph G such that the performance of the system of its independent sets is as small as n?0:226. We obtain this result by generalizing the product construction of Bartal et al. in [3]. We de ne the scaled up performance e0 (M) of a set system M which determines the performance of a product with building blocks isomorphic to M. We raise several interesting questions toward a better understanding of the combinatorial conditions on systems that determine the performance of on-line algorithms on them. Maybe the most intriguing among them is to nd the set system with the worst performance.

1 Introduction An online algorithm receives a series of requests and must respond to each request immediately without any knowledge of future requests. In contrast, an o -line algorithm knows the complete request sequence before it starts serving the requests. Beginning with the work of Sleator and Tarjan on list searching and paging [12], a lot of attention has been devoted over the last decade to the design and analysis of online algorithms for speci c problems, as well as to the development of online computational models. Sleator and Tarjan [12] suggested to measure the performance of an online algorithm by the maximum of the ratio of the on-line and the optimal of-line solutions, where the maximum is taken over all input sequences. This ratio, called the competitive ratio, has been used to analyze algorithms for various data structures, paging, caching, and graph problems (See e.g. [6, 11, 8, 7]), and gave rise to elegant generalizations, such as metrical task systems introduced by Borodin, Linial, and Saks [5], and the K-server problems introduced by Manasse, McGeoch and Sleator [10] (see also [9]). Most on-line models assume that there is small common knowledge shared between the online algorithm and the adversary, in particular online models for graph problems assume that the graph is not known to the online algorithm. An exception to this is the model of Halldorsson in [7], in which the player receives a graph in the start of the game that contains the future on-line input sequence (which is a polynomial fraction of this graph), but the adversary does

2

not identify the upcomming vertices in the large graph (he reveals only their connections to previously shown vertices). Bartal, Fiat and Leonardi in [3] recently introduced a model for graph optimization problems, in which the player is o ered elements from a on-line sequence of distinct nodes of a xed graph G, that she can choose to pick or not to pick with the following provisions. The description of the game, besides G, includes a graph property , and the set of elements having been picked by the player at any time during the game must have property . The player does not know when the input sequence ends, so she should optimize her strategy for all possible lengths. Bartal et. al. use this new model to explain the behavior of a wide range of real life online problems such as optimal routing in recon gurable optical networks, and online routing in switch-less optical networks. In our paper we take a look at a generalization of this model, which also turns out to be the generalization of another interesting, and seemingly unrelated problem, known as the Videoon-Demand (VOD) scheduling problem [] []. In this model customers issue requests for movies to a video server. Upon arrival, a movie request is accepted and served (possibly with some delay), or rejected. The goal of the video server is to maximize the number of accepted requests subject to a capacity constraint c, which is the maximum number of movies that can be shown in parallel. Our Model: A set system M on a basic set B = M(B) is said to be monotone if for every M 2 M and M 0  M we also have M 0 2 M. Let M be a set system on a basic set B(M). We shall assume, that B(M) = SM 2M M . If M is not monotone, we can consider its downward closure, Mc , which is the is the smallest monotone system containing M. For our further discussion we assume that M is monotone, and shall call its elements independent. Let s = a1 a2 : : :ar , where a1 ; : : :; ar 2 B(M), ai 6= aj for 1  i 6= j  r. We de ne OPT (M; s), or shortly OPT (s) to be the size of the maximum cardinality subset of s that belongs to M. Notice, that for every non-empty sequence s OPT (s)  1. We shall denote the set of all possible input sequences, including the empty sequence , by S (B ). Assume that A is a randomized online algorithm that picks or rejects elements of a sequence s in real time. If the set of elements picked by A for every input sequence and for every random choice of A belongs to M, then we say that A admits M. In this paragraph we consider only algorithms that admit M. If A(s) is the set of elements that A picks for an input sequence s, then y (A; s) = E (jA(s)j) is the yield of A on s, and e(A) = e(A; M) = mins y (A; s)=OPT (s) is the performance of A. The performance of M is maxA e(A:s), the performance of the best performing algorithm A on M, that admits M. We remark, that the on-line performance of

3

an algorithm is the reciprocal of what is called competitive ratio in the literature. We choose to state our results in terms of the on-line performance, for it is always between 0 and 1, and its increase well re ects the increasing success of the algorithm aginst the adversary. The fundamental issue we address here is how various properties of M a ect the value of e(M) ? If M is the collection of subsets of the vertex set of a graph G satisfying some hereditary property, then we get the model of Bartal et. al. On the other hand, if M is the downward closure of a system with disjoint maximal sets, then we get the Video-on-Demand scheduling problem posed in [2]. Let Mk;l denote the downward closure of l disjoint sets each of cardinality k, and let e(k; l) denote its performance. Awerbuch et al. [1] give an algorithm for a generalization of the VOD problem, whose competitive ratio is O(log l log(k=c)), which implies, that e(Mk;l) =

(1=(log k log l). This bounds is tight when k and l are not super-exponentially related [1]. For another range of parameters, the e(k; l)  kl?1 =(kl ? (k ? 1)l) bound obtained in Section ?? is better. The simple structure of Mk;l makes us hope that computing e(k; l) = e(Mk;l ) exactly as a function of k and l is possible. Though this ultimate goal is still beyond our reach at this point, we have been able to make a signi cant progress towards a complete understanding of these systems. In Section 4 we show that e(k; 2) = k=(2k ? 1). The l = 3 is more dicult to analyze. We prove that k=(3k ? 0:5 ? minz=1;:::;k (z=2 + k=z ))  e(k; 3)  k=(3k ? minz=1;:::;k (z + k=z )), and describe a computer experiment that leads us to conjecture that the above lower bound is tight for k > 4. For practically occurring on-line problems nding the best performance can be done with a computer if the instance is very small. Otherwise the search space needs to be resticted without loosing the best algorithms. For the Mk;l case we de ne a subset of online algorithms which we call oblivious, and conjecture that this class contains the best algorithm for any k and l. One motivation of computing e(k; l) precisely is a decomposition theorem, called the Union Theorem, which we prove in Section 5.3. The Union theorem upper bounds the performance of the union of l systems with independent sets of size at most k in terms of e(k; l) and the performance of the components. Bartal Fiat and Leonardi considered set systems which arise from graph properties. They p prove that 1=(2 n)  e(M) for any such M, where n = jB (M)j. It is straightforward to generalize their lower bound p to any monotone set system. Moreover, in Section 5.1 we improve this lower bound to 1= 2n.

4

We de ne composition of set systems and a new performance measure, called the scaled-up performance, which allows us to estimate the performance of the composed system in terms of the performances of the components. Using these results we improve the n0:207 upper bound of Bartal et al. on the competitive ratio of the maximal independent set problem of a graph to n0:226. What we see as the main merit of our approach is that it is general, and may give rise to bounds that are very close to optimal. A re nement of the upper bound of O(1= log k log l) on e(k; l) can also be obtained as a consequence of our composition theorem. Athough we are far from the goal of predicting e(M) from the structure of the set system M in general, we have made a slight progress in this direction as well. In Section 7 we show that the set systems with e(M) = 1 are exactly the matroids. We de ne extensions of matroids, and show, that they are characterized by certain types of greedy algorithms. We also give a combinatorial characterization of those elements in a set system (not necessarily a matroid) that any algorithm can greedly pick without hurting its performance.

2 An Algebraic Characterization of the On-Line Performance In this section we give a formal de nition of an online algorithm and an algebraic characterization of the on-line performance, and show that every randomized online algorithm admitting a set system M is a convex combination of deterministic online algorithms admitting M when the algorithms are described in an appropriately de ned space. This fact is necessary to establish a valid ground for our use of Yao's Lemma [15] which we discuss in the next section. The formal characterizations given in this section are not required to the understanding of the results in the rest of the paper. The reader who is more interested in our bounds on the performance of various set systems may want to skip this section in his rst reading. Let B = f1; 2; ; : : :; ng, and S (B ) be the collection of all sequences a1a2 : : :ar , where a1 ; : : :; ar 2 B, ai 6= aj for 1  i 6= j  r, and the empty string . The intersection of a sequence s and a subset X of B is a subsequence of s. If this subsequence contains all elements of X we say that s contains X and denote it by X  s. We denote the last element of a sequence s by last(s). Let  = f(s; M ) j s 2 S (B ); M 2 M; M  s; last(s) 2 M g: An on-line algorithm A is characterized by a function P :  ! [0; 1]. The function P gives the conditional probability that the algorithm A picks last(s) for an input sequence s, when the set of previously picked elements is M nflast(s)g. The set of all on-line algorithms (randomized

5

or deterministic) correspond to the set of points of the jj dimensional cube [0; 1]. The deterministic algorithms correspond to the corners of this cube f0; 1g. Next we give a di erent parameterization of the above space in terms of absolute probabilities. Let 0 = f(s; M ) j s 2 S (B ); M 2 M; M  sg: An on-line algorithm is also characterized by a function p : 0 ! [0; 1] satisfying the following conditions. 1. p(; ;) = 1; 2. For every (s; M ) 2 0, p(s; M )  0; 3. For every (s; M ) 2 0 such that last(s) 62 M , if M [ flast(s)g 62 M, p(s; M ) = p(s n flast(s)g; M ): and if M [ flast(s)g 2 M, p(s; M ) + p(s; M [ flast(s)g) = p(s n flast(s)g; M ): Since conditions (1), (2), and (3) above are linear the online algorithms occupy a convex polytope ON in the 0 dimensional space. The value p(s; M ) is the absolute probability that A picks a subsequence M 2 M for an input sequence s. Therefore we can express the on-line performance of an on-line algorithm with respect to an input sequence s as

e(A; s) =

jM j p(s; M ): M 2M;M s OPT (s) X

(1)

In order to express the absolute probability p in terms of the conditional probability P we de ne P 0 that extends P to 00 = f(s; M ) j s 2 S (B ); M 2 Mg in the following way. Let s \ M = X . Then if last(s) 2 X : P 0 (s; M ) = P (s; X ) if last(s) 62 X ^ X [ flast(s)g 2 M : P 0 (s; M ) = 1 ? P (s; X [ flast(s)g) if last(s) 62 X ^ X [ flast(s)g 62 M : P 0 (s; M ) = 1 The function P 0 is just a formal extension of P to situations when we decide not to pick the last element, or when the element cannot be picked. For every (s; M ) 2 0 we have:

p(s; M ) = P 0 (a1 ; M )P 0(a1a2 ; M ) : : :P 0 (a1 : : :ar ; M ): Conversely, let (s; M ) 2 . Then we have:

P (s; M ) = p(s n flast(ps()s;g;MM)n flast(s)) : 6

(2)

It is intuitively straightforward, but requires a formal proof, that the vertices of ON correspond to deterministic algorithms. One can prove this directly by showing that every point in ON is a convex combination of those points in ON that correspond to deterministic algorithms. Alternatively, one can use Equation (2) to this end. The right hand side of 2 is a multi-linear function on [0; 1] i.e. the variables are the elements of . It is known (see e.g. [13]) that for every point x of the unit cube there are coecients c(x; y ), where y is a corner of the cube, such P that for every multi-linear function f on the unit cube f (x) = y c(x; y )f (y ) (note that c(x; y ) do not depend of f ). This implies that ON is the convex hull of the vectors p that correspond to deterministic algorithms. Thus we proved the following theorem:

Theorem 2.1 Every vector p in the 0 dimensional space that correspond to a randomized

algorithm can be expressed as a convex combination of vectors that correspond to deterministic algorithms.

3 Yao's lemma In this section we apply the often quoted lemma of Yao [15] for our model. This lemma, which is mathematically equivalent with the duality theorem in linear programming of Von Neumann, relates the performance of randomized algorithms to the performance of deterministic algorithms on random inputs. Yao's crucial observation is that this lemma applies for all computational models in which real vectors can be assigned to each randomized and deterministic algorithm such that: 1. Every randomized algorithm is a convex combination of deterministic ones; 2. Convex combination of algorithms (randomized or deterministic) corresponds to some (randomized or deterministic) algorithm; 3. The performance of an algorithm A can be obtained by an inner product (vA ; obj ), where vA is the vector representing A, and obj is a xed bene t vector. We have seen in the previous section, that on-line algorithms have a representation in terms of absolute probabilities, that satisfy 1-3. The linearity of the performance comes from Equation (2). In order to formulate Yao's lemma we extend some functions originally de ned on input sequences to probability distributions on input sequences.

7

De nition 3.1 Let D be an arbitrary distribution on input sequences from S (B(M)). Let A be an on-line algorithm, that admits M. Then y(A; D) def = def

OPT (D) =

Theorem 3.2

X

s2S (B (M)) X s2S (B (M))

D(s)y(A; s); D(s)OPT (s)

y(A; D)) ; e(M) = min max D A OPT (D)

(3)

where the minimum is over all probability distributions D on S (B ) and the maximum is over all deterministic online algorithms A admitting M.

We shall use this lemma in all our lower bound proofs in Sections 4.3, 4.5, and 6.1.

4 Systems with Disjoint Maximal Sets Let Mk;l = fH1; : : :; Hlgc be a system such that Hi \ Hj = 0 for 1  i 6= j  l, and jHij = k for 1  i  l. For an a 2 B (Mk;l ) we denote by H (a) the maximal independent set that contains a. We denote e(Mk;l ) by e(k; l). Awerbuch et al. show that e(k; l) = (1= log k log l) and prove that this bound is optimal for a wide range of k; l pairs. We start out giving an explicit formula for e(k; 2) in Section 4.1. Section 4.3 gives an algorithm admitting Mk;3 . We conjecture that this algorithm is the best possible for every k > 4 and prove an almost matching upper bound. Section 4.4 reviews the algorithm of [1] that admits Mk;l whose performance is (1=(log k log l)) and also gives a new algorithm whose performance is kl?1 =(kl ? (k ? 1)l ) which is bigger than (1= log k log l) for some values of k and l. Section 4.2 de nes an oblivious algorithm admitting Mk;l as one that decides whether to rst pick an element after observing a sequence s, based upon js \ H1 j, js \ H2 j,.. . , js \ Hl j alone, ignoring the order of the elements within s. The online algorithm whose on-line performance is e(k; 2) described in Section 4.1 and the algorithm of [1] for Mk;l described in Section 4.4 are oblivious. We conjecture in Section 4.2 that the best algorithm for every k and l is oblivious. Section 4.5 shows how to obtain the upper bound of Awerbuch et al. as an application of a more general theorem about the performace of compound set systems that we prove in Section 6.

8

4.1 Two disjoint sets In this Section we show that e(k; 2) = k=(2k ? 1). The following theorem shows that this formula in fact holds for any system with two disjoint maximal sets that are not necessarily of the same size such that the larger has size k.

Theorem 4.1 Let M be a set system consisting of two disjoint maximal sets H1 and H2, such that jH1j  jH2j = k. Then e(M) = k=(2k ? 1). Proof. Let A be an online algorithm that admits M. Let x 2 H1 and let p be the probability

that A picks x if x is the rst element of an input sequence. From the performance of A on the one-element sequence x we obtain that e(A)  p. By considering the sequence that starts with x and continues with all the elements in H2 we obtain that e(A)  (p + (1 ? p)k)=k. These two upper bounds on e(A) imply that e(A)  k=(2k ? 1). Since A was an arbitrary online algorithm that admits A we have e(M)  k=(2k ? 1). On the other hand, let A be the following online algorithm which admits M. Let s be the input sequence. Algorithm A picks the rst element x in s with probability k=(2k ? 1), and assuming it did not pick x then if there is an element y 2 s such that x and y are in di erent sets it picks the rst such y with probability 1. It is straightforward to check that e(A) = k=(2k ? 1) so e(M) = k=(2k ? 1). The following is an immediate corollary

Corollary 4.2 The on-line performance e(k; 2) of Mk;2 is k=(2k ? 1).

4.2 Oblivious Algorithms For an online algorithm A and a sequence s let pickA (s) denote the probability that the set of elements that A picks is non-empty.

Lemma 4.3 For every randomized on-line algorithm A that admits Mk;l there is a randomized

on-line algorithm A0 such that for every input sequence s = a1 a2 : : :ar we have:

y(A; s)  y(A0; s) = pickA (a1)js \ H (a1)j + + (pickA(a1 a2 ) ? pickA(a1 ))ja2 : : :ar \ H (a2)j +

9

()

+ (pickA(a1 a2 a3) ? pickA (a1a2 ))ja3 : : :ar \ H (a3)j + .. . + (pickA(s) ? pickA(a1a2 : : :ar?1 ))

Proof. We de ne A0 such that we simulate the picking decisions of A until the rst pick, and

after that A0 picks greedily. Clearly pickA0 (s) = pickA(s) for every input sequence s. Moreover, since the maximal elements of M are disjoint, the rst picking decision restricts the set of pick-able elements to the elements Hi , where Hi is the set from which the rst pick was made. Under this restriction the greedy algorithm becomes optimal. Therefore y (A0 ; s)  y (A; s). For an input sequence s and an element x 2 s let us denote by sx the initial segment of s that ends with x, and by s?x the initial segment of s that ends with the element that immediately precedes x (if x is the rst element of s, then s?x is the empty sequence). To prove that the r.h.s. of () is correct observe, that the probability that the rst picking decision of A0 is to pick some x 2 s is pickA0 (sx ) ? pickA0 (s?x ) = pickA(sx ) ? pickA (s?x ). Because of the greedy selection of elements after the rst pick, the yield of A0 under the condition that we pick x from s is j(s n s?x ) \ H (x)j, and the explicit formula of the theorem follows. In the above theorem A0 represents the on-line algorithm that is optimal for a given function pick(). The yield of A0 depends solely on pick().

Lemma 4.4 Let pick() be a function from S (B(Mk;l)) to the reals with the following properties: 1. pick(;) = 0; 2. 8 s 2 S (B ) : 0  pick(s)  1; 3. 8 s 2 S (B ); x 2 B n s : pick(s)  pick(sx); Then there is a unique on-line algorithm A which admits Mk;l, such that pickA(s) = pick(s), and y (A; s) can be expressed as the r.h.s. of equation ().

The proof of the above lemma is straightforward. Next we shall introduce an important subclass of on-line algorithms admitting a set system with disjoint maximal elements.

De nition 4.5 (Oblivious Algorithms) Let algorithm A be de ned through a function pick() which satis es conditions (1)-(3) of Lemma 4.4, and Equation (). If pick(s) depends only on

10

the sequence of \multiplicities" js \ H1j; : : :; js \ Hl j, then A is said to be oblivious. We will use the notation pick(m1; m2; : : :; ml) instead of pick(s), where mi = jHi \ sj (1  i  l) is the sequence of multiplicities.

Conjecture 4.6 There is an oblivious algorithm Ak;l such that e(Ak;l) = e(k; l). This conjecture is based upon the observation, that the best algorithms for Mk;l that we know of are oblivious, or can be made oblivious. It is feasible to nd the best oblivious algorithm for Mk;l by computer when k and l are small. The idea is to generate a linear expression for the on-line performance of the algorithm with respect to every possible input sequence, where the variables in these expressions are the values of the function pick(). Then by solving a linear program we nd the function pick() that maximizes the value of the minimum expression. We have implemented this experiment and ran it for l = 3 and k = 2; 3; 4. The optimal pick() for k = 4; l = 3 is presented in gure 1, the competitive ratio of the corresponding algorithm is 13=28. The pick() we obtain is invariant to shuing the coordinates of the input vectors. This fact together with the monotonicity of pick() allows the reader to read the value of pick() over any point of f0; 1; 2; 3; 4g3 from the table. On gure 1 we can also see a dotted line. This line corresponds to the sequence s = x1 y1 z1 x2y2 y3 y4 where the x's are elements from H1, the y 's are elements from H2 and the z 's are elements from H3. The algorithm picks the rst element of s with probability 13=28, the second with probability (19 ? 13)=28, the third with probability (25 ? 19)=28, etc. so 13 + 4 6 + 6 + 3 = 59 ; y(A; s) = 2 28 28 28 28 28 and e(A; s) = 59=(4  28). In the next section we shall discuss the k = 3 case in great details.

4.3 The case of Three disjoint sets The performance of Mk;3 is harder to compute than the performance of Mk;2. In this section we describe an algorithm admitting Mk;3 whose competitive ratio is f (k) where

f (k) = 

3k ?

k

 1 ? min z 2 z=1;:::;k 2

+ kz

:

z + k  is obtained either for z = bp2kc or for z = bp2k + 1c. So 2 z

Observe thatz=1 min ;:::;k f (k)  3k?1=k2?p2k . Our main theorem is the following

11

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

25 28

x=1; y,z=1,2,3,4

1

1

1

1

13

24

24

24

28

28

28

28

13

22

22

24

28

28

28

28

13

19

22

24

28

28

28

28

13

13

13

28

28

28

0

1

1

1

1

1

x=0; y,z=0,1,2,3,4

Figure 1: A best oblivious algorithm for l = 3, k = 4.

12

Theorem 4.7 e(k; 3)  f (k): We also prove the following upper bound:

Theorem 4.8

kp e(k; 3)  3k ? min k (z + k=z)  3k ? 2 k z=1;:::;k

Intuitively, any algorithm for large k cannot do much better than the algorithm that decides uniformly at random on a set H 2 fH1 ; H2; H3g and then picks every element from H . Therefore, we expect the performance of Mk;3 to tend to 1=3 as k grows. Indeed, our algorithm gets similar to the algorithm above as k tends to in nity. In fact, we conjecture that e(k; 3) = f (k) for k > 4. One argument in favor of this conjecture is the \good" behavior of our solution in the limit. Another basis to our conjecture are computer experiments we describe below. These experiments, in fact, led us to the discovery of f (k). Our study of Mk;3 started via the computer experiment described in the previous section where we found the best oblivious algorithm for k  4. By analyzing our results we were still unable to see what was the performance of Mk;3 as a function of k. We needed to compute the performance for somewhat bigger values of k. To make this feasible we had to reduce the size of the space of the algorithms in which we search for the optimal one. To this end we invented the notion of semi-oblivious algorithms de ned below.

De nition 4.9 (semi-oblivious algorithm) A semi-oblivious algorithm for Mk;3 is an al-

gorithm whose pick function takes the following form. Let s = a1 : : :ar be the input sequence, and let Ht and Ht0 , t < t0 , be the two maximal independent sets in Mk;3 other than H (a1). Then pickA(s) = pick(s \ Hl ; s \ Hl0 ) where pick(i; j ), 0  i; j  k, is monotone, and 0  pick(i; j )  1.

We will omit the subscript A from the pick function where A is clear from the context. In other words, if we assume that a1 2 H1 , Algorithm A picks a1 with probability pick(0; 0), and any other element from H1 is rst picked (i.e. is the rst element that A picks) with probability zero. If xl is the ith element from H2 and it is preceded by j elements from H3 then A rst picks xj with probability pick(i; j ) ? pick(i ? 1; j ) and similarly if xl is the j th element from H3 and it is preceded in s by i elements from H2 then A rst picks xl with probability pick(i; j ) ? pick(i; j ? 1).

13

Throughout this section online algorithm will mean semi-oblivious algorithm admitting Mk;3. By symmetry it is clear that in order to prove a lower bound on the performance of a semi-oblivious A it is enough to guarantee it on sequences that start with an element from H1. Hence in the rest of this section we restrict ourselves to such sequences. The following lemma explains the computational advantage we gain from De nition 4.9.

Lemma 4.10 Let A be semi-oblivious. If e(A; s)  b for every sequence s = a1 : : :am where a1 2 H1 and aj 62 H1 for j 6= 1, then e(A)  b. Proof. By considering the sequence a1 we have that b  pick(0; 0). Let s0 = a01 : : :a0m0 . If s0 contains an optimum solution which is a subset of H1 , then e(A; s0)  pick(0; 0)  b. If s0 does not contain a maximum independent set from H1 then construct s00 by deleting from s0 all the elements from H1 except the rst. By the assumption of the theorem we have e(A; s00)  b, and since y (A; s0)  y (A; s00) and OPT (s00 ) = OPT (s0 ) we also have e(A; s0)  b.

Lemma 4.11 There exists an algorithm A whose on-line performance is maximum (among all semi-oblivious algorithms) such that pickA(i; j ) is symmetric, and pickA(i; k) = 1 for 0  i  k. Proof. We de ne B by setting pickB(l; k) = 1 for every 0  l  k, and pickB(i; j ) = pickA(i; j ) for every i and j = 6 k. It is straightforward to verify that e(B)  e(A).

Assume pickB (i; j ) is not symmetric. De ne B 0 by pickB 0 (i; j ) = pickB (j; i). Note that e(B 0 ) = e(B) since for every sequence s there is a sequence s0 on which B and B 0 , respectively, have the same performance. Finally de ne B 00 by pickB 00 (i; j ) = (pickB (i; j ) + pickB 0 (i; j ))=2. Clearly e(B 00 )  e(B 0 ) = e(B ) and pickB 00 is symmetric. Using a computer we obtained a best semi-oblivious algorithm for every k  10 and calculated its competitive ratio. As an example, Figure 2 shows the best semi-oblivious algorithm for M8;3. Table 1 summarizes the performances of Mk;3 for k = 1; : : :; 10. As in the previous experiment we generated a linear program containing a linear constraint for every sequence of the type speci ed in Lemma 4.10. The number of such sequences is much smaller than the total number of sequences, and therefore, a solution for larger k's becomes feasible. By Lemma 4.14 we assume that the pick function we look for is symmetric and that further reduces the number of variables and constraints in the linear program. Notice that from every oblivious algorithm one can generate a semi-oblivious algorithm whose performance is at least as good. Therefore a search in the semi-oblivious space might

14

1

37 39 35

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

39 33 39 31 39 29

38

39

39

27

36

38

39

39

39

25

34

36

38

39

39

39

39

16

25

27

29

31

33

35

37

39

39

39

39

39

39

39

39

1

Figure 2: A best semi-oblivious algorithm for M8;3. The number near the circle in column i and row j is the value of pick(i,j). The values not shown in the upper triangle are all ones.

1 2 3 4 5 6 7 8 9 10 1 3/5 1/2 13/28 15/34 3/7 28/67 16/39 36/89 10/25 1. .6 .5 .4643 .4412 .4286 .4179 .4103 .4045 .4 Table 1: Column k contains our best lower bound on e(k; 3) and its decimal approximation.

15

result with a semi-oblivious algorithm whose performance is better than the performance of the best oblivious algorithm. Such a result of course will immediately prove that Conjecture 4.6 is false. On the other hand if Conjecture 4.6 is true then the performance of the best oblivious algorithm and the best semi-oblivious algorithm is the same. Our experiments show that the best semi-oblivious algorithm has the same performance as the best oblivious algorithm for k  4. The dependency of the performance on k is still hard to deduce from Table 1. But the algorithms achieving this performance turned out to have a very speci c structure. For every k  10 we found an optimal algorithm A whose symmetric pick function takes the following form. If we denote pick(0; 0) by e then there exists x and y such that pick(0; 1) = minf1; e + xg, pick(i; 1) = minf1; e + x + (j ? 1)yg, and pick(i; j ) = minf1; e + 2x + (i + j ? 2)yg, where e, x, and y are between zero and one, and depend on k. Moreover x, y , and e satisfy e(A) = e; (4) e + x + (k ? 1)y = 1; (5) 1 ? x + y = 2e; for every k 6= 2; 4: (6) Notice that 5 and 6 imply that

ky = e:

(7)

For example, our algorithm for M8;3 depicted in Figure 2 has e = 16=39, x = 9=39, and y = 2=39. These results lead us to conjecture that the best semi-oblivious algorithm for any k > 4 has a pick function of the type described above and its parameters satisfy equations 4, 5, and 6.

De nition 4.12 (ASO Algorithm) An Arithmetic Semi-Oblivious (ASO) algorithm for Mk;3 is a semi-oblivious algorithm for Mk;3 whose pick function satisfy: pick(0; 0) = e, pick(0; 1) = minf1; e + xg, pick(i; 1) = minf1; e + x +(j ? 1)y g, and pick(i; j ) = minf1; e +2x +(i + j ? 2)y g, where e, x, and y are between zero and one and satisfy equations 4, 5, and 6.

Let s = a1 ; : : :; am be an input sequence. Lemma 4.10 and 4.11 already guarantee that it suces to consider sequences that start with a1 2 H1 , do not contain any other elements from H1, and contain an optimum solution from H3. Therefore every sequence mentioned hereafter will be of this type. The following lemma show that for an ASO algorithm we can further restrict the number of relevant input sequences.

16

Lemma 4.13 Let A be an ASO algorithm admitting Mk;3. If e(A; s)  e for every s that rst

contains the elements from H2, consecutively, and then the elements of H3 , consecutively, then e(A)  e.

Proof. Let s be a sequence such that pick(s) = 1 and let s0 be the shortest pre x of s for

which pick(s0) = 1. Let s00 be the sequence obtained from s by removing all the elements from H2 that follow s0. Clearly y(s00)  y(s) and OPT (s00) = OPT (s). Therefore it suces to prove that e(A; s)  e for every s such that pick(s) < 1, or pick(s) = 1 and all the elements after the shortest pre x s0 of s that satisfy pick(s0) = 1 are from H3 . We prove the statement for these sequences by induction on the number of pairs x; y of elements such that x 2 H3 and precedes y 2 H2 in s Let x be the last H3-element in s that precedes an H2-element and let y be the element from H2 that follows x in s. Let s0 be the pre x of s that ends with the element preceding the pair xy . Case 1: pick(s0) < 1 and pick(s0xy ) < 1. Construct s00 from s by ipping x and y . It is straightforward to verify that e(A; s00) = e(A; s) and by induction e(A; s00)  e. Case 2: pick(s0) < 1 and pick(s0xy ) = 1. Construct s00 from s by ipping x and y . Clearly OPT (s00 ) = OPT (s) and we also claim that y(A; s00)  y(A; s). Thus e(A; s00)  e(A; s) and by the induction hypothesis on s00 the statement follows. To prove the claim let m3 be the number of H3-elements that appear after s0 in s. Notice that m3  1 since y 2 H3 . By de nition of the yield we have that y (A; s) ? y (A; s00) = m3  d + d0 where d = [pick(s0y ) ? pick(s0)] ? [1 ? pick(s0x)] and d0 = [1 ? pick(s0y )] ? [pick(s0x) ? pick(s0)]. Since d  d0 and d + d0 = 0 we obtain that y (A; s) > y (A; s00) as required.

Lemma 4.14 Let A be an ASO algorithm admitting Mk;3. If e(A; s)  e for every s = a1tt0 where t  H2 , and t0 is either empty or contains all the elements in H3 then e(A)  e. Proof. We prove that e(A; s)  e for every s in which the elements of H2 appear consecutively

before the elements of H3 . The statement then follows using Lemma 4.13. By symmetry the statement holds for sequences that do not contain elements from H2 so may assume that s \ H2 6= ;. The proof is by induction on m = k ? OPT (s) = k ? js \ H3 j.

17

Assume the statement is true for every sequence s for which k ? OPT (s) = m ? 1. Let s be a sequence for which k ? OPT (s) = m and let s0 be the sequence obtained by adding an element from H3 ? s to the end of s. Using 6 we obtain that

y(A; s0) ? y(A; s)  1 ? (e + x) = e ? y: Since OPT (s0 ) = OPT (s) + 1 and by the inductive assumption y (A; s0)  e OPT (s0 ) we obtain that e(A; s)  e.

Lemma 4.15 Let A be an ASO algorithm admitting Mk;3. If e(A; s)  e for every sequence s = as0 where s0  H2 then e(A) = e. Proof. We will prove that e(A; sl)  e for every sequence sl = as0s00, where a 2 H1,s0  H2, js0j = l and s00 \ H3 = H3. The statement then follows by applying Lemma 4.14. If l = j + i, 0  i  k ? j where j is the maximum integer such that x ? (1 ? e ? x ? (j ? 1)y ) is positive, then we use inequalities (5), (6), and (7) to lower bound the yield of s

y(A; s) = y(A; s0) + (1 ? e ? x ? (l ? 1)y)k  le + (1 ? e ? x ? (l ? 1)y)k = e + (1 ? e ? x)k = e + (e ? y )k = ek: Using 6 and 7 we also have,

y(A; sj?1 ) ? y(A; sj ) = xk ? (1 ? e ? x ? (j ? 1)y)k +(1 ? e ? 2x ? (j ? 2)y )(k ? 1) ? x ? (j ? 1)y = 2e + x ? y ? 1 = 0 hence e(A; sj ?1 )  e. Finally for l = j ? i, where 1 < i  j ? 1, we obtain that

y(A; sl?1) ? y(A; sl) = y(k ? i + 1) ? (1 ? e ? 2x ? (j ? 2)y) ? x ? (j ? i)y = 2e + x ? 1 ? y = 0 hence e(A; sl )  e also for l < j ? 1, and the statement follows. Now we are ready to calculate the competitive ratio of the best ASO algorithm admitting Mk;3.

18

Theorem 4.16 The best ASO algorithm admitting Mk;3 has on-line performance f (k). Proof. By lemma 4.15 it suces to look for an ASO algorithm A such that e(A; s)  e for k sequences sl = as0 , 1  l  k, where s0 is a sequence of length l in H2 . The requirement e(A; sl)  e for l = 1; : : :; k implies that e + lx + l(l ? 1)y ? le  0; (8) 2

for l = 1; : : :; k. By using 7 to substitute for x, and dividing by y we obtain that   1 1 2k + 2l y + 2 ? 3k + l2  0 for l = 1; : : :; k. We next divide by 2l and rearrange terms to obtain 1  3k ? 1 ?  l + k  y 2 2 l

(9)

for l = 1; : : :; k. Since e = ky , we want to nd the largest y that satisfy 9 for l = 1; : : :; k. Thus we obtain that 1   y= 1 3k ? 2 ? minl=1;:::;k 2l + kl from which the statement follows. Theorem 4.7 with which we started this section is a corollary of Theorem 4.16. We continue with the proof of Theorem 4.8 which is via a simple application of Yao's Lemma. Proof. (of Theorem 4.8) Consider a probability distribution D on S (B(Mk;3)) that assigns probability p1 to the sequence a1 where a1 2 H1 , probability p2=b to the sequence a1 s where s contains b elements from H2, and probability p3 =k to the input sequence a1 ss0 where s0 consists of all the elements from H3 . Every other sequence is assigned probability 0 so p1 + p2 + p3 = 1 Notice that OPT (D) = 1. Let A1 ; A2; A3, be deterministic online algorithms admitting Mk;3 such that Ai picks the rst element from Hi in the input sequence for i = 1; 2; 3. Clearly,

y(A1; D) = p1 + p2 =b + p3=k;

19

y(A2; D) = p2 + p3 b=k; y(A3; D) = p3 ; and the average yield of any other deterministic online algorithm is dominated by the yield of either A1, A2 , or A3. By solving a linear system of equations one can verify that A1 , A2 , and A3 have equal yield when we set ) ; p = k p1 = 3kk?(bb??1)k=b ; p2 = 3kb?(kb??bk=b 3 3k ? b ? k=b ; and this yield equals to the value of p3. Applying Yao's lemma the statement now follows.

4.4 Many Disjoint Sets As mentioned in Section 1 the algorithm of Awerbuch at al. [1] when restricted to pick only one set corresponds to an algorithm that admits Mk;l and has performance (1= log k log l). In this section we review this algorithm using our framework. We also describe a di erent algorithm whose performance is kl?1 =(kl ? (k ? 1)l). Let s = a1 : : :ar be a sequence and let mi = js \ Hij. The algorithm of Awerbuch et al. is de ned by the following pick function that satis es the conditions of Lemma 4.4. The algorithm is oblivious as the pick function depends only upon jHi \ sj. 1

pick(s) = 1 ? dlog ke + 1

dlog Y(s) Xke OPT i=1

j =1

j=2i !wj 2  1 ? 2i l2 :

(10)

where wi = jf1  j  l j mj >= igj and (x) equals x if x  0 and 0 otherwise. The following theorem proven by Awerbuch et al. lower bounds the performance of their algorithm. It follows from this theorem that e(k; l) = (1= log k log l).

Theorem 4.17 (Awerbuch et al. [1]) The online algorithm de ned by the pick function in (10) has performance (1= log k log l).

i

Proof. (sketch) We de ne 0(s) = 1, and i(s) = 1 ? Qmj=1l (1 ? 22j=il )wj for i = 1; : : :; dlog ke. P =dlog ke Clearly pick(s) = dlog 1ke+1 ii=0 i(s). If OPT (s)  3 log l then the statement holds as we get yield 1 from 0 . In case OPT (s)  3 log l we x i such that 3(2i log l)  OPT (s)  2

2

3(2i+1 log l). and show that i gives yield of (OPT (s)= log l).

20

Let d = 2i and for each aq 2 s we assign a pair (x; y ) such that aq is the xth element in s from Hy . Note that this gives a one to one correspondance between the elements of s and the set of pairs S = f(x; y )jx 4. If OPT (M) p 2n, then the greedy algorithm gives e(M)  1= 2n. Otherwise, q let M 2 M with jM j  2n. Notice, that e(MjM ) = 1, and by p induction e(MjB n M)  1= 2(n ? 2 n). The statement now follows by applying Corollary 5.2.

5.2 The Performance of Disjoint Union Versus Union It is natural to look for parameters of the set system on which the performance depends monotonically. The performance of a set system is not monotone in the sense, that when we add sets to it, the performance can increase as well as decrease. In this section we prove a monotonicity theorem that requires the notion of splitting.

De nition 5.5 (Splitting) Let x 2 B(M) and let Mx = fM 2 M j x 2 M g. Let M1 and M2 be arbitrary monotone systems such that Mx = M1 [ M2 . From M1 we create M01 by replacing x with x1 62 B(M). Similarly, from M2 we create M02 by replacing x with x2 62 B(M) [ fx1g. The new system (M n Mx) [ M01 [ M02 is said to be obtained from M by splitting x into x1 and x2 .

26

It is easy to see that (MnMx) [M01 [M02 is monotone for every x and every M1 and M2 such that Mx = M1 [ M2 .

Theorem 5.6 Let M0 be a system obtained from M by splitting x into x1 and x2. Then, e(M0)  e(M). Proof. Let A0 be an algorithm that admits M0 and has performance e(A0) = e(M0). We construct an algorithm A that admits M, and has performance at least e(M0 ). Algorithm A

simulates A0 as follows: Let y be the current element. If y 6= x, A sends y to A0 and if A0 picks y then so does A. If y = x, then A sends x1 followed by x2 to A0. If A0 picks x1 or x2 then A picks x, otherwise it takes no action and goes on to the next input element. By de nition 5.5, A admits M. Let s0 be the sequence that A0 encounters while A processes s. Clearly, OPT (s0 ; M0) = OPT (s; M) and y(A0; s0) = y(A; s), so e(M 0)  e(A) and the statement follows.

Corollary 5.7 If M = M1 [ M2, and M0 = M01 [: : M02 such that M01 is isomorphic to M1 and M2 is isomorphic to M2, then e(M)  e(M0) ([ stands for disjoint union). Proof. M0 can be obtained from M by a sequence of element splits.

5.3 The Union Theorem We denote by OPT (M) the size of the largest set in M. Let M = M1 [ : : : [ Ml where OPT (Mi )  k. The union theorem, which we prove in this section, shows that the performance of M is at least as good as mini=1;:::;l e(Mi ) times a poly-logarithmic factor in k and l.

Theorem 5.8 (Union Theorem) Let M = M1[: : :[Ml be a set system such that OPT (Mi)  k for 1  i  l. Assume moreover that e(Mi)  for 1  i  l. Then, e(M)  e(l; k); where e(k; l) is the performance of the system Mk;l = fH1; : : :; Hlgc introduced in Section 4. Proof. Let A0 be an algorithm admitting Mk;l such that e(A0) = e(k; l), and let Bi be an algorithm that admits Mi such that e(Bi )  for 1  i  l. Let s be an on-line sequence. 27

Our algorithm A works in two phases: In the rst phase A constructs an input sequence s0 from s and runs A0 on it until A0 decides to pick an element, that belongs to Hp for some 1  p  l. s0 will be constructed from s on-line. In the second phase A runs Bp on the sux of s which is current when A0 makes its decision. In the rst phase no selection of any element of the input sequence occurs. The sole purpose of this phase is to guess an index p for which OPT (s; Mp) is close to OPT (s; M). Below we give the details of how s0 of the rst phase is constructed: Originally s0 is the empty sequence. For an element x of s we denote the initial segment of s that ends with x by sx, and the initial segment of s that ends with the element just before x in s by s?x . When we read a new element x of s, we execute the following set of instructions for every i ranging from 1 through l: If OPT (sx ; Mi) = OPT (s?x ; Mi), we take no action, and continue with the next i. If OPT (sx; Mi) = OPT (s?x ; Mi) + 1, then we add a new element from Hi to s0 and run the next step of A0 on this element. If A0 decides to pick this element, we start the second phase of our algorithm, in which we set p = i, and run Bp on the post x of s that starts with x. For our further discussion we denote this post x by s0 . (If after having seen all i's A0 does not pick anything, phase one continues with reading the next x.) For the purposes of the analysis we change the above algorithm a little bit, and assume that s0 keeps being constructed silently during the second phase as well. We shall denote with s00 the post x of s0 which starts with the element added last in the rst phase and contains all elements added during the second phase.

Claim 5.9 E (OPT (s0; Mp))  OPT (s; M)e(k; l), where E denotes the expected value of its

argument over random choices of A0 .

Proof. [of the claim] By the construction of s0 for every 1  i  l : OPT (s; Mi) = js0 \ Hij, and so:

OPT (s; M) = 1max OPT (s; Mi) = OPT (s0 ; Mk;l): il

(15)

By the super-additivity of OPT we also have that

OPT (s0 ; Mp)  OPT (s; Mp) ? OPT (s?x ; Mp) = jHp \ s00 j; where x is the element of s which was looked at, when A0 made its pick. Since the performance of A0 is e(k; l), we have that E (jHp \ s00 j) = OPT (s0 ; Mk;l)e(k; l). Combining this with (15), (5.3) Claim 5.9 follows.

28

The statement of the above lemma asserts that the sequence s0 that goes into the second phase has expected optimum at least OPT (s; M)e(k; l) with respect to Mp , where p is the index selected by A0 randomly. Since running Bp on s0 will result in an additional loss of a factor of no more than (on expectation), the theorem follows. In particular Theorem 4.17 and 4.18 imply:

Corollary 5.10 Let M = M1 [ : : : [ Ml be a set system such that the optimum of Mi is at most k for 1  i  k. Assume moreover that e(Mi)  for 1  i  k. Then: e(M)  ; (1 + o(1))2e log k log2 l l?1 e(M)  kl ? k(k ? 1)l :

2

6 Compound Systems and Scaled up Performance Bartal at. al. invented a recursive construction based upon a speci c graph product, which has allowed them to obtain lower bounds on the competitive ratio. Their lower bound increases at the rate of some xed root of the size of the construction. In this section we generalize this construction, and identify the parameter that gets carried on to the higher levels of this construction.

De nition 6.1 (Composition of Set Systems) Let M0 be a set system on basic set B0 where jB0 j = l, and let M1; : : :; Ml be set systems on basic sets B1 ; : : :; Bl . We assume that Bi \ Bj = ; for 1  i = 6 j  l.We de ne a set system on B = B1 [ B2 [ : : : [ Bl as the collection of sets Ii [ Ii [ : : : [ Iis , where fi1; i2; : : :; isg 2 M0 , and Iij 2 Mij for 1  j  s. We call M the M0 -composition of systems M1; : : :; Ml, and denote it by M0(M1; : : :; Ml). If M2; : : :; Ml are all isomorphic to M1, then M0 (M1; : : :; Ml) is called the M0 composition of copies of M1. 1

2

In the remaining part of this paper we shall give weights to the elements of the basic set of a set system M. Let w be a weight function from B (M) to the non-negative reals. We extend our basic de nitions from Section 1 to the weighted case as follows. The yield, y (A; s), of an algorithm A for an input sequence s is the total weight of the elements that A picks from s if A is deterministic, and the expected value of the total weight of the elements that A picks from s if A is randomized. We also de ne OPT (A; s) to be the maximum weight of an independent set

29

in s. The notions of on line performance of an algorithm and of a set system change accordingly. When we do not explicitly mention the weight function it is identically one.

Lemma 6.2 If we multiply the weight of all elements of B(M) with a constant factor c 6= 0, then the performance of M does not change. The proof of this Lemma is easy since y () and OPT () will both scale up with a factor of c. Recall that in Section 4.2 we de ned the function pickA(s) for a randomized algorithm A, and an input s, as the probability, that the set of elements that A picks from s is non-empty. The same notion also makes sense for deterministic algorithms, in which case we get a function that takes 0 or 1 as its value. Below we de ne the linear extension of the pick() function from input sequences to distributions on input sequences.

De nition 6.3 Let D be an arbitrary distribution on input sequences from S (B(M)). Let A be an on-line algorithm that admits M. Then pickA (D) def =

X

s2S (B (M))

pickA(s):

The following de nition plays a pivotal role in our further discussion:

De nition 6.4 (Scaled-up Performance) For a deterministic algorithm A for which pickA(D) = 6 0 we de ne y0(A; D) = y(A; D)=pickA(D). The scaled-up performance of a set system M, e0 (M) is de ned as y 0(A; D) ; e0(M) = min max D A OPT (D)

(16)

where A runs through all deterministic algorithms for which pickA (D) 6= 0. We get the same notion if we require that A runs through all randomized algorithms for which pickA(D) 6= 0 (see next lemma).

Lemma 6.5 In the above de nition we could have required that A runs through all randomized algorithms A for which pickA(D) = 6 0. In other words, the maximum on the r.h.s. of Equation (16) does not increase if A is allowed to be randomized.

30

Proof. Let A be a randomized algorithm. A is equivalent to running some deterministic Pm algorithms Ai 1  i  m with probabilities qi for 1  i  m. Then y (A; D) = i=1 qi y (Ai ; D), P

and pickA (D) = mi=1 qi pickAi (D). Thus

y 0 (A; D)  P Pmi=1 qi y(Ai ; D)  OPT (DP ) ( mi=1 qi pickAi (D)) OPT (D) m q OPT (D) pick (D) e0(M) Ai i=1 0 Pmi ( i=1 qi pickAi (D)) OPT (D) = e (M):

The hardness proofs in [3] are based on taking the logarithmic power of a speci c construction and upper bounding its performance. Our next theorem generalizes the proof in [3] by giving an upper bound on the performance of compound systems in terms of the scaled up performance of the components. It is unknown under what set of conditions we can exchange the scaled up performance with performance in this theorem.

Theorem 6.6 Let M1; : : :; Ml be arbitrary set systems such that e0(M1); : : :; e(Ml)  , and M0 be a weighted system such that w(i) = OPT (Mi; Di), where Di is the distribution that certi es that e0 (Mi)  . Then e(M0(M1; : : :; Ml))  e(M0) e0 (M0(M1; : : :; Ml))  e0 (M0) :

(17) (18)

Proof. De nition 6.7 Let D0 be a distribution on S (B0) and D1; D2; : : :; Dl be a distributions on S (B1); S (B2); : : :; S (Bl). We de ne distribution D = D0 (D1; : : :D2) on S (B1 [ B2 [ : : : [ Bl) as follows: to generate a random sequence from S (B1 [ B2 [ : : : [ Bl ) according to D we

rst draw a random sequence s = a1 : : :ag from S (B0 ) according to D0, and then for each ai (1  i  g ) we draw a sequence from S (Bai ) randomly according to the distribution Dai . The random sequence we construct is going to be the concatenated sequence s(a1 )s(a2) : : :s(al ).

From a deterministic algorithm A that admits M = M0(M1; : : :; Ml) and for distribution D = D0(D1; : : :; Dl) we construct randomized algorithms Ai that admits Mi for 0  i  l. Then we are going to use the bounds on the performance of these algorithms to bound the performance of A. We introduce the notation Bi for B (Mi ).

31

First we construct A0 . We de ne A0 for sequences in S (D0) (i.e. for the sequences that support D0 ), and then we extend the de nition for arbitrary sequences. The extended algorithm stops picking elements when it observes that the input sequence is not in S (D0). If A0 gets an input sequence s = a1 : : :ag from S (D0), then it creates an on-line sequence s0 for A as follows: For every new element x in s algorithm A0 draws a random sequence s(x) = b1 : : :bf 2 Dx, adds it to s0, and simulates the corresponding steps of A on it. If A picks at least one element from s(x), then A0 picks x, otherwise it does not. By the de nition of M = M0(M1; : : :; Ml), and by the fact that A admits M, it follows that the collection of those elements that A0 picks for any s 2 S (D0) forms an independent set in M0. From this and from the de nition of A0 for input sequences in S (B0) n S (D0) it follows that A0 admits M0 For any x 2 B0 let P (x) denote the probability (over all random choices for the input sequence s 2 D0, and over all random choices of A0), that algorithm A0 picks x. An important fact, that we will use later, is that P (x) agrees with the probability (over randomly chosen input sequences from D), that A picks an element from Bx . We have that:

y(A0 ; D0) =

X

x2B0

P (x)w(x):

(19)

For a xed x 2 B0 we de ne the yield of algorithm A associated with x, denoted by y(A; D; x) to be the expected number of elements of Bx that A picks. We have:

y(A; D) =

X

x2B0

y(A; D; x):

(20)

Now we show that for a xed x 2 B0

y(A; D; x)  P (x)y 0(Mx ; Dx)

(21)

by constructing a randomized algorithm Ax that admits Mx , has average performance y (A; D; x) for distribution Dx , and pickAx (Dx) = P (x). We de ne Ax as follows. Let s = a1 : : :af 2 S (Bx ) be an on-line input sequence for Ax . Algorithm Ax will consist of a preparation (silent) phase, and an active phase in which it can pick elements from s. In both phases we shall simulate the running of A. In the preparation phase we run A on an input sequence s0 randomly drawn according to the distribution D until we encounter the rst element from Bx (if we never encounter such an element, the algorithm terminates by not

32

picking anything). In the second phase we continue running A on the elements of s, and we pick those elements that A picks. By the de nition of the composition of set systems it follows immediately that Ax admits Mx. If we draw s randomly from Dx and then run Ax on s, then the distribution of the output sets is exactly the same as when we draw s from D, run A on it, and intersect the result with Bx . This implies, that P (x) = pick(Ax; Dx), and y (A; D; x) = y (Ax ; Dx) = pick(Ax; Dx)y 0(Ax ; Dx) = P (x)y 0(Ax ; Dx).

Lemma 6.8 OPT (D; M)  OPT (D0; M0). Proof. Recall, that distribution D is constructed such that we take a random sequence a1 : : :ar

from D0, for each ai we draw a random sequence s(ai ) randomly from Dai , and concatenate these sequences. From this construction it follows, that it is enough to show, that if a1 : : :ar is an arbitrary sequence from S (B (M0)), then OPT (a1 : : :ar ; M0)  E (OPT (s(a1) : : :s(ar ))), where E here (and later in (22)) denotes expectation over the cross product of independent P random choices for s(ai ) 2 Dai (1  i  r). Let M  s, M 2 M0 be such, that x2M w(x) = OPT (a1 : : :ar ; M0). Then we have:

OPT (a1 : : :ar ; M0) = X P

x2M

X

x2M

w(x) =

E (OPT (s(x); Mx) = E

X

x2M X x2M

OPT (Dx ; Mx) =

(22)

!

OPT (s(x); Mx)  E (OPT (s(a1) : : :s(ar )));

since x2M OPT (s(x); Mx)  OPT (s(a1 ) : : :s(ar )) for every sequence s(a1 ) : : :s(ar ). From the above lemma, and inequalities (19) (20) (21) we obtain: P y(A; D) x2B y (A; D; x)  = OPT (M; D)pickA(D) OPT (M; D)pickA(D) P P 0 x2B P (x)y (Mx; Dx) = x2B P (x)w(x) = OPT (M; D)pickA(D) OPT (M; D)pickA(D) y(A0 ; D0) y(A0; D0) 0  OPT (M; D)pick (D) OPT (M ; D )pick (D )  e (M0) 0

0

0

A

0

0

A0

(23)

0

Note that pickA(D) = pickA (D0) is obvious from the de nitions. If we maximize both size for all deterministic algorithms A that admit M, then we get e0 (M)  e0 (M0) . If we change the derivation (23) so that we do not divide the sides with pickA (D), and we maximize for all deterministic algorithms A, then we get e(M)  e(M0) . 0

33

(l) Corollary 6.9 Let M1 and M2 be two monotone set systems. Let l = jB(M1)j, and M(1) 2 : : :; M2 be disjoint copies of M2. Then (l) 0 0 e0(M1(M(1) 2 ; : : :; M2 )  e (M1)e (M2); (l) 0 e(M1 (M(1) 2 ; : : :; M2 )  e(M1)e (M2);

Proof. Scale up the weights of the elements of M1 by a factor of OPT (M2). By Lemma 6.2

this does not e ect the performance of M1 . The statement now follows from Theorem 6.6.

6.1 Applications of Compound Systems As a rst application we show how to strengthen the result in [3] and improve the lower bound on the competitive ratio of on-line algorithms for the maximum independent set of a graph.

De nition 6.10 Let G be a graph on n vertices. Then we de ne ind(G) as the system of all independent sets of G.

Lemma 6.11 Let G0, G1, : : :, Gl be graphs on disjoint vertex sets such that V (G0) = f1; 2; : : :; lg, and let G0 (G1; : : :; Gl) be the graph with

V (G) = V (G1) [ V (Gl) [ : : : [ V (Gl) E (G) = E (G1) [ E (Gl) [ : : : [ E (Gl) [ f(x; y) j x 2 V (Gi); y 2 V (Gj ) : (i; j ) 2 G0g:

(24)

If Mi = ind(Gi) then ind(G) = M0(M1; : : :; Ml):

The proof of the above lemma is straightforward. We will use our composition theorem to get estimates on the competitive ratio of on-line algorithms for the independent set problem.

Lemma 6.12 Let G be a graph on l vertices such that e0(ind(G)) = . Then the competitive ratio of on-line algorithms for nding maximum independent sets of graphs of size n is bounded from below by (n? logl ).

Proof. It is enough to construct graphs Gm on lm points for which e(ind(Gm))  m . We shall construct graphs Gm such that e0 (ind(Gm))  m , so e(ind(Gm))  m will also hold. 34

(l) (1) (l) De ne G1 = G, and Gm = G(G(1) m?1; : : :; Gm?1), where Gm?1 ; : : :; Gm?1 are disjoint copies of Gm?1 . For m = 1 we have e0 (ind(G)) = by our assumption. For m > 1 Corollary 6.9 gives us that e0 (ind(Gm)  e0 (ind(Gm?1)), and by induction e0 (ind(Gm?1))  m . Let M be a set system on a base set of six elements f1; 2; 3; 4; 5; 6g that consists of three disjoint maximal sets of size two (f1; 2g; f3; 4g; f5; 6g). Let P be a probability distribution on S (B) such that P (1352) = 1=3, P (1354) = 1=3, P (1356) = 1=3, and on every other sequence P is zero. One can calculate using this distribution that e0 (M)  2=3. By the above lemma we get that performance ratio of on-line algorithms that compute the maximal independent set problem is  1=n0:226. Next, as a completely di erent application of the Composition Theorem 6.6 we sharpen our bounds on e(k; l) (e(k; l) = e(Mk:l ) is introduced in Section 4.)

Theorem 6.13 Let k; l; p; q be natural numbers such that p2q  k, q2p  l. Then e(k; l)  1=pq .

Proof. Let M0 be a system which contains q singletons of weights p; 2p; : : :; 2qp. With an obvious modi cation of the argument of Theorem 4.21 we obtain that e(M0 )  2=q . Let M1; : : :; Mq be systems isomorphic to Mp;2p? ; M2p;2p? ; : : :; M2qp;2p? . The following lemma gives an upper bound on the scaled-up performance of Mi , 1  i  q . 1

1

1

Lemma 6.14 e0(Mi)  2=p and OPT (Mi; Di) = 2ip for every 1  i  q, Where Di is the distribution that certi es that e0 (Mi)  2=p. Proof. We obtain a distribution Di on S (B(Mi)) from the distribution de ned in the proof of

Theorem 4.19 by blowing up each singleton into 2i xed elements. Clearly OPT (Mi ; Di) = 2i p. Moreover e0(Mi )  2=p by an argument identical to the one given in the proof of Theorem 4.19. Let us apply the Composition Theorem for M = M0(M1; : : :; Ml). Inequality 17 gives that e(M)  4=pq. On the other hand M0(M1; : : :; Ml)  Mp2q ;q2p?  Mk;l, and the theorem follows. 1

35

7 Deterministic Algorithms, Greedy algorithms and Extensions of Matroids Although the primary focus of our study is randomized algorithms, we ought to look at subclasses of the former, such as deterministic algorithms and variants of greedy algorithms for at least two reasons. First, the simplest algorithms used in practice often belong to these classes. Our second reason is theoretical: It appears that characterizing set systems by generalizations of matroids has more to do with deterministic and greedy algorithms than with randomized algorithms. In this section we shall explore the relation between the performances of on-line algorithms that belong to classes illustrated in gure 3. Randomized Deterministic Greedy Monotone Greedy

Figure 3: Classes of online algorithms. Randomized and deterministic algorithms have already been de ned earlier. Next we de ne a subclass of deterministic algorithms, that we call generalized greedy algorithms:

De nition 7.1 (Generalized Greedy Algorithm) Let M be a monotone set system, and M0  M be a sub-system (not necessarily monotone). We de ne the deterministic algorithm greedyM0 to be the one that picks an element x from an input sequence s if x together with the set of already picked elements belongs to M0.

Note, that greedyM is just the usual greedy algorithm on M. We remind the reader that a matroid is a monotone set system M with the property that for every M1 ; M2 2 M, jM1 j < jM2j it is possible to nd an element x 2 M2 n M1 such that M1 [ fxg 2 M (for more information about matroid theory see [14]). It is known that matroids are those set systems for which the greedy algorithm computes the rank of on-line input sequences, where the rank of the input

36

sequence s is OPT (s). Our rst theorem proves that even if one allows an arbitrary randomized online algorithm then a set system has optimal on-line performance if and only if it is a matroid.

Theorem 7.2 e(M)=1 if and only if M is a matroid. Proof. Let A be an algorithm admitting M with performance 1. By the argument in the previous section A is a convex combination of the deterministic algorithms admitting M. Clearly

those deterministic algorithms that have a positive coecient in the convex combination must have performance one. Thus there exists at least one deterministic algorithm D with performance 1 admitting A. Algorithm D must be greedy since otherwise it fails on any sequence that ends with an item that can increase the size of the solution but is not picked by D. The theorem now follows by the characterization of matroids mentioned above. We conclude from the above theorem that the class of matroids coincides with those set systems that have performance equals to 1, no matter, whether the optimal is taken among randomized, deterministic, or generalized greedy algorithms. Unfortunately this coincidence does not hold any more, when the performance is smaller than 1. There are set systems for which randomized algorithms perform much better than deterministic ones. Currently we do not have a combinatorial characterization of those systems for which there exists a deterministic algorithm whose performance is above a certain threshold, or even for those systems for which there exists a generalized greedy algorithm with a guarantee on its performance. However we can de ne a subclass of the generalized greedy algorithms, that we call monotone greedy algorithms, and characterize those set systems M, for which there is a monotone greedy algorithm with performance at least c.

De nition 7.3 (Monotone Greedy Algorithm) A monotone greedy algorithm for a set system M is a generalized greedy algorithm greedyM0 for M such that M0 is a monotone set system.

Theorem 7.4 Let M be a monotone set-system. Then there exists a monotone greedy algorithm for M with performance at least c( 1) i the following set theoretical condition holds for M: Monotone Extensibility Condition with Parameter c: There exists a monotone set system M0  M such that for every M 0 2 M0 and for every M 2 M with cjM j > jM 0j we have, that there is an x 2 M n M 0 such that fxg [ M 2 M0.

37

Proof. For the \if" part let us take greedyM0 , where M0 is the subset of M the existence of

which is guaranteed by the combinatorial condition. We need to show, that the performance of greedyM0 is at least c for every input sequence s. Let M  s be such that jM j = OPT (s), and let greedyM0 (s) = M 0  M0. We show that cjM j  jM 0j. Assume in the contrary, that cjM j > jM 0j. Then there is an x 2 M n M 0 such that fxg [ M 2 M0. Let M 00 be the elements of M 0 that come before x in s. fxg [ M 00 2 M0 by the monotonicity of M0. Hence, x should have been picked by the de nition of greedyM0 , which is contradiction. For the \only if" part let greedyM0 be a greedy algorithm with performance c. We show, that for M0 the combinatorial condition of the theorem holds. For M 0 2 M0 and for M 2 M we create an input sequence s such that we rst list all elements from M 0 in an arbitrary order, and then we list the elements from M n M0 in an arbitrary order. If there is no x 2 M n M 0 such that fxg [ M 2 M0 then greedyM0 (s) = M 0 , which implies jM 0j  cOPT (s)  cjM j. We discovered Theorem 7.4 while trying to nd a combinatorial characterization for set systems with deterministic performance at least c. Interestingly, there is a necessary condition, which is very similar to the Monotone Extensibility Condition. The di erence is that the monotonicity of M0 is not required.

Theorem 7.5 Let M be a monotone set system, such that there is a deterministic algorithm A that admits M, and e(A)  c. Then the following condition holds for M: Non-monotone Extensibility Condition with Parameter c: There exists a set system M0  M such that ; 2 M0, and for every M 0 2 M 0 and for every M 2 M with cjM j > jM 0j we have, that there is an x 2 M n M 0 such that fxg [ M 2 M0. Proof. We shall need the following de nition: De nition 7.6 Let M be a monotone set system, and let A be a deterministic algorithm that admits M. We de ne H(A) = fM 2 M j 9s 2 S (B(M)) : A(s) = M g: We show that for M0 = H(A) the Non-monotone Extensibility Condition holds with parameter c. Let M 0 = A(s) be an arbitrary element of H(A), and M be an arbitrary element of M with cjM j > jM 0j. Let s0 be an arbitrary sequence constructed from those elements of M , that do not belong to s. Since OPT (ss0 )  jM j > jM 0 j=c, we have, that A(ss0 ) 6= M 0 (but A(ss0 )  M 0), which implies that there exists an x 2 M n M 0 such that fxg [ M 0 2 H(A).

38

Towards closing the gap between greedy and deterministic algorithms we show:

Lemma 7.7 Let us assume that M is such that the Non-monotone Extensibility Condition holds with parameter c and with sub-system M0. Then for every input sequence s 2 S (B (M)) P if y (greedyM0 ; M; s) = k then 1 + ki=1 di=ce > OPT (s) Proof. Let s be an input sequence for greedyM0 . We prove the statement by induction on

k. If k = 0, then s ought to be empty (because greedyM0 always picks the rst element of a

sequence), and the statement follows. P ?1 Let k  0. Let s0 be the shortest pre x of s with OPT (s0 ) = 1 + ik=1 di=ce, and let 0 greedyM0 (s ) = X . By induction we have, that jX j  k, so jX j = k. On the other hand for the sux s00 = s n s0 we have OPT (s00 ) > dk=ce. Thus there is an M 2 M, M  s00 with jM j > k=c. By the Non-monotone Extensibility Condition there is an x 2 M such that X [ fxg 2 M0, which implies, that greedyM0 must pick at least one element from s00, which is a contradiction.

Combining Theorem 7.5 and Lemma 7.7 we get that whenever there exists a deterministic algorithm admitting M there also exists a greedy algorithm admitting M with with some guarantee on its performance. It is an intriguing question whether this relation could be improved. We say that M has extensibility-ratio c, 0  c  1, if for every M1 ; M2 2 M such that jM2j < cjM1j it is possible to nd an element x 2 M1 n M2 such that M2 [ fxg 2 M. If M has extensibility ratio c, then the Monotone Extensibility condition holds with M0 = M and with parameter c. Therefore e(M0 )  c. The set of all matchings in a graph G is a natural example for a monotone set system with extensibility-ratio 1=2, so the usual greedy algorithm will always produce a matching of size at least half the size of the maximum matching in any given sequence of edges from G. During the study of speci c systems we observed that even if the set system M is not a matroid, there are elements of B = B (M) that are always good to pick greedily. Interestingly we have found a purely set theoretical characterization of such elements.

De nition 7.8 An element x 2 B is greedy for M if for every M2  M1 2 M such that M2 [ fxg 2 M, there is an M 0 2 M such that M2 [ fxg  M 0  M1 [ fxg, and jM 0j = jM1j. Example 7.9 Let B = f1; 2; 3; 4g, M = ff1; 2g; f1; 3; 4g; f2; 3; 4ggc. Then 3 and 4 are greedy for M. 39

The following theorem describes a way to change an algorithm to another one which works in a greedy manner on a prescribed set of greedy elements, and has the same or better performance.

Theorem 7.10 Let G be a set of greedy elements for M. Then for any randomized algorithm

A there is a randomized algorithm A0 such that

1. When A0 encounters an element x 2 G then it picks x with probability one if x and the elements already picked by A0 form an independent set. 2. For every input sequence s, e(A; s)  e(A0 ; s).

In order to prove Theorem 7.10 we need the following lemma.

Lemma 7.11 Let G be a set of greedy elements for M, and let M1; M2 be two independent sets in M, such that jM2j  jM1 j and M2 ? M1  G. Then there exists an M 0 such that M2  M 0  M1 [ M2, and jM 0 j = jM1j. Proof. The proof is by a straightforward induction on jM2 ? M1j. We omit the details. Using Lemma 7.11 it is easy to prove Theorem 7.10. Proof. (Theorem 7.10) Let A0 be an algorithm that simulates A's coin ips and proceeds according to the picking decisions of A for a given input sequence, s, with the following exceptions: When A0 encounters x 2 G such that x together with the elements picked by A0 so far form an independent set, A0 picks x (regardless of the decision that A would have been made for the same input and for the same sequence of coin ips). When A0 encounters x 62 G then if A would not have picked x, A0 does not pick x either, and if A would have picked x, and x together with the elements picked by A0 so far form and independent set then A0 picks x. Using Lemma 7.11 one can prove by induction on the length of the input sequence s that after processing s, A0 picks at least as many elements as A would have picked for the same input and for the same sequence of coin ips. We conclude, that for all sequences of coin ips the performance of A0 is at least as good as that of A, and by taking the expectations for both cases we obtain the statement of the theorem. Theorem 7.10 is particularly useful when one comes to design an algorithm for a set system M for which the greedy elements are known or can be eciently computed. We nish this section by proving that every element x 2 B is greedy for M if and only if M is a matroid.

40

Theorem 7.12 A monotone set system M is a matroid if and only if every x 2 B is greedy for M. Proof. It is easy to show that if M is a matroid then any element is greedy. The converse follows from Lemma 7.11 and the monotonicity of M.

8 Open Problems The starting point for our research was to nd necessary and sucient conditions for a set system M which determine its performance. In this paper we establish many relations between the structure of M and its performance, but these relations usually do not go in both directions. In particular, we do not have a characterization of those M with performance (1=polylog (n)) where n = jB (M)j or of systems with performance (1). On the otherp hand, we would like to nd the systems with worst performance. As shown in Section 5.1, 1= 2n  e(M), but we cannot prove nor disprove the existence of systems with p performance O(1= n). Using the compound construction of Section 6 with a carefully chosen base system we obtain a set system M for which e(M)  1=n0:226, and to the best of our knowledge this is the best known upper bound for minjB (M)j=n e(M). Determining the performance of a system is dicult even when the maximal elements are disjoint. When this is the case, a subclass of randomized algorithms, called oblivious (see Section 4.2), appear to contain the best algorithms, and we conjecture that indeed, the best algorithms belong to this class. In Section 4 we give an analysis of the performance of systems with l maximal sets, each of size k. If we denote this value by e(k; l) we can phrase our questions about the value of e(k; l) as: 1. What is the precise formula for e(k; 3) (we conjecture, that it is k=(3k?0:5?minz=1;:::;k (z=2+ k=z)))? 2. Is it possible to give a precise formula for e(k; l) in general? A task maybe slightly less ambitious than characterizing set systems with small (randomized) performance would be the analogous question for the deterministic performance. In other words, nd a set theoretic characterization for those systems for which there exists a deterministic algorithm with performance at least c.

41

Acknowledgments: We acknowledge Joel Spencer for his contribution to Theorem 4.8 and Anna Karlin for pointing us to reference [1].

References [1] B. Awerbuch, Yossi Azar, A. Fiat, and T. Leighton. Making commitments in the face of uncertainity: How to pick a winner almost every time. In Proc. 28th ACM Symposium on Theory of Computing, pages 519{530, 1996. [2] A. Bar-Noy, J. A. Garay, A. Herzberg, and S. Aggarwal. Sharing video on demand. Technical report, IBM Technical Memorandum, 1996. [3] Y. Bartal, A. Fiat, and S. Leonardi. Lower bounds for on-line graph problems with application to on-line circuit and optical routing. In Proc. 28th ACM Symposium on Theory of Computing, pages 531{540, 1996. [4] S. Ben-David, A. Borodin, R. Karp, G. Tardos, and A. Wigderson. On the power of randomization in on-line algorithms. Algorithmica, 11(1):2{14, 1994. [5] A. Borodin, N. Linial, and M. E. Saks. An optimal on-line algorithm for metrical task system. J. ACM, 39(4):745{763, 1992. [6] A. Fiat, R. Karp, M. Luby, L. McGeoch, D. D. Sleator, and N. Young. Competitive paging algorithms. J. Algorithms, 12:685{699, 1991. [7] M. M. Hallsorsson. Parallel and online graph coloring. Technical report, Science Institute, University of Iceland. To appear in J. Algorithms. [8] R. M. Karp, U. V. Vazirani, and V. V. Vazirani. An optimal algorithm for on-line bipartite matching. In Proc. 22nd ACM Symposium on Theory of Computing, pages 352{358, 1990. [9] E. Koutsoupias and C. H. Papadimitriou. On the k-server conjecture. J. ACM, 42(5):971{ 983, 1995. [10] M. S. Manasse, L. A. McGeoch, and D. D. Sleator. Competitive algorithms for on-line problems. J. Algorithms, 11(2):208{230, 1990. [11] N. Reingold, J. Westbrook, and D. D. Sleator. Randomized algorithms for the list update problem. Algorithmica, 11(1):15{32, 1994.

42

[12] D. D. Sleator and R. E. Tarjan. Amortized eciency of list update and paging rules. Commun. ACM, 28(2):202{208, 1985. [13] M. Szegedy. Algebraic Methods in Lower Bounds for Computational Models with Limited Communication. PhD thesis, School of Computer Science, Chicago University, 1989. [14] D. J. A. Welsh. Matroid Theory. Academic Press, London, 1976. [15] A. C. Yao. Probabilistic computations: Towards a uni ed measure of complexity. In Proc. 17th Annual Symposium on Foundations of Computer Science, pages 222{227, 1977.

43

Suggest Documents