Optimal Plans for Aggregation - Computer Science - Harvard University

1 downloads 0 Views 906KB Size Report
news, weather, horoscopes, movie schedules, etc. There is a conflict here between returning the page quickly and having the most updated information from ...
Optimal Plans for Aggregation tt Michael Mitzenmacher Harvard University michaelm @eecs.harvard.edu

Andrei Broder IBM Research Division [email protected]

ABSTRACT

1.

We consider t h e following problem, which arises in the cont e x t of d i s t r i b u t e d Web c o m p u t a t i o n s . A n aggregator aims to combine specific d a t a from n sources. T h e aggregator contacts all sources at once. T h e t i m e for each source to r e t u r n its d a t a to t h e aggregator is i n d e p e n d e n t and identically distributed according to a known distribution. T h e aggregator at some point stops waiting for d a t a and returns an answer d e p e n d i n g only on the d a t a received so far. If t h e aggregator returns the aggregated information from k of t h e n sources at t i m e t it obtains a reward R k (t) t h a t grows w i t h k and decreases w i t h t. T h e goal of the aggregator is to m a x i m i z e its e x p e c t e d reward.

We consider t h e following problem: an aggregator aims to combine specific d a t a from n sources. T h e aggregator contacts all sources simultaneously. T h e t i m e for each source to r e t u r n its d a t a to the aggregator is i n d e p e n d e n t and identically d i s t r i b u t e d according to a known distribution. T h e aggregator at some point stops waiting for data and returns an answer depending only on the data received so far. If the aggregator returns the aggregated information from k of the rt sources at time t it obtains a reward Rk (t) that grows with k and decreases with t.

INTRODUCTION

At first blush this p r o b l e m might seem artificial; however it is quite c o m m o n in the Web c o n t e x t where results are c o m p o s e d from m a n y sources and t h e r e is often a conflict between r e t u r n i n g fast results versus r e t u r n i n g good results.

We prove t h a t for certain b r o a d families of distributions and broad classes of reward functions, t h e o p t i m a l plan for t h e aggregator has a simple form and hence can be easily computed.

Here are some examples: C u r r e n t general search engines (A1taVista, Google, I n k t o m i , etc.) index a corpus of hundreds of millions of Web pages by m a i n t a i n i n g a large collection of inverted files d i s t r i b u t e d in various ways a m o n g m u l t i p l e machines. Often, w h e n a q u e r y is received, a central controller m u s t aggregate d a t a from several machines. T h e goal of t h e controller is to optimize t h e user experience, which d e p e n d s on two factors. T h e first is the q u a n t i t y of information received; we will start w i t h t h e simplifying a s s u m p t i o n t h a t this depends only on t h e n u m b e r of machines who have responded to the controller's query. (Practically, ignoring some machines is equivalent to a r e d u c t i o n in the size of t h e corpus.) T h e second is t h e a m o u n t of t i m e t h e user has to wait for t h e response. We wish to design an a l g o r i t h m for d e t e r m i n i n g an o p t i m a l plan for t h e controller. O f course in reality t h e situation is more complex: different machines m a y be experiencing different loads, or some machines m a y be down altogether, w i t h o u t t h e controller's knowledge.

• Search engines.

* P a r t of this work was done while the a u t h o r was at A1tavista, Inc. ~Supported in part by an Alfred P. Sloan Research Fellowship and N S F grants CCR-9983832, CCR-0118701, and CCR-0121154.

• Metasearch engines and peer-to-peer ( P 2 P ) search. T h e

situation here is similar e x c e p t t h a t the controller now waits for results from m u l t i p l e search engines. In the P 2 P scenario the n u m b e r of sources m a y not even be known a priori.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. PODC 2002, July 21-24, 2002, Monterey, California, USA. Copyright 2002 ACM 1-58113--485-1/02/0007...$5.00.

Personalized portM pages (Excite, Lycos, Yahoo) integrate all sort of personalized information, each p r o v i d e d by a different server: stock quotes,

• Portal pages.

144

Our work relies on using statistical properties from t h e literature of systems reliability: increasing failure rate and decreasing failure rate. We provide a weak e x a m p l e of our results here, a l t h o u g h the s t a t e m e n t m a y be m o r e meaningful only after all the definitions and framework have been established. Suppose each reward function Rk(t) equals rk (1--Z(t)), where the constants rk > 0 represent the undiscounted rewards and 1 - Z(t) is a discount factor related to the t i m e waited. Also, let the function Z(t) be a c u m u l a t i v e distribution function for some probability distribution. N o t e t h a t Z(t) is not actually used as t h e distribution of a r a n d o m variable; this is j u s t a convenient way to specify t h e shape of t h e function. Indeed, one of our i m p o r t a n t contributions is to recognize t h a t interesting results arise in this setting by considering the shape of the reward function as a probability distribution. As an example, a n a t u r a l special case t h a t is often used in Markov decision processes [4, 13] is where t h e reward has exponential decay; t h a t is, Rk(t) = rke -~t for some fixed '3'. Recall t h a t F is t h e c u m u l a t i v e distribution function of the r e t u r n t i m e of each source. Let F0 represent the cumulative distribution for the first t i m e of any of the n sources. One result we o b t a i n is the following:

news, weather, horoscopes, movie schedules, etc. T h e r e is a conflict here between r e t u r n i n g t h e page quickly and having t h e most u p d a t e d information from each of these servers. T h e aggregator m a y decide to r e t u r n out-of-date information a n d / o r not fill certain fields in the interest of responding in a timely fashion.

• Page construction from cached components. C o m p a nies such as A k a m a i provide network caching services whereby certain objects such as images are cached on multiple servers located all over the world. W h e n a user requests a page including an "Akamaized" object, t h e object is r e t u r n e d from the most a p p r o p r i a t e cache. In more advanced i m p l e m e n t a t i o n s most of the page is stored on an A k a m a i server in t h e form of a collection of objects and only m i n i m a l requests are m a d e to a central server. T h e issue here is t h a t the objects have an expiration date. In some situations the cache server must decide whether to r e t u r n a page containing some expired objects, wait for updates, or t i m e out the user request altogether.

T h e o r e m : If F has increasing failure rate, Z has decreasing failure rate, and the s u m of the failure rates for F0 and Z is decreasing, t h e n in the o p t i m a l plan for each k there is a single transition point. Specifically, on receiving the kth response t h e aggregator should either return i m m e d i a t e l y or wait for t h e next response. Similarly, if F has decreasing failure rate, Z has increasing failure rate, and t h e sum of the failure rates for F0 and Z is increasing, t h e n in t h e optimM plan for each k there is a single transition point. W i t h k responses, the receiver should wait up to some t i m e tk for a n o t h e r response before giving up.

Again, the general problem situation uniting the above scenarios is t h a t an aggregator simultaneously makes requests to n sources and at some t i m e t decides to r e t u r n a response in order to obtain a reward Rk(t) t h a t depends on the number k of sources t h a t have responded by t i m e t. T h e goal of the aggregator is to m a x i m i z e t h e e x p e c t e d reward obtained. In full generality, this p r o b l e m is quite complex; t h e form of the solution depends on t h e reward functions, the rules governing machine responses, and t h e t y p e of solution desired ( a p p r o x i m a t i o n or exact). For this paper, we have focused on several specific goals t h a t guide us. First, in order to provide a m a t h e m a t i c a l framework for the problem, we initially focus on a natural probabilistic model, where the r e t u r n t i m e for each of the n sources is i n d e p e n d e n t and identically distributed (i.i.d.) according to a c u m u l a t i v e dist r i b u t i o n function F known to the controller.

Hence, for e x t r e m e l y large classes of r e t u r n distributions and reward functions, t h e o p t i m a l plan is incredibly simple. Using this fact, one can generMly calculate the o p t i m a l plan numerically if given t h e r e t u r n distribution in an appropriate form. A n o t h e r possible application of these results is to allow an aggregator to learn a n e a r - o p t i m a l plan effectively even if the underlying distributions are unknown. T h e aggregator uses a learning a l g o r i t h m to discover the best plan; it can narrow the space of plans to consider by focusing only on plans w i t h a single transition point if the aggregator knows or suspects the conditions of the t h e o r e m are satisfied.

Second, we are interested in situations where we can develop plans t h a t are optimal for t h e aggregator. T h e p r o b l e m of designing fast, on-line general a p p r o x i m a t i o n algorithms for this scenario is clearly interesting and a worthy p r o b l e m for future work. Our hope, however, is to find a broad class of instances where c o m p u t i n g an o p t i m a l solution is possible. Third, we are interested in situations where we can develop a plan for the aggregator t h a t is simple. While simplicity is a relative notion, we m a y describe it formally here as follows. T h e plan can be t h o u g h t of as a series of binary functions Pk(t), where Pk(t) is 1 if t h e aggregator should return w i t h k responses at t i m e t and 0 otherwise. One n a t u r a l measure of the complexity of Pk(t) is the n u m b e r of transition points, where to is a transition point if for every e > 0 there exist points x, y in the neighborhood [to - e , to +c] such t h a t Pk (x) and Pk (Y) take on different values. Intuitively, the o p t i m a l plan changes around to. A priori there is no reason t h a t t h e n u m b e r of transition points must be finite (indeed, it is not clear t h a t the set of transition points needs to be countable). A simple plan will have few transition points. In this work we find situations where each Pk(t) has a single transition point.

We also provide several additional theorems. For example, we provide a result t h a t has fewer restrictions on t h e distributions, but instead restricts the behavior of the values rk. We also examine the cases where different sources m a y have different values or different response times. In these settings, the o p t i m a l plan also has a simple form, with one transition point per subset of sources. Finally, we present an a r g u m e n t showing t h a t m a k i n g a n o t h e r n a t u r a l weaker a s s u m p t i o n on t h e distributions of the response times is insufficent for our results.

1.1

Previous work

T h e concepts of increasing failure rate and decreasing failure rate are utilized primarily in the literature of systems relia-

145

bility, where they are used to describe the lifetime of system components [12, 15, 16, 19]. The lifetime distribution may affect the proper strategy for scheduling maintenance. In this sense, the idea that these distributions can yield algorithmic implications has been known for some time [16].

goals here are different than in this previous work, as we focus on what probabilistic assumptions are necessary for the optimM plan to have a simple, calculable form.

2.

DEFINITIONS AND BASIC L E M M A S

We begin by providing standard probabilistic definitions and some simple lemmas. Consider a cumulative probability distribution F ( t ) with density function f ( t ) = F ' ( t ) . For convenience we will generally assume that F and f are non-zero, continuous, and differentiable over all positive real numbers, although our results hold more generally.

The problem we examine here fits naturally into the scheme of Markov decision processes [4, 13]; there is an underlying Markov reward process, and one wants to maximize the reward. Indeed, it specifically fits into the framework of optimal stopping theory, where one wishes to find the optimal time to stop a process in order to maximize a reward [6, 8]. For example, the well-known secretary problem, where an employer interviews n secretaries and must decide after each interview whether or not to hire that person on the spot, exemplifies the problem framework of optimal stopping theory. The reward function for the secretary problem depends on the variation; the employer may wish to maximize the probability of finding the best secretary, or optimize for some other criterion (see, e.g., [1]). We note that in order to keep this work self-contained, we eschew using notation or language specific to optimal stopping theory.

We consider the case where F has increasing failure rate. 2 At an intuitive level, an event has increasing failure rate if the longer you've waited for it, the more likely it's just about to happen. More formally, we have the following.

DEFINITION 1. For a nonnegative random variable X with cumulative distribution function (cdf) F ( t ) , we define the corresponding survival function to be l~(t) = 1 - F ( t ) . The failure rate of X is defined as r(t) = f ( t ) / F ( t ) .

What is novel in our work is the connection between reliability theory and this natural stopping problem, and in particular the interesting relationships between the return time distribution and the shape of the reward discounting function. There is very little other prior work that builds on this natural connection. Most recently, properties of increasing failure rate distributions were used by Boyan and Mitzenmacher in the context of another scheduling and planning problem, crossing town by bus [5]. This paper extended previous work by Datar and Ranade [7], which considered the question under the assumption that the arrival distribution for buses to a stop were exponential distributions. The main result of Boyan and Mitzenmacher is to show that the weaker assumption that bus arrival distributions have increasing failure rate results in simple plans for the optimal solution, which can then be calculated using numerical methods.

DEFINITION 2. A nonnegative random variable X with function (cdf) F ( t ) is said to have increasing failure rate (or be I F R ) if the failure rate r(t) is increasing. Equivalently, X is I F R if log F ( t ) is concave on the support of 1~. That is, l~(t) is logconcave.

We may say that F is I F R instead of X is I F R where the meaning is clear. Also note the function r(t) satisfies

r ( t ) - - f(t___)__ lim P r ( t < X _ < t + A t F(t) ~t~o At

IX>t)

'

that is, r(t) represent the probability that the event corresponding to X is about to occur, given that it has not occurred already. This explains our previously given intuition. Exponential, normal, and uniform distributions are for example all IFR, as are gamma distributions corresponding to the sum of exponential distributions.

After completing this work, we found additional prior work from the statistical community that utilized the connection between stopping times for aggregation problems and reliability properties of distributions [17, 18, 9]. In all of this prior work, however, the reward function used in the analysis was R k ( t ) = rk -- ct for some constant c or in some cases R k ( t ) = rk -- c(t) for a convex c(t). That is, their discount was additive over time, while ours is multiplicative. 1 While the flavor of the work is similar, all of our results are new. Moreover, some of our generalizations appear to have no parallel in this previous work. While it is arguable which model is more accurate, we believe that each may be appropriate in different situations.

Similarly, we have the following: DEFINITION 3. A nonnegative random variable X with function (cdf) F ( t ) is said to have decreasing failure rate (or be D F R ) if the failure rate r(t) is decreasing. Equivalently, X is D F R if log F ( t ) is convex on the support of 1~, or l~(t) is logconvex. Exponential distributions are also DFR; since the failure rate for exponential distributions is constant (another way of saying the distribution is memoryless), they are both I F R mad DFR. Another class of interesting distributions that

Our work also has some of the flavor of problems for online Mgorithms in the face of uncertainty, including machine breakdown [2, 3, 10, 11]. Our aggregator must decide when to stop in the face of uncertain response time. Again, our

2Here we follow the perhaps unfortunate but apparently standard practice and use "increasing" to mean "nondecreasing" and "decreasing" to mean "non-increasing" throughout. So a constant function is both increasing and decreasing, and having increasing failure rate really means the failure rate is non-decreasing, even though I F R is the standard term.

l i t might seem one could move from multiplicative to additive discounts by placing an appropriate logarithm, but because we seek to maximize the expected reward, this does not appear to be the case.

146

fall into t h e D F R class is certain power-law distributions, for e x a m p l e distributions where /e(t) = ~ for c~ > 0. Power-law distributions have a p p e a r e d in several contexts in recent work, as t h e y m o d e l p h e n o m e n a w i t h heavy tails. A heavy-tailed response t i m e distribution m a y make sense in situations where there is some probability a machine is t e m porarily down, in which case there is a non-triviai possibility t h a t it m a y be a long t i m e before a response is heard.

PROOF. This follows i m m e d i a t e l y from L e m m a 1, since

fj(t)/.ff'j(t) = ( n - j ) f ( t ) / F ( t ) .

[]

Let R j ( t ) = r~2(t), where r0 _< r l _< r 2 . . . _< rn are constants and Z ( t ) is t h e survival function of a probability distribution. Recall t h a t t h e r j can be considered t h e undiscounted reward for r e t u r n i n g w h e n j sources have been heard from, and Z ( t ) is a multiplicative discount accounting for time. We emphasize again t h a t because Z ( t ) is a survival function does not imply t h e reward R j (t) is random; we are merely specifying the shape of t h e reward function in a convenient manner. For convenience we assume the support of all distributions is all positive real numbers. Let z(t) be the corresponding density function for t h e c u m u l a t i v e distribution function Z(t).

Before proceeding, we state a few short l e m m a s a b o u t I F R and D F R distributions t h a t prove useful in t h e sequel. LEMMA 1. I f X l , Z 2 , . . . , X n are IFR (resp. DFR) random variables, then min(X1, X 2 , . . . , X ~ ) is IFR (resp. DFR). PROOF. If rj(t) is the failure rate for X j , t h e n ~ = 1 rj(t) is the failure rate for m i n ( X 1 , X 2 , . . . , X ~ ) . T h e result follows. []

Finally, n o t e t h a t if F ( t ) and Z(t) are survival functions, t h e n so is their p r o d u c t F ( t ) Z ( t ) . A b u s i n g notation, we let F . Z be t h e c u m u l a t i v e distribution function associated w i t h t h e survival function l~(t)Z(t). T h e failure rate associated w i t h t h e distribution Fj - Z is the sum y~(t) Fj(t) + Z(t) "

LEMMA 2. I f F is IFR (resp. DFR), then l~(t + x)/l~(t) is decreasing (resp. increasing) in t f o r a fixed x > O.

THEOREM 1. I f F has increasing failure rate, Z has decreasing failure rate, and Fm • Z has decreasing failure rate, then in the optimal plan for each number of responses j > m - 1 there is a single transition point. Moreover, on receiving the j t h response the aggregator should either return immediately or wait f o r the next response.

PROOF. We prove for t h e case where F is I F R ; the o t h e r case is entirely similar. T h e derivative of F ( t + x ) / F ( t ) w i t h respect to t is

l~(t + x ) f ( t ) - l ~ ( t ) f ( t + x)

(p(t))2. If F is I F R , t h e n

We have stated T h e o r e m i in a general way, so t h a t it m a y apply once some n u m b e r of responses have been o b t a i n e d even if it does not apply for fewer responses. In particular, since Fn(t) is identically one, t h e t h e o r e m always holds for n - 1 responses if F is I F R and Z is D F R .

f(t +x) > f(t) F ( t + x) - F ( t ) ' from which we m a y conclude the derivative is non-positive and hence the t h e o r e m holds. [] Finally, we recall t h a t we are interested in the complexity of the plan, where the complexity is m e a s u r e d in t e r m s of

PROOF. We begin by noting a useful fact. Since y~(t) F~ (t) = (n -- j~j ~f(t) , F is I F R , Z is D F R , and t h e failure rate asso-

transition points.

ciated with t h e distribution Fj - Z is the sum F~(t) + 2 ( 0 ' we have t h a t if Fj • Z is D F R , so is Fk • Z for all k _> j . This fact allows us to do a backwards induction.

DEFINITION 4. A plan has a transition point at time t

with k responding sources if for every e > 0 there exist times x, y in the neighborhood [ t - e , t + e ] such that with k responding sources the plan returns at time x but waits at time y.

3.

Let rrVj(t) be the e x p e c t e d payoff to t h e aggregator using the o p t i m a l s t r a t e g y from t i m e t given t h a t j sources have responded. Clearly V~(t) = R~(t); the o p t i m a l s t r a t e g y when all sources have been heard must be to r e t u r n immediately. We show via a backward i n d u c t i o n from j = n down to m the following:

SIMPLE PLANS F O R CLASSES OF DISTRIBUTIONS AND REWARDS

In this section, we prove our result in the case where t h e cum u l a t i v e distribution function F for the response t i m e from each individual source is I F R . For convenience, from here on we denote by Fj t h e c u m u l a t i v e distribution function for t h e next response given t h a t j sources have already responded. Of course f and f j are the corresponding density functions. For notational convenience we m a y t h i n k of/~n as being 1 everywhere; t h a t is, the corresponding r a n d o m value is infinite.

• At any t i m e t, the o p t i m a l s t r a t e g y having j responses is either to wait for the n e x t arrival, or to r e t u r n immediately. • T h e function W j ( t ) = V j ( t ) / 2 ( t ) is increasing in t. T h e above are b o t h true for j = n. T h e first part of the induction will also go t h r o u g h for j = m - 1, and hence the induction yields the theorem.

LEMMA 3. I f F is IFR (respectively DFR), then F~ is also

I F R (respectively DFR).

147

To begin, for a n y fixed t consider vt(u) t o b e t h e v a l u e obt a i n e d b y t h e a g g r e g a t o r if, s t a r t i n g a t t i m e t, it chooses t h e policy of w a i t i n g u n t i l t i m e u _> t a n d t h e n r e t u r n i n g w h e n it h a s a l r e a d y h e a r d from j sources. T h e o p t i m a l policy m u s t b e of t h i s form (since t h e d i s t r i b u t i o n is k n o w n a h e a d of t i m e , r a n d o m n e s s c a n n o t help t h e a g g r e g a t o r ) . T h e n

vt(u) =

Equivalently,

W3(t) = max

( rs' (

max

jf

~ V.j+l~I x )' f J~( X ) d x + I~J(u)~" "

~ rj,

\

:t =o

J~:0

2(t)

Fs(t )

]

:

YJ+l(t+x) fj(tWX)dx~ = 2(t) Fj(t) /

+x)

+x)

2(t)

+x)

~(t)

+

F,(t+

dx

T h e first t e r m o n t h e r i g h t h a n d side c o r r e s p o n d s t o t h e cont r i b u t i o n s h o u l d a n a r r i v a l h a p p e n before u, a n d t h e s e c o n d t e r m c o r r e s p o n d s t o t h e c o n t r i b u t i o n if n o a r r i v a l h a p p e n s before u.

Now t h e e x p r e s s i o n w i t h i n t h e i n t e g r a l is i n c r e a s i n g in t:

Now

is b y L e m m a 2 a n d t h e fact t h a t F j • Z is D F R ; a n d Fj ]~(t+~) (t--t-x) is since F j is I F R . I t follows t h a t t h e i n t e g r a l is i n c r e a s i n g in t a n d h e n c e so is Wi(t), c o m p l e t i n g t h e i n d u c t i o n . [ ]

dvt du

f ~ ( u ) R s ( ~ ) _ _p j _( u )

v5+1(~) f j ( u )

l~j(t) - Fj(t-=--)

Wj+l(t + x) is b y t h e i n d u c t i o n h y p o t h e s i s ;

, ,

l~j(t) r,z(u)

fj (u)Z(u)[Wj+l (u) -- rj] - rjPj (u)z(u)

3.1

ies(t)

S u p p o s e u* is t h e i n f i m u m of all values s u c h t h a t

THEOREM 2. If F has decreasing failure rate, Z has increasing failure rate, and Fm• Z has increasing failure rate, then in the optimal plan for each number of responses j >_ m - i there is a single transition point. Specifically, an aggregator with j responses should wait only until some time t 5 for another response.

fj(u*)2(u*)[Wj+~(u*) - r31 - r3F3(u*)z(u*) _> O. E q u i v a l e n t l y , u* is t h e i n f i m u m of all values s u c h t h a t

fj(u*) z(u*) P~(u*) [ w s + l ( u * ) - rj] _> r~ 2 ( ~ * ) T h e left h a n d side a b o v e is i n c r e a s i n g in u*, as F j is I F R , Wj+l(u*) > rj+l > rj, a n d Wj+i is i n c r e a s i n g b y t h e ind u c t i v e h y p o t h e s i s . T h e r i g h t h a n d side is decreasing, as Z is D F R . H e n c e for all u > u , (u) -

PROOF. T h e p r o o f follows T h e o r e m 1, w i t h t h e d i r e c t i o n s "reversed." T h e goal of t h e b a c k w a r d s i n d u c t i o n is n o w t o show:

z(~)

r3] _> rj Z(u) ' • W i t h j sources, t h e o p t i m a l s t r a t e g y is t o w a i t o n l y u n t i l some t i m e tj for a n o t h e r response.

or equivalently, for u _> u*,

fj ( u ) 2 ( u ) [ W j + l ( U ) - rj] - rjFj (u)z(u) > O.

• The function Wj(t) =

H e n c e t h e d e r i v a t i v e of dvt/du is n o n - n e g a t i v e over s o m e i n t e r v a l u > u* a n d n o n - p o s i t i v e for u < u*, as was t o b e shown.

( 5=t

Vj+i(x

dx

is d e c r e a s i n g in t.

S i m i l a r l y t o T h e o r e m 1, we h a v e t h a t if F j • Z is I F R , so is Fk • Z for all k _> j u n d e r t h e s e c o n d i t i o n s . Following T h e o r e m 1, for a n y fixed t a g a i n let vt(u) b e t h e v a l u e o b t a i n e d b y t h e a g g r e g a t o r if, s t a r t i n g a t t i m e t, it chooses t h e policy of w a i t i n g u n t i l t i m e u _> t a n d t h e n r e t u r n i n g . As in T h e o r e m 1 we find

I t follows t h a t

Re(t),

½(t)/2(t)

T h e a b o v e are b o t h t r u e for j = n (here t~ = -cx~). T h e first p a r t of t h e i n d u c t i o n will also go t h r o u g h for j = m - 1, a n d h e n c e t h e i n d u c t i o n yields t h e t h e o r e m .

T h e a b o v e a r g u m e n t says t h a t a t a n y t i m e t, t h e o p t i m a l s t r a t e g y is e i t h e r t o r e t u r n i m m e d i a t e l y or wait for a n o t h e r response. Let tj b e t h e i n f i m u m of all values of t s u c h t h a t it is b e t t e r t o wait for t h e ( j + 1)st source a t t i m e t. It m u s t b e t h e case t h a t for e v e r y t > t j t h e a g g r e g a t o r s h o u l d wait for a n o t h e r r e s p o n s e a n d for e v e r y t < t j t h e a g g r e g a t o r s h o u l d return immediately.

Vj(t) = max

Decreasing Failure Rate Return Times

N a t u r a l l y , we ask w h e t h e r t h e a b o v e r e s u l t c a n b e e x t e n d e d in t h e a l t e r n a t i v e case w h e r e sources h a v e a d e c r e a s i n g failu r e r a t e . S u c h a m o d e l is r e a s o n a b l e in t h i s s e t t i n g ; it m a y b e t h a t once we h a v e n o t h e a r d f r o m a source, t h e r e is reas o n t o believe it is b u s y or b r o k e n a n d m i g h t n o t r e s p o n d for some time. W e p r o v e a r e s u l t similar t o T h e o r e m 1. W e also p r o v i d e a s e c o n d r e s u l t t h a t loosens t h e r e s t r i c t i o n s o n F a n d Z , b u t r e q u i r e s a d d i t i o n a l r e s t r i c t i o n s o n t h e values rj.

W e show t h a t t h e n u m e r a t o r of t h e final r i g h t h a n d side a b o v e is n o n - p o s i t i v e for all u in some i n t e r v a l t < u < u*, a n d n o n - n e g a t i v e for all u _> u* ( w h e r e u* m a y t a k e o n t h e v a l u e co). I t follows t h a t t h e m a x i m u m value for v is a c h i e v e d e i t h e r w h e n u = t or in t h e limit as u goes t o infinity. T h a t is, t h e o p t i m a l s t r a t e g y is e i t h e r t o r e t u r n i m m e d i a t e l y or wait for t h e n e x t arrival.

Fj(u)fJ(u)[ W j + l

Z(t+x) F~(t-l-x) z(t) Fs(t)

dvt du

.

148

f j ( u ) Z ( u ) [ W j + l ( u ) - rj] - r j f j ( u ) z ( u ) ff'j ( t )

)

Suppose ty is the infimum of all values such that -~dv, < 0. Equivalently,

f3(ty)

FAq) [ W ~ + ~ ( t j )

to 0 (from the right).

W j ( t + A) - Wj(t) < - [ t j fj(x_)Wj+_l(x)2(X) dx J~=t~-a Fs(t)Z(t)A

z(tJ

- r3] _> r~

2(t~)"

1(

The left hand side above is decreasing in ty, as Fj is DFR, Wy+i _> rj+~ > r3, and Wj+t is decreasing by the inductive hypothesis. The right hand side is increasing, as Z is IFR. Hence for all u _> ty,

+K

dWj(t) dt

Hence the derivative of dvt/du is non-negative over some interval u > tj and non-positive for u _< tj. Hence vt first increases to t3 and then decreases, from which we conclude there is a point t3 that the aggregator should wait until before returning. Further note that the numerator of dvt/du is independent of t, and hence the value t3 is in fact independent of t; that is, the value t3 completely summarizes the optimal plan for the aggregator when it has j responses.

fj(t~)Ws+l(tj)2(ty ) F¢(t)2(t) + (R~(tJl~3(tj)(fj(t)2(t) + Pj(t)z(t)) (Fj(t)2(t)) 2

and Rj (t) = rjZ(t), so simplying the above yields

dWj(t) < r~F~(tj)2(t~) { fs(t) z(t) fs(t3) z(tj ) dt 13j(t)~(t) ' ~ ~ + 2(t) - 1~(t5) - 2 ( t j . " Hence dWd~t(t) is non-positive if

fj(tj) + z(tj) fj(t) z(t) > O. Z ( t j - Ps(t) - 2(t) -

Fj(t¢)

Since the failure rate of Fj • Z is increasing, the above holds, and the induction goes through. []

fj (x)V3+~ (x) A) dx

ft~

Pj(t + A)2(t +

:t+a

- jf~ f'(~)Y'+~(~)d Fj(t)2(t)

+ R3(t~)F3(t3)

(

x

F~(t +

A)2(t + A)

1)

-

F3(t)2(t)

If the reward function has exponential decay, then Theorem 1 and Theorem 2 both hold for any number of responses if the return distribution times are exponential. We therefore obtain the following corollary: "

COROLLARY 1. If Z(t) and F(t) are given by exponential distributions, then the optimal strategy for the aggregator is to always wait for k sources to respond for some k.

Let us examine the first two terms on the right hand side. Note

t~ fj(x)Vj+l(x) =t F¢(t)2(t) d x =

PROOF. In this case, Theorem 1 says that for each number of responding sources we return immediately or wait forever, while Theorem 2 says that for each number of responding sources we wait until some fixed time and return. Both can only hold if for j responding sources we either always return immediately or always wait for another source, independent of the time passed. Hence the aggregator always waits for a fixed number of sources to respond. []

j t~-tWs+l(z + t) f j ( x

+ t) Fj(x + t ) 2 ( z + t) P j ( x + t) F~(t)2(t)

=0

The product inside the integral is decreasing in t for a fixed x by Lemma 2 and our assumptions on Fj and Z. Hence

t~ fJ(x)Vi+l(X) =*+aFJt+A)2(t+A)

Zx f t j f j ( x ) V j + l ( x ) , ~ -].=t F~(t)2(t) a x =

~x t~-t-A f j ( x + t +A)Wj+I(X-~ t + A ) Z ( x + t + A ) d x =0 Fd(t + A ) Z ( t + A) f t~-t f~(x + t)W¢+~(x + t ) 2 ( x + t) Fy(t)Z(t) dx -

< -

f j ( t j ) W j + l ( t j ) Z ( t j ) = r j f j ( t j ) Z ( t j ) + rjz(tj)Fj(tj),

w~ (t + zx) - w , (t) :

_< _

]"

where we have replaced the limiting terms by the appropriate derivatives. Recall that

We now need to show that Wy (t) is decreasing up to ty. To determine the sign of the derivative of W3(t), we begin by considering the value of W~ (t + A) - Wy (t), where t and t + A are less than t j; indeed, we will think of A > 0 as going to 0.



Fj(t)Z(t)

Taking the limit gives

f¢(u)2(u)[W~+~(u) - r j - r~P~(u)z(u) k O.



\F~(t+A)Z(t+A)-

This corollary can also be proved directly using the algebraic properties of the exponential distribution, but the above proof based on first principles is appealing.

xtj = 0

Our second theorem for this case (inspired by [18]) places restrictions on the values rk. In stopping time nomenclature, it is an example of an infinitesimal stopping rule [14], where the stopping rule is easily expressed in terms of the input parameters.

--t

f j ( x + t)Wj+l(x + t ) Z ( x :t~-t-a F~(t)2(t) + t) dx

THEOREM 3. If F has decreasing failure rate and Z has increasing failure rate, and moreover r~+l r~ is decreasing in

Now consider (W3 (t + A) - W5( t ) ) / A in the limit as A goes

149

rate. In the optimal plan for each set of respondents S with ISI > m - 1 there is a single transition point. On receiving a response the aggregator should either return immediately or wait for the next response.

j for j _> k, then in the optimal plan there is a time tj for each j _> k so that an aggregator with j responses should wait only until time t~ for another response. Here tj is the first time such that -~:-~-~.~ < ~z(tj), or tj = cx~ if Fj(tj) ( - ~ A - 1) -rto such time exists.

PROOF. Let Vs(t) be t h e e x p e c t e d payoff to t h e aggregator using the o p t i m a l s t r a t e g y given t h a t at t i m e t all the sources in S have been heard from. We use a similar backward i n d u c t i o n as in T h e o r e m 1. Consider any set S of sources of size j < n. Let 7- be t h e set of all sets of t h e form {S} t2e for some e ~ S, and V~-(t) = ~ T e T VT(t)/IT-I" Then, following T h e o r e m 1, consider vt(u) to be t h e value o b t a i n e d by t h e aggregator if, s t a r t i n g at t i m e t, it chooses the policy of waiting until t i m e u _> t for some fixed set S. We have

PROOF. Again the p r o o f follows a reverse induction. Recall

dvt du

f j ( u ) Z ( u ) [ W j + l ( u ) - rj] - r j F j ( u ) z ( u ) 1~j( t )

This is positive w h e n e v e r

fj(u) (Wj+l(u) l~Au) \ r5

1)>

z(u) - Z(u)

vt(u)

Let tj be t h e first t i m e where

f~ =

fs(t3) (r~+~ _ 1) > z(t3) Pj(t,) \ rj - 2(tD"

T h e proof now follows T h e o r e m 1, w i t h VT-(x) and W~-(x) replacing VVj+i(x) and Wj+l(X) as appropriate. T h e point is t h a t if each W T ( t ) is decreasing, so is their average. []

(If no such t i m e exists, we take tj = c~.) We claim t h a t t h e o p t i m a l plan is to wait until t i m e tj and t h e n stop. Since Wi+m(u) _> r3+1, Fy has decreasing failure rate, and Z has dvt increasing failure rate, at all times u _< t i we have t h a t -His positive. Hence we should always wait until t i m e tj.

In the second case, t h e r e t u r n value function R j (t) = rj Z(t) depends only on t h e n u m b e r of sources, b u t t h e r e t u r n t i m e distribution F~ for each source e m a y vary, u n d e r t h e constraint t h a t all distributions are I F R . We let Fs be t h e dist r i b u t i o n of t h e t i m e until the first response for all sources excluding those in S, and similarly define f s appropriately.

We now claim t h a t t j + l _< tj. This holds trivially when j = n - 1 , and holds for o t h e r values o f j from t h e definition ~+1 is decreasing in j and ~F j ( x ) > - ~ ) " Since t3+l < as rj -- Fj+l(X -tj, by a reverse induction W j + l ( t i ) = ri+l. T h i s implies t h e aggregator need not wait past t i m e t i for an o p t i m a l schedule. []

4.

V 'xXfJ(X) dx + l~j(u) "~ " " :~ ~ J ~ S ;,(t-----S~ u ) "

THEOREM 5. Consider the setting where each return distribution F~ has increasing failure rate, Z has decreasing failure rate, and Fs • Z has decreasing failure rate for some subset S of the sources. In the optimal plan for each set of respondents T satisfying S C_ T there is a single transition point. Moreover, on receiving a response the aggregator should either return immediately or wait f o r the next response.

RELATED CASES

Up to this point, we have considered t h e case where all sources were indistinguishable, b o t h in t e r m s of t h e distribution of their response t i m e and t h e value of their response. It is n a t u r a l to ask if we can generalize to less stringent cases. We show t h a t T h e o r e m s 1 and 2 can be generalized n a t u r a l l y if either the distributions are different or t h e value corresponding to individual sources are different, as we show below. In these cases, however, there m a y be a transition point for each possible subset of sources, so t h e size to represent t h e o p t i m a l plan m a y grow e x p o n e n t i a l l y in t h e n u m b e r of sources n. This m a y still be suitable in some situations, where the n u m b e r of sources is small (in t h e tens) b u t there m a y be some variability in either response t i m e or t h e value of answers. For convenience, we state our results as variations of T h e o r e m 1 only.

PROOF. Following T h e o r e m 1, consider vt(u) to be t h e value o b t a i n e d by t h e aggregator if, s t a r t i n g at t i m e t, it chooses t h e policy of waiting until t i m e u >_ t for some fixed set Q. We have

vt(u) = f ~ = t V j + m ( x )f ~Q ( x ) dx + FQ(u) FQ(t) R 3(u)" T h e r e is now a backwards i n d u c t i o n on supersets of S t h a t goes t h r o u g h exactly as in T h e o r e m 1. []

To begin, consider t h e case where for a subset S of sources there is an associated value R s ( t ) = r s Z ( t ) . For example, each source e could have its own corresponding value r~, and t h e value r s would be the s u m of the r~ for the set. O f course the r s should be increasing, in the sense t h a t if S C T t h e n r s _< rT. All sources are governed by the same distribution for the response time.

We do not have a p r o o f for t h e case w h e n b o t h t h e response times and the values m a y vary across sources. T h e p r o b l e m in this case is t h a t as t i m e changes which source is most likely to r e t u r n m a y change, and this m a y t h e n affect t h e value of waiting for the n e x t response. Hence it is not clear if in fact b o t h response times and response values v a r y t h a t t h e o p t i m a l plan continues to have such a nice form; a s t r o n g e r condition m a y be required.

THEOREM 4. Consider the setting where R s ( t ) = r s Z ( t ) , the r s are increasing, F has increasing failure rate, Z has decreasing failure rate, and F m • Z has decreasing failure

T h e above t h e o r e m s m a y be more useful in settings where we have m a n y sources but a small n u m b e r of types of sources.

150

For example, suppose we have a small set of e x t r e m e l y valuable sources and a larger set of less valuable sources, where all sources have the same response t i m e distribution. If t h e value to t h e aggregator is just a function of the n u m b e r of e x t r e m e l y valuable and less valuable sources t h a t have been heard from, t h e n we may use T h e o r e m 4. In this case, the o p t i m a l plan will have one transition point associated with each pair (x, y), where x is the n u m b e r of e x t r e m e l y valuable sources and y is the n u m b e r of less valuable sources t h a t have responded. In this case the size of t h e o p t i m a l plan in no longer exponential in the n u m b e r of sources.

5.

larger if it waits until t i m e 2 for t h e second response and t h e n returns at t i m e 2 if it fails to arrive. • For times in the range [0, 0.4], if t h e aggregator receives a first response, the e x p e c t a t i o n for the aggregator is larger if it returns i m m e d i a t e l y rather t h a n waits until t i m e 2 for a second response. More specifically, there are in fact two transition points, one at a b o u t 0.406 and one at 2. A l t h o u g h t h e support of the response distribution is not all positive n u m b e r s in our example, we could easily modify t h e distribution to have this property, keeping the density in portions outside t h e intervals suitably smM1. Also, we could use distributions consisting of more disjoint intervals to make the o p t i m a l schedule even more complex.

A C O U N T E R E X A M P L E FOR A W E A K E R DISTRIBUTION P R O P E R T Y

For completeness, we provide a c o u n t e r e x a m p l e to show t h a t some strong condition is necessary for our t h e o r e m s to apply. In all of our theorems, the case where we have heard from n 1 sources is the easy first step of our backward induction. In T h e o r e m 1, for example, just the fact t h a t F has increasing failure rate and Z has decreasing failure rate is sufficient to g u a r a n t e e t h a t there is only one transition point w h e n n - 1 sources have responded. Our c o u n t e r e x a m p l e shows t h a t a weaker condition on the response times t h a n T h e o r e m 1 fails to g u a r a n t e e t h a t optimal plans have only a single transition point even with n - 1 responses.

6.

CONCLUSION

We have introduced the n a t u r a l p r o b l e m of aggregation, which arises in Web search engines as well as o t h e r dist r i b u t e d systems. T h e aggregation p a r a d i g m appears quite general; we believe it will prove a foundational framework for several applications. Our focus in this work has been to det e r m i n e a p p r o p r i a t e probabilistic assumptions under which the o p t i m a l plan for the aggregator has a simple form.

T h e m e a n residual life of X at t i m e t is defined as m x (t) = E [ X - t ] X > t]. We define rex(t) to be 0 where F ( t ) = 0. T h e r a n d o m variable X is said to have decreasing mean residual life or be D M R L if r e x ( t ) is decreasing. It is easy to show t h a t if X is I F R t h e n it is D M R L , but t h e reverse need not hold. We here show t h a t if the response t i m e distributions for sources have decreasing m e a n residual life, t h e optimal plan m a y have more t h a n one transition point for each n u m b e r of sources heard from.

Our work suggests m a n y further questions for future study; we list some possibilities. It would be interesting to know how complex an optimM solution can be, or if there is a n a t u ral way to tie the complexity of the response distributions to t h e complexity of t h e o p t i m a l plan. P r o v i n g t h e existence of and designing algorithms for finding a p p r o x i m a t e l y o p t i m a l plans w i t h low complexity in t e r m s of the n u m b e r of transition points would also be useful. Designing on-line systems t h a t deal properly w i t h widely heterogeneous sources, either exactly or w i t h a p p r o x i m a t e algorithms, may be possible; t h e full plan would not need to be c o m p u t e d in advance, b u t p r o p e r responses might be calculated on the fly. Ext e n d i n g results to cases where response t i m e distributions m a y be correlated in non-trivial ways might be i m p o r t a n t for practical applications.

For our counterexample, we use the fact t h a t the uniform distribution over the range [0, 2] O [4, 12] is D M R L . Call this distribution D. We consider a simple p r o b l e m with two sources with response t i m e distribution D. T h e reward is governed by an exponential decay over time. W i t h no responses, the reward is always 0. W i t h one response, at t i m e t the reward is e -t. W i t h two responses, at t i m e t the reward is 10e - t . T h e idea of this c o u n t e r e x a m p l e is t h a t by m a k i n g the reward large for receiving two response, we encourage t h e aggregator to wait if it has received a single response up to the gap where responses will not arrive. A t the same time, the gap dissuades an aggregator from waiting if one response comes in early enough.

7.

REFERENCES

[1] M. Ajtal, N. Megiddo, and O. Waarts. I m p r o v e d Algorithms and Analysis for Secretary P r o b l e m s and Generalizations. S I A M Journal on Discrete Mathematics, 14(1):1-27, 2001. [2] S. Albers and G. Schmidt. Scheduling with U n e x p e c t e d Machine Breakdowns. In Lecture Notes in Computer Science 1671, pages 269-280, 1999.

T h e following four facts are easy to verify using c o m p u t a t i o n and a case analysis, as shown in t h e A p p e n d i x .

[3] B. Awerbuch, Y. Azar, A. Fiat, and T. Leighton. Making C o m m i t m e n t s in t h e Face of Uncertainty: How to Pick a W i n n e r A l m o s t E v e r y Time. In Proceedings of

• At t i m e 4 and higher, if t h e aggregator has one response, it should always wait for a second.

the 28th Annual A C M Symposium on Theory of Computing, pages 519-530, 1996.

• At t i m e 2, if the aggregator has one response, it should return rather t h a n wait at least two t i m e units for the next response.

[4] D. Bertsekas and J. Tsitsiklis. Neuro-Dynamic Programming. A t h e n a Scientific, Belmont, MA, 1996.

• For times in the range [0.5, 2], if the aggregator receives a first response, the e x p e c t a t i o n for the aggregator is

[5] J. A. Boyan and M. Mitzenmacher. I m p r o v e d Results for Stochastic T r a n s p o r t a t i o n Networks. In Proceedings

151

making the reward large for receiving two response, we encourage the aggregator to wait up to the gap where responses will not arrive if it has received a single response. The gap will dissuade an aggregator from waiting, however, if one response comes in early enough. There are two transition points, one at about 0.406 and one at 2. We simply show that there are at least two transition points.

of the Twelfth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 895-902, 2001. [6] Y. S. Chow, H. Robbins, and D. Siegmund. Great

Expectations: The Theory of Optimal Stopping, Houghton Mifflin Company, Boston, 1971. [7] M. Datar and A. Ranade. Commuting with Delay Prone Buses. In Proceedings of the Eleventh Annual A CM-SIAM Symposium on Discrete Algorithms, pages 22-29, 2000.

First, we show that at time 4 and higher, if the aggregator has one response, it should always wait for a second. In the language of Theorem 1, for t _> 4 the aggregator should wait if

[8] E. B. Dynkin. Markov processes and related problems of analysis, Cambridge University Press, Cambridge, 1981.

fj(t) z(t) ~ ( t ) [ws+l(t) - rj] > 2(t)"

[9] T. Ferguson. A Poisson Fishing Model. In Festsehrift for Lucien Le Cam, pp. 234-244, Springer, 1997.

1 In this case ~f j ( t ) _> ~, Wj+z(t)-rj Hence the aggregator should wait.

[10] B. Kalyanasundaram and K. R. Pruhs. Fault-Tolerant Scheduling. In Proceedings of the 26th Annual A C M Symposium on Theory of Computing, pages 115-124, 1994.

> 9, and ~z(t) = 1.

Second, at time 2, if the aggregator has one response, it should return rather t h a n waiting at least two time units for the next response. The reward by returning is e -z. If the aggregator waits, from the above paragraph it should wait until the second arrival. Comparing the rewards we find

[11] B. Kalyanasundaram and K. R. Pruhs. Fault-Tolerant Real Time Scheduling. In Proceedings of the 1997 European Symposium on Algorithms, pages 296-307, 1997.

e --2_> ft12~_e-tdt.

[12] K. C. Kapur and L. A. Lamberson. Reliability in Engineering Design. John Wiley and Sons, New York, 1977.

=4

Third, consider the case where the aggregator has one response at a time u in the range [0, 2]. We compare rewards if the adversary waits until time 2 and if it returns immediately. If it returns immediately, the reward is e -~. If it waits, the reward is

[13] M. L. Puterman. Markov Decision

Processes--Discrete Stochastic Dynamic Programming. John Wiley ~z Sons, Inc., New York, NY, 1994. [14] S. Ross. Infinitesimal Look-Ahead Stopping Rules. The Annals of Mathematical Statistics, 42(1):297-303, 1971.

(~t 2 --el0 _tdt)+ 8 =4 10 - u l~-u

[15] S. Ross. Stochastic Processes. John Wiley and Sons, New York, 1983.

e

_~

Graphing the two results numerically shows that it is better to wait for times in the range [0.407, 2] but return for times in the range [0.405, 2].

[16] M. Shaked and J. G. Shanthikumar. Stochastic Orders and Their Applications. Academic Press, San Diego, 1994. [17] N. Starr and M. Woodroofe. Gone fishin': Optimal stopping based on catch times. University of Michigan Technical Report, Department of Statistics, 1974. [18] N. Starr, R. Wardrop, and M. Woodroofe. Estimating a mean from delayed observations. Zeitschrift fur

Wahrscheinlichkeitstheorie und verwandte Gebeite, 35:103-116, 1976. [19] D. Stoyan. Comparison Methods for Queues and Other Stochastic Models. John Wiley and Sons, Berlin, 1983.

Appendix: A Counterexample For our counterexample, we use the fact that the uniform distribution over the range [0, 2] k) [4, 12] is DMRL. Call this distribution D. We consider a simple problem with two sources with response time distribution D. The reward is governed by an exponential decay over time. With no responses, the reward is always 0. With one response, at time t the reward is e -t. With two responses, at time t the reward is 10e -t. The idea of this counterexample is that by

152