PATTERNS OF BUFFER OVERFLOW IN A CLASS OF QUEUES WITH LONG MEMORY IN THE INPUT STREAM David Heath, Sidney Resnick and Gennady Samorodnitsky
Cornell University Abstract. We study the time it takes until a a uid queue with a nite, but large, holding capacity reaches the over ow point. The queue is fed by an on/o process, with a heavy tailed on distribution, which is known to have long memory. It turns out that the expected time until over ow, as a function of capacity L, increases only polynomially fast, and so over ows happen much more often than in the \classical" light tailed case, where the expected over ow time increases as an exponential function of L. Moreover, we show that in the heavy tailed case, over ows are basically caused by single huge jobs. An implication is that the usual GI=G=1 queue with nite but large holding capacity and heavy tailed service times, will over ow about equally often no matter how much we increase the service rate. We also study the time until over ow for queues fed by a superposition of k iid. on/o processes with a heavy tailed on distribution, and show the bene t of pooling the system resources as far as time until over ow is concerned.
1. Introduction.
Trac on data networks, e.g. Ethernet LANs, has characteristics substantially dierent from those of traditional voice trac. An important feature of data trac lies in its dependence structure; traditional models are based on assumptions of short-range dependence (like Poisson arrivals and exponential call lengths), while recent measurement and analysis of data trac has produced strong indications of longrange dependence and self{similarity. Several empirical studies present statistical evidence for existence of these non{standard dependence structures. See for example Leland, Taqqu, Willinger and Wilson (1993, 1994); Willinger, Taqqu, Leland and Wilson (1995); Crovella and Bestavros (1995); Cunha, Bestavros and Crovella (1995). Seeking an explanation for the observed long range dependence and self{similarity, Willinger, Taqqu, Sherman and Wilson, (1995) have modeled trac between a single source and destination as an on/o or packet train process. In their model, an idealized source alternates between an on state, in which it produces data at a constant rate, and an o state in which it produces no data. The durations of the on and o periods are independent; on times are identically distributed, and so are o times. The data they present indicates that both on and o times are reasonably well modeled by heavy tailed distributions with shape parameter governing heaviness represented by the parameter . In one example, = 1:7 and 1:2 respectively for the on and o periods. A similar conclusion was drawn by Crovella and Bestavros (1995) who in their study of world wide web use found evidence of heavy tails in such things as le lengths, transfer times, and operator idle periods. Other papers dealing with on/o and related models for communication systems are Brichet et al (1996), Kella and Whitt (1992), Choudhury and Whitt (1995). 1991 Mathematics Subject Classi cation. 60K25, 90B15.. Key words and phrases. long range dependence, heavy tails, on/o models, G/G/1 queue, uid models, long memory, heavy tailed distribution, regular variation, time to hit a level, buer over ow, maximum workload, weak convergence. David Heath, Sidney Resnick and Gennady Samorodnitsky were partially supported by NSA Grant MDA904-95-H-1036 at Cornell University. Resnick and Samorodnitsky also received support from NSF Grant DMS-94-00535 at Cornell University. Samorodnitsky also acknowledges support from the United States { Israel Binational Science Foundation (BSF) Grant 92-00074 1
2
HEATH, RESNICK AND SAMORODNITSKY
Various paradigms for on/o models can be kept in mind. One is the storage or uid queue model where the store is lling at rate 1 during an on period and the contents are subject to constant release at rate r when the content level is positive. Another paradigm allows one to imagine work entering the system at rate 1 during on periods and a server working at rate r. We use either paradigm as is convenient. In a previous paper we studied the stationary distributions of the simple on/o models. In the present paper we study the behavior of the rst time the contents process exceeds level L for large levels L. Since this represents the time until "buer over ow" in an on/o system with limited capacity, it is important in understanding the behavior of trac networks. The simplest model consisting of a single on/o source feeding a single server queue, is de ned as follows. Let fXi ; i = 1; 2; : : : g be a sequence of iid nonnegative random variables representing on periods, and similarly let fYi ; i = 1; 2; : : : g be iid nonnegative random variables representing o periods. The on and o sequences are independent. Let Fon be the common distribution of Xi 's, and Fo be the common distribution of Yi 's. The workload arrives in the system at rate 1 during on periods (no workload arrives in the system during o periods). The service rate is r. That is, whenever the system is nonempty, workload is leaving the system at rate r. The state of the system at time t (its content at time t, the workload in the system at time t) is denoted by X (t), and can be formally de ned as follows. For a t 0 let (1.1)
1; if Pn?1(Xi + Yi) t < Pn?1(Xi + Yi) + Xn; for some n 1; i=1 i=1 Z (t) = 0; otherwise.
So Z (t) is the indicator of the the source \being on" at time t. De ning the service rate at state x by
r; if x > 0; r(x) =
0; if x = 0;
the state process fX (t); ; t 0g is de ned by (1.2)
dX (t) = Z (t)dt ? r(X (t))dt:
The analogous GI=G=1 queue can be thought of as model (1.2) with on periods shrunk to zero and workload arriving in the system in lumps of size fBi ; i 1g. In this context, fYi ; i 1g can be thought of as interarrival times. One can take, for example, Bi = (1 ? r)Xi , as this is the net increase in the state of the system (1.2) after the ith on period. However, the discussion below does not depend on this particular form of the oered work. The service rate is still r, so that the actual service time of the ith customer is Bi =r; ; i 1. It turns out that the system over ow patterns in the on/o model, when the on times are heavy tailed, are very similar to those of the GI=G=1 queue when the amount of work Bi is heavy tailed. The computations describing the structure of system over ows in both cases are similar, and they are easier for the GI=G=1 queue. We will present the detailed arguments only for the more involved case of the uid model (1.2). To emphasize the dramatic eect of heavy tailed distributions on system behavior, we contrast the heavy tailed case with the simplest, classical case in Section 3. Consider the uid queue with Fon being exponential with mean on , Fo being exponential with mean o, such that on < r: (1.3) on + o Clearly, (1.3) is the necessary and sucient condition for the system (1.2) to be stable. If (1.4)
(L) = inf ft 0 : X (t) Lg
HEAVY TAILS AND LONG RANGE DEPENDENCE
3
is the time until the system over ows, then based on martingale and Markov methods, we nd in Section 3 that (1.5) E (L) = a exp L( (1 ? 1r) ? r1 ) ? 1 ? bL; ; L 0; on o where 2 a = ((ron +? (1o?)(rr)o))2 ; and
o
on
b = r ?on (1+ ?or) : o on Several conclusions are immediate from (1.5). First of all, in the exponentially distributed on and o periods case, the expected time till the system over ows increases exponentially fast with the system holding capacity. Secondly, the average o time, o , critically aects the average time until the system over ows since it aects the multiplicative constant as well as the growth rate. Quite dierent conclusions are reached in the heavy tailed case. In Section 2 we show that in the case of on times having a heavy tailed distribution, the expected time to exceed L is asymptotically the same as the expected time until a single on period would cause the contents to exceed L assuming the contents were empty at the start of the on time. This is very dierent from the exponential result (1.5), for in this case the expected time until a single on period causes the system to exceed the capacity L is asymptotic to (on + o) exp (1 ?Lr) ; on which is of a larger order of magnitude than (1.5). In the case of heavy tailed on periods the expected time until the system exceeds a level grows much slower than the exponential rate of increase seen in (1.5). Furthermore, the fact that in the heavy tailed case the system over ow is caused by a single long on period implies that the mean o time o aects the expected time until over ow only by its eect on a multiplicative factor but does not otherwise in uence the growth rate. A similar conclusion is valid for the GI=G=1 queue with heavy tailed amounts of work fBi ; i 1g. In this case the oered workload exceeds the system capacity L when a single customer brings amount of work reaching L. In particular, the mean interarrival time aects the time until the over ow only as a multiplicative factor, and it does not depend on the service rate r (!). This provides intuition about the "failure modes" of such a system. Precise arguments showing unusual behavior in the heavy tailed case are presented in the next section where we study the maximum of the uid queue (1.2) over a single \wet period", and use the ndings to obtain functional limit theorems for the maximum process of the queue (1.2) and for the hitting time process of the same queue. Section 3 contrasts in detail the behaviors in cases where on and o distributions have exponentially bounded tails with the heavy tailed case. Tangentially relevent papers on extremes of queues (which typically emphasize Markovian methods and exponential tails) are Iglehart (1972), Asmussen and Perry (1992), Berger and Whitt (1995); Abate, Choudhury and Whitt (1994). In Section 4 we study the behavior of models with several on/o sources and a single server. We show again that in the case r < 1 the asymptotic behavior of the time at which the contents process exceeds L is the same as that of the rst time that any of the input processes has an on period long enough to achieve level L from an empty initial content level. We then compare the behavior of a system of completely separate on/o processes with one in which the inputs are pooled and in which the capacity of the system is the sum of the capacities of the separate systems. Our conclusions quantify the bene ts of pooling the system resources. Other papers on multisource models, usually emphasizing Markovian environments, are Anick, Mitra and Sondhi (1982), Prabhu and Pacheco (1995), Pacheco and Prabhu (1996).
4
HEATH, RESNICK AND SAMORODNITSKY
2. Level crossing times in single input models.
In this section we consider the extreme values of the contents process speci ed in (1.2) and the time for the content to cross a level. The uid or storage model is generated by an alternating P renewal process which feeds a reservoir. We represent the renewal sequence as fSn; n 0g with Sn = ni=1 (Xi + Yi ); n 1 and for convenience we suppose S0 = 0. Both Fon and Fo have nite means on and o and we set = on + o. During an on period, liquid enters at net rate 1 ? r and during an o period liquid is released at uniform rate r. We assure that neither the input rate nor the output rate overwhelms the other by assuming (2.1) 1 > r > on :
P P De ne Sn(X ) = ni=1 Xi and Sn(Y ) = ni=1 Yi and the stopping time
(2.2)
N = inf fn > 0 : (1 ? r)Sn(X ) ? rSn(Y ) 0g
so that [N = n] = [(1 ? r)Sj(X ) ? rSj(Y ) > 0; j = 1; : : :; n ? 1; (1 ? r)Sn(X ) ? rSn(Y ) 0g 2 B(Xi ; Yi; i = 1; : : :; n): Consider fX (Sn ); n 0g. Comparing X (Sn ) with X (Sn+1 ) we get (2.3)
X (Sn+1 ) = (X (Sn ) + (1 ? r)Xn+1 ? rYn+1 )+ = (X (Sn ) + n+1 )+ ;
where fn+1 = (1 ? r)Xn+1 ? rYn+1g is iid. This equation expresses that the change of contents over a renewal interval is the input during the on period and the loss during the o period. Of course (2.3) is Lindley's equation (Resnick, 1992, page 270; Asmussen, 1987; Feller, 1971) and since (2.1) implies
E1 = (1 ? r)on ? ro = on ? r < 0; we know from standard theory that the process
fWn g := fX (Sn )g will be stable and E N < 1. As is customary, we call fWn g the queuing process . We suppose that (2.4)
1 ? Fon (x) = x? L(x); > 1; x ! 1;
where L is a slowly varying function. Note that the process fX (t); t 0g is regenerative (cf. Resnick, 1992; Feller, 1971; Asmussen, 1987). One set of regeneration times is
fCng := fSn : X (Sn ?) = 0g; which are the times when a dry period ends and input commences to ll the store. In order to understand the behavior of the extremes of fX (t)g, it is natural to study the extremes over a cycle. For this purpose, it is necessary to understand the tail behavior of the distribution of maximum of the queuing process over one cycle. A result about this is stated next.
HEAVY TAILS AND LONG RANGE DEPENDENCE
5
Proposition 2.1. For the stable queuing process fWng satisfying (2.1) and (2.4), the maximum over a cycle has a distribution tail asymptotic to the tail of the on distribution; that is, as x ! 1 P[
N _ n=0
Wn > x] P [1 > x]E (N )
P [(1 ? r)X1 > x]E (N ) (1 ? r) Fon (x)E (N ):
(2.5)
The proof of this critical result is deferred to the end of this section. Note the result depends on Fon and r but that Fo only aects the answer through the multiplicative factor E (N ). We now look at the extremes of fX (t)g over a cycle and examine the distribution tail of _0sC1 X (s) where C1 = SN : Note that N is the rst downgoing ladder epoch of the random walk n X
f
i=1
i ; n 0g = f(1 ? r)Sn(X ) ? rSn(Y ) ; n 0g
associated to the queuing process fWn g and that it is not the downgoing ladder epoch of fSn g which is determining the time scale. Corollary 2.2. Assume the contents process fX (t)g satis es (2.1) and (2.4). The distribution tail of the maximum of the contents process over one cycle is asymptotic to the tail of the on distribution; that is, as
x!1 (2.6)
P[
C1 _ s=0
X (s) > x] (1 ? r)Fon (x)E (N ):
Note again that Fo only aects the answer through the multiplicative factor E (N ). W 1 X (s). Because of the sawtooth character of the paths of X () we have that Proof. Set M1 = Cs=0 N _
M1 = and therefore
M1
j =1
(1 ? r)Sj(X ) ? rSj(Y?1)
(1 ? r)Sj(X ) ? rSj(Y ) =d
N _
N _
j =0
n=0
Thus lim inf P [M1 > x] lim inf x!1 F (x) x!1 on
where the last step uses Proposition 2.1.
P[
Wn :
WN W > x] j =0 n
Fon (x) =(1 ? r) E (N )
6
HEATH, RESNICK AND SAMORODNITSKY
To get a reverse inequality, choose K such that
E ((1 ? r)X1 ? r(Y1 ^ K )) < 0 which can always be done since
E (Y1 ^ K ) " EY1
as K " 1. Then
Sj(Y;K ) :=
which obviously gives
Also and thus we have
i=1
(Yi ^ K ) Sj(Y )
N _
N _ j =0
j X
(1 ? r)Sj(X ) ? rSj(Y?1)
j =1
) (1 ? r)Sj(X ) ? r Sj(Y;K ?1 :
N N (K ) := inf fn > 0 : (1 ? r)Sn(X ) ? rSn(Y;K ) 0g M1
(K) N_
) (1 ? r)Sj(X ) ? rSj(Y;K ?1
j =0 (K) N_
(1 ? r)Sj(X ) ? rSj(Y;K ) + r(Yj ^ K )
j =0 (K) N_ j =0
(1 ? r)Sj(X ) ? rSj(Y;K ) + rK :
We therefore have 1 > x] lim sup P [ lim sup P [M x!1 Fon(x) x!1
WN ( ) (1 ? r)S(X) ? rS(Y;K) > x ? rK ] j j j =0 K
Fon (x)
and applying Proposition 2.1 to the random walk f(1 ? r)Sj(X ) ? rSj(Y;K ) ; j 0g we get this equal to =(1 ? r)E (N (K ) ) ! (1 ? r)E (N ) as K ! 1. This provides the reverse inequality and completes the proof. We are now in a position to discuss the behavior of the extremes of the contents process and also the behavior of the rst passage time over a level. For a non-decreasing function U : (0; 1) 7! (0; 1) de ne the (left continuous) inverse U (x) = inf fs > 0 : U (s) xg; x > 0: De ne the non-decreasing process _t M (t) = X (s) s=0
HEAVY TAILS AND LONG RANGE DEPENDENCE
7
and the rst passage time (L > 0)
(L) = inf fs > 0 : X (s) Lg = inf fs > 0 : M (s) Lg =M (L): Standard inversion techniques from extreme value theory (Resnick, 1987, Section 4.4) allow for the simultaneous treatment of the weak convergence properties of (M (); M ()) as random elements of Dr (0; 1) Dl (0; 1) where Dr (0; 1) is the space of right continuous functions with nite left limits and Dl (0; 1) is the space of left continuous functions on (0; 1) with nite right limits. Each space is equipped with the M1 topology. Theorem 2.3. Assume the contents process fX (t)g satis es (2.1) and (2.4). De ne the quantile function
1 (s): b(s) = 1 ? F on
Let fY(t); t > 0g be the extremal process (Resnick, 1987, Section 4.3) generated by the extreme value distribution
(x) = expf?x? g; x > 0
so that
P [Y(t) x] = t (x):
De ne Then in Dr (0; 1) Dl (0; 1) as u ! 1
S (t) = 11?=r Y (t):
M (u) M (u) ) (S ; S ): b(u) ; b(u)
In particular we get for the rst passage process, as u ! 1
and
lim P
L!1
(1 ? r)
1= (1 ? Fon(u)) (u) ) Y ( 1? r )
(1 ? Fon (L)) (L) x = P [E (1) x] = 1 ? e?x ; x > 0;
where E (1) is a unit exponential random variable. Furthermore, as L ! 1
(1 ? Fon(L))E ( (L)) ! (1 ? r) :
Proof. We let fNk ; k 1g be the iterates of N so that Nk is the kth downgoing ladder epoch of the random
walk f(1 ? r)Sn(X ) ? rSn(Y ) ; n 0g. Then by the strong law of large numbers Nk =k ! E (N ) as k ! 1. We write 1 k 0 S_ k N _ _k @ S_Ni X (s)A := Mi M (SNk ) = X (s) = s=0
i=1 s=SNi?1
i=1
8
HEATH, RESNICK AND SAMORODNITSKY
so that fMi ; i 1g is iid. From Corollary 2.2, as x ! 1
P [M1 > x] (1 ? r) Fon(x)E (N ) so that as u ! 1
uP [M1 > b(u)x] (1 ? r) E (N )uFon(b(u)x) (1 ? r) E (N )x?: Therefore, [_ ut]
Mi = M (SN[ut] ) ) (1 ? r)(E N )1=Y (t): b(u) i=1 b(u)
(2.7) Observe that as u ! 1
SN[ut] u ! E (N )t in C (0; 1). For the renewal sequence fSNk ; k 0g, let (t) = inf fk : SNk tg (2.8)
be the associated counting function so that as u ! 1 (ut) ! t = t (2.9) u ESN E (N ) in C (0; 1). Note the inequalities M (SN(ut)?1 ) M (ut) M (SN(ut) ) b(u) b(u) : b(u) Now from Billingsley, 1968, Theorem 4.4 and composition M (SN(ut) ) M (SN[u(ut)=u] ) )(1 ? r)(E N )1= Y(t=E N ) b(u) = b(u) d ? r)?1=Y (t) =: S (t) =(1 in the J1 topology and we hope the same result is true in the M1 topology for the family of processes M (u)=b(u) as u ! 1. In order to verify this, we need to show M (SN(ut) ) ? M (SN(ut)?1 ) (2.10) )0 b(u) in the M1 topology. For a xed t we get for any and large u that
" M (S ) ? M (S P N( ) b(u) N( ut
)?1 )
ut
# " M (S ) ? M (S # N( ( ? )) ) > P N( ) > b(u)
! P [jS(t) ? S (t ? )j > ]
ut
u t
HEAVY TAILS AND LONG RANGE DEPENDENCE
9
which goes to 0 as ! 0 by the stochastic continuity of S (Resnick, 1987, Proposition 4.7). The multivariate analogue needed to prove M1 convergence is similar. 12
Random off times
4
log(mean hitting time) 6 8 10
simulation approximation
1
2
3
4
5
6
log(level)
Figure 2.1. Pareto on/o periods, = 1:5; r = :53. The weak convergence result for () is obtained by taking inverses in the process convergence. Inversion is a continuous operation in the M1 topology. We note that inverses of extremal processes have exponential marginals (Resnick, 1987) so as u ! 1 and changing variables s 7! b(u) yields
M (b(u)x) ) S (x) u M (sx) ) S (x): 1 1?Fon(s)
Observe for y > 0
P [S (1) y] =P [1 S (y)] 1= =P [ 1? r Y (y)] =1 ? expf?y(1 ? r) ?1g: Finally we consider the result for the expected values. On the one hand, by Fatou's lemma, we get
(1 ? r) (1 ? F (L)) (L) : 1 lim inf E on L!1 For a reverse inequality, note that where
(L) S := inf fn : Xn > L=(1 ? r)g
10
HEATH, RESNICK AND SAMORODNITSKY
so that
E (L) E (X1 + Y1 )E = E:
However
E =
1 X n=0
P [ > n] =
n 1 _ X n=0
P [ Xi L=(1 ? r)] i=1
1 (1 ? r)? = 1 ? F (L= Fon(L) on (1 ? r)) and this completes the proof. To illustrate these results we present two modest simulations. For each simulation we supposed Fon was Pareto with = 1:5 and r = :53. For the rst simulation, Fo was the same Pareto and for the second simulation Fo corresponded to constant o times with value 3. We used 500 replications to compute expected hitting times of various levels by simulation and compared these with the approximate mean hitting time given by Theorem 2.3. The levels used for both experiments were 2, 5, 10, 22, 46, 100, 215, 464. The plots use a log scale for both axes. Note that the dotted line appears closer to the solid one when the o time is deterministic which may indicate a faster rate of convergence of the the approximation compared to the situation where the approximation has to cope with randomness in the o time. However, no systematic investigation has been completed of the rate of convergence. 12
Deterministic off times
4
6
log(mean hitting time) 8 10
simulation approximation
1
2
3
4
5
6
log(level)
Figure 2.2. Pareto on period, deterministic o period, = 1:5; r = :53. We also sought experimental evidence to con rm the intuition that in the heavy tailed case the process exceeds a level L because of a very long on period. As an additional experiment, we simulated 1000 runs of the process with = 1:5, the o distribution concentrated at 3 and r = :53: We waited until the process crossed L = 64 and then measured the length of the last on period Xon multiplying by (1 ? r). We compiled 1000 realizations of (1 ? r)X ^ on 3; L the truncation by 3 being for the purpose of keeping the data in a comfortable range. The range of the 1000 realizations was [:896; 3] and 848 observations were at least as large as 1, meaning that in about 85%
HEAVY TAILS AND LONG RANGE DEPENDENCE
11
of the simulation runs, the process crossed L due to a single large on period pushing the process across. A histogram of the data follows showing the preponderance of observations to the right of 1.
0
50
100
150
200
Histogram of ratios
1.0
1.5
2.0
2.5
3.0
Figure 2.3. (1?rL)Xon ^ 3 for Pareto on period, deterministic o period, = 1:5; r = :53. It remains to prove Proposition 2.1. We restate it in the following form.
Proposition 2.10 . Suppose fn; n 1g are iid with E1 < 0 and for x > 0 P [1 > x] =: F (x) x? L(x); > 1 where L is a slowly varying function. Let
fSn() ; n 0g = f0;
n X i=1
i ; n 1g
be the random walk with negative drift and step distribution F . Suppose
N = inf fn > 0 : Sn() 0g is the downgoing ladder epoch of fSn() g. Then as x ! 1
P[ Proof. For x > 0 let
N _ n=0
Sn() > x] E (N )P [1 > x] = E (N )F (x):
N (x) = inf fn > 0 : Sn() > xg:
12
HEATH, RESNICK AND SAMORODNITSKY
Then
P[
N _ n=0
Sn() > x] =P [ =
N _ n=0
1 X
n=1
1 X
n=1
Sn() > x; N (x) < N ]
P [Si() 2 [0; x]; i = 1; : : :; n ? 1; Sn() > x] P [Si() 2 [0; x]; i = 1; : : :; n ? 1; Xn > x]
=F (x)
1 X
P [Si() 2 [0; x]; i = 1; : : :; n ? 1] n=1 =F (x)E (N ^ N (x)) F (x)E (N ); as x ! 1. Thus we get lim inf P [
(2.11)
WN Sn() > x] n=0 E (N ):
F (x) The reverse inequality is more challenging. Observe that x!1
P[ and that
P[
(2.12)
N _ n=0
N _ n=0
Sn()
> x] = P [
n > x] P [
N ^_ N (x)
N ^_ N (x) n=0
n=0
Sn() > x]
n > x] E (N )F (x)
where the rst equivalence follows from Resnick (1986) and second being a minor modi cation of the rst. So we attempt to compare
P[ We write as an abbreviation Pick > 0. Then we have
N ^_ N (x) n=0
Sn()
> x] with P [
N ^_ N (x) n=0
n > x]:
N = N ^ N (x):
N _ ( ) P [ Sn > x]?P [ n > x] n=1 n=0 N N _ _ () =P [ Sn > x; n x] n=1 n=0 N N N _ () _ _ () =P [ Sn > x; n x ? ] + P [ Sn n=0 n=1 n=0 =I1(x) + I2 (x): N _
(2.13)
> x; x ?
x ? ] ? P [WN > x] i=1 i i=1 i !0 F (x)
as x ! 1. So I2 is of small order and we turn attention to I1 . Pick and x K such that [K=2] > =( ? 1) and write
I1 (x) P [
N _
Si()
> x;
N _
i=1 i=0 =I11(x) + I12(x):
i x=K ] + P [
N _ i=0
Si()
> x; x=K
x; x=K
x=K ] i=1 i=1 N M N N _ _ _ _ i > x=K; N > M ] + P [ Si() > x; i x ? ; i x=K; i=1 i=1 i=0 i=M +1 =I121(x) + I122(x):
I12(x) =P [
i=0
For x >
I121(x) P [ There exists some n M ^ N such that n > x=K;
P [ There exists n M such that n > x=K; 1 _
1 X m _ m=0 j =1 j 6=n
N m _ X m=0 j =1 j 6=n
j > ]
M F (x=K )P [ Sj() > ] j =0
and thus we conclude
1 _
(x) MK P [ S () > ]: lim sup I121 j x!1 F (x) j =0
(2.15)
Furthermore for I122(x) we have
I122(x) P f
[ nM
1 X
n=M +1
[n > x=K; Si() 2 [0; x]; i = 1; : : :; n ? 1]g
P [Si() 2 [0; x]; i = 1; : : :; n ? 1; n > x=K ]
=F (x=K )E (N ^ N (x)1[N ^N (x)>M ]
j > ]
14
HEATH, RESNICK AND SAMORODNITSKY
and thus (x) lim sup I122 ] ): F (x) K E (N 1[N>M
(2.16)
x!1
Finally we deal with I11 (x). De ne stopping times
N1 (x=K ) = inf fn 0 : Sn() x=K g; Ni (x=K ) = inf fn Ni?1(x=K ) : Sn() ? SN(i)?1 (x=K ) x=K g; i 2: Because all positive steps of the random walk must be small in the piece controlled by I11 (x), we have i = 1; : : :; [K=2]] I11(x) P [Ni(x=K ) N; and by the strong Markov property, this is bounded by i = 1; : : :; [K=2] ? 1]P [ P [Ni(x=K ) N;
(P [
W
1 _ n=0
1 _ n=0
Sn() > x=K ]
Sn() > x=K ])[K=2] :
() Since P [ 1 n=0 Sn > z ] is regularly varying in z with index ?( ? 1) (see for example, Bingham, Goldie and Teugels, 1987, page 387) we have
W S() > x=K ])[K=2] I11(x) (P [ 1 n=0 n !0 F (x) F (x)
as x ! 1. We now must put the pieces together. We have lim sup P [ x!1
WN S() > x] ? P [WN > x] n=0 n n=0 lim sup I11(x) + I121(x) + I122(x) + I2 (x) F (x)
x!1
K MP [
1 _
n=0
F (x)
Sn() > ] + K E (N 1[N>M ] ):
Let ! 1 and then M ! 1 to get
WN S() > x] ? P [WN > x] P [ n=0 n lim sup n=0 = 0: F (x) x!1 Since P [_Nn=0 n > x] E (N )F (x) we get
lim sup P [
WN S() > x] n=0 n E (N )
F (x) which is the desired reverse inequality. The proof is complete. x!1
HEAVY TAILS AND LONG RANGE DEPENDENCE
15
3. Contrast with exponential tails. If Fon has an exponentially bounded tail, known results of Iglegart (1972) can be applied to obtain the analogue of Theorem 2.3. We continue to assume that fXi g and fYig are iid, independent of each other, with distributions Fon and Fo respectively. Set i = (1 ? r)Xi ? rYi and for stability continue to suppose E1 < 0: Recall N is the rst downgoing ladder epoch of the random walk with steps fi g. We need to suppose (A1) For some > 0: E 1 = 1 and for this value of we have (A2) E1e 1 := 2 (0; 1) and (A3) 1 has a non-lattice distribution. An analogue of Corollary 2.2 is provided by Iglehart (1972) which suces to give the tail behavior of the maximum contents level in a cycle. We write Sn() = (1 ? r)Sn(X ) ? rSn(Y ) ; n 0. Proposition 3.1 (Iglehart, 1972, Lemma 4). Suppose assumptions A1, A2 and A3 hold. Then for x > 0 (3.1)
P [M1 > x] = P [
where
C1 _
s=0
X (s) > x] a(0)e? x ; ()
SN 2 a(0) = (1 ? E (Ee (N ) )) E (e (1?r)X1 ):
We may now follow the line of reasoning of Section 2. The exponential tails given in (3.1) imply
(3.2)
[_ ut]
i=1
Mi ? log a(0)u ) Y0 (t)
in Dr (0; 1) where Y0 () is the extremal process generated by the Gumbel distribution (x) = expf?e?x g; x 2 R: We then get as u ! 1 x + log a (0) u )=u ) (Y0 (t=(E (N )); E (N )Y0 (x))
M (ut) ? log a(0)u; M (
which leads to + u) ) E (N )Y ( x): a(0) (xe u 0 If x = 0 we get as u ! 1 a(0) e( uu) ) E (N )Y0 (0): Note that Y0 (0) is exponentially distributed with mean 1 and we get the nal result (3.3)
a(0) (u) ? S ( ) )2 ! (u) (1 ? Ee
(1 ? r ) X 1 ) (E N )2 e u ) E (1) E (N ) e u = E (e
N
16
HEATH, RESNICK AND SAMORODNITSKY
as u ! 1 where E (1) is a unit exponential random variable. We may also check that E (u)=e u converges as follows. Observe that (3.4) 0 (s) SNV (s) where
V (s) = inf fn :
_n i=1
Mi sg
since the hitting time of X () must occur before the end of the cycle that has a cycle maximum bigger than the level. Since V (s) is geometrically distributed (s) ) E (1); a(0) Ve s where, as usual, E (1) is a unit exponential random variable. Furthermore, as s ! 1 ESNV (s) =E NV (s) = E (N )E (V (s)) =E (N ) P [M1 > s] 1 ) s E ( N a(0) e : So as s ! 1 E (SNV (s) ) E (N ) e s ! a(0) ; and
SNV (s) ) V ( ss) ) E (N ) E (1): E ( N
s e e a(0) These two statements coupled with (3.3) and (3.4) and a variant of Fatou's lemma sometimes called Pratt's lemma (Pratt, 1960) yield the desired result (N ) e s (s ! 1): (3.5) E (s) Ea(0) Note how critically Fo enters into formulas (3.3), (3.5) since Fo is important in determining the growth rate of the hitting time. In the heavy tail case, Fo did not play a role in determining the growth rate of (s) since levels were hit basically due just to big upward jumps which were controlled by Fon. Recall that in the heavy tailed case, as s ! 1, E (s) (1 ? r)? =Fon(s) ES = E where = inf fn : (1 ? r)Xn > sg: Example: Consider the standard example where both Fon and Fo are exponential distributions with means on and o respectively. The negative drift condition is (1 ? r)on ? ro < 0
HEAVY TAILS AND LONG RANGE DEPENDENCE
17
and must satisfy
1 = Ee ((1?r)X1 ?rY1 ) = ((1 ? (1 ? r)on )(1 + ro ))?1 : Solving for we get the solutions = 0 and 1 :
= r(1o??r)(1r? r)on = (1 ? ?r)E r on o on o The numerator is positive by the drift condition. We now calculate the coecient of (u)=e u in (3.3) in order to compare it to the exact calculation given in (1.5). Since X1 is assumed exponential with mean on we have Ee (1?r)X1 = (1 ?rro) : on To calculate we compute dsd Ees1 and substitute s = . This calculation is made easier by use of the formulas 1 + r o = (1 ?rro) on (1 ? r ) 1 : 1 ? (1 ? r)on = r on = 1 + r o
With these formulas we nd Next observe that because of exponential tails,
o
= ?E1 :
SN() =d ?ro E (1); where E (1) is a unit exponential random variable and hence () 1 1 ? Ee SN =1 ? 1 + r o (1 ? r ) =1 ? r on o ? E 1 = r : o Knowing the distribution of SN() also enables us to compute E N since
ESN() = ?ro = E (N )E1 and hence
E N = ?rEo :
Putting the ingredients together, (3.2) becomes
1
(?E1 )2 (u) ) E (1) (ro)2 e u which agrees with the exact calculation for the expected value in (1.5).
18
HEATH, RESNICK AND SAMORODNITSKY
The contrast with the heavy tailed case is very evident. Instead of E (s) being of the same order as ES , the expected time until an on period of length at least s=(1 ? r) occurs, we have in our exponential example
ES =E = =Fon(s=(1 ? r)) = expfs=(on (1 ? r))g: However from (3.5)
2
o) s E (s) (?(r E )2 e : 1
Comparing the growth rates with 1=(on(1 ? r)) we have
0 < = (1 ? 1r) ? r1 < (1 ? 1r) on o on so that ES has a faster growth rate which is to be expected since in the exponential case, the process X () jumps over a high level as a result of an accumulation of small upward movements and not typically as a result of a single large jump. To obtain the exact expression for E (s) in this example, proceed as follows. De ning
X~ (t) = (X (t); Z (t)); t 0; we describe the state of the system prior to reaching level s as a Markov process fX~ (t); t 0g with a state space E = f(x; i); 0 x s; i = 0; 1g. We can express (s) in terms of the hitting times of the Markov process fX~ (t); t 0g as (s) = T(s;1) := inf ft 0 : X~ (t) = (s; 1)g: For x 2 E , let H (x) be the expected hitting time T(s;1) starting at x, and de ne for an 0 x s,
h1 (x) = H ((x; 1)); h2 (x) = H ((x; 0)): Then E (s) = h1(0). Using the natural ltration Ft = (Z (u); 0 u t); t 0, we observe that for any t0 E (T(s;1)jFt) = H X~ (t ^ T(s;1) ) + X~ (t ^ T(s;1) ) := M (t): Therefore, fM (t); t 0g is a martingale, and its martingale property leads to the following system of ordinary dierential equations: (3.6) (3.7)
(1 ? r)h01(x) = ? 1 + 1 h1 (x) ? 1 h2(x); on on 1 1 rh02(x) =1 + h1 (x) ? h2(x); o o
with the obvious boundary conditions (3.8)
h2(0) = o + h1(0); h1 (s) = 0:
The system (3.6{3.8) can be solved in the standard way, and we obtain (1.5).
HEAVY TAILS AND LONG RANGE DEPENDENCE
19
4. A single server uid queue fed by several on/o processes. Let fXi(j ) ; i = 1; 2; : : : g; j = 1; : : : ; k and fYi(j ); i = 1; 2; : : : g; j = 1; : : : ; k be iid copies of the on sequence fXi; i = 1; 2; : : : g and the o sequence fYi ; i = 1; 2; : : : g correspondingly. We construct k iid on/o processes, fZj (t); t 0g; j = 1; : : : ; k as in (1.1). Sometimes we will nd it convenient to work with stationary versions of fZj (t); t 0g; j = 1; : : : ; k. Those exist due to the niteness of on and o,
and can be constructed as follows. Fix a j = 1; : : : ; k, and let Cj(0); Xj(0); Yj(0) and Y0(j ) be four independent random variables which are independent of fXi(j ) ; Yi(j ); i 1g de ned as follows: Cj(0) is a Bernoulli random variable with values f0; 1g and mass function P [Cj(0) = 1] = on = 1 ? P [Cj(0) = 0]
and (x > 0)
P [Xj(0) > x] =
Z 1 Fon(s)
ds; P [Yj(0) > x] =
Z 1 Fo(s)
on o ds: x Finally, Y0(j ) has the Fo distribution. De ne a delay random variable Dj(0) by x
Dj(0) = Cj(0) (Xj(0) + Y0(j )) + (1 ? Cj(0))Yj(0) and a delayed renewal sequence by (4.1)
fSn(j ) ; n 0g := fDj(0) ; Dj(0) +
n X i=1
(Xi(j ) + Yi(j )); n 1g:
Then a stationary version of fZj (t); t 0g is de ned by (4.2)
Zj (t) = Cj(0)1[0;Xj(0) ) (t) +
1 X n=0
1[Sn(j) t r:
By the non-triviality assumption, k0 k. If k0 = 1, by analogy to the case k = 1 one expects the state of the system to reach a high level L when one of the k on/o processes has an on period of length L=(1 ? r). If k0 > 1, the same intuition says that the system will reach a high level L only when k0 very long on periods occur at about the same time, and so it will take much longer until this high level is reached. In this section we prove the above statement for the case k0 = 1, thus generalizing the conclusion reached in Section 2 for k = 1. The proof of the Theorem 4.1 is signi cantly more involved than the argument required for k = 1 due, once again, to the lack of renewal structure in the process fZ (k) (t); t 0g. A natural way of calculating the time until the system contents reach the level L is by starting from the moment the system is empty, and all k on/o processes begin an on period. One must realize, however, that for k > 1 such a moment in time is far from being \typical", and, even if we initialize the system in such a way, chances are that such moments will not recur. Therefore, we state our theorem in a more general way, by allowing more general initial conditions. To this end, let H be an arbitrary probability law on Rk+ , whose marginals have nite rst moments. Let (D1(0) : : : ; Dk(0)) be an H -distributed random vector, independent of the sequences fXi(j ) ; i = 1; 2; : : : g; j = 1; : : : ; k and fYi(j ); i = 1; 2; : : : g; j = 1; : : : ; k. We again de ne a delayed renewal sequence by (4.1), and, similarly to (4.2), we de ne fZj (t); t 0g by (4.7)
Zj (t) =
1 X n=0
1[Sn(j) t 1; x ! 1; and assume that for some p > 1 we have EY1p < 1. If the service rate r satis es (4.5), and r < 1, then for any H and x0 we have (4.9)
on L EH;x0 (L) = lim Fon L EH;x0 (L) = 1 ; F lim L!1 L!1 1?r 1?r k
HEAVY TAILS AND LONG RANGE DEPENDENCE
21
where
(L) = inf ft 0 : one of the k on/o processes begins at time t L an on period of length at least 1 ? r g: An argument identical to that of Theorem 2.1 immediately proves the corresponding result for the corresponding GI=G=1 queue. Let fYi(j ) ; i = 1; 2; : : : g; j = 1; : : : ; k be iid copies of the sequence of interarrival times fYi; i = 1; 2; : : : g, and let fBi(j ) ; i = 1; 2; : : : g; j = 1; : : : ; k be iid copies of the sequence of oered work fBi ; i = 1; 2; : : : g, so that at time Sn(j ) = Y1(j ) + : : : + Yn(j ) an amount of work Bn(j ) is brought in the system on the j th channel, n = 0; 1; : : : ; j = 1; : : : ; k. Let r be the service rate. Note that the following theorem does not require the assumption r < 1. Theorem 2.2. Let the distribution Fon of Bi satisfy Fon(x) = x?L(x); > 1; x ! 1; and assume that for some p > 1 we have EY1p < 1. If (4.11) k on < r; o then
(4.12)
lim F (L)E (L) = L!1 on
1 k o;
where (L) is the rst time the amount of work in the system reaches the level L. In particular, the expected time to reach a high level L in (2.12) does not depend on the service rate r.
Note that the result of Theorem 2.2 remains true if one initializes in an arbitrary way the state of the system. Let us look at the implications of our results on the bene ts of pooling the system resources. Think of k iid GI=G=1 queues, with holding capacity L each, and service rates r each. The queue number j is driven by the sequences fYi(j ); i = 1; 2; : : : g and fBi(j ); i = 1; 2; : : : g as above, j = 1; : : : ; k. We take pooling system resources here to mean that we put together the service resources to create a \super-server" with service rate kr, and we feed this \super-server" by a combined stream of the k input processes, as in Theorem 2.2. The holding capacity of the new system is taken to be kL, again, as the result of pooling the resources. Let us look at a generic stream of \customers", or work (i.e. one of the k original streams of work). One can imagine that when the holding capacity of the system serving these \customers" is reached, the system is blocked for a time to any future arrivals. Under the \k separate servers" scenario, the expected time until the serving system is blocked is E (L), while when the system resources are pooled, this expected time is EkL. By Theorem 2.2 the ratio R of the two expected times is, asymptotically for large L, kFon(kL) = 1 < 1; R = Llim !1 Fon (L) k?1 which is the expected bene t of pooling the resources. However, this bene t becomes less pronounced if is close to 1. We are ready now for the proofs. Proof of Theorem 2.1. Obviously, (4.13) EH;x0 (L) EH;x0 (L) + 1 ?L r :
22
HEATH, RESNICK AND SAMORODNITSKY
If
o n ) L ; j = 1; : : : ; k; ;j (L) = inf Sn(j ) : n 1; Xn(j+1 1?r
(4.14) we have
(L) = min ;1 (L); : : : ; ;k (L) :
(4.15)
Observe that ;1 (L); : : : ; ;k (L) are, conditionally on (D1(0) : : : ; Dk(0)), independent, and that GL(j)
X ;j (L) =d D(0) + (X (;j ) + Y (j ) );
(4.16)
j
i
i=1
i
where G(Lj ) is a geometric random variables with parameter Fon( 1?L r ), independent of two independent iid sequences, fXi(;j ) ; i = 1; 2; : : : g and fYi(j ) ; i = 1; 2; : : : g, where the latter sequence has, as usual, the Fo distribution, and the former has the distribution
P (Xi(;j ) 2 A) = P
X1(j ) 2 AjX1(j ) L
Z L=(1?r) 1 1(x 2 A) Fon (dx): = Fon ( 1?L r ) 0
st
Everything is also independent of the delay random variable D0(j ) . In particular, (X1(j ) jX1(j ) L) X1(j ) , and hence st
;j (L) SG(j()j) ;
(4.17)
L
where all the random variables appearing in the right hand side of (4.17) are independent. We conclude that st (L)
and so
_ 1j k
D0(j ) +
EH;x0 (L) EH
^1jX k GL(j)
_ 1j k
i=1
(Xi(1) + Yi(1));
D0(j ) + E
^ 1j k
G(Lj ):
Since ^1j kG(Lj ) is, once again, a geometric random variable with parameter 1 ? Fon ( 1?L r )k , we obtain immediately that
L L E _ D(j ) + 1 EH;x0 (L) EH;x0 (L) + 1 ? ? 1 + 1 ? r; H 0 r 1 ? Fon( 1?L r )k 1j k
which implies that (4.18)
L on L EH;x0 (L) 1 : ( L ) lim sup F E lim sup Fon 1 ? H;x 0 r 1?r k L!1 L!1
HEAVY TAILS AND LONG RANGE DEPENDENCE
23
For a lower bound, we start with observing that we may assume that ^1j kDj(0) = 0, PH;x0 {almost surely. Indeed, shortening all the initial o periods by the same amount can only bring the level crossing time closer. Now, take an 0 < < 1, and observe that (L) (L(1 ? ))1[ (L) (L(1?))]; and so (4.19) EH;x0 (L) EH;x0 (L(1 ? )) ? EH;x0 (L(1 ? ))1[ (L) 0,
P
SG(1)(1) L
> x =P
GX(1) L
i=1
(Xi(1) + Yi(1)) > x
n X
x (1) (1) P \n>N [G(1) L > max(N; (1 ? ) ); (Xi + Yi ) > n(1 ? ))]
(4.25)
i=1
n x ) P \ [X (1) + Y (1) ) > n(1 ? ))] ( X =P G(1) > max( N; n>N i i L (1 ? )) i =1 x =:N P G(1) L > max(N; (1 ? ) ) :
Observe that by the strong law of large numbers, N ! 1 as N ! 1. We have, therefore,
E ^ (L) k
N
=Nk
Z1
N (1?)
1 X
n=N
x k P G(1) L > (1 ? ) dx
(1 ? )P G(1) L >n
k
k(N +1)
Fon 1?L r k =N (1 ? ) k : 1 ? Fon 1?L r We conclude that
on L E ^ (L) 1 Nk (1 ? ); F lim inf L!1 1?r k and (4.24) follows by letting N ! 1 and ! 0. Therefore, to prove the theorem we only need to prove (4.20). 0 0 By the assumptions of the theorem, there is a p0 > 1 such that both EX1p < 1 and EY1p < 1. By Holder's inequality, (4.26) 1=p0 PH;x0 ( (L) < (L(1 ? )))1=q0 ; EH;x0 (L(1 ? ))1( (L) < (L(1 ? ))) EH;x0 ( (L(1 ? )))p0 where p10 + q10 = 1. We use the following inequality: for any - nite measure spaces ( 1 ; F1; 1) and ( 2 ; F2; 2), a p0 > 1 and a nonnegative measurable function f : 1 2 ! R, (4.27)
Z Z
1
2
p0
f (!1 ; !2) 2(d!2)
1=p0 Z Z
1(d!1)
2
1
f (!1 ; !2)p0 1 (d!1)
1=p0
2(d!2):
HEAVY TAILS AND LONG RANGE DEPENDENCE
25
See for example Lemma 3.3.1 of Kwapien and Woyczynski (1992). Let j0 = argmin(D1(0) ; : : : ; Dk(0)) (with ties broken in, say, the lexicographical manner), and recall that Dj(0) 0 = 0, PH;x0 {almost surely. We have by (4.27)
EH;x0 ( (L(1 ? )))p0
1=p0 1=p0 1=p0 EH;x0 ( (L))p0 EH;x0 (^L )p0 0 (j0 ) p0 1=p EH;x0 (SG( 0 ) )
X 1
= E
i=1
j L
p0 1=p (Xi(1) + Yi(1) )1(i G(1) ) L
1 X i=1
E
p (Xi(1) + Yi(1) )1(i G(1) L )
0
0 1=p0
1 1=p0 X
1=p0 = E (X1(1) + Y1(1))p0 P (G(1) L i) i=1 L ?1 C Fon 1 ? r ; where C is a nite positive constant. It follows from (4.26) that we will prove (4.20) by showing that
(4.28) Let us prove rst that for every (4.29)
lim P ( (L) < (L(1 ? ))) = 0: L!1 H;x0
N > 2k ? 1 ;
we have (4.30)
lim P ( (L) < (L=N )) = 0: L!1 H;x0
To this end, let us \unpool" the system. That is, imagine k separate uid queuing systems de ned by (4.31) dXj (t) = Zj (t) dt ? k1 r(Xj )(t) dt; t 0 where fZj (t); t 0g is given by (4.7), and Xj (0) = k1 x0 , j = 1; : : : ; k. The k processes fXj (t); t 0g; j = 1; : : : ; k are, conditionally on the initial delay (D1(0) : : : ; Dk(0)), independent. Let Y (k) (t) = X1 (t) + : : : + Xk (t); t 0. The two processes, fX (k) (t); t 0g and fY (k)(t); t 0g describe the states of two queuing systems. Obviously, X (k) (0) = Y (k)(0), the two systems have identical in ow streams of work, while the out ow of work from X (k) () when the system is not empty is always at rate r , and the rate of out ow of work from Y (k)() does not exceed r. Therefore, for every !, (4.32)
X (k) (t) Y (k)(t); t 0:
Note that (4.32) is just an expression of the bene t of pooling the system resources. De ne (4.33)
(Y ) (L) = inf ft 0 : Y (k)(t) Lg:
26
HEATH, RESNICK AND SAMORODNITSKY
Then (4.32) implies that (Y ) (L) (L), so that
n
o n
o
! : (L) < (L=N ) ! : (Y ) (L) < (L=N ) ;
and, therefore,
PH;x0 ( (L) < (L=N )) PH;x0 ( (Y ) (L) < (L=N )):
(4.34) Now, let for j = 1; : : : ; k Then (4.35)
(j )(L) = inf ft 0 : Xj (t) Lg:
PH;x0 ( (Y ) (L) < (L=N )) PH;x0 (j )(L=k) < (L=N ) for some j = 1; : : : ; k
j X i=1
j X i=1
PH;x0 ( (j )(L=k) < (L=N )) PH;x0 ( (j )(L=k) < ;j (L=N )):
Therefore, (4.30) will follow from (4.34) and (4.35) once we prove that for every (4.36) M > 2 ? 1 we have (4.37)
lim P ( (j )(L) < ;j (L=M )) = 0; L!1 H;x0
for j = 1; : : : ; k. We will prove (4.37) for j = 1. Let T0 = inf ft > D1(0) : X1 (t) = 0g. Write
PH;x0 ( (1) (L) < ;1(L=M )) = PH;x0 (T0 (1)(L) < ;1 (L=M )) + PH;x0 ( (1)(L) < ;1(L=M ) ^ T0 ): Since for all L > 2x0
PH;x0 ( (1)(L) < ;1(L=M ) ^ T0 ) PH;x0 ( (1) (L) < T0 ) (1) < T ) ! 0 P (L= 0 2 as L ! 1 because of the rate condition (4.5) (or recall Corollary 2.2), (4.37) will follow if we prove that (4.38)
lim P (T (1) (L) < ;1 (L=M )) = 0: L!1 H;x0 0
Clearly, at time T0 the system is in an o period. Denote by H~ the law (under PH;x0 ) of the remainder of this o period after time T0 . By the strong Markov property,
PH;x0 (T0 (1) (L) < ;1(L=M )) PH;~ 0 ( (1)(L) < ;1 (L=M )):
HEAVY TAILS AND LONG RANGE DEPENDENCE
27
Therefore, (4.38) will follow once we prove (4.37) with x0 = 0, and so (4.37) in its generality will follow if we prove it only for x0 = 0. Assume, therefore, that x0 = 0. For K1 0 and K2 > 0 let fX1(K1 ;K2 ) ; t 0g denote the process given by (4.31) (with j = 1) when D1(0) is replaced by D1(0) ^ K1 , and each Yi(1) is replaced by Yi(1) ^ K2 . Let (1) (L; K1; K2 ) and ;1 (L; K1 ; K2) be the random times analogous to (1)(L) and ;1 (L) correspondingly, de ned with respect to the process fX1(K1 ;K2 ) ; t 0g. Observe that the event ;1 (K1 ; K2)g AK1 ;K2 = f! : (1)(L; K1 ; K2) < L=M
increases when K1 and K2 decrease. It is enough, therefore, to prove (4.37) in the case when K1 = 0, and K2 is any nite positive number such that on < kr : (1) on + E (Y1 ^ K2 ) In other words, we will prove (4.37) in the case when D1(0) = 0, the o times are bounded (by K2 ), and x0 = 0. We will, therefore, use once again P and E without any subscripts. We de ne 3 events, Ai (L), i = 1; 2; 3, corresponding to the following 3 possibilities. (i) ;1(L=M ) < (1) (L) ^ T0 . That is, the process fX1 (t); t 0g begins an on interval of length at least M (1L?r) before reaching either level L, or returning to 0. (ii) (1)(L) < ;1 (L=M ) ^ T0 . In other words, the process fX1 (t); t 0g reaches level L before starting an on interval of length at least M (1L?r) and before returning to 0. (iii) T0 < (1) (L) ^ ;1(L=M ). In other words, the process fX1 (t); t 0g returns to 0 before reaching level L, and without having an on interval of the length of at least M (1L?r) . Clearly,
P ( (1)(L) < ;1(L=M )) = P f (1) (L) < ;1(L=M )g \ A2(L) + P f (1)(L) < ;1(L=M )g \ A3(L) : However, by the strong Markov property,
P f (1)(L) < ;1(L=M )g \ A3(L) = P (A3(L))P ( (1) (L) < ;1(L=M )): Therefore,
P f (1)(L) < ;1(L=M )g \ A2(L) (1) ; 1 P ( (L) < (L=M )) = : 1 ? P (A3 (L))
(4.39)
Let i = (1 ? r)Xi(1) ? rYi(i); i = 1; 2; : : : . Taking into account that the o times are bounded, we can repeat now the argument used to estimate I11(x) in the proof of Proposition 2.10, to conclude that
P and P
f (1)(L) < ;1 (L=M )g \ A2 (L)
1 _
P
n=0
Sn() >
[M=2]
L M (1 ? r)
W1 () n=0 Sn > L is regularly varying in L with index ?( ? 1). Moreover, (1) L L
1 ? P (A3(L)) P (A1 (L)) P X1 > M (1 ? r) = Fon M (1 ? r) :
;
28
HEATH, RESNICK AND SAMORODNITSKY
We conclude by (4.39) and (4.36) that
W [M=2] 1 ( ) L P n=0 Sn > M (1?r) L lim sup P ( (1)(L) < ;1(L=M )) lim sup = 0: L!1
Fon
L!1
M (1?r)
This proves (4.37) and so (4.30) is proven as well. We observe at this point that the above argument that allowed us to assume the initial delays being equal to 0 shows that we have also proved that for every N satisfying (4.29) lim sup PH;0( (L) < (L=N )) = 0:
(4.40)
L!1 H
The next step in the proof of (4.28) is to show that two \long" on periods are \unlikely to happen simultaneously". Formally, let (4.41)
QL = inf ft 0 : at time t there are two on periods running, each of length at least Lg:
We claim that there is a function L ! 0 as L ! 1 such that
Q L?1 (Fon (L))?1 = 0: lim P L!1 H;x0 L
(4.42)
Of course, QL will only decrease if we assume that all delay times and o times are equal to 0, and QL is unaected by x0 . We will, therefore, once again drop the subscripts from P and E , and assume that all o times are equal to 0. For j1 ; j2 = 1; : : : ; k, j1 6= j2 , let
Q(Lj1 ;j2 ) = inf ft 0 : at time t, the processes Xj1 () and Xj2 () both have on periods running, each of length at least Lg: Then
QL =
and so for any q > 0,
^ j1 6=j2
QL(j1;j2 ) ;
P (QL q) k(k 2? 1) P (Q(1L ;2) q):
(4.43) Let (4.44)
Q(1) L = inf ft 0 : at time t, X1 () begins an on period of length at least L during which X2 () also begins an on period of length at least Lg
with Q(2) L de ned similarly. Then which means that for any q > 0, (4.45)
(2) Q(1L ;2) Q(1) L ^ QL ;
P (Q(1L ;2) q) 2P (Q(1) L q):
HEAVY TAILS AND LONG RANGE DEPENDENCE
Let Zk ; k = 1; 2; : : : be an iid sequence, such that
Z1 =d
29
G X ^ L
i=1
Xi ;
where GL is a geometric random variable with parameter Fon (L), independent of an iid sequence X^i ; i = 1; 2; : : : , with common law ZL 1 ^ P (X1 2 A) = P X1 2 AjX1 L = F (L) 1(x 2 A) Fon(dx): on 0 Then Z1 represents the rst time X1 () starts an on period of length at least L. Let HL be yet another geometric random variable, independent of the sequence Zk ; k = 1; 2; : : : , this time with parameter pL de ned as follows. Let W be a random variable with distribution Z1 1 1(x 2 A) Fon(dx); P (W 2 A) = P X1 2 AjX1 > L = F (L) on L and independent of X2 (). Recall that at time 0 the process X2 () starts an on interval. Then de ne
pL = P inf fSn(2) : n 1; Xn(2)+1 Lg W :
(4.46)
That is, pL is the probability that X (2) () begins an on period of length at least L during an on period of X (1) (), whose length is at least L. If X (2) () does not start such an on period, we then have to wait till the next on period of X (1) () whose length is at least L. Therefore, HL st X
Q(1) L
(4.47) We claim that (4.48) Indeed, (4.49)
n=1
Zn :
pL ! 0 as L ! 1:
Z1 Z1 X GL 1 1 P (Z1 t) Fon(dt) = F (L) P ( X^i t) Fon (dt): pL = F (L) on on L L i=1
Observe that for every t L,
n X ^ i ? n on 0 Xi t) P (GL (F (1L))1=2 ) + P X inf 2 n>(Fon(L))?1=2 i=1 on i=1 on (4.50) + P ( 2 GL t): Notice that the rst term in the right hand side of (4.50) goes to 0 as L ! 1, and that by the strong law of large numbers so does the second term in the the right hand side of (4.50). Since neither of these two terms depends on t, (4.48) will follow once we prove that Z 1 on 1 (4.51) lim P ( 2 GL t) Fon (dt) = 0: L!1 Fon (L) L
P(
G X ^ L
30
HEATH, RESNICK AND SAMORODNITSKY
However, for all L big enough and all t L we have [2t=on] 1, and so [2t=on] h 2t i Fon (L): P ( 2on GL t) = 1 ? Fon (L) on Therefore, Z1 Z 1 on lim sup F 1(L) P ( 2 GL t) dt 2 lim sup t Fon(dt) = 0 L!1
on
on L!1
L
L
since on < 1. This proves (4.51), and, therefore, we have (4.48). De ne now L = p1L=2. Observe that by (4.43) (4.45) and (4.47) we have (4.52)
P QL L?1 (Fon (L))?1 k(k ? 1)P
However,
H X L
n=1
Zn =d
H X L
n=1
Zn p?L 1=2(Fon (L))?1 :
J X ^ L
i=1
Xi ;
where JL is a geometric random variable with parameter pLFon (L), independent of fX^i ; i = 1; 2; : : : g. Since, as L ! 1, (pL Fon(L))JL ) E (1) where E (1) is a standard exponential random variable and we immediately obtain (4.42) using (4.52) and the law of large numbers. Let us go back now to the proof of (4.28). Observe, rst of all, that for all L big enough,
PH;x0 (L) < (L(1 ? )) PH;0 (L(1 ? =2)) < (L(1 ? )) : Therefore, it is enough to prove (4.28) for x0 = 0, for the general case will follow by making smaller. We will, therefore, use the notation PH and EH , when x0 = 0. Fix any N satisfying (4.29) and big enough to make the right hand side of (4.53) below positive, and observe that it is enough to prove (4.28) for = 1=N . Fix, further, a satisfying (4.53) (k ? 1) 2 ? 2: We have (4.54) PH ( (L) < (L(1 ? ))) PH (L) < (L(1 ? )); (L2) (L3 ); (L) < QL + PH ( (L2 ) < (L3)) + PH ( (L) QL ): Observe that by (2.30) lim P ( (L2 ) < (L3 )) = 0: L!1 H Furthermore, by (4.42) and (4.18) PH ( (L) QL) PH QL ( L )?1 (Fon (L))?1 + PH (L) ( L )?1 (Fon (L))?1
PH QL ( L )?1 (Fon (L))?1 + L Fon(L)EH (L) !0
HEAVY TAILS AND LONG RANGE DEPENDENCE
as L ! 1. Therefore, (4.28) will follow once we prove that (4.55)
31
lim P (L) < (L(1 ? )); (L2 ) (L3 ); (L) < QL = 0: L!1 H
As a matter of fact, we will prove an even stronger statement. We will prove that (4.56) Let
n
o
lim sup PH (L) < (L(1 ? )); (L2 ) (L3 ); (L) < QL = 0: L!1 H
B (L) = (L) < (L(1 ? )); (L2 ) (L3 ); (L) < QL : We split this event into two events, B1 (L) and B2 (L), accordingly to the following two possibilities. (i) After starting the rst \wet period" the process X (k) () reaches level L before returning to 0. (ii) The process X (k) () returns to 0 before reaching level L. Let us look at the event B1 (L) rst. Since B1 (L) B (L), at time (L2 ) at most one of the k on/o processes has an on period whose length is at least L. Depending on whether the number of such on/o processes is 0 or 1, we split the event B1 (L) into B11 (L) and B12(L). Let us look, for example, at B12 (L). The treatment of the event B11 (L) is similar. Since B12 (L) B (L), the single on period running at time (L2) of length least L, has length not exceeding L(1(1??r)) . Let us now modify the state of the system at time (L2 ) in the following way. Bring all the work remaining in the presently running on periods in one \lump" at time (L2 ), and attach the remainders of these on periods to the subsequent o periods. Obviously, this action can only make (L) smaller, and so it can only increase the probability of the event B12(L). Observe that, after this action, the state of the system does not exceed L2 + L(1 ? ) + (k ? 1)L L(1 ? =2) by (4.53). We increase, if necessary, the state of the system to exactly L(1 ? =2) (only increasing in the process the probability of the event B12(L)). We conclude that for some H0, (4.57)
PH (B12 (L)) PH0 ;L(1?=2) sup V (k)(t) L ; t0
where fV (k) (t); t 0g is given by
dV (k)(t) = Z (k) (t) dt ? r dt; with fZ (k)(t); t 0g given by (4.3) and (4.7). However, V (k)(t) = V1 (t) + : : : + Vk (t); t 0; where for j = 1; : : : ; k the process fVj (t); t 0g is de ned by dVj (t) = Zj (t) dt ? kr dt; with Vj (0) = L(1 ? =2)=k, and with initial delay governed by the j th marginal law, H (j ) , of H0. We conclude immediately by (4.57) that PH (B12 (L)) k sup PH (1) ;L(1?=2)=k(sup V1 (t) Lk ) = kP (sup V1 (t) L 2k ); t0 t0 H (1)
32
HEATH, RESNICK AND SAMORODNITSKY
where P without a subscript indicates, as usual, absence of delay and zero initial state. Because of the negative drift, we conclude that lim sup PH (B12 (L)) = 0:
(4.58)
L!1 H
In exactly the same way one can show that lim sup PH (B11 (L)) = 0:
(4.59)
L!1 H
Finally, we consider the event B2 (L) above. Consider the two possibilities that are feasible after the process reaches 0: either after that time the state of the system reaches the level L2 before the beginning of the rst on period of length at least L3 , or not. Accordingly, by the strong Markov property,
PH (B2 (L)) PH ( (L3 ) < (L(1 ? ))) sup PG ( (L2 ) (L3 )) + sup PG(B (L)) : G
G
Taking supremum over H , we obtain sup PH (B (L)) sup PH (B11 (L)) + sup PH (B12(L)) H
H
H
+ P ( (L3 ) < (L(1 ? ))) sup PH ( (L2 ) (L3 )) + sup PH (B (L)) ; H
H
which is the same as
2 3 H PH (B12 (L)) + supH PH ( (L ) (L )) : sup PH (B (L)) supH PH (B11 (L)) + sup 3 P ( (L ) = (L(1 ? ))) H Now (4.56) follows from (4.60), (4.58), (4.59), (4.40) and the fact that 3 P ( (L3 ) = (L(1 ? ))) = FonF(L((1L?3 ))) ! 1 ? > 0 on as L ! 1 by the regular variation. This completes the proof of the theorem.
(4.60)
5. Appendix.
We will construct explicitly a stationary version of fX (k) (t); t 0g. In fact, we will construct a stationary version of fX (k) (t); ?1 < t < 1g on the whole real line. We proceed as follows. Let fZj (t); ?1 < t < 1g; j = 1; : : : ; k be stationary. Then fZ (k)(t); ?1 < t < 1g de ned by (4.3) is itself stationary. We will construct fX (k) (t); ?1 < t < 1g by exhibiting a measurable function ' : D(R; f0; 1; :: : ; kg) ! R+ such that the process (5.1)
X (k) (t) = '(Z (k) ( + t)); ?1 < t < 1
satis es (4.4). Clearly, the process given by (5.1) is stationary. Let x = (x(u); ?1 < u < 1) 2 D(R; f0; 1; :: : ; kg), and let
: : : < T?2 < T?1 < 0 T1 < T2 < : : : be the epoch times for fx(u); ?1 < u < 1g. That is, at each time Ti , fx(u); ?1 < u < 1g changes its value, and let Z0;1 = fi : x(Ti ? 0) = 0; x(Ti + 0) = 1g:
HEAVY TAILS AND LONG RANGE DEPENDENCE
33
For x = (Z (k) (t); ?1 < t < 1), the points Ti with i 2 Z0;1 are the times when one of the on/o processes begins an on period, following a period when all k processes were o. For every i 2 Z0;1 let Li be the length of the \wet period" commencing at Ti . Formally, let X (k) (Ti ) = 0, and de ne fX (k)(t); u Ti g by the following analogue of (4.4)
dX (k) (u) = x(u) du ? r(X (k) )(u) du:
(5.2) Then
Li = inf fu > 0 : X (k) (Ti + u) = 0g:
n
If we denote
o
G1 = x : for every i 2 Z0;1(x); Li (x) < 1 ; then the rate condition (4.5) implies that the set G1 2 D(R; f0; 1; :: : ; kg) has full measure under the probability measure 0 induced by fZ (k)(t); ?1 < t < 1g on D(R; f0; 1; :: : ; kg). Now, for a ?1 < t < 1 let A0;1(t) = fTi t : i 2 Z0;1g; and de ne (5.3) T (t) = inf fTi 2 A0;1(t) : Ti + Li tg: Denote n o G2 = x : for every ?1 < t < 1; Ti + Li t for only nitely many i 2 A0;1(t) : We claim that the set G2 2 D(R; f0; 1; :: : ; kg) has full measure under the probability measure 0 . That is, we claim that n o (5.4) P ! : for some ?1 < t < 1, Ti + Li t, for in nitely many i 2 A0;1(t)(Z (k) ()(!)) = 0: Now, concentrating on the rational times only, we see that (5.4) will follow once we prove that for any
?1 < t < 1 we have (5.5)
n
P ! : Ti + Li > t for in nitely many i 2 A0;1(t)(Z (k) ()(!))
o
We check (5.5) for t = 0. We have
n
P ! : Ti + Li > 0 for in nitely many i 2 A0;1(0)(Z (k) ()(!))
P lim supf! : n!?1
k Z0 X j =1 n
Zj (u) du rjnj ? rg
= 0:
o
and the last probability goes to 0 as n ! ?1 because of the rate condition (4.5) and the obvious fact that k Z0 1 X on Z ( u ) du ! k j jnj j =1 n
as n ! ?1. This proves (5.5) and so (5.4) holds. Finally, let
n
o
G3 = x : for every ?1 < t < 1; there is a u t such that x(u) = k :
34
HEATH, RESNICK AND SAMORODNITSKY
Note that Fon Fo being non-arithmetic implies that the set G3 2 D(R; f0; 1;: :: ; kg) has full measure under the probability measure 0 . Finally, we let G = G1 \ G2 \ G3, and we note that G has also full measure under the probability measure 0 . We will de ne now the function '(x); x 2 D(R; f0; 1; ::: ; kg). If x 2 Gc, let '(x) = 0 (an arbitrary number). Suppose now that x 2 G. Since x 2 G3, we may meaningfullyde ne s(x) = supfu 0 : x(u) = kg. Since x 2 G2, T (x; s(x)) is well de ned by (5.3). Now de ne '(x) as X (k) (0) given by solving (5.2) on (T (x; s(x)); 1), with X (k) (T (x; s(x))) = 0. Intuitively, s(Z (k) ())) is the last time before time 0 that all k on/o processes were on, which is identi able as a time point when the system is in a \wet period", and T (Z (k) (); s(Z (k)())) is the beginning of that \wet period", at which time the system must be empty. This constructs the required function ' : D(R; f0; 1; :: : ; kg) ! R+, and thus a stationary version of a process satisfying (3.4). References Abate, J., Choudhury, G. and Whitt, W., Waiting-time tail probabilities in queues with long-tail service-time distributions, Queueing Systems Theory Appl. 16 (1994), 311{338. Anick, D., Mitra, D. and Sondhi, M., Stochastic theory of a data-handling system with multiple sources, Bell System Tech. J. 61 (1982), 1871{1894. Asmussen,S., Applied Probability and Queues, J. Wiley and Sons, New York, 1987. Asmussen, S. and Perry, D., On cycle maxima, rst passage problems and extreme value theory for queues, Stochastic Models 8 (1992), 421{458. Avram, Florin and Taqqu, Murad S., Weak convergence of moving averages with in nite variance, Dependence in Probability and Statistics, Progr. Probab. Statist., 11, Birkhauser Boston, Boston, MA, 1986, pp. 399{415. Avram, Florin and Taqqu, Murad S., Probability bounds for M -Skorohod oscillations, Stochastic Processes and their Applications 33 (1989), 63{72. Avram, Florin and Taqqu, Murad S., Weak convergence of sums of moving averages in the -stable domain of attraction, Ann. Probability 20 (1992), 483{503. Berger, A. and Whitt, W., Maximum values in queueing processes, Probab. Engrg. Inform. Sci. 9 (1995), 375{409. Billingsley, P., Convergence of Probability Measures, Wiley, New York, 1968. Bingham, N., Goldie, C. and Teugels, J., Regular Variation, Encyclopedia of Mathematics and its Applications, Cambridge University Press, Cambridge, UK, 1987. Brichet, F., Roberts, J., Simonian, A. and Veitch, D., Heavy trac analysis of a storage model with long range dependent on/o sources, Preprint (1996). Choudhury, G. and Whitt, W., Long-tail buer-content distributions in broadband networks, Preprint, AT&T Bell Laboratories (1995). Crovella, M and Bestavros, A., Explaining world wide web trac self{similarity, Preprint available as TR-95-015 from fcrovella,
[email protected] (1995). Cunha, Bestavros, A. and Crovella, M., Characteristics of www client{based traces, Preprint available as BU-CS-95-010 from fcrovella,
[email protected] (1995). Feller, W., An Introduction to Probability Theory and its Applications, Vol. II, second edition, Wiley, New York, 1971. Heath, D., Resnick, S. and Samorodnitsky, G., Heavy tails and long range dependence in on/o processes and associated uid models, Available as TR1144.ps.Z at http://www.orie.cornell.edu/trlist/trlist.html (1996). Iglehart, D., Extreme values in the GI/G/1 queue, Ann. Math. Statist. 43 (1972), 627{635. Kella, O. and Whitt, W., A storage model with a two-state random environment, Opns. Res. 40 (1992), 257{262. Kwapien, S. and Woyczynski, N.A., Random Series and Stochastic Integrals: Single and Multiple, Birkhauser, Boston, 1992. Leland, W., Taqqu, M., Willinger, W., Wilson, D., On the self{similar nature of ethernet trac, Proceedings of the ACM/SIGCOMM '93, San Francisco, ACM/SIGCOMM Computer Communications Review 23 (1993), 183{193. Leland, W., Taqqu, M., Willinger, W., Wilson, D., On the self{similar nature of ethernet trac (extended version), IEEE/ACM Transactions on Networking 2 (1994), 1{15. Pacheco, A. and Prabhu, N.U., A Markovian storage model, Ann. Applied Probability 6 (1996), 76{91. Prabhu, N.U. and Pacheco, A., A storage model for data communication systems, Queueing Systems 19 (1995), 1{40. Pratt, J., On interchanging limits and integrals., Ann. Math. Statist. 31 (1960), 74{77. Resnick, Sidney, Point processes, regular variation and weak convergence, Advances Appl. Probability 18 (1986), 66-138. Resnick, Sidney, Extreme Values, Regular Variation, and Point Processes, Springer-Verlag, New York, 1987. Resnick, S., Adventures in Stochastic Processes, Birkhauser, Boston, 1992. Willinger, W., Taqqu, M., Leland, W. and Wilson, D., Self{similarity in high{speed packet trac: analysis and modeling of ethernet trac measurements, Statistical Science 10 (1995), 67{85.
HEAVY TAILS AND LONG RANGE DEPENDENCE
35
Willinger, W., Taqqu, M., Sherman, R. and Wilson, D., Self{similarity through high{variability: Statistical analysis of ethernet LAN trac at the source level, Preprint (1995). David Heath, Cornell University, School of Operations Research and Industrial Engineering, ETC Building, Ithaca, NY 14853 USA
E-mail address :
[email protected]
Sidney I. Resnick, Cornell University, School of Operations Research and Industrial Engineering, ETC Building, Ithaca, NY 14853 USA
E-mail address :
[email protected]
Gennady Samorodnitsky, Cornell University, School of Operations Research and Industrial Engineering, ETC Building, Ithaca, NY 14853 USA
E-mail address :
[email protected]