Multiple time scales in Markovian ATM models I. Formal ... - CiteSeerX

2 downloads 0 Views 325KB Size Report
Sep 17, 1998 - Fergal Toomey, “Predicting Quality of Service for Traffic with Long- ... [HRS] David Heath, Sidney Resnick and Gennady Samorodnitsky, “Pat-.
Multiple time scales in Markovian ATM models I. Formal calculations Adam Shwartz Electrical Engineering Technion, Israel

Alan Weiss Bell Laboratories Murray Hill, New Jersey

September 17, 1998

Abstract Multiple time scale models are an attractive alternative to longrange dependence models via heavy tails. They are Markovian models, hence analyzable, and capture some of the key phenomena manifest in long-range dependence. We analyze extensions of the AMS model of ATM traffic. The models includes both open types (where connection arrive and leave), and closed, fixed-population types. Connections present either transmit “fluid” through a buffer to a finite capacity link, or are in one of several idle states. We analyze the most likely behavior of connections and buffer, as well as the probability and path to buffer overflow. We briefly indicate how our models and analyses might be applicable to the design of connection admission controls, and to the effect of pricing on user behavior. Our analysis is asymptotic in the buffer size and link capacity, and uses techniques from the theory of large deviations.

1

Introduction

The existence of multiple time scales in teletraffic has recently received considerable attention. Willinger and his colleagues [WTLW] emphasized the subjects importance in their seminal study of Ethernet and other traffic. Subsequently many have investigated the effect of multiple time scales in queueing and other models. Most investigators, e.g., Jelenkovic et al. [JLS], Heath et al. [HRS], Erramilli et al. [ENW], Parulekar and Makowski [PM], Duffield et al. [DLORT], Evans [Ev], and Norros [No] have found the implications to be important. A few [RE] have found simple Markovian models to be adequate for describing relevant models. Despite all this activity, there 1

remain many open questions, and the subject is currently very active (see the 420 references in [WTE]). One of the main problems with models of long-tailed distributions is that they are often difficult to analyze. On the other hand, Markovian models, which are frequently analyzable, have exponentially bounded tails. The present work attempts to model long-tailed distributions by explicit Markovian models of multiple time scales. The models are analyzable; indeed, the analyses constitutes the bulk of this paper. The main defect in our analyses is that we do not provide a complete mathematical justification for some of the calculations. We hope to provide these justifications in a later paper [SW98]. Our models all share the characteristic that there are many time scales for periods of inactivity, but only one time scale for active traffic. This feature is possibly a substantial weakness of the models. Choudhury and Whitt [CW], Jelenkovic and Lazar [JL] and Heath et al. [HRS] showed that, for the systems they studied, the distribution of periods of inactivity was not as important as the distribution of activity periods. To some extent our results confirm this—see Section 9, where we show that for some models and asymptotic regimes, the existence of multiple time scales for inactive periods has no effect on steady state queueing distributions. Our results also show that in other asymptotic regimes the multiple time scales do have an effect. Furthermore, our analysis addresses transient as well as steady state statistics, and here the multiple time scales are obviously important. Our analysis uses the techniques of fluid limits, time reversal, and sample path large deviations. Thus our approach is asymptotic. The regimes we consider all have many multiplexed traffic sources, high speed buffers, and a very low probability of buffer overflow. Our work is based directly on the large deviations book [SW95], the paper on reversibility and large deviations [SW93] and the dissertation [Ma]. We give two applications of our results. One is to the effect of pricing on traffic. We briefly examine the effect on capacity, throughput, and length of connection time, when traffic sources are charged a fee for call set up, and a fee per unit time for holding a call, in addition to any fee per unit of information transmitted. The other application is to connection admission control (CAC). Here we do not derive explicit results, but we do describe how our analysis might be used in designing a CAC.

2

1.1

Outline of results

Our explicit solutions of some models reveal striking (but not surprising) characteristics. First, let us see how a buffer would begin to fill1 . In our models, sources arrive or leave, and each source moves in a Markovian manner between an active state—where it transmits—and several inactive states. The most likely way for the number of sources in the active state to become large is for the system as a whole to fill, and then for the inactive states to pump their customers into the active state; see Figure 7. This occurs as follows, when the time scales are well-separated. The slowest time scale sources gradually accumulate, then push into the active state. Then when they have filled the active state to a certain amount, the next slowest time scale kicks in, pushing accumulated sources into the active state and so on, until all the inactive states have emptied themselves back to their steady state values. After this description, it shouldn’t be surprising that the most likely way a small amount of traffic accumulates in the buffer is for the system to behave as just described, but then for the fastest time scale sources to continue to squeeze over into the active state. Note that our analysis does nor require the time scales to be well separated. If they are not then many classes of sources will squeeze into the active state. The situation changes when we consider how a huge amount of traffic might accumulate. In that case, sources at all the time scales simultaneously accumulate to a high, easily calculated level and hold there nearly until the overflow occurs (individual sources continue to move between states, but their distribution remains skewed.)

1.2

Outline of the paper

In section 2 we provide a description of the models under consideration. We introduce the fluid scaling and set the notation. Section 3 describes two applications of our results: to pricing and to Connection Admission Control (CAC). We start our analysis in Section 4, dealing with the steady-state distribution of the sources. Section 5 describes the most likely (transient) behavior of the sources, and consequently of the (scaled) buffer size. In Section 6 we show how reversibility can be used to obtain steady state results. The analysis of the unlikely sample path behavior begins in Section 7, where we derive the local rate functions for the open and the closed models. Section 8 deals with the buffer overflow problem, with particular attention (and 1

the detailed models are in the next section: this is just to give a flavor of the results

3

more explicit results) for small buffers and for large buffers. We conclude in Section 9 with some immediate consequences of our analysis. Finally, in the Appendix 10 we show how to derive the results directly from the variational equations associated with the large deviation problem.

4

2

A plethora of models

The basic models are based on Markovian networks of infinite server queues. We begin with a description of “open” models, those that have arrivals and departures, postponing our description of “closed” models to the next subsection. Close models have fixed population. We denote open models by OK , and closed models by CK , where the index K denotes the dimension of the system.

2.1

Open models

There are K ≥ 1 queues in model OK . Customers, called “sources” below, arrive at state 1 according to a Poisson process with parameter nγδ. The rational for this complicated parameter is given later. State 1 is called the “on” or “active” state. If K ≥ 1, each customer in state 1 makes transitions to state j (2 ≤ j ≤ K) at rate λ1j . The system is Markovian, so this is a Poisson rate. Furthermore, customers in state 1 leave the system at rate δ (this δ was introduced in the arrival rate nγδ). Customers in state j ≥ 1 make transitions only to state 1, and these occur at rate λj1 . Let xi (t) denote the number of sources in state i at time t, and define off2

n

off3

off4

on

Figure 1: The open source model, K = 4 ~x(t) = (x1 (t), . . . , xK (t))T where T denotes transposition. This is a Markov process, and is completely characterized by its jump rates and jump directions. The jumps are described by the vectors ~eij of dimension K, indexed by {i, j}; ~eij = (m, 0, . . . , 0, k, 0, . . . , 0)T .

(2.1)

This vector represents the possible transitions: an arrival (to state 1) is represented by m = 1, k = 0 and a departure by m = −1, k = 0. A transition from state 1 to state j is represented by m = −1, k = 1, where k is the jth entry of ~eij . A transition from state j to state 1 is represented

5

by m = 1, k = −1. The jump rates of ~x depend on the state, and (with a slight abuse of notation) are given by λij (~x) = λij · xi .

(2.2)

We use i = 0 to denote arrivals, so that λ01 = γδ is independent of the value of ~x and j = 0 to denote departure, so that λ10 (~x) = x1 δ. So far we have described the behavior of sources. There is another component to the system too. There is an auxiliary buffer, which is called “the buffer” throughout this paper. The buffer’s contents are denoted b(t). The buffer’s behavior is governed by the vector ~x(t) as follows. Loosely speaking, the buffer drains at rate n · C, where n is the parameter introduced in the Poisson arrival rate nγδ, and C is a another fixed positive quantity. The buffer fills at rate x1 (t); in other words, each active source adds 1 to the buffer’s fill rate. To be more precise, we have

n off2

off3

off4

on

b(t)

nC

Figure 2: The open model with buffer db = dt

( x1 (t) − nC 0

if b(t) > 0 or x1 (t) > nC, otherwise.

6

(2.3)

Equation (2.3) has the following well-known solution [SW95, Eqn. 13.4]: b(t) = sup

Z

t

 b(0)δ(u) + z∞,1 (u) − C du .

0≤s≤t s

2.2

(2.4)

Closed models

A closed model with K ≥ 2 states, denoted CK , can be defined as an open model OK with δ = 0. This means that there are no arrivals or departures, so the number of sources is fixed in time. We let n denote this number. CK is again a Markovian model, with state ~x(t). The closed model has a buffer off2

off3

off4

on

off5

Figure 3: The closed source model, K = 5 b(t) that fills at rate x1 (t) and drains at rate nC. The behavior of b(t) will be interesting only if C < 1, so that db/dt > 0 is possible.

2.3

Scalings

We introduce the standard fluid scaling 1 ~x(t), n 1 bn (t) = b(t). n

~zn (t) =

(2.5) (2.6)

Our point of view is, for either closed or open models, we hold the parameters λij , γ, δ, and C fixed, and let n → ∞. We then apply the powerful machinery of fluid limit theorems and large deviations theory to generate our results. Our aim at this point is to calculate the steady state behavior, and obtain small and large buffer asymptotics for the probability that buffers are nonempty and for the overflow probability. In some cases, we shall only 7

go as far as to suggest a numerical scheme. Following the point of view of [SW95], we first verify that the calculation provide useful insights, and postpone proofs to [SW98], which is not yet complete. There is an interesting dichotomy between the cases δ > 0 but small (model OK ), and δ = 0 (model CK ). As δ ↓ 0, transitions in or out of OK become rare, making the model look like CK . However, for any fixed δ > 0, as n → ∞, the arrival rate nγδ approaches ∞. Therefore, since our analysis is asymptotic in n, it is perhaps not too surprising that our results ′ for CK and OK are very different. There is another closed model CK+1 that approximates OK arbitrarily well over fixed time and space intervals. Take off2

n

off3

off4

off2

on

off3

off4

on

off5

Figure 4: Nearly equivalent open and closed models; they are equivalent with appropriately chosen rates ε a small number, and set nC = (1 + γδ ε )n, where nC stands for n closed. ′ You can easily check that the Poisson rate of CK+1 and OK are nearly the same. But of course the populations n and nC are quite different. We use this observation below in monotonicity analysis.

3

Applications and motivation

There are at least two applications of our work: connection admission control (CAC), and pricing. We discuss them in this section to motivate the reader, as well as to explain our motivation in considering this class of models. We note that both pricing for network services and CAC are major areas of research, and do not wish to attempt even a cursory overview of the voluminous extant literature.

8

3.1

Pricing

We consider the effect of pricing schemes on user behavior and network utilization. From a network’s point of view, there are costs associated with setting up calls2 , transmitting information, and reserving bandwidth to provide good quality of service (QoS) to connected, inactive calls. This last point means that a network that has many connected but inactive calls may need to refuse to establish new connections because of the risk of being unable to provide QoS if the inactive users should reactivate. Therefore, it is in the networks interest to charge inactive connections a bandwidth reservation fee per unit time. From a customer’s point of view, if he know he is entering a long inactive state, he has some incentive to disconnect from the network if he is being charged a per-unit-time connection fee. There are of course costs associated with establishing a connection, even if there are no fees involved in call setup: the time taken to establish a call, the possibility of being refused a connection etc. Therefore, customers will be likely not to disconnect during short breaks. Furthermore, if there is a fee for establishing a connection, then users have a stronger disincentive to disconnect upon entering an idle state3 . All these considerations can be incorporated simply into the model we have just developed. Here is one approach. Consider an O4 model, where the 3 off-states have well-separated rates. For example, Off2 could represent pauses between keystrokes while typing (mean off-time approximately 1/4 second), Off3 could represent pauses for thought (mean off-time approximately 1/2 minute), and Off4 could represent pauses for lunch, or to deal with a different project (mean off-time approximately 1 hour). These three rates are separated by factors of about 100. In general, assume λ21 ≫ λ31 ≫ · · · ≫ λK1 . Now suppose that there is a charge of ε per unit time to stay connected, and η to establish a connection. If the average cost of connection while off in state K equals ε/λK1 < η, then there is no incentive for a customer to close his connection while idle. However, if ε/λK1 ≫ η, then there is strong incentive to close the connection rather than enter into state K. Indeed, the index J that makes ε/λJ1 ≥ η > ε/λJ+1,1 indicates the smallest index of the states that are uneconomical to enter. For simplicity, let us assume that 2 < J < K. Let us further suppose that a fraction pi of customers would temporarily exit the system instead of 2

tear-down costs can be included in call setup costs. There may be a cost per unit of information transmitted. However, we assume that the source is willing to pay this price, and its remaining interest is to minimize its overhead. 3

9

entering state i, J ≤ i ≤ K. The quantities pi include the effect of connect time charge ε, per-connection charge η, and holding times 1/λ1i . We do not specify the functional form of the pi , but they should clearly be monotone increasing in ε and λ1i , and monotone decreasing in η. We may model this effect by adding K − J + 1 new states denoted J ′ , (J + 1)′ , . . . , K ′ . We set λi′ 1 = λi1 for J ≤ i ≤ K, λ1i′ = pi · λ1i , and change λ1i to (1 − pi )λ1i . This makes the population of state i plus the population of state i′ equal to the population of the original state i. We regard customers in i′ states as having exited the system temporarily. Therefore, the population of the system is x1 + . . . + xK . The average reduction in population under this Not in off3' system off2

n

off3

off4'

off4

on

Figure 5: With pricing, some customers leave rather than enter a long idle period. Here J = 3, K = 4 P scheme is nγ K i=J pi λ1i /λi1 . The probability of buffer overflow is, of course, unchanged. The important effect of pricing here is to lower the number of long-time connected idle customers. Note also that our calculations are for steady-state quantities. Therefore, we believe that our assumption that the system is Markovian is unnecessary for our results on pricing to hold, at least as far as steady-state means are concerned. In other words, the holding times in states J ′ , (J + 1)′ , . . . , K ′ can be taken as general independent random variables. Of course, this means that we are not concerned with buffer statistics. It is also easy to examine the effect of such a model on a closed system, to calculate the effect of supposing that some people who temporarily disconnect actually leave forever, and to calculate the average cost savings to both user and the network under such a strategy. But we do not wish to belabor the point, and the calculations are straightforward. In summary, if one can estimate the probabilities pi as a function of ε, η, and λ1i , then the correct price to charge for each connection and per unit of time will be straightforward to calculate. 10

3.2

Connection admission control (CAC)

Connection admission control was one of the main items motivating us in this study. Although we do not derive results on CAC in this paper, we feel that our approach may be fruitful. Our approach is similar in spirit to that of Mitra, Reiman, and Wang [MRW], where fast times scales were assumed to mix quickly for the purpose of CAC. We now outline our approach and comment on implementation. A good admission control should not admit a customer if that admission would cause too high a probability of error in the foreseeable future. A Markovian control policy could have an acceptance region R, where new connections would be allowed, and where call attempts when ~x(t) 6∈ R would be rejected. The problem is to design the region R. To see that the structure of this region is not trivial, consider for example an O2 model, where x1 (0) is small, but x2 (0) is very large. Then the most likely evolution of ~x(t) will x2 x(0)

x* x1 C

Figure 6: Large initial values of x2 cause x1 to be large soon make x1 (t) large in the near future; see Figure 6. One of the advantages of our sample-path large deviations approach is that the probability of overflow could be calculated for such a policy, simply by changing the rate function. There is some technical difficulty with boundaries, which could be defined in a smooth manner, or handled in other ways. We choose to finesse this matter. A slight modification to an OK model addresses the issue as follows. Consider any path ~x(t) starting at ~x(0) = ~x0 ∈ ∂R that causes an overflow at time T . Suppose that the entire path from t = 0 to t = T satisfies ~x(t) 6∈ R. Then the appropriate model for this case has no arrivals in [0, T ]. This is the same as setting γ = 0. The problem is thus reduced to the

11

transient problem of finding P (overflow) ≈ e−nI(~x0 ) , where γ = 0. If a fixed probability of overflow = e−nH is desired, then the level set I(~x0 ) = H gives an appropriate definition of ∂R. A different approach [RS] is to choose T to reflect the expected sojourn time of the call under consideration: this puts the emphasis on the individual QoS guarantee, and requires similar calculations. In summary, one approach to the design of CAC consists of solving a transient large deviations problem to find to cost function I(~x0 ), and then finding an appropriate level set of this function. While this approach would very likely lead to an analytically intractable variational problem, it should be quite feasible numerically.

4 4.1

Steady state distribution Open models

In order to calculate the steady state distribution of ~x(t) we quote the following result of Massey and Whitt for M/G/∞ queues [MW]. Theorem 1 (Massey and Whitt [MW]) In steady state, all components xi , 1 ≤ i ≤ K are statistically independent with Poisson distributions. We now calculate the distribution of each component xi by considering the “balance equations.” Write xi for the mean occupancy of state i. For i = 2, . . . , K, considering the flow into and out of state i gives x1 λ1i = xi λi1 .

(4.1)

Furthermore, considering the external arrivals and departures from state 1 gives nγδ = x1 δ .

(4.2)

x1 = nγ,

(4.3)

Therefore

and so xi = nγ 12

λ1i . λi1

(4.4)

In summary, we find that the steady state distribution of the vector ~x is a product of Poisson distributions in each component, with rate   λ1K T λ12 . , . . . , nγ nγ λ21 λK1

(4.5)

We have three comments before we leave this issue. One is that, for large n, Poisson distributions are closely concentrated about their means. Recall vector ~zn (t) = n1 ~x(t). Define   λ1K T λ12 def ·γ . ,... , ~z∗ = 1, λ21 λK1

(4.6)

Then by the weak law of large numbers, for any ε > 0, we have, in steady state, lim P (|~zn − ~z∗ | ≥ ε) = 0 .

(4.7)

n→∞

In other words, the distribution of the vector ~zn in steady state is nearly a point mass at the point ~z∗ . Our second point is the we did not require that the process be Markovian for the steady-state distribution to be calculated. Indeed, Massey and Whitt’s result was formulated for non-Markovian queuing networks. The third point is that, while we have calculated the steady state distribution of the vector ~x(t), we have not yet addressed the distribution of the buffer b(t).

4.2

Closed models

In steady state, the distribution of the vector ~x is clearly multinomial. Let pj represent the probability that a particular source is in state j. By balancing the rates to and from state j , 2 < j ≤ K, we see λj1 pj = λ1j p1 , or pj =

λ1j p1 . λj1

Coupled with the normalization condition p1 =

1+

1 PK

(4.8)

PK

j=1 pj

λ1j j=2 λj1

.

= 1 we obtain (4.9)

Equations (4.8) and (4.9) define pj , and so the distribution of ~x is specified. 13

There is a law of large numbers that holds for closed models, analogous to (4.7). For closed models, define def

~z∗ = (p1 , . . . , pK )T .

(4.10)

Then (4.7) holds with this ~z∗ . Furthermore, the steady-state distribution of ~x is not dependent on a Markovian model. This was shown by Ivnitskii [Iv] in general for starshaped networks, but should be quite believable in any case. Again, we make Markovian assumptions in order to calculate the transient behavior of the sources, and so the statistics of the buffer.

4.3

Preliminary estimates of the buffer

Let us suppose that the Freidlin-Wentzell theory [SW95, Chapter 6] applies to all of our models, in order to calculate the probability that the buffer is nonempty in steady state. We will establish in [SW98] that the FreidlinWentzell theory indeed applies. According to the Freidlin-Wentzell theory, the probability that the buffer is nonempty is logarithmically equivalent to the probability that the number of “on” sources exceeds nC (so, in particular, it does not depend on the size of the buffer!). That is, in steady state 1 1 log P(bn > 0) = lim log P(zn,1 ≥ C). n→∞ n n→∞ n lim

Here bn is the normalized buffer contents of the model with parameter n. For open models, the probability on the right can be calculated from the formula for the Poisson distribution, or we can use Chernoff’s theorem to estimate   C C C C 1 − log P(b > 0) ≈ C log + γ − C = γ log + 1 − . (4.11) n γ γ γ γ For closed models we may use binomial statistics as in [SW95, Chapter 13], where it was shown (for the case C2 ) that the Freidlin-Wentzell theory indeed applies [SW95, Theorem 13.39], and that 1−C C 1 + (1 − C) log . − log P (b ≥ 0) ≈ C log n p1 1 − p1

14

5

Most likely behavior, or fluid limits

In this section we compute the most probable pathwise behavior of the scaled process. This result will be later combined with arguments of reversibility and variational methods, to find the most likely path to nonempty buffer and its associated probability. The main tool for the analysis of probable behavior is Kurtz’s Theorem [SW95, Eq. (5.7) and Theorem 5.3]. Recall the definitions (2.1)–(2.5) of the jump rates, jump directions, and the scaled process ~zn . Theorem 2 (Kurtz) Let λij (~x) be uniformly bounded and Lipschitz continuous, and let ~z∞ (t) be the unique solution of X d ~z∞ (t) = λij (~z∞ (t))~eij dt

(5.1)

ij

with ~z∞ (0) = ~z0 . For each finite T there exists a positive constant C1 and a function C2 = C2 (ε) with lim ε↓0

C2 (ε) ∈ (0, ∞) ε2

and

lim ε↑0

C2 (ε) =∞ ε

(5.2)

!

(5.3)

such that, for all n ≥ 1 and ε > 0, P

sup k~zn (t) − ~z∞ (t)k ≥ ε ~zn (0) = ~z0

0≤t≤T

≤ C1 e−nC2 (ε) .

Moreover, C1 and C2 can be chosen independently of the initial point ~z0 . Note that the assumption that the jump rates are bounded does not hold in the open case. In Corollary 7 below we adapt the theorem to our case.

5.1

The fluid limit

By Kurtz’s Theorem, we expect the most likely path to obey the following equations: Model O1 : d z∞ (t) = γδ − δz∞ (t) dt

(5.4)

15

Model OK , K ≥ 2: d z∞,1 (t) = γδ − dt

δ+

K X

λ1i

i=2

!

z∞,1 (t) +

K X

λi1 z∞,i (t), (5.5)

i=2

d z∞,i (t) = λ1i z∞,1 (t) − λi1 z∞,i (t) , dt

i ≥ 2.

def

For Model C2 we often abbreviate z∞ = z∞,1 . Since zn,2 = 1 − zn,1 , it is fully described by d z∞,1 (t) = λ(1 − z∞,1 (t)) − µz∞,1 (t). dt

(5.6)

For Model CK , K

X d λi1 z∞,i (t) − z∞,1 (t) = dt

K X

λ1i

!

z∞,1 (t),

(5.7) i=2 i=2 d z∞,i (t) = λ1i z∞,1 (t) − λi1 z∞,i (t) , i ≥ 2. dt These are all linear differential equations, and in our applications the initial conditions are necessarily positive. Lemma 3 If z∞,i (0) ≥ 0 for all i, then any solution of (5.4)–(5.7) remains componentwise nonnegative for all t ≥ 0. Proof. As long as z∞,1 (t) ≥ 0, the other coordinates cannot become negative. On the other hand, as long as z∞,i (t) ≥ 0, i ≥ 2, we have ! K X d (5.8) λ1i z∞,1 (t) ≥ γδ − (δ + Λ1 ) z∞,1 (t) where Λ1 = δ + dt i=2

(with δ = 0 for closed models) and so [Ha] z∞,1 (t) ≥ z∞,1 (0)e−Λ1 t + 1 − e−Λ1 t

 γδ ≥0 Λ1

since Λ1 > 0.

(5.9)

Next we show that these systems of linear equations describing open as well as closed models are asymptotically stable4 . This implies that the solutions converge to limit points, given by (4.6) and (4.10) respectively. 4 This means solutions are uniformly bounded, and there is a point ~ z ∗ so that, for any initial condition ~ z∞ (0) we have limt→∞ ~ z∞ (t) = ~z ∗ ).

16

Lemma 4 The system of equations (5.5) is asymptotically stable. Proof.

Let ~z = (z∞,1 , . . . , z∞,K )T and write (5.5) in vector form as d ~z(t) = A · ~z + b , dt

(5.10)

where 

     A=    

−δ −

PK

λ21 −λ21

λ13 .. . .. . λ1K

0

i=2 λ1i

λ12

λ31 0

... ...

λK1 0 .. . .. . .. .

−λ13 . . .

0 .. .

0 .. .

0

...

... ... 0

−λK1

          



 γδ 0   b= .   ..  0

,

(5.11) .

From the general theory of linear differential equations [Ha], the equation (5.10) is asymptotically stable if and only if all eigenvalues of A have (strictly) negative real part. Consider the transpose AT of the matrix A, which has the same eigenvalues as A. Denote by aij the i, j element of AT . By Gerˆsgorin’s Circle theorem [MM, 2.2.1, Page 146], all eigenvalues of A lie in the union of the closed discs |z − aii | ≤ Ri − |aii |,

all i

(5.12)

where Ri =

K X j=1

|aij | .

(5.13)

But for AT , Ri = 2λi1 and aii = −λi1 so that each of the closed discs lies entirely in the left half of the complex plane, and the only possible point with zero real part is the point 0. However, it is easy to see that 0 cannot be an eigenvalue of this matrix. Indeed, if ~v = (v1 , . . . , vK )T is an eigenvector corresponding to the eigenvalue 0 of the matrix A, then λ1i v1 − λi1 vi = 0 ,

i = 2, . . . , K

(5.14)

and so we can choose ~v =



λ12 λ1K 1, ,... , λ21 λK1 17



.

(5.15)

But then the first element in A · ~v is −δ −

K X

λ1i +

K X

λi1

i=2

i=2

λ1K = −δ 6= 0 λK1

(5.16)

which contradicts the assumption that 0 is the eigenvalue. Therefore, all eigenvalues have negative real parts. The stability of closed models is easy to establish. LemmaP 5 The system of equations (5.7) is asymptotically stable on the set K zi ≥ 0, i=1 zi = 1.

Note that (5.7) is not asymptotically stable since, for this model K X

z∞,i (t) = Constant

(5.17)

i=1

and therefore any limit must depend on our initial point. However, we are only interested in the case where the constant equals 1. Proof.

Since

PK

i=1 z∞,i (t)

= 1 we have

z∞,1 (t) = 1 −

K X

z∞,i (t)

(5.18)

i=2

and the variable z∞,1 (t) can be eliminated. The reduced system of linear differential equations, of dimension K − 1, ! K X d z∞,i (t) = λ1i 1 − z∞,k (t) − λi1 z∞,i (t), i = 2, . . . , K (5.19) dt k=2

is asymptotically stable. Indeed, with ~z = (z∞,2 , . . . in the vector form (5.10) where  −λ12 − λ21 −λ12 ... −λ12  −λ13 −λ − λ . . . −λ13 13 31  A= .. .. .. ..  . . . . −λ1K

−λ1K

...

18

, z∞,K )T we can write (5.19)

−λ1K − λK1

    

 λ12  λ13    b= .  .  .  

,

λ1K

. (5.20)

Let PK α be an eigenvalue of A with eigenvector ~v , which we normalize so that k=2 vk = 1 (a similar calculation shows that the sum cannot vanish). Then αvj = −λ1j

K X k=2

vk − λj1 vj

= −λ1j − λj1 vj

(5.21) (5.22)

so that vj = − But this contradicts

PK

k=2 vk

λ1j . λj1 + α

(5.23)

= 1 unless Real (α) < 0.

These results imply that, as t → ∞, the fluid limit converges to the “balanced” state. Corollary 6 For each of the models OK and CK , there exists a point ~z∗ so that k~z∞ (t) − ~z∗ k ≤ Ceλt

(5.24)

where λ < 0. The exponent λ can be chosen as any real number larger than the largest real part of the eigenvalues of the matrix A of (5.11) or (5.20), respectively. The constant C depends on ~z∞ (0) and on the choice of λ. The point ~z∗ is given by (4.6) for open models and by (4.10) for closed models. Proof. By Lemmas 4–5, the differential equations are asymptotically stable, and all eigenvalues of the matrix A have strictly negative real parts. So [Ha], there is a unique limit ~z∗ and the solution converges to this limit exponentially fast with the cited rate. This limit is obtained by setting the left-hand-side of the differential equations to 0 (and, for models CK , using P K ∗ i=1 zi = 1).

5.2

Most likely behavior

We can now describe the most likely behavior of our models, in the spirit of Kurtz’s theorem. Corollary 7 Consider a model OK and let S be a closed, bounded set in the positive orthant (that is, ~z ∈ S implies ~zi ≥ 0 for all i). For each 19

finite T there exists a positive constant C1 and a function C2 = C2 (ε) ≥ 0 satisfying (5.2), with ! (5.25) P sup k~zn (t) − ~z∞ (t)k ≥ ε ~zn (0) = ~z0 ≤ C1 e−nC2 (ε) , 0≤t≤T

where ~z∞ is the unique solution of (5.5) with initial condition ~z∞ (0) = ~z0 . Moreover, C1 and C2 can be chosen independently of the initial point ~z0 ∈ S. For a CK model, let S be the set ) ( K X xi = 1 . ~x : xi ≥ 0, i=1

Then the conclusions hold, where (5.7) replaces (5.5). Proof. It is easy to verify that equations (5.5) and (5.7) agree with (5.1). Since by Lemmas 4 and 5 the differential equations are asymptotically stable, they are by definition stable [Ha], and so there exists a closed bounded set S1 so that any solution with ~z0 ∈ S satisfies ~z∞ (t) ∈ S1 for all t ≥ 0. Let  \ {~z : zi ≥ 0 for all i} . (5.26) S2 = ~z : ′inf k~z − ~z′ k ≤ 1 ~ z ∈S1

Then λij (~x) are bounded in S2 , so that Kurtz’s theorem applies. Kurtz’s theorem allows us to describe the most likely way the buffer drains: that is, to describe transient phenomena. In particular, when z1∗ < C, the fluid drain rate is (most likely) larger than the input rate, and consequently the buffer will eventually drain, and from that point on, the throughput of the (unscaled) model is n · γ. Corollary 8 Let the stability assumption z1∗ < C hold, and denote by b(0) the buffer content and by ~z(0) the state at time 0. Define β(t) = sup

Z

0≤s≤t s

t

 b(0)δ(u) + z∞,1 (u) − C du

(5.27)

where δ is the Dirac Delta function. Then β(t) ≥ 0 and β(t) ≤ b(0) − C0 (t − T1 )

20

(5.28)

where T1 and C0 are positive and can be chosen uniformly in bounded (b(0), ~z (0)) sets. Moreover, for any ε > 0 and any T , ! (5.29) P sup |bn (t) − β(t)| ≥ ε ~zn (0) = ~z0 ≤ C1 e−nC2 (ε) , 0≤t≤T

where bn is the scaled buffer content, give in (2.6), and C1 , C2 are as in Corollary 7.

Equation (5.27) specifies that the buffer content consists of the fluid accumulated from the last time the buffer was empty: if that happened at some time after time 0, then the initial content does not play a role anymore. This Corollary provides a pathwise description of the most likely way that the buffer empties, starting at an arbitrary size and with an arbitrary number of sources in each state. The result is meaningful under the buffer stability condition z1∗ < C. Proof. The first claim follows from Corollary 6, since we can choose T1 as the time after which z∞,1 (t) remains close enough to z1∗ so that the buffer drains at the specified rate. To prove the second claim, note that (5.27) describes also bn (t) as a function of zn,1 (t). So Z t  |bn (t) − b(t)| = sup b(0)δ(s) + zn,1 (s) − C ds (5.30) 0≤s≤t s

t

− sup b(0)δ(s) + z∞,1 (s) − C ds 0≤s≤t s Z t Z t z∞,1 (s) ds ≤ sup zn,1 (s) ds − Z



0≤s≤t Z t 0



(5.31)

0

s

|zn,1 (s) − z∞,1 (s)| ds .

(5.32)

The result follows by choosing ε′ = ε/T in Corollary 7.

6

Reversibility, likely and unlikely behavior

For background on reversibility, see Kelly [Ke]. For its applications to large deviations problems see Shwartz and Weiss [SW93]. The line of reasoning we use goes as follows. 21

1. By Kurtz’s theorem 7, for any fixed time T , initial state z and ε > 0, ! lim Pz

n→∞

sup k~zn (t) − ~z∞ (t)k < ε

=1

(6.1)

0≤t≤T

where ~z∞ solves (5.4) with ~z∞ (0) = z. 2. By reversibility, for any initial state z and any path ~r(t) we have, in steady state, sup k~zn (t) − ~r(t)k ~zn (0) = z

P

0≤t≤T

=P

!

! sup k~zn (t) − ~r(−t)k ~zn (0) = z . (6.2)

−T ≤t≤0

Combining the arguments of (6.1)–(6.2) it follows that, in steady state, the most likely way to reach a state z is by following the time-reversal of ~z∞ —the solution of (5.4) with ~z∞ (0) = z. This result has been known in various guises for quite some time. This does not as yet resolve the problem of the occurrence of a nonempty buffer. We need to know exactly which point is the most likely end point of the path leading to that occurrence. For models OK we can use the independence result of Theorem 1: since in steady state the {zi } are statistically independent, conditioning on the first coordinate does not affect the other coordinates. Therefore, by §5 we have in steady state, for each ε > 0,   ∗ (6.3) lim P |z2 − z2∗ | < ε, . . . , |zK − zK | < ε z1 = C = 1. n→∞

Combining (6.1), (6.2) and (6.3) and using the uniform continuity of ~z∞ with respect to initial conditions, we see that in steady state, for any ε and T > 0, ! lim P sup k~zn (t) − ~z∞ (−t)k < ε zn,1 (0) = C = 1, (6.4) n→∞

−T ≤t≤0

∗ ). where ~z∞ (0) = (C, z2∗ , . . . , zK When the time scales of the various off states are well separated, this result has a simple interpretation. The way that upcrossings occur is for the slowest time scale to first fill, then empty its excess into the active state.

22

As it fills, all the other off states keep in relative equilibrium, filling just enough to balance the new occupancy of the slow and active states. At the moment that the slow state empties enough customers into the active state that it is at its steady-state occupancy, the next fastest time scale empties its customers, too. Then the next faster time scale immediately empties its customers, etc. This is illustrated in Figure 7. # active

# active expanded view

expanded view …

t # off in slow time scale

t # off in med. time scale

expanded view

expanded view

mean steadystate value t

mean steadystate value t

Figure 7: The open model with separated time scales For closed models, recall that the steady state distribution for each source ~π satisfies ~π = {p1 , . . . , pK } where the pi satisfy (4.8)– (4.9). The FreidlinWentzell theory implies 1 1 log P (buffer > 0) → log Pss (x1 (t) ≥ C) n n

as n → ∞

(6.5)

where Pss is the steady state probability. Also 1 C 1−C log Pss (x1 (t) ≥ C) → C log + (1 − C) log n π1 1 − π1

as n → ∞

(6.6)

due to the following. Imagine an experiment where n i.i.d. coins are flipped, one at a time, with P(heads) = π1 . If a coin is heads then it goes to x1 , otherwise not. Then the probability distribution for the coins in this experiment is Binomial. However, this is the same as the distribution of sources in state x1 , since sources are statistically independent, with probability of 23



π1 for being on. Thus (6.6), which describes binomial random variables, applies also to x1 . As before, the sample path to {x1 ≥ C} is the time reversal of ~z∞ . The only unknown that we need to calculate is the initial value ~z∞ (0). Perhaps the easiest way to calculate ~z∞ (0) is to imagine n(1 − C) tosses to a (K − 1)-sided die (again, the independence of the sources implies that their distribution is the same as that of the toss of independent die). Each face j of the die has a probability πj PK

i=2 πi

=

πj . 1 − π1

(6.7)

Then the number of times side j comes up is, most likely, n(1 − C)

πj 1−C = nπj . 1 − π1 1 − π1

(6.8)

That is, ~z∞ (0) =



1−C 1−C , . . . , πK C, π2 1 − π1 1 − π1



.

(6.9)

In summary, for the upcrossing problem to x1 ≥ C, the solution is the time-reversal of the solution to the differential equation (5.10).

7

Calculus of Variations and unlikely behavior

We now turn to a different method of solving for the probability of unlikely events, as well as for the manner in which these events occur. The equivalence of this method with that in Section 6 is given in Appendix 10. We use the notation of Section 2: see equations (2.1)–(2.2). The derivation below applies to both open and closed models: for the latter simply set δ = 0. Define the “local rate function” ℓ as      X ~ ℓ(~x, ~y ) = sup h~θ, ~y i − λij (~x) ehθ,~eij i − 1 (7.1)   ~ θ ij

where the sum is over all relevant pairs (i, j). We approximate probabilities using sample path large deviations methods [SW95]. This entails minimizing the functional Z T  ℓ ~r(t), ~r ′ (t) dt (7.2) I(~r ) = 0

24

over the sets of paths of interest, as well as finding the minimizing path. In our case, we can transform ℓ(~x, ~y ) to a more convenient form. By definition,     ~  X θ, ~y i − λij (~x) ehθ,~eij i − 1 ℓ(~x, ~y ) = sup h~ (7.3)   ~ θ ij  K      X ~ λ1j x1 eθj −θ1 − 1 − x1 δ e−θ1 − 1 = sup hθ, ~y i −  θ~ j=2 (7.4) ) K     X λi1 xi eθ1 −θi − 1 − γδ eθ1 − 1 . − i=2

Making the obvious substitution θ1′ = θ1 and θj′ = θj − θ1 , j ≥ 2, we have ℓ(~x, ~y ) = sup ~ θ′

= sup θ1′

+

 

θ1′ y1 +



  

K X j=2



θ1′

K X

K X i=2

K X j=1

(θ1′ + θj′ )yj −

K X j=2

 ′   ′  λ1j x1 eθj − 1 − x1 δ e−θ1 − 1

 ′   ′  λi1 xi e−θi − 1 − γδ eθ1 − 1

  ′  ′ yj − x1 δ e−θ1 − 1 − γδ eθ1

)

  −1 

(7.5)

(7.6)

 ′ o  ′  n sup θj′ yj − λ1j x1 eθj − 1 − λj1 xj e−θj − 1

′ j=2 θj



= h γδ, x1 δ,

K X j=1



yj  +

K X

h (λ1j x1 , λj1 xj , yj )

(7.7)

j=2

where h is given by [SW95, Eq. (7.16)–(7.17)] p p y + y 2 + 4ab def (7.8) h(a, b, y) = y log + a + b − y 2 + 4ab. 2a P For closed models, δ = 0 and K j=1 yj = 0 (since the total number in the system is fixed). Since h(0, 0, 0) = 0 we conclude that the rate function ℓ takes the form

25

Open Model:   ∞ if xi < 0, or  xi = 0 and yi < 0, some i,    K K X X ℓ(~x, ~y ) =    y h (λ1j x1 , λj1 xj , yj ) otherwise. γδ, x δ, h + j 1   j=1

j=2

Closed Model:

ℓ(~x, ~y ) =

  ∞       ∞

if xi < 0, or xi = 0 and yi < 0, for some i, K X yj 6= 0, if

j=1   K  X    h (λ1j x1 , λj1 xj , yj )  

otherwise.

j=2

8

Buffer overflow

Recall that the buffer content at time t is denote by b(t). The buffer size depends on the number of on sources through ( x1 (t) − nC if b(t) > 0 or x1 (t) > nC, d (8.1) b(t) = dt 0 otherwise. Recall that in (2.6) we defined the scaled buffer by bn (t) =

1 b(t). n

The scaled buffer therefore satisfies ( zn,1 (t) − C if bn (t) > 0 or zn,1 (t) > C, d bn (t) = dt 0 otherwise.

(8.2)

Let ~rB∗ (t) denote the optimal path to make a buffer of size B. This path starts at ~z ∗ , and necessarily goes through ~r1 (t) = C. We shift time so that ∗ (0) = C. The variational problem can be broken into two parts: the rB,1 one leading to the point ~rB∗ (0), and then from ~rB∗ (0), the part that makes a

26

buffer of size B. The second variational problem takes the following form: Z T  ℓ ~r(t), ~r ′ (t) dt (8.3) Minimize F2 (~r) = Subject to

~r(0) =

Z

0

0 ~rB∗ (0)

(8.4)

T

(r1 (t) − C) dt = B .

(8.5)

In [SW95, Chapter 13] we showed that the optimal way to exceed B is to reach exactly B, and that once the buffer starts filling, it will not empty prior to reaching level B. In order to handle the last constraint, we introduce a Lagrange multiplier L, and examine the stationary points of the functional Z T   ℓ ~r(t), ~r ′ (t) − L(r1 (t) − C) dt (8.6) F (~r) = 0

under the constraints (8.4)–(8.5). The starting point ~rB∗ (0) is the terminal point of the path leading to rB,1 (0) = C, and is to be determined by finding inf∗ I1 + I2 (B) , ~ r

(8.7)

where I1 is the cost of the path up to t = 0, leading to the buffer starting ∗ = C. The cost I (B) corresponds to the path from t = 0 to fill, so that ~rB,1 2 on, when the buffer fills from empty to B. In general, we cannot solve analytically for the optimal cost and optimal path to create a buffer of a given size. We can, however, analyze the frequency and the manner with which a buffer overflows in several asymptotic regimes. In §8.1 below we consider the asymptotics as B becomes small and n becomes large, for the open model. The closed model is discussed in 8.2. The large buffer asymptotics are analyzed in §8.3 (open model) and in §8.4 (closed model).

8.1

Small buffer asymptotics: open model

Following the argument in Shwartz and Weiss [SW95, P. 383], we know that   ∗ ~rB∗ (0) = (C, z2∗ , . . . , zK ) + O B 1/2 , (8.8) √ I ∗ (B) = I ∗ (0) + c3 B + O(B) . As in [SW95, pp. 383–384], we can approximate the jump rates for small ∗ ), so buffer by the rates z1 λ1j and zj λj1 , j ≥ 2 at the point ~z = (C, z2∗ , . . . , zK 27

that ℓ is a function of ~r ′ alone. The Euler Equation (10.4) for the variational problem of (8.6) therefore takes the form −L −

d ∂ℓ =0 dt ∂r1′ d ∂ℓ =0, dt ∂rj′

(8.9) j ≥ 2.

(8.10)

∗ ) yields Using the representation (7.7) for ℓ and replacing ~r by (C, z2∗ , . . . , zK   K K X X  ′ ′  h λ1j C, λj1 zj∗ , rj′ . (8.11) rj + ℓ(~r ) = h γδ, Cδ, j=2

j=1

Therefore, using (10.7)–(10.10), ∂ℓ = log ∂r1′

PK

∂ℓ = log ∂ri′

PK

+

ri′

q

′ j=1 rj

′ j=1 rj

+ log

+

r PK

2δγ

2

r PK

2

′ j=1 rj

′ j=1 rj

2δγ

+

(ri′ )2

+

+ 4δ2 γC (8.12) + 4δ2 γC

4λ1i Cλi1 zi∗

2λ1i C

(8.13) ,

i ≥ 2.

Equation (8.9) implies that ∂ℓ = K1 − L · t. ∂r1′ And so, with y =

PK

′ j=1 rj ,

y+

p y 2 + 4δ2 γC = eK1 −Lt . 2δγ

√ Now substitute y(t) = 2δ Cγ · sinh θ(t), so p  2δ Cγ sinh(θ(t)) + cosh(θ(t)) = eK1 −Lt .

(It is easy to see that y(t) could not be a hyperbolic cosine, which also satisfies the equation, because the function cosh does not change sign, and y(t) 28

represents the derivative of a function that returns to its starting position.) Therefore eθ(t) = K2 e−Lt

(8.14)

θ(t) = K3 − Lt,

(8.15)

and so y(t) = 2δ

p

Cγ · sinh(K3 − Lt).

(8.16)

∂ℓ Note that the first term on the right of (8.13) is ∂r ′ . Combining (8.9), (8.10) 1 and (8.13) we have q ′+ (ri′ )2 + 4λ1i Cλi1 zi∗ r d i log = L , i ≥ 2. (8.17) dt 2λ1i C

Following the same reasoning as above, we obtain p ri′ (t) = 2λ1i Cγ sinh (K3,i + Lt)

(8.18)

Using (4.6) and (8.18) with t = 0, p 2λ1i Cγ · sinh K3,j = λj1 zj∗ − λ1j C

(8.20)

since zi∗ = γλ1i /λi1 . The number of constants is not as large as it seems: from the principle of smooth fit [SW95, 13.63, p. 374] we know that, at t = 0, the derivatives ∗ ). So, rj′ (0) of ~rB at 0 agree with those of the optimal path to (C, z2∗ , . . . , zK by (10.13),   K K X X d   λj1 zj∗ , λ1j C − γδ − r1 (0) = δ + dt j=1 j=1 (8.19) d rj (0) = λj1 zj∗ − λ1j C , j ≥ 2 . dt

λ1j (γ − C) √ 2λ1j Cγ γ−C = √ . 2 Cγ

sinh K3,j =

(8.21)

Since the function sinh is monotone, we conclude that K3,j is independent of j and also of B. A similar calculation applies to the first coordinate, yielding K3 = −K3,j (using sinh x = − sinh(−x)). 29

Collecting these results and using cosh x = cosh(−x) we have √ 2λ1i Cγ rj (t) = K4,j + cosh (K3 − Lt) , j ≥ 2, L √ K X 2δ Cγ cosh (K3 − Lt) . rj (t) = K4 − L

(8.22) (8.23)

j=1

Using similar arguments we can obtain explicit expressions for K4 , K4,j . rj∗ (0) = γ

K4,j

λ1j λj1

√ 2λ1j Cγ cosh (K3 ) = K4,j + L√ 2λ1j Cγ λ1j cosh (K3 ) − =γ λj1 L

j≥2

(8.24)

r1 (0) = C √ K X λ1j 2δ Cγ γ = K4 − cosh (K3 ) − L λj1 j=2

K4 = C +



(8.25)

K

X λ1j 2δ Cγ cosh (K3 ) + γ . L λj1 j=2

Finally, after some more algebra, we obtain   √ K X 2 Cγ  λ1j  (cosh (K3 ) − cosh (K3 − Lt)) δ+ r1 (t) = C + L j=2 √ λ1j 2 Cγ rj (t) = γ + λ1j (cosh (K3 − Lt) − cosh (K3 )) . λj1 L

(8.26) (8.27)

By exactly the same “soft” arguments as in [SW95, Chapter 13], the solution must satisfy r1∗ (0) = r1∗ (T ) = C, where T is the optimal time for the variational problem. This implies K3 = LT /2, and consequently, that the entire path ~r ∗ is symmetric around t = LT /2. In particular, the end-point ∗ ). of the optimal path satisfies ~rB∗ (T ) = (C, z2∗ , . . . , zK We have obtained explicit equations for the optimal path, where the only unknown is L. We can now obtain the solution to the optimization problem, and therefore also the probability of buffer overflow (for the small buffer ∗ case). This probability is, asymptotically, e−nI where I ∗ = I1 + I2 (B). 30

The first term, I1 , is given by (10.18), and corresponds to the cost (and probability) of the buffer starting to fill. The second term corresponds to ∗ ), filling a buffer of size B, starting at the initial state ~rB∗ (0) = (C, z2∗ , . . . , zK and is calculated as follows. Z T ℓ(~r ′ ) dt (8.28) I2 (B) ≈ 0     Z T K K X X  ′   h λ1j C, λj1 zj∗ , rj′  dt rj + (8.29) = h γδ, Cδ, 0

j=2

j=1

by (8.11). From (10.11), (8.21), the discussion following (8.26)–(8.27), (8.16) and (8.18) respectively we have λ1j λj1 (γ − C) sinh K3 = − √ 2 Cγ LT K3 = 2 zj∗ = γ

where T is the optimal time to fill the buffer, K X j=1

rj′ (t) = 2δ

p

rj′ (t) = 2λ1j

Cγ · sinh(K3 − Lt)

p Cγ · sinh (Lt − K3 ) .

After a good deal of tedious calculations, we obtain the following. Define w, f (w) and V through √ Cγ sinh(w) = C −γ f (w) = w cosh w − sinh w s s ! √ √ Cγ Cγ Cγ Cγ + 1+ log − = 1+ 2 2 (C − γ) C−γ (C − γ) C−γ s ! p Cγ V = 2 C + γ + Cγ · 1 + (C − γ)2 s ! √ Cγ Cγ + 1+ × log . C−γ (C − γ)2 31

Now let

L= then

v   u u 2 δ + PK λ f (w) 1j t j=2

I2 (B) = V

B

δ+

PK

j=2 λ1j

v L u K X √ u V t . λ1j · p = B δ+ 2f (w) j=2

The rate function governing the probability of overflow, for a small buffer, is I ∗ = I1 + I2 (B)

(8.30)

v u K X u √ C V t = C log − C + γ + B δ + λ1j · p γ 2f (w) j=2

(8.31)

using (10.18). In addition, we obtain that r1∗ (t) > C while rj∗ (t) < zj∗ for all 0 < t < T ∗ and j > 1.

8.2

Small buffer asymptotics: closed model

As explained in the beginning of section 8.1 and Equation (8.8), √ I ∗ (B) = I ∗ (0) + c3 B + O(B)

(8.32)

for some constant c3 . We now calculate c3 . As before, our method is to fix the jump rates of the process ~zn (t) at the point where the buffer first starts to fill; that is, at the point ~r ∗ (0) = ~z∗ in the zero-buffer analysis. Then we calculate the overflow probability based on those jump rates. We have, therefore   1−C 1−C ∗ , . . . , πK . ~r (0) = C, π2 1 − π1 1 − π1 Accordingly, we fix the jump rates the rates Li = zi∗ λi1 Mi = Cλ1i 32

i ≥ 2,

i ≥ 2.

(8.33) (8.34)

With this simplification, the problem reduces to one equivalent to [SW95, Section 8, pp. 383–4]. Take ai = 1 , i = 1, . . . , K. Let our Lagrange multiplier be called Q (K was taken as the number of dimensions). Then db = r1 (t) − C dt   K X = 1 − rj (t) − C .

(8.35) (8.36)

j=2

Also, by definition,

C =1−

K X

rj (0) .

(8.37)

j=2

In order to find the path as well as the optimal cost, we fix the value of the Lagrange multiplier Q, so that the quantities below depend on Q. Since we deal with dimension K − 1, let us abbreviate h~x, ~y i =

K X

xj yj .

(8.38)

j=2

Then, following the derivation in [SW95],  Z T    ′ ∗ inf ℓ ~r(t), ~r (t) − Q(r1 (t) − C) dt : ~r(0) = ~r (0), T > 0 ~ r,T 0  Z T    ′ ∗ = inf ℓ ~r(t), ~r (t) − Qh~r(0) − ~r(t),~ai dt : ~r(0) = ~r (0), T > 0 ~ r ,T

=

K X j=2

0

inf

rj ,T

Z

T

0

   ℓj rj (t), rj′ (t) − Q(rj (0) − rj (t)) dt : rj (0) = πj

 1−C , T >0 . 1 − π1 (8.39)

The problem has been reduced to exactly the form of [SW95, p. 382, last line], with Q replacing (−K). The solution is therefore given by equations (13.143)–(13.145) there as √ Li + Mi + di 2 Li Mi + cosh(Qt − w) (8.40) ri (t) = ri (0) − Q Q 33

where QT = 2w

and di = cosh(w) − Li − Mi .

(8.41)

Furthermore, the equation just below (13.143) implies K X

di = 0 .

(8.42)

i=2

We complete the calculation as follows. 0=

K X

di

(8.43)

i=2

= (K − 1) cosh(w) −

K X

(Li + Mi ) .

(8.44)

i=2

Therefore PK

i=2 (Li + Mi ) K −1

w = cosh−1

!

(8.45)

QT = 2w w T = 2Q Z T (r1 (t) − C) dt B=

(8.46) (8.47) (8.48)

0

=

Z

0

K T X

(rj (0) − rj (t)) dt

(8.49)

i=2 K  T X

√  Li + Mi + di 2 Li Mi cosh(Qt − w) + = dt − Q Q 0 i=2 ! √ T K X 2 Li Mi L + M + d i i i . − sinh(Qt − w) + T = Q Q 0 Z

(8.50)

(8.51)

i=2

Since QT = 2w and

cosh w =

PK

i=2 (Li

+ Mi ) , K −1

34

(8.52)

we have p

cosh2 w − 1 v !2 u PK u (L + M ) i i i=2 −1 =t K −1

sinh w =

so

B=

K X

(Li + Mi )

i=2

w − Q2

(8.53) (8.54)

v u P 2 ! u K √ K (Li + Mi ) X 2 Li Mi u i=2 t −1 Q2 (K − 1)2 i=2 (8.55)

c4 Q2 which defines c4 . So, we have Z T  ℓ ~r(t), ~r ′ (t) dt I= =

(8.56)

(8.57)

0

=

Z

0

8.3

T

K X i=2

∂ℓ (~r(t), ~r ′ (t)) + Q (ri (0) − ri (t)) ri (t) ∂ri′

!

dt .

(8.58)

Large buffer asymptotics: open model

The large buffer asymptotics for I(B) are easier to calculate than the small buffer asymptotics. We feel that, for applications, they are less useful. In [EHL] it was shown that I(B) ℓ (~x, 0) = inf . B→∞ B ~ x:x1 >C x1 − C lim

By (7.7)

ℓ(~x, 0) = h (γδ, x1 δ, 0) +

K X

h (λ1j x1 , λj1 xj , 0)

(8.59)

(8.60)

j=2

≥ h (γδ, x1 δ, 0) ,

(8.61)

since h is positive, with equality if and only if λj1 xj = λ1j x1 . Therefore h (γδ, x1 δ, 0) I(B) = inf x1 >C B→∞ B x1 − C √ √ 2 δx1 − δγ = inf . x1 >C x1 − C lim

35

(8.62) (8.63)

√ Setting α = x1 and taking derivatives, we find that the minimum in the √ region x1 > C occurs at α = C/ γ. So δ(C − γ) I(B) = . B→∞ B C lim

In his dissertation, Mandjes [Ma] calculated (without proof) that the optimal path to make a large buffer approaches a specific form, which we now develop. We use this form to derive accurate estimates of I(B) for large B. Since the smallest value of ℓ(~x, 0)/(x1 − C) occurs at x1 = C 2 /γ, we know from [SW95, Chapter 7.4, Chapter 11], or from Sanov’s theorem that the processes behave as if the arrival rate is δγA and the departure rate is x1 δ/A, with x1 = C 2 /γ. That is, r C x1 = . A= γ γ All other rates {λ1j , λj1 , j ≥ 2} remain at their original values. Using the new (“twisted”) rates we can construct the process. Fix a point ~z ◦ , to be determined later, with z1◦ = C. The cheapest path to go ∗ (−t). Define ~ ◦ (t) to be the fluid limit (5.5) from ~z ∗ to ~z ◦ is to follow ~z∞ z∞ of the model with the new jump rates (arrival rate δC and departure rate ◦ . By (4.6), as t → ∞ the path ◦ (0) = ~ z∞ z∞ x1 δγ C ). Fix the initial condition ~ ◦ ~z∞ (t) approaches the stationary point  2  C C 2 λ12 C 2 λ1K ◦ ~z∞ (∞) = , , ... , . γ γ λ21 γ λK1 Following Mandjes [Ma], there are two elements that define the path to a ◦ so that large buffer. First, choose ~z∞ d ◦ d ~z∞ (t) = ~z∞ (−t) at t = 0 dt dt (principle of smooth fit). Second, choose the time T so that the buffer made ◦ during the interval [0, T /2] is B/2. Now extend the path by by the path ~z∞ ◦ ◦ is determined setting it to ~z∞ (T /2−t) on the interval [T /2, T ]. The point ~z∞ from the principle of smooth fit: ◦ z∞,1 =C

d ◦ ◦ z∞,j (0) = λj1 z∞,j − λ1j z∞,1 dt d ◦ = z∞,j (0) dt ◦ ◦ = λ1j z∞,1 − λj1 z∞,j 36

(8.64) (8.65) (8.66) (8.67)

so that, for j ≥ 2, ◦ ◦ z∞,j = z∞,1 ·

=C

λ1j λj1

λ1j . λj1

(8.68) (8.69)

The final calculation involves getting a good estimate of I(B). During [0, T /2] we have   γ ◦ d ◦ z∞,1 (t) = δ C − z∞,1 (t) . dt C Therefore

q  2 γ   C − z + C − z1 Cγ + 4γz1 1 γ C ℓ (z1 , z˙1 ) = δ  C − z1 log C 2γ  r  2 γ C − z1 + γ + z1 − + 4γz1  C   C  γ γ log + 1 − (z1 − C) . = δ C − z1 C γ C Therefore, Z T /2 0

T /2 

 C  γ ℓ (z1 , z˙1 ) dt = δ log dt C − z1 γ C 0 Z T /2  γ (z1 − C) dt 1− +δ C 0 Z C 2 /γ  C γB log dz1 + δ 1 − =δ ; γ C 2 C Z

(8.70)

(8.71)

(8.72)

where the first integral comes from changing variables to z(t), and the second from the condition Z T /2 B (z1 − C) dt = . 2 0 Similarly, on the interval [T /2, T ] we have γ d ◦ ◦ z∞,1 (t) = z∞,1 (t) − C, dt C 37

so ℓ (z1 , z˙1 ) =



 z1  γ (z1 − C) . − C log + 1− C C C

(8.73)

Therefore Z

T

C

 γ B/2 log z1 C dz1 + δ 1 − C C 2 /γ   2  C C2 C γ B/2. −C− log +δ 1− =δ γ γ γ C

ℓ (z1 , z˙1 ) dt = δ T /2

Z

Combining the two expressions, we have      Z T C C γ ℓ (z1 , z˙1 ) dt = δ C B . − 1 − log + 1− γ γ C 0 That is, we obtain the following result: as B → ∞,      C C γ I(B) = I(0) + δ C B + o(1). − 1 − log + 1− γ γ C Note that the additive terms are all positive since γ < C. In [EHL] we showed that  γ B I(B) > I(0) + δ 1 − C

for B > 0, and showed that there is a constant u such that  γ I(B) < I(0) + u + δ 1 − B C

(8.74) (8.75)

(8.76)

(8.77)

(8.78)

(8.79)

for all B > 0. We believethat the smallest u which satisfies this inequality is u = δC Cγ − 1 − log Cγ , but haven’t proved it.

8.4

Large buffer asymptotics: closed model

As shown in [SW95, EHL], lim

B→∞

1 ℓ (~x, 0) −1 lim log P (buffer > B) = inf x1 >C x1 − C B n→∞ n ≡ I∗ .

38

(8.80) (8.81)

That is, for large n and B, P (buffer > B) ≈ e−nI

∗B

.

(8.82)

Our investigation is a bit incomplete, in that we have not shown that I ∗ is attained at a unique place ~x∗ . Using Lagrange multipliers, we can perform the minimization over x2 , . . . , xK for each x1 , thus reducing the problem to a one-dimensional minimization of a smooth function. The minimum of the resulting function occurs in the interior C < x1 < 1. Therefore, the problem can be solved efficiently by numerical minimization. Unfortunately, except in the case of K = 2, we have been unable to calculate the solution analytically. Mandjes calculated detailed sample-path asymptotics for this case, too. The calculations are exactly along the lines we have already given. Indeed Mandjes worked out the value of I ∗ as well as the sample path for closed models. The reason we do not translate his more general solution to the present context is that his solution, in the present case, must be worked out at least partially numerically, rather than analytically. This concludes our discussion.

9

End notes

Our solution to the zero- and small-buffer cases have some simple properties that are worth noting. • For models OK , the upcrossing probability is independent of K, and of all λij and of δ. It depends only on n, γ, and C. • For models CK , the upcrossing probability depends on the model parameters through x∗1 , n, and C only. • For OK models, the probability of a buffer exceeding B is independent of K, the λi j and δ, to second order. Specifically, I(B) = α + βB + o(1)

as β → ∞,

where α and β are functions of δ, γ, and C only. • OK models most likely have zj (t) = zj∗ for j ≥ 2 at the time when z1 (t) = C. 1−C • CK models most likely have zj (t) = zj∗ 1−z ∗ for j ≥ 2 at the time when 1 z1 (t) = C.

39

• The derivative of the most likely path, as z1 (t) approaches C, is given by: For open models, dz1 (t) =C −γ dt dzj (t) = zj∗ · λj1 − Cλ1j . dt

(9.1) (9.2)

For closed models, K

dz1 (t) X (Cλ1j − yj λj1 ) = dt

(9.3)

j=2

dzj (t) = yj λj1 − Cλ1j . dt

(9.4)

• The small buffer asymptotics are made by sample paths that start nearly at the upcrossing point, and continue from there with the same 1 (t) > 0 initially, but initial direction. Thus for all of our models, dzdt dzj (t) dt

< 0 for j ≥ 2. Furthermore, if the rates λj1 and λ1j are well separated from λi1 and λ1i for each i 6= j, then nearly all the buffer is caused by the component corresponding to the largest λi1 , λ1i pair (that is, for small buffers the excursion occurs on the shortest time scale). • For open models, when the buffer fills to a large value, it most likely begins by having the system go through the point z ◦ = Cγ ~z∗ , then  2 the system approaches Cγ z ◦ = Cγ ~z∗ for a long time, then it relaxes back to ~z∗ , going through z ◦ on the way back. But even through the path goes through those three collinear points, it is not a straight line, d 1 zj = 0 for j ≥ 2 and dz because at z ◦ we know that dt dt = C − γ 6= 0. • A word of caution: there is a measurement issue regarding the models we developed in this paper, especially as related to CAC. The formulas we developed are based on the assumption that the λij are known. In practice of course, they would need to be inferred from measurements. Measurements are subject to errors, therefore there is a chance that a CAC scheme would cause an unacceptably high error rates because 40

of faulty estimation of the λij . In fact, as Gibbens and Kelly [GK] have shown, many natural measurement/estimation schemes have an associated large deviations rate function, which properly should be incorporated into a CAC scheme. The problem can be ameliorated through proper design and analysis. The point is, it should not be ignored.

Acknowledgments Research of the first author was supported in part by the fund for the promotion of research at the Technion. Work of this author was performed in part while on Sabbatical Leave, visiting Bell Laboratories, Murray Hill.

10

Appendix: Direct variational approach: open models

By the Freidlin-Wentzell theory, the probability that the buffer is nonempty is equal (on an exponential scale) to the probability of crossing the level z1 = C, and moreover, the most likely path by which the buffer starts filling goes through z1 = C. The probability of level crossing, that is the probability that the first queue has more than C users, is given explicitly for open models in Theorem 1. Thus, the theorem allows to compute the probability that the buffer is non empty. In section 6 we obtained the optimal path and the probability of non-empty buffer using the following tools. • Theorem 1 gives the steady state statistics of queue sizes for the open models. • Using (time) reversibility, we obtain the path to the most likely point where the buffer starts filling, as the time reversal of the most likely path from that point. The latter is computed explicitly in section 5. In this section we show how to calculate the most likely path using the variational problem that arises from the theory of large deviations. This serves both to illustrate how such variational problems are solved, as well as to provide a different (although more technical) derivation of the same results.

41

Using the Freidlin-Wentzell theory, we know that the most likely path, if it exists, is the minimizer of the functional Z T  ℓ ~r(t), ~r ′ (t) dt (10.1) I(~r ) = 0

subject to

~r(0) = z ∗

(10.2)

r1 (T ) = C.

(10.3)

Here T as well as the path ~r are chosen so as to minimize the integral (this usually means that T = ∞). A necessary condition that any solution ~r∗ (t) of (10.1) must satisfy is the Euler Equation [SW95, Eq. (C.2)] d ∂ℓ(~r∗ (t), ~r∗′ (t)) ∂ℓ(~r∗ (t), ~r∗′ (t)) = 0. − ∂ri dt ∂ri′

(10.4)

Since T is a “free parameter,” we obtain a second necessary condition—a transversality condition [SW95, Eq. (C.4), Section C.1] ∂ℓ(~r∗ (T∗ ), ~r∗′ (T∗ )) =0, ∂ri′

i = 2, . . . K.

(10.5)

Here T∗ is the terminal time—the time the path reaches r1 (T∗ ) = C. This transversality condition is derived in Elsgolc [El, Page 75]; see also [SW95, Page 516, last equation]. Note that this is a multi-dimensional problem and, since the terminal point here is free for coordinates i = 2 . . . , K, we can make a special variation in each coordinate, and choose ri (T∗ ) (x(b) in [SW95]) arbitrarily. This implies the transversality condition.

10.1

The state at nonempty buffer

By solving the transversality conditions, we now reestablish that the most likely terminal point is (C, z2 , . . . , zK ). ′ (t) along the optimal path, and in particBy reversibility, ~r ′ (t) = −~z∞ ular, at the terminal point, r1 (0) = C (we shift time so the terminal point is reached at t = 0: this is convenient since the optimal time is ∞!). So, by (5.5), r1′ (0) = Cδ − γδ + C

K X j=1

λ1j −

rj′ (0) = λj1 rj (0) − Cλ1j , 42

K X

λj1 rj (0),

(10.6)

j=1

j ≥ 2.

(10.6b)

We now invoke the transversality condition (10.5). The function h has exactly the form [SW95, Eq. (7.17)] of the rate function for the M/M/1 process. Therefore, by [SW95, Eq. (7.23)-Exercise 7.24 and Eq. (C.6)] p y + y 2 + 4ab ∂h(a, b, y) = log (10.7) ∂y 2a and so ( h′1 h′1 + h′j

∂ℓ(~r, ~r ′ ) = ∂rj′

j = 1, j≥2

(10.8)

where

h′1 = log h′j

= log

PK

j=1 yj +

yj +

r PK

j=1 yj

2δγ

q

2

yj2 + 4λ1j Cλj1 xj 2λ1j C

+ 4δγδC (10.9)

.

(10.10)

So, the transversality condition becomes 0 = h′1 + h′j = log

PK

j=1 yj +

But from (10.6),

r

PK

PK

j=1 yj

2δγ

j=1 yj

2

+ 4δγδC · log

yj +

q

yj2 + 4λ1j Cλj1 xj 2λ1j C

= Cδ − γδ and yj = λj1 xj − Cλ1j , so

p

(Cδ − γδ)2 + 4δ2 γC 2δγ p λj1 xj − Cλ1j + (λj1 xj − Cλ1j )2 + 4λ1j Cλj1 xj × 2λ1j C C − γ + C + γ λj1 xj − Cλ1j + λj1 xj + Cλ1j · = 2γ 2λ1j C C λj1 xj = · γ λ1j C

1=

Cδ − γδ +

43

.

and so, finally, xj = =

γλ1j λj1

(10.11)

zj∗ .

That is, the variational problem, through the transversality conditions, also implies that conditioned on x1 = C, the other coordinates are most likely to be near their steady state values. 10.1.1

The probability of nonempty buffer

In §4.3 we showed [Eq. (4.11)] that the probability of nonempty buffer, ∗ P(x1 ≥ nC) in steady state is about e−nI , where I ∗ = C log

C + γ − C. γ

We now derive this result directly from the variational problem. From (10.1),  Z T  ℓ ~r(t), ~r ′ (t) dt ; ~r(0) = z ∗ , r1 (T ) = C, rj (T ) = zj∗ , j ≥ 2 I ∗ = inf 0

(10.12)

where the last constraint is a consequence of reversibility [see 10.1 and (10.11)]. In §6 we calculated that the most likely path (the solution to the variational ∗ ). That is, ~ problem) is ~r(t) = ~z∞ (−t) where ~z∞ (0) = (C, z2∗ , . . . , zK r satisfies ! K K X X d λj1 rj (t), λ1i r1 (t) − γδ − r1 (t) = δ + dt j=1 i=1 (10.13) d rj (t) = λj1 rj (t) − λ1j r1 (t) , j ≥ 2 dt for 0 ≤ t ≤ T , and ~r satisfies the boundary conditions. In order to calculate ℓ along the optimal path, we use (7.7). By (7.8), b h(a, b, b − a) = (b − a) log . a If we combine this with (10.13) and substitute rj for xj and rj′ for yj , we obtain  h λ1j r1 , λj1 rj , rj′ = h (λ1j r1 , λj1 rj , λj1 rj − λ1j r1 ) (10.14) λj1 rj = (λj1 rj − λ1j r1 ) log . λ1j r1 44

Since λ11 = 0 we have 

h γδ, r1 δ,

PK

K X j=1

′ j=1 rj

= r1 δ − γδ, so



rj′  = h (γδ, r1 δ, r1 δ − γδ)

(10.15)

r1 = δ(r1 − γ) log . γ

So, by (7.7), K

ℓ(~r, ~r ′ ) = δ(r1 − γ) log

r1 X λj1 rj (λj1 rj − λ1j r1 ) log + . γ λ1j r1 j=2

Now ~z∞ (t) → ~z ∗ , but does not achieve ~z ∗ in finite time, so that the optimal time for the variational problem is T = ∞. Hence the minimum value is given by Z

T 0

 ℓ ~r(t), ~r ′ (t) dt  Z ∞ δ(r1 (t) − γ) log r1 (t) = γ 0

 λj1 rj (t)  (λj1 rj (t) − λ1j r1 (t)) log dt + λ1j r1 (t) j=2  Z ∞ r1′ (t) log r1 (t) = γ 0 K X

+

K X j=2

(10.16)

   λj1 rj (t) r1 (t)  (λj1 rj (t) − λ1j r1 (t)) log + log dt λ1j r1 (t) γ

where the last equality is obtained using (10.13) as in deriving (10.14)– (10.15). Using (10.13) again to simplify the last sum, we obtain    Z ∞ Z T K  X  λj1 rj (t)  r1′ (t) log r1 (t) + ℓ ~r(t), ~r ′ (t) dt = rj′ (t) log dt. γ λ1j γ 0 0 j=2

(10.17)

45

However, for any function f (t) and constant a, Z ∞ h i∞ f ′ (t) log(a · f (t)) dt = f (t) (log(a · f (t)) − 1) . 0

0

Since r1 (0) = γ, r1 (∞) = C and rj (0) = rj (∞) = zj∗ , we have Z

T 0

 C ℓ ~r(t), ~r ′ (t) dt = C log − C + γ γ

(10.18)

which agrees with (4.11). However, our pathwise approach provides more information: here is an example. We can show that T = ∞ and, moreover, r1 (t) > γ and rj (t) > zj∗ , j = 2, . . . , K, for all t. Using reversibility, we can analyze the behavior ∗ ). Define of ~z∞ , starting at (C, z2∗ , . . . , zK z(t) =

K X

zk (t).

k=1

From (5.5) d z(t) = δ(γ − z1 (t)). dt Since z1 (0) = C > γ it follows that (d/dt)z(t) < 0 until the first time that z1 (t) = γ. Now by (4.6) and (5.5), K

X d λ1k (γ − C) < 0, z1 (0) = δ(γ − C) + dt

(10.19)

k=2

d zj (0) = λ1j (C − γ) > 0. dt

(10.20)

Therefore, d zj (t) ≥ γλ1j − λj1 zj (t) dt at least as long as z1 (t) ≥ γ. This differential inequality implies [Ha] that zj (t) is larger than the solution to d y(t) = γλ1j − λj1 y(t) dt (with the same initial conditions), and hence zj (t) > zj∗ , again at least until z1 (t) = γ. 46

Similarly, we have K K X X d z1 (t) ≥ δ(γ − z1 (t)) − λ1k z1 (t) + λk1 zj∗ dt k=1 k=1 ! K X λ1k (γ − z1 (t)) = δ+

(10.21)

(10.22)

k=1

and so z1 (t) > γ, at least until one of the other coordinates satisfies zj (t) = zj∗ . Putting those together we see that neither zj (t) = zj∗ nor z1 (t) = γ can occur at any finite time. Therefore, T = ∞, and moreover we conclude that z1 (t) is always above (in fact, strictly above) the solution to ! K X d y(t) = δ + λ1k (γ − y(t)) , y(0) = z1 (0). dt k=1

But this can be solved explicitly, so z1 (t) ≥ y(t)

(10.23)

 = e−Λ1 ·t z1 (0) + 1 + e−Λ1 ·t γ,

Λ1 = δ +

K X

λ1k .

(10.24)

k=1

We have just seen that our sample-path asymptotics give simple, explicit estimates of the transient behavior of the system. These estimates show that during an overflow, the most likely way for the system to behave is to have every component of the system larger than its steady-state value except at the very end, when all components (except for state one) are at their steadystate values. Furthermore, since T = ∞, there is no critical time interval over which we can say that upcrossings occur. Instead, upcrossings occur at random times, and as n increases, we expect that upcrossings will be closer to a very gradual, smooth path, especially at the beginning of the upcrossing (which may be hard to identify).

References [AMS] D. Anick, D. Mitra and M.M. Sondhi, “Stochastic theory of a data handling system with multiple sources,” Bell Sys. Tech. J. 16 pp. 876– 898, 1988. 47

[CW] Gagan L. Choudhury and Ward Whitt, “Long-tail buffer-content distributions in broadband networks,” Performance Evaluation pp. 1–14, 1997. [DLORT] N.G. Duffield, J.T. Lewis, Neil O’Connell, Raymond Russell, and Fergal Toomey, “Predicting Quality of Service for Traffic with LongRange Fluctuations,” in Proc. ICC ’95 pp. 473–477, Seattle, 1995. [El]

L.E. Elsgolc, Calculus of variations, Pergamon Press, 1962.

[EHL] Anwar Elwalid, Daniel Heyman, T.V. Lakshman, Debasis Mitra, and Alan Weiss, “Fundamental bounds and approximations for ATM Multiplexers with application to video teleconferencing,” IEEE J. Selected Areas in Comm. 13 pp. 1004–1016, 1995. [ENW] A. Erramilli, O. Narayan, and W. Willinger, “Experimental queueing analysis with long-range dependent packet traffic,” IEEE/ACM Trans. Networking, 4 pp. 209–223, 1996. [Ev] Suzanne P. Evans, “Analyzing system behaviour on different time scales,” in Stochastic Networks: Theory and Applications, edited by F.P. Kelly, S. Zachary, and I. Ziedins, Royal Statistical Society Lecture Note Series 4, Clarendon Press, Oxford, pp. 231–246, 1996. [GK] R.J. Gibbens and F.P. Kelly, “Measurement-based connection admission control,” In Teletraffic Contributions for the Information Age: Proceedings of the 15th International Teletraffic Congress, Washington, DC, edited by V. Ramaswami and P.E. Wirth, Elsevier, Amsterdam, pp. 879–888, 1997. [Ha] J.K. Hale Ordinary Differential Equations, Pure and Applied Mathematics XXI, Robert E. Krieger Publishing Company, New York, 1980. [HRS] David Heath, Sidney Resnick and Gennady Samorodnitsky, “Patterns of buffer overflow in a class of queues with long memory in the input stream,” Ann. Appl. Prob. 7 pp. 1021–1057, 1997. [Iv]

V.A. Ivnitskii, “On the Invariance of Stationary Probabilities of States of a Closed Star-Shaped Queueing Network with State-Dependent Transition Probabilities,” Theo. Prob. Appl. 42, pp. 162-167, 1998.

[Ke] F.P. Kelly, “Reversibility and stochastic networks,” Wiley, 1979.

48

[JLS] P.R. Jelenkovic, A.A. Lazar, and N. Semret, “The Effect of Multiple Time Scales and Subexponentiality of MPEG Video Streams on Queueing Behavior”, JSAC, Special Issue on Video Modeling, 15 no. 6, 1997. [JL] P.R. Jelenkovic and A.A. Lazar,“Asymptotic Results for Multiplexing Subexponential On-Off Sources”, submitted to Adv. Appl. Prob., 1997. [Ma] Michael Mandjes, Rare event analysis of communication networks, Ph.D. Thesis, Tinbergen Institute Research Series, Vrije Universiteit, Amsterdam, 1996. [MM] Marvin Marcus and Henryk Minc, “A survey of Matrix Theory and Matrix Inequalities,” Dover, NY, 1992. [MW] W. Massey and W. Whitt, “Networks of infinite-server queues with nonstationary Poisson input,” Queueing Systems 13 pp. 183-250, 1993. [MRW] Debasis Mitra, Martin I. Reiman, and Jie Wang, “Robust Dynamic Admission Control for Unified Cell and Call QoS in Statistical Multiplexers” IEEE J. Sel. Areas Commun. June, 1998. [No] I. Norros, “A storage model with self-similar input,” QUESTA 16 pp. 387–396, 1994. [PM] M. Parulekar and A.M. Makowski, “Tail probabilities for a multiplexer driven by M/G/∞ input processes (I): Preliminary Asymptotics,” Queueing Systems – Theory & Applications, in press, 1998. [RS] Martin I. Reiman and Adam Shwartz, “Call Admission: A New Approach to Quality of Service,” Report, Bell Laboratories, and CC Pub. 216, Technion, 1997. [RE] Bong K. Ryu and Anwar Elwalid, “The Importance of Long-Range Dependence of VBR Video Traffic in ATM Traffic Engineering: Myths and Realities” Proc. ACM SIGCOMM ’96 Conf. in Computer Comm. Review 26 No. 4, pp. 3–14, October 1996. [SW93] Adam Shwartz and Alan Weiss, “Induced rare events: analysis via large deviations and time reversal,” Journal of Applied Prob. 25 pp. 667–689, 1993. [SW95] Adam Shwartz and Alan Weiss, “Large deviations for performance analysis: queues, communication and computing,” Chapman and Hall, 1995. 49

[SW98] Adam Shwartz and Alan Weiss, “Multiple time scales in Markovian ATM models II. Proofs,” in preparation. [WTE] Walter Willinger, Murad Taqqu, Ashok Erramilli, “A bibliographical guide to self-similar traffic and performance modeling for modern high-speed networks,” in Stochastic Networks: Theory and Applications, edited by F.P. Kelly, S. Zachary, and I. Ziedins, Royal Statistical Society Lecture Note Series 4, Clarendon Press, Oxford, pp. 339–366, 1996. [WTLW] Walter Willinger, Murad Taqqu, W. Leland and D. Wilson, “Selfsimilarity in high-speed packet traffic: analysis and modeling of Ethernet traffic measurements,” Stat. Sci. 10 pp. 67–85, 1995.

50