Asymptotics of rst passage times for random walk in an orthant - uOttawa

Asymptotics of rst passage times for random walk in an orthant D. R. McDonald

February 1998

Key words and phrases: Random walk, rare events, change of measure AMS 1980 subject classi cations: Primary 60K25; Secondary 60K20.

Abstract

We wish to describe how a chosen node in a network of queues overloads. The overloaded node may also drive other nodes into overload but the remaining \super" stable nodes are only driven into a new steady state with stochastically larger queues. We model this network of queues as a Markov additive chain with a boundary. The customers at the \super" stable nodes are described by a Markov chain while the other nodes are described by an additive chain. We use the existence of a harmonic function h for a Markov additive chain provided by Nummelin and Ney (1987) and the asymptotic theory for Markov additive processes to prove asymptotic results on the mean time for a speci ed additive component to hit a high level `. We give the limiting distribution of the \super" stable nodes at this hitting time. We also give the steady state distribution of the \super" stable nodes when the speci ed component equals `. The emphasis here is on sharp asymptotics not rough asymptotics as in large deviation theory. Moreover the limiting distributions are for the unscaled process, not for the uid limit as in large deviation theory.

Research supported in part by NSERC grant A4551

1

1 Introduction The following introduction describes the model and states the main theorems. It is highly recommended that the Flatto-Hahn-Wright example in Section 3.1 be read in parallel with this introduction. Unless a proof immediately follows the statement of a result, it is deferred to Subsection 2.3. The symbol for a Markov chain W may be twisted in which case it will be written in calagraph letters as W . The chain W may be transformed into a free process in which case it will be noted by W 1. The time reversal will be denoted by W . We will use this convention for a variety of symbols throughout the paper.

1.1 De nitions

Many large stochastic systems are modelled as a Harris recurrent (irreducible) Markov chain W with a state space (S; S ) with a stationary probability distribution and kernel K . Let L := K ,I be the discrete generator. We are often interested in the measure (F ) of some rare, forbidden set F 2 S where (F ) > 0. Alternatively we may consider the expected value of the time TF until the chain reaches F . We may also be interested in the hitting distribution on F . The problem of nding the mean time m(x) for the chain to hit F starting from any initial point x means solving the linear system: (1.1)

Lm(x) = ,1 for x 2 B := F c and m(x) = 0 for x 2 F:

The problem of nding eA(x), the probability of hitting F in A F (where A 2 S ) starting from x, means solving the problem (1.2)

LeA(x) = 0 for x 2 B , eA(x) = 1 for x 2 A and eA(x) = 0 for x 2 F n A:

If the space is high dimensional then both of these problems may become numerically intractible. Worse, if (F ) is small then they can't even be solved by simulation because it will take so long to simulate the event of interest; that is an entrance into F ! Here we consider a sequence of forbidden sets F` F such that (F ) ! 0. We address the problem of how to give the exact asymptotics of the expected value of the time TF to reach the forbidden set F . Our second problem is to nd the limiting hitting distribution on F and the third problem is to give an asymptotic expression for on F . The main results in this paper, Theorems 1.10, 1.13 and 1.6, give answers to these three problems for Markov chains which may be viewed as a Markov additive chain with a boundary. Given the kernel K we can always construct the kernel K 0 := (K + I )=2. Note that this new kernel has null transitions with probability 1=2. Both K and K 0 have stationary distribution . Moreover let L0 be the discrete generator associated with K 0 and let m0 and e0A be the solutions of the systems (1.1) and (1.2) with L0 replacing L. Clearly m0(x) = 2m(x) and e0A (x) = eA (x). Consequently by adjusting the time scale we may as well assume our Markov chain W has null transitions with probability 1=2. We note that our results apply to continuous time Markov jump processes on (S; S ) with bounded generators G (see Iscoe and McDonald (1994) for de nitions). We may construct a Markov chain W on (S; S ) with discrete generator L := G=q where q is the event rate of the generator G. W has the same stationary probability distribution as the jump process. Moreover 2

the hitting distribution on F by W is the same that of the original jump process. Finally the time until the jump process hits F is the same as the time for the uniformization of the Markov chain W to hit F . By calculation the expected time for the uniformized chain to hit F is m(x)=q. If we assume the time scale is such that q = 1 then the results for Markov chains immediately imply corresponding results for Markov jump processes. Consider a Markov additive chain W 1 (W~ 1; W^ 1) de ned on a probability space ( ; F ; P ). 1 ^ Br

W is a Markov chain with kernel K 1 on the measurable space (S 1; S 1) := (Rr S; S^) where R and B respectively denote the integers and subsets of the integers in the discrete time case (in the discrete case the integers are also denoted by ZZ) or the reals and the Borel sets in the continuous time case (in the continous case the reals are also denoted by IR). The Markovian component W^ 1 is a -recurrent (see Meyn and Tweedie (1993) for the de nition), ^ S^). The additive component W~ 1 can be viewed as aperiodic Markov chain with state space (S; a multidimensional random walk taking values in (Rr ; Br ). ^ If , 2 Br Decompose a point ~x 2 S 1 as ~x = (~x; x^) (x1 ; x2; : : : ; xr ; x^) where x~ 2 Rr ; x^ 2 S: 1 ^ ^ and A 2 S then the kernel of W satis es ^ x~ + W~ 1[1] , W~ 1[0] 2 ,jW^ 1[0] = x^): K 1(~x; , A^) = P (W^ 1[1] 2 A; Let L1 denote the generator of W 1. As above, W 1 decomposes into components (W~ 1; W^ 1). Let m denote the counting measure in the discrete case and Lebesgue measure in the continuous case. De ne the additive increment associated with the jump from state W^P1[n , 1] to W^ 1[n] to be X 1[n] := W~ 1[n] , W~ 1[n , 1]. Hence, if W~ 1[0] = X 1[0], W~ 1[n] = ni=0 X 1[i] describes the sum of the increments associated with getting to state W^ 1[n] in n steps. The components of the increment X 1 may of course be negative or zero. Now consider a Markov chain W with probability transition kernel K , stationary probability measure and state space S S 1. We assume the existence of a boundary N S 1 such that the probability transition kernel K of the Markov chain W in S has transitions which agree with those of K 1 within the interior of S ; that is K (~x; C ) = K 1(~x; C ) if ~x 2 S 1 n N and C S 1 n N. Let the edge of the boundary N be denoted by M N \ S . In many cases S will be the measurable ^ (B+)r S^) where product space of the positive orthant times S^ given by (S; S ) := ((R+)r S; R+, respectively B+, denotes the nonnegative integers (also denoted by IIN0), respectively subsets of nonnegative integers in the discrete case and the nonnegative reals (also denoted by IR+), respectively the Borel sets on the nonnegative reals, in the continuous case. ~ W^ ) where W^ 2 S^. We shall systematically use an upper Decompose W into components (W; index of 1 to indicate kernels (like K 1) or chains (like W 1) which are free in the sense that the additive components are unconstrained in Rr . Let the origin D be such that (D) > 0 and such R that D M. For any set C 2 S let C () := ( \ C )=(C ) and let EC := C C (d~z)E~z . Of course E~z will always represent the expectation operator for a chain started at state ~z. In some examples there is a natural extension of this theory. It is not necessary that W~ 1 be an additive process. Instead the kernel K 1 could have the following decomposition K 1((z1; z2 ; zr ; z^); (z1 + dx1; dx2; dxr ; dx^)) = K^ 1(^z; dx^)k1(dx1 jz^; x^)k~((z2 ; zr ); (dx2; dxr )jz^; x^; z1 ; x1) where k~ is a probability transition kernel conditioned on the transitions of W^ 1. Hence, given the transition of W^ 1, the components W21; : : : ; Wr1 make a Markovian transition dependant 3

on the additive rst component. In this extension there is no general methodology for nding the harmonic function required in Section 1.2 however if Conditions (1-7) given below hold then we can redo the proofs of Theorems 1.6, 1.13 and 1.10. Only Theorem 2.10 requires an extra hypothesis.

1.2 Hypotheses

R

Suppose there exists a positive function h such that h(~x) = ~y2S1 K 1(~x; d~y)h(~y). Hence h is harmonic for K 1. Perform the h-transform to construct the twisted kernel K1(~x; d~y) := K 1(~x; d~y)h(~y)=h(~x). Let the generator of the associated twisted chain W 1 be L1. In Section 2.1 we use the theory in Nummelin and Ney (1987a) to construct a harmonic function for K 1 of the form h(~x) = exp(x1)â(^x). We shall systematically denote any object which has been twisted with caligraphic lettering. Decompose W 1 as a Markov additive chain (W~ 1 ; W^ 1). The Markov chain W^ 1 has state space S^. The vector W~ 1 [n] can be viewed as a multidimensional additive component taking values in Rr . De ne the increment associated with the jump from state W^ 1P[n , 1] to W^ 1 [n] to be X 1[n] := W~ 1[n] , W~ 1 [n , 1]. Hence, if W~ 1[0] = X 1[0] then W~ 1 [n] = ni=0 X 1[i] describes the sum of the increments to get to state W^ 1[n] in n steps. The pairs (W~ 1[n]; W^ 1[n])1 n=0 de ne 1 r r 1 ^ ^ ^ ~ a Markov chain on S := R S . If , 2 B and A 2 S then the kernel of (W [n]; W^ 1 [n]) satis es K1(~x; , A^) = P (W^ 1[1] 2 A;^ x~ + W~ 1 [1] , W~ 1[0] 2 ,jW^ 1[0] = x^): The associated twisted generator applied to a bounded measurable function g satis es Z 1 L g(~x) := [g(~y) , g(~x)]K 1(~x; d~y) h(~y) h(~x) ~y2S 1 Z = [ h(~y) g(~y) , g(~x)]K 1(~x; d~y) ~y2S 1 h(~x) (1.3) := h(1~x) L1 (h g) (~x)

for all ~x 2 S 1. We recall that, without loss of generality, we may assume K has a null transition with probability 1=2. Hence K^ 1 and K^ 1, the kernels of the chains W^ 1 and W^ 1, are aperiodic. We impose the following conditions: (1) W^ 1 is a -recurrent Markov chain with a stationary probability measure ' (see Meyn and Tweedie (1993) for de nitions). (2) The Markov additive chain (W~ 1[n]; W^ 1 [n]) satis es Conditions (M(1)) and (N(1)) in Section 2.1. In the discrete case W~ 1 is aperiodic as de ned in Condition (P(1)) while in the continuous case W~ 1 is spread-out as de ned in Condition (P(2)). (3) d~1 > 0 where

d~

Z

z^

'(dz^)E(~0;z^)

X 1[1] =

Z

z^

4

'(dz^)

Z Z

w^ u~

u~K1((~0; z^); du~ dw^):

(If S = (R+)r S^ then d~ > 0 componentwise to satisfy Condition (4)). (4) There exists a set C M such that (C ) > 0 and such that, for ~z 2 C , P~z (TN1 = 1) > 0 where TN1 is the rst return time to N by W 1. R (5) S^ a^,1 (^z )'(dz^) < 1. R (6) The measure (d~x) := ~z2M (d~z)K (~z; d~x)f~x 2Mcgh(~x) is nite. R (7) If a^,1 is unbounded we require that the measure ^(dx^) := x~ (dx~; dx^) is a^,1 -regular for the chain W^ 1 as de ned in Chapter 14.1 in Meyn and Tweedie (1993). R R Let f (~z) := M(~z) ~x2Mc K (~z; d~x)h(~x). To verify Condition (6) we must verify ~z f (~z)(d~z) < 1. This can be done using Lyapounov functions (see Meyn and Tweedie (1993) Theorem 14.3.7). Lemma 1.1 Suppose there exist positive functions V and s on S such that (1.4) Then

R

R

LV (~y) KV (~y) , V (~y) ,f (~y) + s(~y):

~z f (~z) (d~z) ~x s(~x) (d~x).

Typically s(~x) = b C (~x) where b is a positive constant and C 2 S . In many examples f~z 2MgK (R~z; d~x)f~x 2Mcg = f~z 2MgK 1(~z; d~x)f~x 2Mcg. In this case the measure becomes x) = M (d~z)h(~z)K1(~z; d~x)f~x 2Mcg so to check Condition (6) it R (d~ suces to verify that M (d~z)h(~z) < 1 (using Lemma 1.1). To check ^(dx^) is a^,1 -regular for the chain W^ 1 we need to nd a petite set C^ 2 S^ such that C^ X

E^ [

(1.5)

n=1

a^,1 (W^ 1[n])] < 1:

The most practical method to verify that the sum (1.5) is nite is to check condition (V3) given in Meyn and Tweedie (1993). It is sucient to nd a constant b and an extended-real valued function V^ : S^ ! [0; 1] such that Z

K^ 1(^y; dz^)V^ (^z) , V^ (^y) ,a^,1 (^y) + bC^ (^y)

R and such that V^ (^x)^(dx^) < 1. A natural candidate for V^ is a multiple of a^,1 . Just note that Z

z^

K^ 1(^y; dz^)â,1 (^z) = =

Z

(^z ) a^,1(^z ) K 1((~0; y^); (dz~; dz^))ez1 aâ^(^ y)

~zZ

~z

K 1((~0; y^); (dz~; dz^))ez1

a^,1(^y):

R If it can be shown that ~z K 1((0~; y^); (dz~; dz^))ez1 < 1 uniformly outside a petite set C^ and uniformly bounded in y^ on C^ then we can take V^ to be some multiple of a^,1.

5

1.3 Stochastic networks

The main application of our work is the estimation of rare event probabilities for stochastic networks. The Flatto-Hahn-Wright example in Section 3.1 is a special case of the networks considered below. Consider a network with r + m nodes which is modeled as a Markov jump process with a state space S (IINr0; IINm0 ) where IIN0 denotes the nonnegative integers. If = fi1 ; i2; : : : ibg, we say ~x is on the boundary S if xi = 0 for i 2 but xi > 0 for i 62 . We assume the jump process modeling the stochastic network is the uniformization of a (homogeneous) nearest neighbour random walk W in the orthant IINr0+m having transition kernel K : K (~x; ~y) = JJ (~y(~y,,~x~x)) ifif ~~xx 622 SS : for any 6= ; Here J (~x) gives the nearest neighbour jump probability in the direction ~x in N = f(x1 ; x2 ; : : : ; xr+m ) : xi 2 f,1; 0; 1gg. On a boundary S we can only allow transitions in the directions N = f(z1 ; z2; : : : ; zr+m) : zi 2 f,1; 0; 1g; i 2 c; zi 2 f0; 1g; i 2 g. So J is of the form (

J (~z) =

J (~z)P if ~z 2 N and ~z 6= ~0 1 , ~s2N nf~0g J (~s) if ~z = ~0:

We assume the kernel K is associated with an irreducible Markov chain W with a stationary probability distribution . Let D = f(0; 0; : : : 0)g. We are interested in the rare event when the rst node overloads; that is when the rst coordinate of W exceeds a level `. When the rst node overloads, other nodes may remain stable even though they are subject to higher loads. The coordinates corresponding to these \super" stable nodes are assumed to be coordinates r + 1 through r + m. Unfortunately when the rst overloads it may drive other nodes into overload. We assume these nodes correspond to coordinates 2 through r. We look for a harmonic function of the form h(~x) := ex1 a^(^x) since in addition to twisting the rst component to become transient we must judiciously twist only those other components which remain recurrent after twisting. Furthermore the twist must make nodes 1 through r transient to plus in nity. To nd h we take N = f~x : xi 0; i = 1; : : : r; xi 0; i = r + 1; : : : ; mg thus removing the boundaries for the rst r coordinates. Hence S 1 = ZZr IINm0 where ZZ represents the integers. If fr + 1; : : : ; r + mg and xi = 0 for i 2 but xi > 0 for i 2 fr + 1; : : : ; r + mgn then ~x 2 S 1. Extend K to K 1 on S 1 by de ning if ~x 62 S 1 for any fr + 1; : : : r + mg 1 K (~x; ~y) = JJ (~y , ~x) \fr+1;:::;r+mg (~y , ~x) if ~x 2 S 1\fr+1;:::;r+mg : Finding a harmonic function h for K 1 of the form exp(x1)â(^x) is possible in great generality as is seen in Section 2.1. However, since K 1 is the transition kernel of a nearest neighbour walk with steps outside the boundary M replaced with null transitions, it is often possible to nd h in exponential form. For ~x = (x1 ; : : : ; xr+m) 2 S 1; de ne h(~x) := a~x ax1 1 ax2 2 axr+r+mm ; where a = (a1; a2 ; : : : ; ar+m ) is a vector of positive constants such that a2 = 1; ar = 1. If h is harmonic at ~x 2 int(S 1) then (1.6)

X

~z2N

a~z J (~z) = 1 where N is the set of nearest neighbours to the origin. 6

Similarly for ~y 2 S 1, harmonicity yields the constraint X

(1.7)

~z2N

a~z J 1(~z) = 1:

In general this means 2m constraints! Of course there is always the solution ai = 1 for all i but in general another positive solution may not exist. Fortunately solutions do exist in many interesting cases! If, for example, we further assume that J (~z) = 0 when at least two coordinates zi and zj are ,1 among i r + 1; j r + 1. In this case satisfying the constraint (1.7) gives a condition on each boundary Sfkg; k r + 1: X

(1.8)

~z2Nfkg

a~z J (~z) =

X

~z2Nfkg

J (~z):

Subtracting this from (1.6) gives (1.9)

X

~z2Dfkg

a,k 1

Y

i6=k

azi i J (~z) =

X

~z2Dfkg

J (~z) where Dfkg = Nfckg:

The m constraints given by (1.9) plus the constraint (1.6) is equal to the m constraints given by (1.8) plus the constraint (1.6). Now subtract the general constraint given at (1.7) from (1.6). Hence (1.10)

X

~z2D

a~z J (~z) =

X

J (~z)

~z2D

where D = N c. However, since we are assuming at most one negative jump among the coordinates fr + 1; : : : ; r + mg, it follows that X

~z2D

a~z J (~z) =

X

~z2D

J (~z) =

X X

k2 ~z2Dfkg X X k2 ~z2Dfkg

a~z J (~z) J (~z):

This means the constraint (1.10) can be obtained by summing the constraints (1.9) over k 2 . Consequently all the constraints are equivalent to the m constraints given at (1.9) plus (1.6); that is m constraints in all. Consequently there is at least one solution. Of course the solution h must produce a twisted process such that W~ 1 drifts to plus in nity while W^ 1 must be a stable Markov chain! If this fails then we must try again by twisting another set of coordinates; that is we must rede ne the super stable nodes.

1.4 Steady State

Recall that T`1 = minfn 1 : W~ 11[n] `g is W 1's rst hitting time at F := f~x 2 S 1; x1 `g and we denote the hitting time at N by TN1. Let R1[`] = W~ 11[T`1] , ` denote the excess beyond ` of the ladder height. The following is shown in Section 2.2. 7

Lemma 1.2 Under the Conditions (1-6), P~z W^ 1[T`1] 2 dy^; R1[`] 2 du; T`1 < TN1 ! H (~z)(du; dy^) in total variation where H (~z) P~z (TN1 = 1). This means that given W 1 hits F , is the limiting hitting distribution of (R1; W^ 1). It is not necessary to impose the spread-out condition in Condition (2). Lemma 1.3 Let f : [0; 1) S^ ! R be a bounded measurable function such that f (u; y^) is

continuous (or piecewise continuous) in u. Then, in the continuous case under the Conditions (1-6) but without assuming the spread-out Condition,

E~z

f (R1[`]; W^ 1 [T 1])fT 1 < T 1g `

`

N

! H (~z)

Z Z

f (u; y^)(du; dy^):

The proof follows by Theorem 3.1 in Athreya, McDonald and Ney (1978). In Theorems 1.5, 1.6 and 1.7 we will need the above convergence for functions which are continuous or piecewise continuous in the ( rst) additive component. Consequently these theorems could be proved without the spread-out condition. On the other hand the convergence in total variation in Theorem 1.13 will fail without the mixing condition (P2). We assume the stronger mixing condition for simplicity. The uniform integrability aorded by Condition (7) gives Lemma 1.4 If Conditions (1-7) hold then Z

lim (d~x)E~x fT`1 < TN1g exp(,R1[`])â,1 (W^ 1[T`1]) `!1 ~xZ

=

~x

(d~x)H (~x)

Z Z

y^ u0

exp(,u)â,1(^y)(du; dy^):

It is well known (see (7.2) in Theorem 7.2 in Orey (1971) or Theorem 10.0.1 in Meyn and Tweedie (1993)) that we can represent the probability of A F as the expected number of visits to A before returning to M:

(A) =

Z

M

(d~z)E~z

TM X n=1

!

A(W [n]) where TM is the rst time W hits M.

We use this representation of (A) to prove the following theorem: Theorem 1.5 Consider a set A P in B Rr,1 S^ (hence measurable with respect to x1 and x^) 1 1 such that mA (u; y^) := E(u;0;:::;0;y^) ( 1 n=0 A (W [n])) is uniformly bounded in u and y^. Assuming Conditions (1-7) we have Z

A

((` + dx1) Rr,1 dx^)

where

Z Z

e,` f f :=

w^ u0

Z

~z2M

a^,1(w^)e,um1 ^) A (u; w^ )(du; dw

(d~z)

Z

~x62M

8

K (~z; d~x)h(~x)H (~x):

We see the asymptotics of ((` + ,) Rr,1 A^) is given by exp(,`) as long as the right hand side of the above expression is strictly positive. This is the case as long as H (~x) is strictly positive and this follows from Condition (4). Remark that if S^ is countable and A = f0 x1 L; x^ = p^g r,1 0 then m1 A (u; y^) is maximal at u = L and y^ = p^. For general state spaces if A [0; L] R Ck0 for some k0 and L < 1, where Ck0 = f~x = (~0; x^) 2 S 1 : P~x(W~ 11[m] mk,1 for all m k) 1=4g then A is directly Riemann integrable as de ned in Kesten (1974) and by Lemma 6 in Kesten (1974), m1 A (u; y^) is uniformly bounded above. The asymptotic expression in Theorem 1.5 can be reevaluated to give: Theorem 1.6 (Steady State) Assume Conditions (1-7) and assume that A is as in Theorem 1.5. Then Z Z Z r , 1 , ` ((` + dx1) R dx^) e f A(x1 ; x^)â,1 (^x)'(dx^) ~1 exp(,x1 )m(dx1 ) d1 A x^ x1 0 where ' is the stationary distribution of W^ 1 given by Condition (1). This means that, for large `, the stationary measure ((` + dx1 ) Rr,1 dx^) is a constant times exp(,`) times a product of the measures a^,1(^x)'(dx^) and exp(,x1 )m(dx1).

1.5 Hitting Times

We wish to describe the hitting time when W1, the rst coordinate of W , reaches some high level `. Let T` TF and let TD denote the hitting time at D. Let f (x) denote the probability that, starting from x 2 B n (D [ F ), W hits D before F ; that is

f (x) = Px(W hits D before F ) = Px(TD < TF ):

(1.11)

f can be extended to all S to satisfy the following Dirichlet problem: 8 x) = 0; in ~x 2 B n D; < Lf (~ (1.12) f (~x) = 0; in ~x 2 F ; : f (~x) = 1 in x 2 D. Using the fact that f satis es (1.12) we have that (1.13)

:=

Z

~y2F

(d~y)

Z

~x2B

K (~y; d~x)f (~x) =

Z

Z

D

(d~z) K (~z; d~x)(1 , f (~x)): ~x

Using the above expression we can interpret =(D) as the probability starting in steady state in D that the chain leaves D and doesn't return before hitting F . Call this probability pD . De ne the Palm measure P0F (!D) associated with the stationary point process of returns to D after passing through F . The associated expectation E0F (!D)T` is a natural measure of the time until overload starting from the point in D reached after the last recovery from overload. By the Hitting Time Theorem in Baccelli and McDonald (1996) we have 9

Corollary 1.7 As (F ) ! 0, (E0F (!D)TF ),1 . This result is an extension of Lemma 1 in Meyn and Frater (1993). With additional hypotheses Baccelli and McDonald (1996) show when E0F (!D)TF is asymptotic to ED TF . It seems using Corollary 1.7 and (1.13) to estimate E0F (!D)TF is useless without an expression for and if visits to the forbidden set are rare events then even simulating on F is practically impossible. On the other hand if f is bounded away from 0 and 1 on F then E0F (!D)TF clearly has the same rough asymptotics as (F ). In fact we can say much more because we can perform and approximate h transformation on W . Next we replace f (x) by a good guess! Let (x) denote the probability that the random walk W , starting at x, hits M before hitting F . Like f , the function satis es a Dirichlet problem: 8 x) = < L(~

0; in ~x 2 B n M; ( ~ x ) = 0 ; in ~x 2 F ; : (~x) = 1; in ~x 2M :

(1.14)

On B n M, the equation L(~x) = 0 is equivalent to the equation L1 (~x) = 0 because the jump kernels K and K 1 agree inside S n M and is constant on the edge M. Assuming for the moment that (~x) is suciently close to f (~x), we may approximate = by

b := (1.15)

Z

~y2ZF

= ,

Z

~y2F

(d~y)

~z2M

(d~y)

Z

~x2B

Z

~x2B

K (~y; d~x)f (~x)

K (~y; d~x)(~x)

(d~z)L(~z) =

Z

~z2M

Z

(d~z) K (~z; d~x)(1 , (~x)): ~x

The above equivalence follows from the Hitting Time Theorem in Baccelli and McDonald (1996) by identifying M and D. The fact that and b are close is shown in the following key Lemma: Lemma 1.8 (Comparison Lemma) If Conditions 1-7 hold then lim`!1 j , bj=b = 0: Using Corollary 1.7 and the above Lemma 1.8 we have: Corollary 1.9 If Conditions 1-7 hold then = (D)pD , b = (M)pM and

E0F (!D)TF ,1 b,1 E0F (!)TF where pM is the probability starting in equilibrium on M of hitting F before M.

Using the fact that h is harmonic for L1, de ne ` by (1.16)

(~x) = 1 , exp(,`)h(~x) (~x): 10

is harmonic for L1 on B n N so is harmonic with respect to L1 on B n N: Moreover, we then have the following boundary-value problem for , de ned through equation (1.16): 8 x) = 0; ~x 2 B n N; < L1 (~ , 1 (1.17) (~y) = exp(`)h(~y) ; ~y 2 F ; : (~z) = 0; ~z 2 N. The boundary-value problem (1.17) has a probabilistic solution. Hence h i (1.18) (~x) = E~x a^,1 (W^ 1[T`1]) exp(,R1[`])fT`1 < TN1g for ~x 2 S n (M [F ): Evaluating (1.15) we see

b = = (1.19) =

Z

Z~y2M Z~z2M

~z2M

(d~y)L(~y) (d~z)e,` (d~z)e,`

Z c Z~x2N

~x2Mc

K (~z; d~x)h(~x) (~x) h

i

K (~z; d~x)h(~x)E~x a^,1 (W^ 1[T`1]) exp(,R1 [`])fT`1 < TN1g :

The above expression has no closed form expression, although it can be estimated by simulation. The expression for near the origin is obtained by simulating W 1 with absorption on N. Using Theorem 2.5, expression (1.19) and Corollary 1.7 we shall prove: Theorem 1.10 (Mean Hitting Time) Under Conditions (1-7), E0F (!D)T` exp(`)g,1 as ` ! 1 where (1.20)

g

Z

~z2M

(d~z)

Z

~x62M

K (~z; d~x)h(~x)H (~x)

Z Z

y^ u0

a^,1(^y)e,u(du; dy^)

where H (~x) is the probability W 1, starting at ~x, never hits N. The rough asymptotics of E0F (!D)T` are given by exp(`). The constant g can be obtained by simulation. This is not too onerous because we only need on M and W 1 is transient toward F .

1.6 Hitting Distribution

Let the time reversal of the Markov chain W with respect to be denoted by W and let the kernel of the time reversal be K . Let f (~x) denote the probability W , starting from ~x 2 B , hits D before F . We denote the rst entrance time by W into F by TF . TD denotes the rst entrance time at D. By time reversal we have Lemma 1.11 If A 2 S and A F then Z Z 1 (d~y) K (~y; d~x)f (~x) PD (W [TF ] 2 AjTF < TD ) = (D)p D ~y2A ~x2B R where pD = D (d~z)P~z (TF < TD )= (D). 11

To apply Lemma 1.11 we again make an approximation. This time substitute (~x) for f (~x) where (x) is the probability W hits M before returning to F . The error introduced in the hitting distribution on F is shown to be asymptotically small. Proposition 1.12 If A 2 S and A F then Z

~y2A

(d~y)

Z

~x2B

K (~y; d~x)f (~x)

Z

~y2A

(d~y)

Z

~x2B

K (~y; d~x) (~x):

We therefore investigate the expression Z Z 1 (~y; d~x) (~x) = (M)pM P (W [T ] 2 AjT < T ) (1.21) (d~y) K F F M (D)pD A (D)pD M ~x2B using Lemma 1.11 with D replaced by M. By Corollary 1.9, (M)pM=(D)pD ! 1 so asymptotically PD (W [TF ] 2 AjTF < TD ) and PM(W [TF ] 2 AjTF < TM) are the same. Now take A = (` + ,) Rr,1 A^ in S . By a change of measure we see PM(W [TF ] 2Z A; TF < TZM) Z 1 = (d~z) K (~z; d~x) P~x(W 1[T`1] 2 d~y; TF1 < TN1) (M) Z~z2M ~x62M A Z Z = (1M) (d~z) K (~z; d~x)h(~x) a^,1(^y)e,y1 P~x(W 1[T`1] 2 d~y; TF1 < TN1) ~z2M ~x62M A Z Z Z Z , ` (d~z) K (~z; d~x)h(~x) a^,1(^y)e,u P~x(W^ 1 [T`1] 2 dy^; R1[`] 2 du; TF1 < TN1): = e(M) ^ ~z2M ~x62M A u2, Similarly, Z Z Z 1 PM(TF < TM) = (M) (d~z) K (~z; d~x)h(~x) h,1(~y)P~x(W 1[T`1] 2 d~y; TF1 < TN1) ~z2MZ ~xZ62M Z F a^,1 (^y) exp(,u)P~x(W^ 1 [T`1] 2 dy^; R1[`] 2 du; TF1 < TN1): = e,` (1M) (d~x) ~x

y^ u0

Cancelling out exp(,`) we see PD (W [T`] 2 AjT` < TD ) is estimated by R R R x) A^ u2, a,1 (^y) exp(,u)P~x(W^ 1[T`1] 2 dy^; R1[`] 2 du; TF1 < TN1) c (d~ ~ x 2 M ,R (1.22) : ( d~ x ) ( ~ x ) c ~x2M Again this provides a practical solution to the hitting problem since the above expression can be simulated quickly. Proposition 1.12 gives the following theorem. Theorem 1.13 (Hitting Distribution) Under Conditions (1-7) above, as ` ! 1, 1 Z (d~z)P (W^ [T ] 2 dy^; R[T ] 2 du j T < T ) ~z ` ` ` D (D) ~z2D Z Z , 1 , u , 1 , u ! a^ (^y)e (du; dy^)= a^ (^y)e (du; dy^) y^ u0

where (du; dy^) is the stationary distribution of (R1 [`]; W^ 1 [T`1]).

12

1.7 Literature Review

The main technique used here is to nd a function h which is harmonic outside some boundary N in order to perform an h transformation or twist of the Markov chain. The h-transform, introduced by Doob (1959), has a long history. In the case of random walk on the line the twist is the associated random walk discussed in Section XII.4 in Feller Volume II. Kesten (1974) Section 4 studied the asymptotics of a Markov additive processes using the twist. Nummelin and Ney (1987) used the twist to study large deviations of the additive component of a Markov additive processes. The representation of the steady state measure of a rare event A F employed in Subsection 1.4 is the basis of the M-cycle (called A-cycle there) technique in importance sampling exploited by Nicola, Shahabuddin, Heidelberger, and Glynn (1992). M-cycles consist of trajectories which start in M, the edge of N in S , and end just before a return to M. The equilibrium measure on M is unknown (unless M is a single point) but by generating N (untwisted) M-cycles it can be estimated. Just sample W whenever it returns to M. Let TM(k); k = 1; : : : N denote the the return times to M in the N untwisted M-cycles. Each return by an untwisted M-cycle determines the start of a twisted M-cycle and we alternate between untwisted and twisted M-cycles. At the start of the kth twisted M-cycle rst generate a step K and next generate steps using the twisted kernel K1 until W 1 either hits N or it hits F at time T`1. If it hits N before F set Vk = 0. If W 1 hits F before N then turn o the twist and count Nk , the number of visits to A before W 1 returns to M. In this case let 1 Vk := h(^W1 [0]) exp(,R1[`])Nk : 1 a^(W [T` ]) The expected value exp(,`)(M)EMVk is precisely expression (2.31). Hence N X k=1

Vk =

N X k=1

TM(k)

is an asymptotically unbiased estimator of exp(`)(A). This application of the twist and Mcycles to produce and importance sampling estimator is described in detail in Bonneau (1996). Also see the survey paper by Heidelberger (1995). Note also that the M-cycle technique may be used to estimate constants like

f g

Z

Z ~z2M

~z2M

(d~z)

(d~z)

Z

Z ~x62M

~x62M

K (~z; d~x)h(~x)H (~x) and

K (~z; d~x)h(~x)H (~x)

Z Z

y^ u0

a^,1 (^y)e,u(du; dy^)

in Theorem 1.6 and Theorem 1.10. Note that if Conditions (1-7)phold, the importance sampling estimator above will have a standard deviation of order O(1= N ). If the Conditions fail, the asymptotics of (A) are not given by exp(,`). In practice the importance sampling technique will fail because M-cycles will abort by hitting M before F . Glasserman and Kou (1995) described problems with importance sampling when estimating the probability of overloading a network. In the two-node example discussed in Section 4.1 in 13

Glasserman and Kou (1995) these problems arise because cycles end when the network empties. The likelihood ratio is poorly controlled on the boundary when the second queue empties. We would take M= f(x1 ; x2 ) : x2 = 0g and the problem would be eliminated (but of course we have to estimate on M). The mean time until a rare event occurs is another useful descriptor. By Lemma 1.7 and Corollary 1.9, the inverse of the mean time for the chain in equilibrium to hit F , starting from the rst return point in D after visiting F , is asymptotic to b = (M)pM where pM is the probability starting in equilibrium on M of hitting F before M. We can estimate (M) by the average number of steps TM in an untwisted M-cycle. The importance sampling estimator of b is obtained by remarking that

b = Z(M)pM = Z(M)PM(T` < TM) = (d~z) K (~z; d~x)h(~x)E~x(h,1(W 1 [T`1])fTF1 < TN1g) ~z2M

(1.23) = exp(,`)

Z ~x62M

~z2M

(d~z)

Z

~x62M

K (~z; d~x)h(~x)E~x(â,1(W^ 1[T`1]) exp(,R1 [`])fTF1 < TN1g):

This can be estimated as above. At the start of the kth M-cycle generate rst a step K and then generate steps using the twisted kernel K1 until W 1 either hits N or it hits F at time T`1. If it hits N before F set Vk = 0. If W 1 hits F before N then set Vk = (h(W 1[0])=a^(W^ [T` ])) exp(,R[`]). The expected value exp(,`)(M)EMVk is precisely expression (1.23). Consequently N X k=1

Vk =

N X k=1

TM(k)

is an asymptotically unbiased estimator of exp( p `)b. Again this importance sampling estimator will have a standard deviation of order O(1= N ) if Conditions (1-7) hold. Parekh and Walrand (1989) and Frater, Lennon and Anderson (1991) calculated the change of measure or twist associated with a large deviation of the total number of customers in a Jackson network. They employed this twist to implement the above M-cycle technique to nd importance sampling estimates of the probability of the rare event when the total number of customers exceeds some level ` as well as an estimate of the mean time until the network overloads. They show that in some cases these estimates have optimally small standard deviation. Labreche (1995) and Labreche and McDonald (1995) nd a harmonic function for a Jackson network in order to study overloads. The associated twist is precisely the one found by Frater, Lennon and Anderson (1991). The main thrust of our work is exact asymptotics. Exact asymptotics for the steady state probabilities of rare events have been studied previously by Hoglund (1991) for ruin problems and by Schwarz and Weiss (1993) in the special case of the FHW queueing network originally proposed by Flatto and Hahn (1985). Recently Sadowsky and Szpankowski (1995) gave the exact asymptotics of a fast teller queueing system using methods similar to those employed in Section 1.4. Here we analyze the FHW example. Our method is to cut out the boundary N where W fails to be a Markov additive chain. This represents a broader point of view than that taken in Sadowsky and Szpankowski (1995) (or in previous works) since the boundary there is simply those states where a (one dimensional) queue 14

is empty. Using the theory of Nummelin and Ney (1987) one can then nd a harmonic function on the complement of N. The associated h-transform produces the twisted process. The main novelty is Theorem 1.10 giving the exact asymptotics of the mean time until a rare event occurs and Theorem 1.13 giving the limiting hitting distribution. The main ingredient in the proof is the Comparison Lemma which shows we may replace an arbitary starting set with an initial steady state distribution on M. The generality of the boundary makes Conditions (4-7) necessary. The representation of the limiting steady state probability of a rare event given in Theorem 1.6 is also new as is the Joint Bottlenecks Theorem 2.10. The range of applications is quite large. In Section 3 we nd the harmonic function associated with a simple example, the Flatto-Hahn-Wright model. The fast teller model is studied in Beck, Dabrowski and McDonald (1997). The join the shortest queue model was partially analysed in McDonald (1996). The harmonic function was given explicitly and Conditions (1-5) were checked. Conditions (6-7) are veri ed in Foley and McDonald (1997). In Huang and McDonald (1996) the representation in Theorem 1.6 was used to give the asymptotics of the queue length distribution of an M jDj1 queue. Also the key heuristic of guessing for f can be used in conjunction with the theory in Iscoe and McDonald (1994a, 1994b) to give error bounds for the estimated mean exit time. In Subsection 2.1 we review the conditions necessary for determing a twist. In Subsection 2.2 we recall the asymptotic theory of Markov additive chain. Subsection 2.3 gives all the proofs which were deferred in earlier sections. In Subsection 2.4 we determine the behaviour of the queues at those nodes which become transient when the rst is overloaded. In Subsection 3 we apply the above results to the FHW example.

2 Tools and Proofs 2.1 The Twist

We now show why we may expect there exists a harmonic function h for K 1 of the form h(~x) = exp(x1 )â(^x). To make the connection to the work of Ney and Nummelin (1987a,b) easier we make the identi cation (W~ 1; W^ 1) (V; Z ) throughout this section (the superscript 1 would be redundant since (V; Z ) is always free). The generating function K^ 1 of the transition kernel of the Markov additive chain (V; Z ) is given by K^ 1(^x; A^) = E (exp( X [1])fZ [1] 2 A^gjZ [0] = x^) where X [1] := (V [1] , V [0]). The existence of eigenvalues and eigenvectors for this "FeynmanKac" operator was studied in Ney and Nummelin (1987a,b) under the condition (M1) below: ^ S^) and a family of (positive) measures fh(^x; )g (M1) There exists a probability measure on (S; R ^ S^) such that (dx^)h(^x; R) > 0 and on (S; h(^x; ,) (A^) K 1((~0; x^); , A^) for all x^ 2 S^, , 2 Br and A^ 2 S^. Under this hypothesis, Ney and Nummelin (1984) (see Lemma 3.1 in Ney and Nummelin (1984)) constructed a regenerative structure for the Markov additive chains: 15

Lemma 2.1 Under (M1) there exist random variables 0 < T0 < T1 < with the following properties: (i) (Ti+1 , Ti; i = 0; 1; : : :) are i.i.d. random variables; (ii) the random blocks (Z [Ti]; : : : ; Z [Ti+1 , 1]; X [Ti + 1]; : : : X [Ti+1]); i = 0; 1; : : : ; are independent; and

(iii)

P~x(Z [Ti ] 2 A^jFTi,1 ; X [Ti]) = (A^) for A^ 2 S^ where Fn is the -algebra generated by fZ [0]; : : : Z [n]; X [1]; : : : ; X [n]g. At times fTi g the process Z behaves as if it has returned to an atom in the state space. This generates a succession of cycles from [Ti,1; Ti). The increments of V between successive returns to this atom are therefore independent. De ne =D Ti+1 , Ti ; and S [ ] =D V [Ti+1] , V [Ti]; i = 1; 2; : : : : The variables and S [ ] have a joint distribution which can be obtained as above from any of the identical regenerative cycles. De ne & := Supp (S [ ]= ) where Supp (Y ) is the convex hull of the support of the measure P (Y 2 ). To ensure that Ti is genuinely r-dimensional we assume (N1) The interior of & is nonempty. We now impose mixing conditions so that two independent copies of (V1; Z ) with dierent initial distributions may be coupled together from some Ti onward. (P1) In the discrete case we assume the distribution of S1 [ ], the (marginal) distribution of the rst component of S [ ], is aperiodic. (P2) In the continuous case we suppose that the distribution of S1 [ ] is spread-out so it has a nonsingular component relative to Lebesgue measure. In the continuous case we could impose the sucient condition like (iii) given in Appendix A of Sadowsky and Szpankowski (1995). It would suce that the measure h in Condition (M1) satisfy r Y h(x; d~y) c (bk , ak ),1 [ak ;bk ](yk )m(d~y) for some constant c: k=1

In fact we could dispense with a mixing Condition (P2) altogether and the Theorems 1.5, 1.6 and 1.7 will still hold using Theorem 3.1 in Athreya, McDonald and Ney (1978). On the other hand the convergence in total variation in Theorem 1.13 will fail without the mixing condition (P2). Lemma 1.3 indicates how the theory goes under these weaker conditions. Under (M1) we may de ne the generating function ( ; ) = E exp( S [ ] , ): 16

Let U := f( ; ) 2 IRr+1 : ( ; ) < 1g and de ne ( ) = inf f : ( ; ) 1g. We impose the following conditions: (M2) U is an open set. (M3) E V1 [ ] < 0. If (M1-2) hold then by Theorem 4.1 in Ney and Nummelin (1987a), ( ; ( )) = 1 and exp(( )) is an eigenvalue of K^ 1 associated with the right eigenvector r(^x; ). We study the case where

= ( 1; 0; : : : ; 0) and we wish to nd 1 other than 0 such that ( ) = 0. By Lemma 3.3 in Ney and Nummelin (1987) we have that r(~0) = (E ),1E V [ ]: Hence if Condition (M3) holds then the derivative of with respect to 1 is negative at 1 = 0. Naturally (~0) = 0. Moreover the function () is strictly convex by Corollary 3.3 in Ney and Nummelin (1987). Hence, ( 1; 0; : : : ; 0) passes through 0 at most at two points, 0 and > 0. If we impose the condition that P (S1[ ] > 0) > 0 then by the de nition of and the fact that ( ; ( )) = 1 we have ( 1; 0; : : : ; 0) ! 1 as 1 ! 1. Finally, by Condition (M2), ( 1; 0; : : : ; 0) increases continuously to in nity so exists. We note in passing: Lemma 2.2 If Condition (M3) holds then r(; 0; : : : ; 0) (1; 0; : : : ; 0) > 0 Proof : By strict convexity (~0) > (; 0; : : :; 0) + r(; 0; : : : ; 0) (,; 0; : : : ; 0): The result follows since (~0) = (; 0; : : : ; 0) = 0. 2 is the uniquely chosen value for 1 ensuring that the associated Perron-Frobenius eigenvalue ^ of K 1 is 1! Let ~ := (; 0; : : : ; 0). Hence there exists a positive Perron-Frobenius eigenvector R 1 ^ a^(^x) r(^x; ~ ) satisfying a^(^x) = K (^x; dy^)â(^y). We may therefore de ne a positive harmonic function for the additive chain (V; Z ) by h(~x) := exp(~a x~)â(^x) exp(x1)â(^x). To show it is harmonic for K 1, pick (~x; x^) 2 S 1 such that x1 = s, so Z

K 1(~x; d~y)h(~y)

=

Z

K 1((~x; x^); (~x + dv~; dy^))â(^y) exp( (s + v~1))

(~v ;y^)2S 1 Z

= exp(s) K^ ~1(^x; dy^)â(^y) = exp(s)â(^x) = h(~x) y^

since a^ is a right eigenvector for K^ ~1 associated with the eigenvalue 1. The kernel K^ 1 and the stationary distribution ' are denoted by Q() and () in Ney and Nummelin (1987). The increment of the additive chain is denoted by S1 in Ney and Nummelin (1987) so the expression (2.24) d~1 E'W^ 1 X1[1] = EQ(()) S1 = r(; 0; : : : ; 0) follows by Lemma 5.3 there. Let the stationary distribution of K^ 1 is denoted by ^ 1. 17

Lemma 2.3 (Drift) If Conditions (M1-2) hold and if E^1 X1 [1] < 0 then d~1 > 0. Proof : ^ 1 is denoted by (0) in Ney and Nummelin (1987). By Lemma 5.3, E^1 X1[1] = r(~0). It follows from the hypothesis that r(~0) (1; 0; : : : ; 0) < 0. Hence condition (M3) holds. Therefore, by Lemma 2.2, r(; 0; : : : ; 0) (1; 0; : : : ; 0) > 0. The result follows from the above

expression for d~1. 2 Condition (M2) is too strong for many applications. By direct computation it may be possible to nd the left and right Perron-Frobenius eigenmeasure and eigenfunction associated with the eigenvalue exp(( )) of K^ 1. The left and right eigenmeasure and eigenfunction are denoted by R ^ and r(; ) respectively and we suppose that r(^x; )^ (dx^) = 1. It may be that ^ is not even a probability measure! The conditions of Lemma 5.3 in Ney and Nummelin (1987) may not be checkable but (2.24) still holds. Take the derivate of the expression Z Z

^ (dx^)K^ 1(^x; dy^)r(^y; ) = exp(( )):

We get 0( ) exp((

)) Z Z d d y; ) ^ x ;

) exp((

)) + ^ =

(dx^)r(^

(dy^) exp(( )) r(^ d d Z Z d + ^ (dx^) d K^ 1(^x; dy^)r(^y; ) But,

d d and

Z

d 1=0 ^ (dx^)r(^y; ) = d

Z Z

d K 1((~0; x^); (dy~; dy^))y e y1 r(^y; ) ^ (dx^) d 1 Z Z (^y; ) r(^x; ) = ^ (dx^)K 1((~0; x^); (dy~; dy^))y1e y1 rr(^ x; ) Evaluating at ~ where (~ ) = 0 gives Z

Z

r(^x; )^(dx^) K1((~0; x^); (dy~; dy^))y1 =

Z

Z

'(dx^) K1((~0; x^); (dy~; dy^))y1

= d~1 Therefore d~1 = 0(~ ) so we can establish d~1 > 0 from the convexity of . 18

2.2 Asymptotics of Markov additive processes

In this section we apply the results in Kesten (1974) to obtain the asymptotic behaviour of W 1. To make the connection easier we identify (W~ 1; W^ 1) (V ; Z ) (again the superscript 1 is redundant since (V ; Z ) is always free). Following Kesten (1974), consider a two sided process fX [n]#; Z [n]#; ,1 < n < 1g with probability measure P # determined by

P #(Z [k + i]# 2 dzî; 0 i n; X #[k + i] 2 di; 1 i n) = '(dz^0)K1((~0; z^0); d1 dz^1) K1((~0; z^1); d2 dz^2) K1((~0; z^n,1); dn dz^n) for any sequence of states fzî; 0 i ng in S^ and any sojourn times fi; 0 i < ng and any integer k. The pairs f(X [n]#; Z [n]#)g,1 `, 1 X

mA(`; u; y^) ! m1 A (u; y^) := E(u;0;:::;0;y^)

n=0

!

A(W 1[n]) as ` ! 1

by monotone convergence. By hypothesis m1 A is uniformly bounded above so, using Lemma 1.4, we see that

e`

Z

A

((` + dx1

) Rr,1 dx^) !

Z

~x

(d~x)H (~x)

Z

Z

y^2S^ u0

a^,1 (^y)e,um1 A (u; y^)(du; dy^):

Proof of Theorem 1.6: By Theorem 1.5 Z

(2.32)

A(x1 ; x^)((` + dx1) Rr,1 dx^)

e,`

Z

~z2M

(d~z)

Z

~x62M

K (~z; d~x)h(~x)H (~x) 22

Z Z

y^ u0

a^,1(^y)e,um1 A (u; y^)(du; dy^)

2

P1 1 where m1 A (u; y^) := E(u;0;:::;0;y^) ( n=0 A (W [n])). For ~y = (u; y2; : : : ; yr ; y^), P~y (W 1[n] 2 d~w) = (K 1)n(~y; dw1 Rr,1 dw^) = exp((u , v)) aâ^((^wy^)) (K1)n(~y; dv Rr,1 w^):

Hence,

m1 A (u; y^) = E(u;0;:::;0;y^)

1 X n=0

!

A(W 1[n])

!

1 X a ^ (^ y ) = exp((u , v)) A(v; w^)E(u;0;:::;0;y^) fW 1[n] 2 dv Rr,1 dw^g : a ^ ( w ^ ) v0 w^ n=0 Substituting this into (2.32) gives exp(`)(A) ! Z Z Z Z 1 X 1 1 r , 1 ! f (du; dy^) exp(,v)A(v; w^)E(u;0;:::;0;y^) fW [n] = dv R dw^g y^ u0 v0 w^ a^(w^) n=0 ! Z Z Z Z 1 X 1 exp(,v) (v; w^) 1 [n] 2 dv Rr,1 dw^g ( du; d y ^ ) E fW = f A (u;0;:::;0;y^) ~) y^ u0 v0 w^ a^(w n=0 Z Z 1 = f A(v; w^)â,1 (w^)'(dw^) ~ exp(,v)m(dv) d1 w^ v0 since ' is the steady state of K^ 1 so ! 1 X m(dv)'(dw^) = Z (du; dy^)E fW 1[n] 2 dv Rr,1 dw^g (u;0;:::;0;y^) ~d1 y^;u n=0 1 r , 1 is the mean number visits of W to dv R dw^. The asymptotics of (d~y) conditioned on y1 2 ` + , follow from the above by summing. 2 Note that (d~x)K (~x; d~y) = (d~x)(K 1) (~x; d~y) = (d~y)K1(~y; d~x) for ~x; ~y 2 S 1 n N where we denote the measure h(~y)(d~y) by (d~y). From Theorem 1.6 we have that as ` tends to in nity, Z ,1 r , 1 , 1 , v (du R dx^) ! g a^ (^y)e (dv; dy^) '(dx^) m(~du) d1 y^;v where we recall m denotes counting measure in the discrete case and Lebesgue measure in the continuous case. Consequently K (~x; dy1 Rr,1 dy^) is given asymptotically by the time reversal of K1 with respect to the measure m(dx1 ) '(dx^). This connects the results in Subsection 2.2 and Lemma 1.11. By Lemma 1.11 the probability W hits the set du Rr,1 dy^ is asymptotically Z Z 1 (D)pD y2 ;:::yr (du dy2 dyr dy^) x1 so when the rst node is overloaded the second also overloads. To treat this case, create a cticious node which neither receives nor serves customers. In other words consider a state space S = IIN0 IIN0 f0g. The chain W = (W1; W2 ; W^ ) is such that W^ [t] 0 while (W1 [t]; W2[t]) represents the number of customers queued at the rst and second nodes respectively. To construct W 1 simply extend the transition probabilities of W to S 1 := ZZ ZZ f0g by taking N = f(x; y) : x 0 or y 0; x; y 2 ZZg. Now twist W 1 with twist coordinates (a1 ; 1; a3). Since W^ 1 has only one state a3 = 1 and the only equation to be satis ed is ( + )a1 + ( + ) + a,1 1 = + + + + . The only solution other than a1 = 1 is a1 = =( + ). This is the same twist as was obtained above. The stationary distribution of W^ 1 is a unit mass at 0 so (d~1; d~2) = ( , ( + ); + =( + ) , ) by adding the components of (W11; W21). Hence d~1 and d~2 are positive since we are assuming

so Conditions (1-5,7) are automatic. To check Condition (6) we must show that

a,y '(y)=

1 X x=0

a,z '(z)

h(x; 0)(x; 0) < 1 and

1 X y=0

h(0; y)(0; y):

The second sum is surely nite since h(0; y) = 1 for y 0. Next 1 X x=0

h(x; 0)(x; 0) =

1 X 1 X x=0 y=0

f (x; y)(x; y) where f (x; y) := ax1 fy = 0g:

By Lemma 1.1 it suces to nd a positivePfunction V (x; y) such that LV (x; y) ,f (x; y)+ s(x; y) where s is a positive function such that x;y s(x; y)(x; y) < 1. De ne a function V (x; y) := (ax1 ( = )y )=( , ). It is easy to check that LV (x; y) = 0 if x > 0 and y > 0. Next, if x > 0 then LV (x; 0) = ( , )V (x; 0) = ,ax1 = ,f (x; 0). Finally if y > 0 then LV (0; y) = ( , ( + ))V (0; y) = ( , ( + ))( = )y =( , ) and LV (0; 0) = ( , + , ( + ))V (0; 0) = ( , + , ( + ))=( , ): De ne ( , ( + )) ( ,

+ , ( + )) y s(x; y) = ( , ) ( = ) fx = 0g + 1 + fx = 0; y = 0g: ( , ) 30

Note that s is positive because > + and > + by hypothesis. It follows that LV (x; y) ,f (x; y) + s(x; y). The last step is to verify that X

x;y

1 X ( ,

+ , ( + )) ( , ( + )) ( = )y (0; y) < 1 s(x; y)(x; y) = 1 + (0 ; 0) + ( , ) ( , ) y=1

but this holds because < . Applying Theorem 1.10 we get that EMT` is of order a`1. Again this is not very surprising because the rst node behaves like an M jM j1 queue with input rate + and service rate . Applying Theorem 2.10 we get that W2 [T`]=` grows linearly in ` with a rate of d~2=d~1.

Acknowledgments. I wish to thank Professors Arie Hordijk, Ian Iscoe and Bob Foley as well as Beno^t Beck and Nadine Labreche for their help. Thanks also to the rather harsh but very professional referees who obliged me to clearly set out what is novel. Finally I should like to express my appreciation to Francois Baccelli at the INRIA and Dick Serfozo at Georgia Tech for their hospitality during the completion of this work.

References Aldous, D. (1989). Probability Approximations via the Poisson Clumping Heuristic. SpringerVerlag, New York. Athreya, K. B., McDonald, D., Ney, P. (1978). Limit theorems for semi-Markov processes and renewal theory for Markov chains. Ann. Probab. Vol. 6, No. 5, 788-797. Baccelli, F., McDonald, D. (1996). Rare Events for Stationary Processes. preprint. Beck, B., Dabrowski, A., McDonald, D. (1997). A uni ed approach to fast teller queues and ATM, preprint. Bonneau, M-C. (1996). Accelerated Simulation of a leaky bucket controller, M.Sc. thesis, University of Ottawa. Bremaud, P. (1981) Point processes and queues, martingale dynamics, Springer-Verlag, New York, Heidelberg, Berlin. Doob, J. L. (1959). Discrete potential theory and boundaries, J. Math. and Mech. 8, 433-458. Foley, R., McDonald, D. (1997). Join the shortest queue systems: stability and exact asymptotics, manuscript. Frater, M. R., Lennon, T. M. and Anderson B. D. O. (1991). Optimally ecient estimation of the statistics of rare events in queueing networks. IEEE Transactions on Automatic Control, 36, no. 12, 1395-1405. Hoglund, T. (1991). The ruin problem for nite Markov chains. Ann. Probab. 19, 1298-1310. 31

Huang, C-C., McDonald, D. (1996). Connection Admission Control for constant bit rate trac at a multi-buer multiplexer using the oldest-cell- rst discipline. manuscript. Iscoe, I. and McDonald, D. (1994a). Asymptotics for exit times of Markov jump processes I, Ann. Probab. 22, no. 1, 372-397. Iscoe, I. and McDonald, D. (1994b). Asymptotics for exit times. of Markov jump processes II: application to Jackson networks, Ann. Probab. 22, no. 4, 2168-2182. Iscoe, I., McDonald, D., Xian, K. (1993). Capacity of ATM switches. Annals of Applied Probability. 3, No. 2, 277-295. Iscoe, I., McDonald, D., Xian, K. (1994). Asymptotics of the exit distribution for Markov jump processes; application to ATM. Can. J. Math. 46, no. 6, 1238-1262. Flatto, L., Hahn, S. (1985). Two parallel queues created by arrivals with two demands I. SIAM J. Appl. Math. 44, 1041-1053. Feller, W. (1971). An Introduction to Probability Theory and Its Applications. Vol. II. John Wiley & Sons, New York. Glasserman, G., Kou, S-G. (1995). Analysis of an importance sampling estimator for tandem queues. ACM Transactions on Modeling and Computer Simulation. 5, No. 1, 22-42. Heidelberger, P. (1995). Fast simulation of rare events in queueing and reliability models, ACM Transactions on Modeling and Computer Simulation. 5, No. 1, 43-85. Keilson, J. (1979). Markov Chain Models { Rarity and Exponentiality. Springer-Verlag, New York. Kesten, H. (1974). Renewals theory for functionals of a Markov chain with general state space, Ann. Probab. 2, no. 3, 355-386. Labreche, N. (1995). Overloading a Jackson network of shared queues. M.Sc. thesis, University of Ottawa. Labreche, N., McDonald, D. (1995). Overloading a Jackson network of shared queues. Manuscript. McDonald, D. (1996). Overloading parallel servers when arrivals join the shortest queue. Stochastic Networks: Stability and Rare Events, P. Glasserman, K. Sigman and D. Yao Editors, Lecture Notes in Statistics 117, Springer Verlag, New York, 169-196. Meyn, P. M. and Frater, M. R. (1993). Recurrence Times of Buer Overloads in Jackson Networks. IEEE Trans. Information Theory, 39, No. 1,92-97. Meyn, S. P. and Tweedie R. L. (1993). Markov Chains and Stochastic Stability. London: Springer-Verlag. Ney, P. and Nummelin, E. (1984). Some limit theorems for Markov additive processes. Proc. of Symposium on Semi-Markov Processes. Brussels. 32

Ney, P. and Nummelin, E. (1987a). Markov Additive Processes I. Eigenvalue Properties and Limit Theorems, Ann. Probab 15, 561-592. Ney, P. and Nummelin, E. (1987b). Markov Additive Processes II. Large Deviations, Ann. Probab 15, 593-609. Nicola, V. F., Shahabuddin, P., Heidelberger, P. and Glynn, P. W.(1992). Fast simulation of steady-state availability in non-Markovian highly dependable systems, IBM Research Report. Orey, S. (1971). Limit Theorems for Markov Chain Transition Probabilities. Van Nostrand Reinhold, 108 p.p. Parekh, S. and Walrand, J. (1989) A quick simulation method for excessive backlogs in networks of queues, IEEE Trans. Auto. Control 34, no. 1, 54-66. Schwartz, A., Weiss, A. (1993). Induced rare events: analysis via large deviations and time reversal. Adv. Appl. Prob. 25, 667-689. Sadowsky, J., Szpankowski, W. (1995). The Probability of Large Queue Lengths and Waiting Times in a Heterogeneous multiserver Queue I : Tight Limits, Adv. Appl. Prob. 27, 532-566. Wright, P. (1992). Two parallel processors with coupled inputs. Adv. Appl. Prob. 24, 986-1007. D. MCDONALD DEPARTMENT OF MATHEMATICS AND STATISTICS UNIVERSITY OF OTTAWA OTTAWA, ONTARIO CANADA K1N 6N5 EMAIL: [email protected]

33