Large Deviations Rate Function for Polling Systems - Springer Link

0 downloads 0 Views 247KB Size Report
In this paper, we identify the local rate function governing the sample path large ... Keywords: large deviations, local rate function, polling system, fluid limits, ...
Queueing Systems 41, 13–44, 2002  2002 Kluwer Academic Publishers. Manufactured in The Netherlands.

Large Deviations Rate Function for Polling Systems FRANCK DELCOIGNE

Université Paris 10, UFR SEGMI, 200 av. de la République, 92000 Nanterre, France ARNAUD DE LA FORTELLE [email protected] INRIA, Domaine de Voluceau, Rocquencourt, BP 105, 78153 Le Chesnay Cedex, France

Received 5 December 2000; Revised 9 December 2001

Abstract. In this paper, we identify the local rate function governing the sample path large deviation principle for a rescaled process n−1 Qnt , where Qt represents the joint number of clients at time t in a polling system with N nodes, one server and Markovian routing. By the way, the large deviation principle is proved and the rate function is shown to have the form conjectured by Dupuis and Ellis. We introduce a so called empirical generator consisting of Qt and of two empirical measures associated with St , the position of the server at time t. One of the main step is to derive large deviations bounds for a localized version of the empirical generator. The analysis relies on a suitable change of measure and on a representation of fluid limits for polling systems. Finally, the rate function is solution of a meaningful convex program. The method seems to have a wide range of application including the famous Jackson networks, as shown at the end of this study. An example illustrates how this technique can be used to estimate stationary probability decay rate. Keywords: large deviations, local rate function, polling system, fluid limits, empirical generator, change of measure, contraction principle, entropy, convex program

1.

Introduction

The model. Consider a polling system (see figure 1) consisting of N nodes attended by def a single server and denote by S = {1, . . . , N} the set of nodes. At node i, arrivals of clients form a Poisson process with rate λi . Each customer at node i requires service, def whose duration is exponentially distributed with parameter µi . Let ρi = λi /µi be the intensity factor at node i. When the server arrives at a busy node, say i, it serves one customer and then moves to some node, chosen via some routing matrix P = (Pij )i,j ∈S with invariant measure η = (ηi )i∈S . If it reaches an empty node, then it immediately switches to some other node, still chosen according to P. The switch-over time to go from node i to node j , for i, j ∈ S, is exponentially distributed with mean τij . All stochastic input sequences (inter-arrival times, services, switch-over times) are supposed to be mutually independent. When the joint number of clients and the position of the server at time 0 are respectively given by x = (x1 , . . . , xN ) and s, Q(t, x, s) = (q1 (t, x, s), . . . , qN (t, x, s)) and S(t, x, s) represent the joint number of clients at each node and the position of the server at time t. As a rule, we shall write S(t, x, s) = i if the server is serving some

14

F. DELCOIGNE AND A. DE LA FORTELLE

Figure 1. The polling system under study: one server, Poisson arrivals, exponential service and switch-over times, Markov routing.

customer at node i and S(t, x, s) = ij if it is in transit between nodes i and j . Set def S0 = S ∪ S 2 , the state space of the server. Then    def Xx,s = Q(t, x, s), S(t, x, s) , t  0 is a Markov process. Previous work. A huge literature has been devoted to the study of polling systems because of their wide range of applicability. In [2,11,18], the necessary and sufficient conditions of ergodicity has been established for systems with one or several servers under a rich variety of service policies. However, the problem of determining the invariant measure for such systems is still open. Even, for limited policies, the mean waiting time can be computed only under symmetry assumptions [3]. The reader is referred to [23] for an overview about polling systems. In the present paper, a sample path large deviation principle or a sample path LDP is established for the rescaled process   1 n def Q(nt, [nx], s), t  0 . Qx,s = n Let us recall that {Qnx,s , n  1} satisfies a LDP in D([0, T ], RN + ) with good rate function and s, IT (·) if for every T > 0, x ∈ RN +  N 1. For any compact set C ⊂ R+ , x∈C x (K) is compact in C([0, T ], RN + ), where     x (K) = ϕ ∈ D [0, T ], RN + : IT (ϕ)  K, ϕ(0) = x stands for the level set of IT .

LARGE DEVIATIONS RATE FUNCTION FOR POLLING

15

2. For each closed set F of D([0, T ], RN + ), lim sup n→∞

   1 log P Qnx,s ∈ F  − inf IT (φ), φ ∈ F, φ(0) = x . n

3. For each open set O of D([0, T ], RN + ),    1 log P Qnx,s ∈ O  − inf IT (φ), φ ∈ O, φ(0) = x . n→∞ n This could be a preliminary step in order to obtain large deviations estimates for the stationary distribution. In view of future applications, our main concern will be to compute the rate function governing the sample path LDP. An application is described in section 3. All this program falls into the framework of LDP for Markov processes with discontinuous statistics, i.e., those for which the coefficients of their generator are not spatially continuous. It seems that one of the first paper dealing with such processes is [15], where large deviations problems for Jackson networks were investigated using partial differential equations techniques. Quite recently, the LDP for a large class of Markov processes with discontinuous statistics has been proved in [13]. Roughly speaking, the authors of [13] express the logarithm of large deviation probabilities as the minimal cost of some stochastic optimal-control problem, and the limit of the optimal cost is shown to exist by means of a sub-additivity argument. However, the rate function is not explicit. Note that in [14], an explicit upper bound of large deviations involving Legendre transforms is proved. While for Jackson networks and some processor sharing models this bound is tight [1], in general the problem of the lower bound remains open. Until now, the identification of the rate function has been carried out in some particular cases and usually for low dimensional systems. General results were obtained in [12,20] where the LDP has been established for random walks whose generator has a discontinuity along an hyperplane. These results are applied in [20] to compute the exponential decay of the stationary distribution of ergodic random walks in Z2+ . Nevertheless, in such examples, there are at most two boundaries with codimension one or two where discontinuity arise. In [9], using the contraction principle, the exponential decay of the stationary distribution of the waiting time is computed for a two-dimensional tandem networks taking advantage that it can be expressed simply as a continuous function of the input processes. It should be noted that in this setting, a sample path LDP for processes with independent increments over infinite intervals of time is needed [10]. Ultimately, the identification of the rate function governing the LDP for Jackson networks has been carried out in [1,19]. In [1], the rate function is computed for a class of Markov processes evolving in ZN + under the assumptions of [13]. Moreover, it is required that some Skorokhod maps associated to the process are regular as well as other assumptions. In the present paper, we identify the rate function for polling systems. It is worth emphasizing that in our case, Qt is not a Markov process. Note that in [13], the LDP is said to hold for Markov driven processes (i.e., St is Markov), but our model still lim inf

16

F. DELCOIGNE AND A. DE LA FORTELLE

does not fall in that scope. So that the polling model does not satisfy the assumptions of [1,13]. Moreover, to our knowledge the regularity of Skorokhod maps associated to polling systems needed in [1] is not proved and moreover this issue appears useless for our purpose. Instead, we use a different method based on the use of empirical measures which is reminiscent of the proof of Sanov’s Theorem for jump processes (see [4] and [8, chapter 3] for finite state space and discrete time). It appears too that the use of empirical measures is a constructive method allowing a careful description of how large deviations events occur. Besides, it is well suited when fluid limits and ergodicity conditions are known. In section 8, we show briefly how one can recover the results of [1,19] for Jackson networks. Note that this method has been used succesfully in [7] to identify the rate function for a model arising in the context of bandwith sharing. Finally, it is worth noting that the lower and the upper bounds are proved in one step. Structure of the paper. The organization of the paper is the following one. In section 4, the notion of localized polling system is discussed as well as fluid limits; the local bounds (2.11) are then restated in a more convenient way using bounds for such systems. In section 5, the empirical generator of localized polling system is introduced as well as the entropy H (·R) and the local rate function L(x, D). The connections between empirical generator and fluid limits are discussed together with the properties of H (·R) and L(x, D). For the sake of completness, the properties of L(x, D) and IT (·) needed for the proof of the sample path LDP are established. However, this section can be skipped since these properties are not used in the proof of the main result, theorem 2.6. In section 7, we derive large deviations bounds for the empirical generator of localized polling system. Then, using some kind of contraction principle, the local bounds (2.11) are proved. As a conclusion, we show briefly in section 8 how one can recover the results of [1,19] for Jackson networks by means of the empirical generator.

2.

Informal description of the method

Since there are several steps in order to achieve the main result (theorem 2.6) and the cumulative length of the proofs is large, we intend to show here the path of proofs and their meaning without too many technicalities. Recall that the main object considered here is the Markov process Xx,s with generator R such that Rf (x, s) =



     q x, s; y, s  f y, s  − f (x, s) ,

(y,s  )∈ZN + ×S0

where f ∈ B(ZN + × S0 ) and, for all i, j ∈ S,

∀(x, s) ∈ ZN + × S0 ,

LARGE DEVIATIONS RATE FUNCTION FOR POLLING

17

 λi , if y = x + ei , s  = s,     if xi > 0, s = i, y = x − ei , s  = ij,   µi Pij ,   if xi > 0, s = j i, y = x, s  = i, x, s; y, s  = 1/τj i ,    Pil /τj i , if xi = 0, s = j i, y = x, s  = il,    0, otherwise. Whenever no confusion arises, the initial state (x, s) will be dropped. Let us introduce now a definition and a notation which will be of constant use in the sequel: Definition. For every x = (x1 , . . . , xN ) ∈ RN + , denote by )(x) the set of indices i such that xi > 0. If ) is a subset of S, the subset of RN +   c x ∈ RN + | xi > 0, ∀i ∈ ) and xi = 0, ∀i ∈ ) is called face ). We denote by R) the subspace of D ∈ RN with Di = 0 for i ∈ )c . • For any set A, Ac will denote its complementary and 1{A} its indicator function; • for any space E, B(E), M(E) and P(E) represent respectively the sets of bounded functions on E, of positive measures on E and of probability measures on E; • D([0, T ], RN ) is the space of right continuous functions with left limits f : [0, T ] → RN , endowed with the Skorokhod metric denoted by dd . Local bounds, empirical generator and entropy. Following [13], in order to get a sample path LDP for polling systems, the main step is to prove the large deviations local bounds (see figure 2):     1 inf log P sup Q(t, y) − nx − Dt  < δn lim lim lim inf δ→0 ε→0 n→∞ n |y−nx| 0, ∀i ∈ )(x),

    1 inf log P sup Q(t, y) − nx − Dt  < δn δ→0 ε→0 n→∞ n |y−nx| 0, x ∈ R+ and s.

22

3.

F. DELCOIGNE AND A. DE LA FORTELLE

Example: the cyclic polling

In this section, we consider the case of an ergodic polling system with cyclic routing. In this case, the routing matrix satisfies pi,i+1 = 1 for all i (with N + 1 ≡ 1) and τi stands for the mean transfer time between nodes i and i + 1. We show how the identification of the rate function L(x, D) could be useful to compute the tail of the distribution of a node, say 1. The proposed algorithm assumes the following conjecture, which remains open at the present time. Conjecture. The path leading to the saturation of node 1 is a straight line. Freidlin and Wentzell’s theory exposed in [17] suggests that the tail of the stationary distribution of node 1 is related to IT by the following formula   1 log P[q1 > n] = − inf inf IT (ϕ): ϕ(0) = 0, ϕ1 (T ) = 1 . n→∞ n T 0 ϕ lim

(3.1)

Although technical, it is reasonable to argue that the preceding equality holds in our case. The optimization problem (3.1) is infinite dimensional but is reduced, under our conjecture, to 1 L(), D) H (GR) log P[q1 > n] = − inf = − inf . n→∞ n {1}⊂),D1 >0 G: D1 >0 D1 D1 lim

(3.2)

The reduction of the variational problem (3.1) to a finite-dimensional optimization problem remains in general open when N  3. Nonetheless, this fact is true for twodimensional reflected random walks [20] and it has been proved recently [16] for a class of reflected Brownian motion when their invariant measure is product form.

Figure 3. The cyclic polling system: the routing is deterministic.

LARGE DEVIATIONS RATE FUNCTION FOR POLLING

23

Table 1 Constraint equations and Lagrange’s multipliers. 

Constraints  jaij − j aj i = 0 a − ai  0 jij π = 1 s s di  0

Multipliers

Convention

φi φi

φ1 = 0 φi = 0 ∀i ∈ )c

−θ

The optimization4 program (3.2) depends heavily on the face ). Using the constraints equations (2.2)–(2.4) ((2.5) is never active), Lagrange’s multipliers as described in table 1 and optimizing w.r.t. ai , ai,i+1 , πi , πi,i+1 , di for i ∈ ), one finds the following set of equations, simplified by using (2.6)–(2.9): (ai ) (ai,i+1 ) (πi ) (πi,i+1 ) (di ) (d1 )

λ˜ i µ˜ i = λi µi eφi ,   τ˜i−1 = τi−1 e−φi −φi +φi+1 , µ˜ i = µi − θ, τ˜i−1 = τi − θ, λ˜ i = λi or λ˜ i = Z, ∀i ∈ ) \ {1},

λ˜ i − λi = θ

(3.3) (3.4) (3.5) (3.6) (3.7) (3.8)

i∈S

with H /D1 = log(λ˜ 1 /λ1 ) and Z given by −1 

λi µi  1 1 + Z = 1− (µi − θ)2 µi − θ τi−1 − θ i ∈) / i∈)

(3.9)

i∈S

We did not mention Lagrange’s multipliers for the constraints di  0 since the result (3.7) of the optimization is very simple. Due to (3.7), for each queue i, there are three possible behaviors: either a non-saturation (i ∈ / ) and di = 0), or a light saturation (i ∈ ) and di = 0, i.e. the queue does not grow linearly) or a positive linear increase (i ∈ ) and di > 0). Note that d1 > 0 implies that queue 1 is in the third case. One obtains a polynomial in θ for each case by simplifying the rational equation:  λ˜ i µ˜ i  (1 − θτi ) =1 λµ i∈) i i

(3.10)

i∈S

Partitioning ) in {1}, )1 (for which λ˜ i = λi ) and )2 (for which λ˜ i = Z), (3.10) becomes  λ˜ 1  µi − θ  Z(µi − θ) (1 − θτi ) , λ1 i∈) µi i∈) λi µi i∈S 1 2

λi θ . (λi − Z) − λ˜ 1 = λ1 + θ + µi − θ i∈) i ∈) / 1=

2

4 Due to the denominator, this is no more a convex program.

(3.11) (3.12)

24

F. DELCOIGNE AND A. DE LA FORTELLE

Now the algorithm appears clearly: Algorithm 1. Calculation of the optimal θ ∗ (decay rate). 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11:

for all partitions ()c , )1 , )2 ) do θ ∗ ← µ1 Calculate equation (3.11) Substitute Z (3.9) and λ˜ 1 (3.12) in (3.11) Simplify the rational equation into a polynomial P (θ) Find the two roots θ1 < θ2 of P belonging to [0, min{µi , τi−1 }] Calculate Z(θ2 ) (3.9)   if i∈)c (λi µi < (µi − θ2 )Z) i∈)1 (λi > Z) then θ ∗ ← min{θ ∗ , θ2 } end if end for

Note that the drift of the first queue is D1 = λ˜ 1 − Z. Among the two roots θ1 < θ2 (step 6), the first corresponds to a negative drift and the second to a positive drift (this is why there is exactly two roots). The condition of step 8 ensures that the queues i ∈ )c are indeed ergodic and the queues i ∈ )1 are strongly saturated queues. Figure 4 shows the result of these calculations for an homogeneous cyclic polling system: λi = λ = 0.1, µi = µ = 1, N = 5 and τ varies in [0, 1]. The load ρ =

Figure 4. Results from theoretical calculations and simulation.

LARGE DEVIATIONS RATE FUNCTION FOR POLLING

25

Nλ(1/µ + τ ) is the ratio between arrivals and departures; the system is ergodic if, and ∗ only if, ρ < 1. The large deviation estimate is e−θ = λ1 /λ˜ 1 . We compare these two quantities with an estimator of the decay rate for stationary distributions obtained by simulation. The decay rate is well described by our large deviation estimate. A refined approach would be to use the optimal generator as change of measure for importance sampling. This should improve drastically the accuracy of simulation estimates but it leads to much more complex programming. Note that the search of the optimal θ ∗ implies that we have to test each queue (except for 1) with the three possible behavior: non-saturated ()c ), lightly saturated ()2 ) or strongly saturated ()1 ). This means 3n−1 possibilities. However, in the homogeneous cyclic polling, queues are exchangeable, so that there are “only” n(n + 1)/2 configurations. It seems worth mentioning that there is only one optimal configuration in all the cases we have treated in numerically solving the equations: when 1 is strongly saturated and all other queues are lightly saturated. 4.

Localized polling systems and fluid limits

4.1. Localized polling systems The choice of x, D and τ in theorem 2.6 ensures that the bounds (2.11) will not depend on the transition mechanism when some components indexed by )(x) are null. This allows one to restate (2.11) in terms of bounds for a local model without boundary conditions on the nodes i ∈ )(x). More precisely, for each subset ) of S, let us define the process    ) def = Q) (t, x, s), S ) (t, x, s) , t  0 , Xx,s where Q) (t, x, s) = (q1) (t, x, s), . . . , qN) (t, x, s)) and S ) (t, x, s) represent the join number of clients at each node, the position of the server at time t and where (x, s) stands for the initial state. The transition mechanism defining the evolution of X ) is identical to X’s except that there is no boundary condition on the nodes i ∈ ): the components qi) (t, x, s) for i ∈ ) may be negative. X ) is then called a localized polling system. All

Figure 5. The localization procedure.

26

F. DELCOIGNE AND A. DE LA FORTELLE

the notation defined for polling systems in section 1 are adapted in a straightforward way ) c for localized polling systems. Note that if Q) )c (t, x, s) denotes {qi (t, x, s), i ∈ ) }, then    ) (4.1) Q)c (t, x, s), S ) (t, x, s) , t  0 is a Markov process. ) is ergodic if the Markov Definition. By an abuse of notation, one says that Xx,s process (4.1) is ergodic.

Take τ satisfying xi + Di τ > 0, ∀i ∈ )(x). and all y with |y − nx| < εn      P sup Q(t, y) − nx − Dt  < δn = P sup t ∈[0,nτ ]

t ∈[0,nτ ]

Then for ε and δ sufficiently small    )(x) Q (t, y) − nx − Dt  < δn .

Indeed on the event considered, the components indexed by )(x) do not vanish. Since ) is invariant with respect to the shift nx. xi = 0, ∀i ∈ )(x)c , the distribution of Xy,s )(x) . Then, one can restate the local bounds (2.11) in term of local bounds for X0,s N c Proposition 4.1. Let x ∈ RN + and D ∈ R such that Di = 0, ∀i ∈ )(x) . Then, for τ satisfying xi + Di τ > 0, ∀i ∈ )(x),     1 inf log P sup Q(t, y) − nx − Dt  < δn lim lim lim inf δ→0 ε→0 n→∞ nτ |y−nx| 0; • Ip (νλ) is null if, and only if, ν = λ; for all λ, it has compact level sets in ν. We then recall the form of the entropy function for generators, as given in definition 2.3, and its properties.

 

  def πij Ip τ˜ij−1 τij−1 + Hd (AP ), Ip (λ˜ i λi ) + πi Ip (µ˜ i µi ) + H (GR) = i∈S

i,j ∈S

The relative entropy H (·R) is positive, finite, continuous and strictly convex on Gs) (P); it is infinite otherwise; it is null if, and only if, G = R. It has compact level sets. Proof. All those properties are immediately derived from the same properties of Hd and Ip , under the condition that µi , λi and τij are strictly positive and finite, and that P is irreducible, conditions that are naturally satisfied here.  Lemma 5.4. The set of irreducible generators is a dense subset of Gs) (P). Proof. Let G ∈ Gs) (P). For ε > 0, (1 − ε)G + εR is irreducible because R is irreducible. It converges to G when ε tends to 0, hence the lemma.  6.

Properties of the rate functions L(x, D) and IT (·)

6.1. Properties of the local functional L(x, D) We recall the definition of the local functional (see definition 2.5): L(), D) = def

inf

G∈f)−1 (D)

H (GR),

∀D ∈ R) ,

where f) : Gs) → R) is the projection f) (G) = D. The properties are proved here. Proposition 6.1. The rate function L(), D) is positive, finite, continuous and strictly convex with respect to D ∈ R) ; it is null if, and only if, D is the drift of the localized polling system; it has compact level sets. Moreover, the infimum G ∈ Gs) such that H (GR) = L(), D) is reached at a unique point G(D) ∈ Gs) (P), and G(D) is a continuous function of D. Proof. Since f) is linear and H (·R) is strictly convex, by [4, proposition 4.8], L(), ·) is convex. The compactness of the level sets is immediately derived from the compactness of the level sets of H (·R). Since H (·R) is strictly convex, since its level sets are compact and since L(), D) is the solution of a convex program, the infimum G(D) is reached at a unique point, which is obviously in Gs) (P) where H is finite.

34

F. DELCOIGNE AND A. DE LA FORTELLE n→∞

Let Dn −→ D. Since H (·R) has compact level sets and since it is continuous ) on Gs (P), it is easily seen that the sequence {G(Dn ), n  1} is relatively compact. k→∞

Let {G(Dnk ), k  1} be a subsequence such that G(Dnk ) −→ G. H (·R) being continuous,   k→∞ L(), Dnk ) = H G(Dnk )R −→ H (GR). By definition of L, H (GR)  L(), D). Now G(D) = (A, π, D), and consider Gnk = (A, π, Dnk ). Since H is lower semi-continuous and since Gnk tends to G(D),     L(), D) = H G(D)R = lim H (Gnk R)  lim H G(Dnk )R = H (GR). k→∞

k→∞

Then, H (GR) = H (G(D)R), and by the uniqueness of the infimum, G = G(D). So, n→∞  G(Dn ) −→ G(D). Finally G(D) is continuous and so is L(), D). Remark. We shall extend definition 2.5 for all D ∈ RN by extending the validity of equation (2.10). First, extend the set Gs) of generators to  ) def  N G s = (A, π, D) ∈ M) s (S0 ) × P(S0 ) × R , ai + Di  0, ∀i . This actually means that all localized generators are considered, not only the stable ones. ) The projection f¯) : G s → RN is the analog of f) , and now L(), D) = def

inf

G∈f¯)−1 (D)

H (GR)

is defined for D ∈ RN . Of course the definitions coincide for D ∈ R) . One checks this extension preserves all properties proved here, particularly the lower semi-continuity and the convexity, which is convenient for the proof of the sample path LDP. Moreover, as underlined in remark 2, these values are almost never used, so that finally this extension does not modify the rate function IT . Lemma 6.1. L(), D) is increasing w.r.t. ). L(x, D) is jointly lower semi-continuous w.r.t. x and D. Proof. By definition 2.5, H (GR) does not depend of ) and the domain f)−1 (D) where the infimum is taken decreases w.r.t. ); therefore L(), D) is increasing w.r.t. ). Actually (A, π, D) ∈ f)−1 (D) if, and only if π ∈ P(S0 ) and A ∈ M) s (S0 ) with ai + Di  0 if i ∈ ), so that, by (5.5), it decreases w.r.t. ). Let (xn , Dn ) tends to (x, D). Since all strictly positive coordinates of x have to be strictly positive for n large enough, ∃n0 ,

)(x) ⊂ )(xn ),

∀n  n0 .

LARGE DEVIATIONS RATE FUNCTION FOR POLLING

35

Therefore, using both the continuity w.r.t. D and the growth of L(·, D),   lim inf L(xn , Dn ) − L(x, D) n→∞      lim inf L(), Dn ) − L(), D) + lim inf L(xn , D) − L(x, D)  0. n→∞ )⊃)(x)

n→∞

Hence the lower semi-continuity with respect to both variables x and D.



Lemma 6.2. There exists M ∈ R such that, 1 L(x, D)  D log D, 2

∀x ∈ RN + , ∀ D  M.

(6.1)

Proof. Assume that |Dj | = D = maxi |Di |. From the definitions of L(x, D) (definition 2.5), of H (GR) (definition 2.3) and from the positivity of Ip and Hd , def

L(x, D) 

inf

L∈f)−1 (D)



 Ip (aj + Dj λj ) + Ip (aj πj µj ) .

But, IP (xy) is increasing in x and decreasing in y for x  y and necessarily aj  −Dj , so that,      L(x, D)  min Ip |Dj |λj , Ip |Dj |µj ,

∀|Dj |  max{λj , µj }.

Now IP (xy) ∼ x log x when x tends to infinity, so that there exists M(y) such that 1 Ip (xy)  x log x, 2

∀x  M(y).

Therefore (6.1) holds with   M = max M(λi ), M(µi ), λi , µi . i∈S

The proof is concluded.



6.2. Properties of the rate function IT (·) For the sake of completeness, the properties of IT (·) needed for the proof of the sample path LDP are stated. For, recall that     x (K) = ϕ ∈ D [0, T ], RN + : IT (ϕ)  K, ϕ(0) = x . Then

36

F. DELCOIGNE AND A. DE LA FORTELLE

Proposition 6.3. 1. Assume IT (ϕ)  K for some K. Then, for all ε > 0, there exists δ > 0 independent of  ϕ such that for any collection of non overlapping intervals [tj , tj +1 ] in [0, T ] with j tj +1 − tj = δ,

  ϕ(tj +1 ) − ϕ(tj )  ε. j

2. IT (·) is lower semi-continuous in (D([0, T ], RN + ), dd ).  N 3. For any compact set C ⊂ R+ , x∈C x (K) is compact in C([0, T ], RN + ). 4. Let ϕ ∈ AC([0, T ], RN + ) with IT (ϕ) < ∞. Then, for all ε > 0, there exists ϕε ∈ ) such that: PL([0, T ], RN + (a) dc (ϕε , ϕ)  ε, (b) IT (ϕε )  IT (ϕ) + ε. Proof. As in lemma 5.18 of [22], one can prove (i) using lemma 6.2. By (i), in order to prove the lower semi-continuity of IT (·), it is sufficient to consider sequences of absolutely continuous functions. Since on C([0, T ], RN + ), the metrics dc and dd are equivalent, one can use dc . Now, using lemma 6.2, since L(x, D) is lower semi-continuous in (x, D) and convex with respect to D (see lemma 6.1 and proposition 6.1), theorem 3 of section 9.1.4 of [21] yields (i). (iii) is a consequence of (i) and (ii) (see [22, proposition 5.46]). For a complete proof of (iv), the reader is reffered to [6, appendix 2].  7.

Large deviations for the empirical generator

It is to be stressed that the following theorem 2.4 relies on the one-to-one mapping between empirical generators and polling systems expressed in proposition 5.2, which itself depends on the linear description of the fluid limit when the polling system starts from an empty state (see theorem 4.3). Thus, we shall only consider throughout this section that the queues are initially empty6 (Q) (0) = 0). Now, by proposition 4.1, the problem of large deviations has been reduced precisely to this case. We shall prove in this section a LDP for the combination of the empirical generator and the uniform variable     sup Q) (s) − sD  < tδ , s∈[0,t ]

defined for any D ∈ R . We shall also define the projection f) : Gs) → R) by f) (G) = D, and the ball B(G, r) of center G and radius r. Note that f) (B(G, r)) = B(D, r), the second ball being in RN , since the distance is a max. )

6 For this reason, the initial condition (0, s) in Q) (t, 0, s) will be dropped.

LARGE DEVIATIONS RATE FUNCTION FOR POLLING

37

Theorem 7.1 (Generator’s local bounds). Let ) be a face and G = (π, A, D) ∈ Gs) . Then,     1 (7.1) lim lim inf log P Gt ∈ B(G, δ), sup Q) (s) − sD  < tδ δ→0 t →∞ t s∈[0,t ]     1 (7.2) = lim lim sup log P Gt ∈ B(G, δ), sup Q) (s) − sD  < tδ δ→0 t →∞ t s∈[0,t ] = −H (GR). The proof will be done in four steps. First, a change of measure, so as to treat only neighborhoods of a generator G; second the lower and upper local LDP bounds for particular generators; third a continuity argument, in order to extend the bounds to all generators; fourth the exponential tightness of the measures. 7.1. Exponential change of measure Fix G ∈ Gs) . In this section the bijection of proposition 5.2 is used to describe G indifferently as G = (A, π, D) or as G = (λ˜ i , µ˜ i , τ˜ij , Pij ). In the next section, fluid limit results will be used. Since these results are restricted to irreducible polling systems, the generator G will henceforth be assumed to be irreducible. Secondly, in order for the ij is assumed to be null when Pij is, mapping h2 to be properly defined (see below), P ) i.e. G ∈ Gs (P). The first step is to amount the problem to neighborhoods of G. For this purpose we introduce a change of measure. Consider a localized polling system X ) with generator R = (λi , µi , τij , Pij ). Then define • the vector α ∈ RN by αi = log (λ˜ i /λi ), for all i ∈ S; • the mapping h1 : S0 → R by  µ˜ i def   h1 (i) = αi + log , µi for all i, j ∈ S; τij def   h1 (ij ) = log , τ˜ij def

• the mapping h2 : S0 → R by  Pij def  , h2 (ij ) = log Pij  def h2 (i) = 0,

for all i, j ∈ S;

• the compensator K : S0 → R by 

def  = (λ˜ k − λk ) + µ˜ i − µi , K(i)   k∈ S

def  (λ˜ k − λk ) + τ˜ij−1 − τij−1 ,   K(ij ) = k∈S

for all i, j ∈ S;

38

F. DELCOIGNE AND A. DE LA FORTELLE

• and the process Et) by  Et)





= exp α, Q (t, x, s) − x + def

)

Nt) −1





h1 Si)



 )  + h2 Si+1 −

i=0



t 0

  K Sv) dv .

Since K has been exactly defined so that7 !  " Nt) −1

      d ) K(s) = h1 Si) + h2 Si+1 E exp α, Q) (t, x, s) − x + dt i=0

,

t =0

it is easily checked that the derivative of E[Et) ] at t = 0 is null. Then using the Markov property, one can get that the derivative is null for all t  0, so that E[Et) ] = 1. Using again the Markov property, this proves that E[Et) |Fs ] = Es) , for all t  s  0, hence {Et) , t  0} is a martingale w.r.t. the natural filtration Ft . Then define a new probability measure by  def  P [B] = E 1{B} Et) , ∀B ∈ Ft . It is a matter of routine to show that under  P, X ) is again a Markov process. Moreover )  if q (x, s; y, s ) denotes the intensity of jump from (x, s) to (y, s  ) of X ) under P, then P is given by the generator X ) under 

       q ) x, s; y, s  e%α,y−x&+1{s=s } (h1 (s)+h2 (s )) f y, s  − f (x, s) . y,s 

Hence under  P, X ) depicts the evolution of a localized polling system without boundary conditions on the nodes belonging to ). The arrival rate at node i is λi eαi = λ˜ i whereas the service rate is equal to µi e−αi +h1 (i) = µ˜ i . The intensity of switch-over time and the probability of routing between i and j are respectively given by τij−1 eh1 (ij ) = τ˜ij−1 and Pij eh2 (ij ) = Pij . This is the change of measure we wanted. 7.2. Upper and lower bounds The ball B(G, δ) is precisely defined by # #   def B(G, δ) = G ∈ G(P): #G − G#∞ < δ . Define the continuous mapping φ : G(P) → R by



    def  as h1 (s) + aij h2 (ij ) − πs K(s). φ G = α, D  + s∈S0

i,j ∈S

s∈S0

7 Note that the derivative is independent of ) and x, so that they are dropped.

(7.3)

LARGE DEVIATIONS RATE FUNCTION FOR POLLING

39

Simple manipulations using (5.6)–(5.9) yield φ(G) = H (GR). Note that φ is finite everywhere as soon as H (GR) is finite, since then there are no infinite terms in (7.3). The change of measure has been done so that log Et) = tφ(Gt ). For the sake of brevity, we shall denote by Et (G, δ) the event     def Et (G, δ) = Gt ∈ B(G, δ), sup Q) (s) − sD  < tδ . s∈[0,t ]

Applying the change of measure and the previous relation yields   −1 1 1 log P Et (G, δ) = log  E Et) 1Et (G,δ) t t    1 P Et (G, δ) .  − sup φ G + log  t G ∈B(G,δ)

(7.4)

Now, it has been shown through fluid limit that, under  P, Gt converges to G in probability and Q) (s, 0) converges to sD uniformly for s ∈ [0, t], so that the probability of the event Et (G, δ) converges to 1. Moreover, by the continuity of φ, the first term of (7.4) tends to φ(G) = H (GR), hence the local lower bound,  1 (7.5) lim lim inf log P Et (G, δ)  −H (GR). δ→0 t →∞ t Then, reversing the inequality in (7.4) yields     1 1 log P Et (G, δ)  −  inf φ G + log  P Et (G, δ) , G ∈B(G,δ) t t which turns to the local upper bound by the same argument,   1 P Et (G, δ)  −H (GR). lim lim sup log P  δ→0 t →∞ t

(7.6)

Now the irreducibility assumption will be relaxed. First note that the upper bound (7.6) is obtained by bounding the probability  P[Et (G, δ)] by 1, thus it is valid for reducible generators in Gs) (P). By lemma 5.4, the set of irreducible generators is dense in Gs) (P). For any G ∈ ) Gs (P) and any δ > 0, it is possible to find an irreducible G ∈ Gs) (P) such that the distance with G is less than δ. Hence there exists δ  > 0 such that B(G , δ  ) ⊂ B(G, δ). Therefore       1 1 lim inf log P Et (G, δ)  lim inf log P Et G , δ   −H G R . t →∞ t t →∞ t Since this is true for any G close to G, by continuity of H on Gs) (P), (7.5) is also true on Gs) (P) for reducible generator. If G ∈ / Gs) (P), by proposition 5.3, H (GP ) = ∞. Obviously (7.5) is valid. When G does not belong to Gs) (P) which is closed, there exists δ > 0 such that B(G, δ) is

40

F. DELCOIGNE AND A. DE LA FORTELLE

entirely outside Gs) (P). The probability for such an event is null. Hence the bound (7.6) is true for all generators. 7.3. Exponential tightness We shall prove here that the measures are exponentially tight, that is, for all C ∈ R, there exists a compact set K ⊂ G such that lim sup t →∞

 1 log P Gt ∈ K c  −C. t

(7.7)

For a Poisson process with intensity bounded by r, the measured intensity Nt /t is bounded by, % $

(rt)n (rt)l0 t l0 Nt > l0 = e−rt  e−rt . P t n! (l t)! l − r 0 0 n>l t 0

Using Stirling’s expansion, the logarithm of the right part is shown to be about −tl0 log l0 when t and l0 are large, which yields the exponential tightness. Now, the intensity of jumps for the server and for the arrivals is bounded by   1 def . r = max λi , µi , i,j ∈S τij  def Therefore, taking l0 big enough, K = {(A, π, D): l  l0 }, where l = s∈S0 as , is a compact set verifying the bound (7.7). The exponential tightness allows classically to extend a weak LDP to a full LDP. We did not prove an LDP here, but the exponential tightness is required for the kind of contraction that follows. 7.4. Contraction Since f) (G) = D is a continuous mapping from Gs) into R) , the contraction principle applied to the local bounds (2.10)–(2.10) suggests the following contraction. Theorem 7.2 (Local bounds). Let ) be a face and D ∈ R) . Then,     1 −L(), D) = lim lim inf log P sup Q) (s) − sD  < tδ δ→0 t →∞ t s∈[0,t ]     1 = lim lim sup log P sup Q) (s) − sD  < tδ , δ→0 t →∞ t s∈[0,t ]

(7.8) (7.9)

where L(), ·) is the good rate function L(), D) = def

inf

G∈f)−1 (D)

H (GR),

∀D ∈ R) .

(7.10)

LARGE DEVIATIONS RATE FUNCTION FOR POLLING

41

Proof. The sketch of the proof is exactly the same as a proof of the contraction principle, except that it is restricted around Gs) . First note that &   B(G, δ). (7.11) f)−1 B(D, δ) = G∈f)−1 (D)

The lower bound is checked with a sequence G(n) such that H (G(n)R) converges to L(), D). For the upper bound, consider the right part of (7.11). From this set, it can be extracted by the exponential tightness a compact set K ⊂ G , so that     1 lim sup log P Gt ∈ K, sup Q) (s) − sD  < tδ t →∞ t s∈[0,t ]     1 = lim sup log P sup Q) (s) − sD  < tδ . t →∞ t s∈[0,t ] Fix ε > 0. By (2.10), for all G ∈ Gs) , there exists δ > 0 such that     1 lim sup log P Gt ∈ B(G, δ), sup Q) (s) − sD  < tδ  −H (GR) + ε. t →∞ t s∈[0,t ] Since K ∩ f)−1 (D) is compact, it can be covered by a finite number of balls, as in (7.11). Moreover, there exists δ0 such that K∩

f)−1



n &   B G(i), δ(i) B(D, δ0 ) ⊂



with f) (Gi ) = D.

i=1

But B(D, δ) is decreasing with δ so that the previous equation is valid for δ  δ0 . Finally     1 lim sup log P Gt ∈ K, sup Q) (s) − sD  < tδ t →∞ t s∈[0,t ]    − min H G(i)R + ε  −L(), D) + ε. i

This is valid for all ε > 0 and δ < δ0 , hence the upper bound. Since (7.8) is less than (7.9), they are both equal to −L(), D). The proof is completed.  8.

Conclusion and comments

As it emerges, in proving a LDP for Q(t)/t, it can be fruitful to work at a higher level. In fact, rather than studying Qt itself, we focused in the empirical generator which allows one to know how the different transition rates have to be modified in order that Qt stays near a given drift D. Then using the explicit expression of fluid limits for polling systems, by means of proposition 5.2 one can select in this case the unique change of measure under which the system follows a prescribed empirical generator. Finally, using a contraction principle, the local rate function can be identified as the solution

42

F. DELCOIGNE AND A. DE LA FORTELLE

of an optimization problem, in our case a convex program. As it was pointed out, in general for Markov processes in ZN + , the fluid limits cannot be characterized. So it seems hard to identify the rate function for arbitrary queueing systems using directly the present methodology. Nevertheless, in our opinion, it could be applied successfully to other networks. For instance, let us briefly discuss the well known example of Jackson networks. def Consider an open network consisting of N nodes denoted by S = {1, . . . , N}. At node i, arrivals of clients form a Poisson process with rate λi . Each customer at node i requires service, whose duration is exponentially distributed with parameter µi . When a client has been served at node i, then it goes to node j with probability Pij or exits the def system with probability Pi0 . Qt = (q1 (t), . . . , qN (t)) is a Markov process where qi (t) represents the number of clients at node i at time t. For i ∈ S, j ∈ S ∪ {0}, let • Nn {i, j }, represents among the n first transitions, the number of customers served at node i and going to node j . In this case, the empirical generator is the process Gt defined by   Nt  Qt − Q0 def LNt , Lt , , Gt = t t where • Nt is the number of transitions till t; def •  Ln = (1/n)Nn ∈ M(S × (S ∪ {0})) is a subprobability. It represents in some sense the empirical measure of the embedded Markov chain except that it does not take into account the arrivals of clients in the network; • Lt = (L1 (t), . . . , LN (t)), where  t def 1 δq (u)=0 du. Li (t) = t 0 i Li (t) is the time average the server at node i is not idle. Like for polling systems, in order to identify L(), D) with Di = 0, ∀i ∈ )c , one considers localized Jackson networks X ) . The transition mechanism describing the evolution of X ) is identical to X’s except that the components indexed by ) can be negative. Now, let us introduce the set of localized empirical generators for X ) . Definition 8.1 (Localized empirical generators). The set G ) of localized empirical generators is defined by the triples (A, π, D) ∈ M(S0 ) × [0, 1]N × R) that verify Di −

j ∈S

πi = 1,

∀i ∈ ),

aj i + ai  0,

∀i ∈ S,

 where ai stands for j ∈S ∪{0} aij . We denote by R) , for the sake of simplicity, the subspace of D ∈ RN with Di = 0 for i ∈ )c .

LARGE DEVIATIONS RATE FUNCTION FOR POLLING

43

Remark. Note that G ) is a convex set. To each generator (A, π, D) ∈ G ) corresponds a localized Jackson network, described by its intensities (λ˜ i , µ˜ i , Pij ). The correspondence is given by

ai def aij aj i + ai , µ˜ i = , Pij = . λ˜ i = Di − πi ai j ∈S

Definition 8.2 (Relative entropy). Let R = (λi , µi , Pij ) denotes the generator of the Jackson network, G = (A, π, D) ∈ Gs) be a localized empirical generator and (λ˜ i , µ˜ i , Pij ) its representation as a localized Jackson network. The relative entropy of G with respect to R is8

     def H (GR) = Ip λ˜ i λi + πi Ip µ˜ i Pij µi Pij , i∈S

where Ip (νλ) = ν log (ν/λ) − ν + λ. def

This entropy has an easy interpretation in terms of information theory. H (·R) is decomposed as the sum of the information gains, first Ip (λ˜ i λi ) for the arrivals, second ij µi Pij ) for the service times and the routing, multiplied by the time πi that the Ip (µ˜ i P server at node i is not idle. Definition 8.3. The rate function L(), D) is defined by L(), D) = def

inf

G∈f)−1 (D)

H (GR),

∀D ∈ R) ,

where f) : G ) → R) is the projection f) (G) = D. For all x ∈ RN + , L(x, D) is defined to be L()(x), D). Like for polling systems, L(), D) appears naturally as the solution of a convex program. This was proved in [19] after suitable change of variable. In our case, the main point will be to establish theorem 2.4. This can be done using the preceding discussion, a suitable change of measure and the representation of fluid limits for Jackson networks. The reader is reffered to sections 5 and 6 of [19] for the last two points. References [1] R. Atar and P. Dupuis, Large deviations and queueing networks: Methods for rate function identification, Stochastic Process. Appl. 84(2) (1999) 255–296. [2] A.A. Borovkov and R. Schassberger, Ergodicity of a polling network, Stochastic Process. Appl. 50(2) (1994) 253–262. 8 Note that the relative entropy H (GR) is independent of ).

44

F. DELCOIGNE AND A. DE LA FORTELLE

[3] O. Boxma and J. Weststrate, Waiting time in polling systems with markovian server routing, in: Messung, Modellierung und Bewertung von Rechensysteme (Springer, Berlin, 1989) pp. 89–104. [4] A. de La Fortelle, Large deviation principle for markov chains in continuous time, Technical Report 3877, INRIA (2000). [5] A. de La Fortelle and G. Fayolle, Large deviation principle for Markov chains in discrete time, Technical Report 3791, INRIA (1999). [6] F. Delcoigne and A. de La Fortelle, Large deviations for polling systems, Technical Report 3892, INRIA (2000). [7] F. Delcoigne and A. de La Fortelle, Large deviations problems for star networks: The min policy, Technical Report 4143, INRIA (2001). [8] A. Dembo and O. Zeitouni, Large Deviations Techniques and Applications, 2nd ed. (Springer, New York, 1998). [9] R.L. Dobrushin and E.A. Pechersky, Large deviations for tandem queueing systems, J. Appl. Math. Stochastic Anal. 7(3) (1994) 301–330. [10] R.L. Dobrushin and E.A. Pechersky, Large deviations for random processes with independent increments on infinite intervals, in: Probability Theory and Mathematical Statistics, St. Petersburg, 1993 (Gordon and Breach, Amsterdam, 1996) pp. 41–74. [11] D. Down, On the stability of polling models with multiple servers, J. Appl. Probab. 35(4) (1998) 925–935. [12] P. Dupuis and R.S. Ellis, Large deviations for Markov processes with discontinuous statistics. II. Random walks, Probab. Theory Related Fields 91(2) (1992) 153–194. [13] P. Dupuis and R.S. Ellis, The large deviation principle for a general class of queueing systems. I, Trans. Amer. Math. Soc. 347(8) (1995) 2689–2751. [14] P. Dupuis, R.S. Ellis and A. Weiss, Large deviations for Markov processes with discontinuous statistics. I. General upper bounds, Ann. Probab. 19(3) (1991) 1280–1297. [15] P. Dupuis, H. Ishii and H.M. Soner, A viscosity solution approach to the asymptotic analysis of queueing systems, Ann. Probab. 18(1) (1990) 226–255. [16] P. Dupuis and K. Ramanan, A time-reversed representation for the tail probabilities of stationary reflected Brownian motion, Technical Report, Brown University (2001). [17] M.I. Freidlin and A.D. Wentzell, Random Perturbations of Dynemical Systems (Springer, New York, 1984). [18] C. Fricker and M. Jaibi, Stability of multi-server polling models, Technical Report 3347, INRIA (1998). [19] I. Ignatiouk-Robert, Large deviations of Jackson networks, Technical Report 14/99, Université de Cergy-Pontoise (1999). [20] I.A. Ignatyuk, V.A. Malyshev and V.V. Shcherbakov, The influence of boundaries in problems on large deviations, Uspekhi Mat. Nauk 49(2) (296) (1994) 43–102. [21] A.D. Ioffe and V.M. Tihomirov, Theory of Extremal Problems (North-Holland, Amsterdam, 1979) (translated from the Russian by K. Makowski). [22] A. Shwartz and A. Weiss, Large Deviations for Performance Analysis, Queues, Communications, and Computing, with an Appendix by R.J. Vanderbei, Stochastic Modeling Series (Chapman & Hall, London, 1995). [23] H. Takagi, Queuing analysis of polling models, ACM Comput. Surveys 20(1) (1988) 5–28.