A Heuristic for Determining Efficient Staffing Requirements for Call Centres Shane G. Henderson, Andrew J. Mason Department of Engineering Science University of Auckland Auckland, New Zealand
[email protected],
[email protected] Ilze Ziedins Department of Statistics University of Auckland Auckland, New Zealand
[email protected]
Richard Thomson Voice Technology Ltd Auckland, New Zealand
[email protected]
Abstract An important new feature on the business scene is the development of call centres, whereby a pool of staff is used to answer incoming calls from customers. We develop a model that enables staffing levels to be determined to meet specified quality targets on customer wait times. Unlike most previous work, the new model explicitly considers call arrival rates that vary during the day, and exploits linkage between periods to attempt to keep staffing costs to a minimum.
1
Introduction
Many companies now employ a call centre, whereby a pool of staff is used to answer incoming calls from customers. A large component of cost in call centre operation is staffing cost. Therefore, it is highly desirable to keep staffing levels to a minimum throughout the day. Of course, the objective of minimizing staffing costs is subject to the constraint that customer service remain at a reasonable level throughout the day. We refer to the problem of minimizing staffing (rostering) costs subject to maintaining a minimum level of customer service as "the rostering problem". The problem is nontrivial because demand can vary 1
considerably throughout the day, and there is usually great flexibility in staffing decisions. This paper is an outgrowth of work performed as part of a Masters in Engineering Thesis (Thomson 1998) for VoiceTech Systems under the guidance of the other authors. Perhaps the primary measure of customer service is customer waiting time in the queue before reaching a server. It is also desirable to incorporate customer abandonment (reneging), where a customer hangs up before receiving service, in any measure of customer service. This can be done by defining a customer specific Customer Grade of Service (CGOS). The CGOS is typically expressed as a percentage, with 100% meaning ideal service, and lower percentages representing lower levels of service. For example, a customer that abandons might receive a CGOS of 0%. The constraint that customer service be satisfactory may be expressed in many different forms. For example: •
CGOS should exceed 50% for all customers.
•
95% of customers should receive a CGOS > 80%.
•
In any 2 hour window, the average CGOS should exceed 80%.
•
During peak times, the average CGOS should exceed 50%, while at other times it should exceed 80%.
•
The expected CGOS for a customer arriving at any time throughout the day should exceed 80%.
Because of the random nature of call centre operation, for any given day these requirements (with the exception of the last one) can only be met with a certain probability. The quality measure used in the current VoiceTech product, Q-Master, calculates the performance at any instant by averaging the CGOS over all customers served in the last 2 hours. VoiceTech clients typically require this measure (GOS) to exceed 80% during the day. However, the company was willing to consider using other performance measures to generate its staffing requirements. 2
The rostering problem is typically solved in two phases (Mehrotra 1997). If we assume that the day has been broken into a number of periods, perhaps each of 15 minutes duration, then the first phase is to determine the staffing requirements for each period of the day. We term these staffing levels the work requirement. We seek a work requirement that is ‘minimum’ in the sense that the service target cannot be met if staffing is reduced in any period. In the second phase, one attempts to build staff rosters that "cover" the minimum staffing requirements determined in the first phase. The Erlang-c delay and loss formulae (see Gross and Harris 1998) give steady-state results for the M / M / c and M / G / c / c queueing models respectively. They are often used in practice to estimate the work requirement via the pointwise stationary approximation (PSA) (Green and Kolesar 1995). The PSA approximates the performance of the system at a given instant of time by steady-state quantities associated with the parameters of the system at that instant. Staffing levels are then increased until the PSA predicts that service is "satisfactory". Once the work requirements are known, the workforce allocation problem is solved. The workforce allocation problem is the problem of determining staff numbers and shift staff times so as to "cover" the work requirements at minimal staffing cost. There are several problems with the standard approaches to computing the work requirements. First, there can be linkage between time periods in the sense that the CGOS depends on staffing levels in both the current and previous periods. If the PSA is used to calculate the required staffing levels, then this linkage will be ignored, and thus the calculated staffing levels may not yield a feasible solution. Second, the linkage between periods typically increases the number of minimum work requirements that exist, i.e. it may be possible to change the staffing requirements in such a way that customer service remains satisfactory, but the solution is still a minimum (in the sense given above). Clearly, where there are many such minimum work requirements, we would like to choose that work requirement that is "easiest" 3
to roster, in the sense that the roster generated in Phase 2 is cheaper, or of higher quality. While the final cost/quality of a work requirement is difficult to determine without solving the Phase 2 optimisation problem, some estimate of these measures would be useful in directing the search for a good minimum work requirement. Several authors have recognized the first of these problems with the use of the PSA, and have developed methods to deal with them. Green and Kolesar (1995) noted that the PSA was occasionally not as accurate as one might hope. Green and Kolesar (1997), (1998) and Green, Kolesar and Soãres (1999) develop simple modifications to the PSA that improve its performance. Thompson (1993) discusses the multiperiod impact of server staffing in some detail, and models the effect by heuristically modifying the call arrival rate in different periods, and then using the modified arrival rate in steady-state queueing models. We have developed a new heuristic (Section 3) partly because we have noticed an interesting effect, which is not captured by these earlier models, that arises when the server staffing levels are changed. However, it should be noted that one could use these earlier methodologies in conjunction with the dynamic program introduced in Section 4 to determine staffing levels. Thompson (1997) developed an integer program that attempts to solve the work requirements problem and workforce allocation problems simultaneously. He used his arrival rate heuristic (Thompson 1993) together with standard Erlang delay results to predict performance for a given staffing level in a given period. Although he was therefore addressing, to some degree, the linkage between periods in terms of service performance, we believe that our work enables one to exploit this linkage when setting staffing levels to a much greater extent. It should be noted, however, that Thompson's model explicitly includes the workforce allocation step, whereas the work in this paper does not. Abate and Whitt (1998) provide a methodology for computing transient measures in the Erlang loss model. Jennings et al. (1996) capture time-dependent (non steady-state) effects 4
through the use of approximations to time-dependent infinite-server queueing model results. Whitt (1999) looks at determining staffing levels through an infinite server model, with the objective of immediately answering all calls. Simulation can also be used to determine the work requirements; see Eitzen (1994), Brigandi et al. (1994). Andrews and Parsons (1993) use Erlang models to predict performance, and then apply economic considerations and regression in determining how to set the staffing levels. The primary contributions of this paper are four-fold. First, we employ a birth-death model of a call centre that captures customer reneging (leaving the queue before reaching service). We give a rigourous proof of convergence in distribution of customer-centred measures. Steady-state results relating to customer-centred performance measures in this model have appeared previously (see Palm 1937 and Ancker and Gafarian 1963), so that we have not supplied proofs of these results in this paper. However, proofs of some of the steady-state results (established before we became aware of the earlier work) appear in Thomson (1998). Second, we attempt to capture some of the linkage effects between time periods in the model used to determine work requirements. To achieve this, we use steady-state queueing models to identify the expected CGOS if the system were run under constant conditions for a sufficiently long period of time. These steady-state results are then modified using a "smoothing heuristic" (Section 3) to attempt to capture the time-dependent effects. Third, we identify what we believe is a previously unrecognized phenomenon. When staffing levels are increased in a call centre and there are customers at the head of the line, some customers immediately enter service, and the rate at which customers are being processed by the system increases abruptly. This leads to far greater service performance in the short term than might otherwise be expected. A similar, although less marked, effect occurs when server numbers are reduced. 5
Fourth, a Dynamic Program (DP) is introduced that incorporates the time-dependent approximations, and the DP is solved to obtain minimum staffing levels throughout the day. The cost structure of the dynamic program is designed to construct staffing levels that are "easy" to roster. Another approach that attempts to capture the linkage between time periods is discussed in Henderson and Mason (1998). This approach utilizes simulation and integer programming in concert to obtain optimal rosters. The method shows great promise but is computationally very expensive. The remainder of this paper is organized as follows. In Section 2, we introduce a queueing model of call centre operation, provide steady-state results for customer centred measures, and establish the convergence in distribution that partly underlies the smoothing heuristic. We also discuss how VoiceTech’s GOS requirements are approximated. In Section 3, we explain how the steady-state results of Section 2 are modified to obtain approximations to the time-dependent CGOS using the smoothing heuristic. In Section 4, we discuss the dynamic program that is used to determine the minimum staffing levels. Section 5 discusses the complication that the GOS as measured by VoiceTech is random, so that adjustments need to be made to the results of our earlier sections if we wish to apply these ideas in their context. We present an example of our methodology in Section 6, and proofs are given in an appendix.
2. Steady-State Results Consider the following model of a call centre. Calls arrive at the call centre according to a Poisson process with rate λ . (We will generalize the arrival process to a non-homogeneous Poisson process in Section 3.) There are c servers, and if a server is free, the customer immediately enters service. Service times are exponentially distributed with mean µ −1 . If no server is available, the customer queues for service. Customers are willing to wait for a certain maximum period of time that we call their renege time or abandonment time. Customers who 6
are still in the queue when their renege time is reached leave the system without ever receiving service. Renege times are exponentially distributed with mean α −1 . All arrival times, service times and renege times are independent. Under these assumptions, the number of customers in the system is a birth-death process, and as long as α > 0, the model is stable for all choices of arrival and service rates. We can therefore apply standard results on birth-death processes (see, e.g., Gross and Harris 1998) to obtain stationary distributions. Let q (⋅; λ , µ , α , c) denote the steady-state probability mass function of the number of customers in the system for a given arrival rate, service rate, abandonment rate and number of servers, so that q (n; λ , µ , α , c) is the steady-state probability that there are n customers in the system. It is straightforward to show that for n ≥ 0, q(n; λ , µ ,α , c) =
λn q(0; λ , µ ,α , c) n
∏µ i =1
,
i
where µ i = min(i, c) µ + max(i − c,0)α and q (0; λ , µ , α , c) is chosen so that q (⋅; λ , µ , α , c) is a probability mass function. Of course, we are primarily interested in the waiting times of customers. Let Ai = 1 if the ith customer abandons (reneges), and 0 otherwise. Let Wi be the ith customer’s waiting time in the queue (typically measured in seconds) when the ith customer does not abandon, and the time till abandonment otherwise, and define the customer performance process CP = (( An , Wn ) : n ≥ 1) . Theorem 1 (below) establishes that CP possesses a steady-state, and that ( An ,Wn ) converges (in distribution) to a steady-state. The proof is based on the analysis of a certain general state space Markov chain. Let X k (i ) denote the residual workload (time remaining to complete all work) of the ith server immediately after the arrival of the kth customer. If the system is initially empty, then
7
we take X 0 (i ) = 0 for all i . Let X k = ( X k (1),..., X k (c)) be the vector of server workloads and define the process X = ( X k : k ≥ 0). Theorem 1: The Markov chain X is positive Harris recurrent and therefore possesses a stationary distribution. Furthermore, ( An ,Wn ) converges in distribution to a steady-state random variable ( A∞ ,W∞ ) as n → ∞ . Theorem 1 follows as a direct corollary of a stronger result (Theorem 2 below) that relaxes the assumption that the renege times, service times and interarrival times are exponentially distributed random variables. Let the renege time and service time of the kth customer be denoted Rk and Vk respectively. Let the interarrival time between the kth and (k1)th customers be denoted U k . Theorem 2: Suppose that the sequences ( Rk : k ≥ 1) , (Vk : k ≥ 1) and (U k : k ≥ 1) are mutually independent sequences of i.i.d. random variables. Suppose further that 0 < EU 1 < ∞ ,
EV1 < ∞ and ER1 < ∞ . Finally, assume that V1 has a strictly positive density (with respect to Lebesgue measure) on the interval [0, v] and P(U 1 > v) > 0 . Then the Markov chain X as defined above is aperiodic and positive Harris recurrent, and X n ⇒ X ∞ as n → ∞ . Furthermore, if either R1 has a density, or U 1 has a density and P( R1 > 0) = 1 , then ( An ,Wn ) converges in distribution to a steady-state random variable ( A∞ ,W∞ ) as n → ∞ . Theorem 2 is proved in the appendix. It is possible to obtain the steady-state probability p (λ , µ , α , c) that a customer abandons
P( A∞ = 1) , and the steady-state density function f w (⋅; λ , µ , α , c) of the waiting time in the queue when the customer has to wait and does not abandon. These results were first obtained by Palm (1937) and later generalized by Ancker and Gafarian (1963). As noted in the
8
introduction, we established these same results before becoming aware of this earlier work, so that proofs of these results may be found in Thomson (1998). We have that p (λ , µ ,α , c) = P( A∞ = 1) =
α E max(Q∞ − c,0), λ
where Q∞ is a random variable distributed according to q (⋅; λ , µ , α , c) , and for t > 0, f w (t ; λ , µ , α , c) = P(W∞ ∈ dt , A∞ = 0) = λq(c − 1; λ , µ , α , c) exp[−(cµ + α )t +
λ (1 − e −αt )]. α
Of course, the probability that a customer enters service immediately and does not abandon is given by the probability that upon arrival there is at least 1 free server, so that c −1
P(W∞ = 0, A∞ = 0) = ∑ q (k ; λ , µ , α , c). k =0
These steady-state results may be used to approximate the required CGOS statistics as follows. The ith customer’s CGOS is given by CGOS(Ai, Wi), where the function CGOS is defined (for example) by 0 100 CGOS (a, w) = 75 50 25
if if if if if
a =1 a = 0, w < 30 a = 0, 30 ≤ w < 60 a = 0, 60 ≤ w < 90 a = 0, w ≥ 90.
The steady-state expected customer GOS, SSGOS, is then
SSGOS (λ , µ , α , c) = 100 P(0 ≤ W∞ < 30, A∞ = 0) + 75 P(30 ≤ W∞ < 60, A∞ = 0) + L The CGOS function can be defined so that the SSGOS yields the fraction of calls that wait less than x seconds, which is a standard performance measure in call centre analysis. In particular, simply define the CGOS function to be 100 if a = 0 and w < x , and 0 otherwise. To compute the probabilities P(a ≤ W∞ < b, A∞ = 0) , one can simply integrate f w . This is not possible in closed form, but standard quadrature schemes have proven effective.
9
3. Time-Dependent Approximations (The Smoothing Heuristic) Of course, the arrival rate of calls to a call centre varies over the day, and practically speaking, staffing decisions can only be made at "reasonable" times. So we now introduce time effects into the call centre model. Break the day up into 15 minute time intervals [t0, t1), [t1, t2), ..., where t0=0, and for k ≥ 1, tk = tk-1 + 15. Let us call the interval [tk-1, tk) period k. We extend our previous model by assuming that calls arrive at the call centre according to a nonhomogeneous Poisson process with right-continuous arrival rate (λ(t): t ≥ 0). To simplify the presentation, we further assume that the arrival rate function λ is constant over each 15 minute time interval, although this assumption is easily relaxed. The arrival rate during the k’th period [tk-1, tk) is denoted by λk=λ(tk-1). We allow the number of servers on duty to vary during the day, where ck denotes the number of servers on duty in period k. We will assume that µ and α are constant during the day, but this assumption can be relaxed if required. Now, one could use the PSA to approximate the expected instantaneous CGOS (ECGOS) gk at time tk. (The instantaneous CGOS may be defined as the CGOS a customer would receive were they to arrive at that particular time instant.) In particular, the arrival rate and the number of servers over the interval [tk-1, tk) are λk and ck respectively, so that we could approximate gk by SSGOS(λk, µ, α, ck). One would expect the steady-state approximation to be quite reasonable if the time intervals were of a "large" length. In fact, in the limit as the time intervals become large, the error in the steady-state approximation converges to zero. Proposition 3: If the arrival rate and the number of servers (and µ and α) remain constant over the time interval [0, t ] , then as t → ∞ , the ECGOS at time t converges to the expected steadystate customer CGOS. A proof of this result is given in the appendix.
10
Our time intervals are only 15 minutes in duration, so it is perhaps unreasonable to expect full convergence to a steady-state regime, but certainly it is reasonable to expect that some convergence towards steady-state will have occurred. This observation is the central motivation behind our time-dependent approximation. It is also confirmed by our simulation runs in Figure 1. (Full details of the simulation are given in Section 6.) ECGOS (Sim)
100
SSGOS
90
Arrival Rate Servers
80 70 60 50 40 30 20 10 0 8
9
10
11
12
13
14
15
16
Time 17 (hours) 18
Figure 1: Steady state and simulated ECGOS values for various arrival rates and numbers of servers. The service rate is 4 and the abandonment rate is 20. All rates are per hour.
Observe that for a fixed number of servers (the left half of the figure), we see the expected convergence to steady state for different arrival rates. The right half of the figure demonstrates what happens when the number of servers changes, and will be discussed further shortly. We model the system behaviour as follows. We begin by approximating the ECGOS that would be received by a customer at the start of the day. Assuming that the system is empty at time 0, it is reasonable to define g0, the ECGOS at time t0, to be 100%. According to Proposition 3, during the first time interval, the ECGOS received by an arriving customer begins to converge to SSGOS(λ1, µ, α, c1). We therefore approximate the ECGOS g1 at the 11
end of the first time interval (at time t1) by a convex combination of g0 and SSGOS(λ1, µ, α, c1). In other words, we approximate g1 by g1 ≅ δ g0 + (1-δ) SSGOS(λ1, µ, α, c1), where 0 < δ < 1 is a tuning parameter. Recursively then, we approximate the ECGOS gk at the end of period k by gk ≅ δ gk-1 + (1-δ) SSGOS(λk, µ, α, ck).
(1)
ECGOS
This approximation is depicted in Figure 2. g0 g1 g2
SSGOS(λ2,µ,α,c2)
SSGOS(λ1,µ,α,c1)
0 t0
t1
Time
t2
Figure 2: The sequence of ECGOS approximations (c constant, λ varying). The approximation bears a similarity to the method of exponential smoothing in forecasting, so that we term this approximation the "smoothing heuristic". Note that as the interval tk-tk-1 between successive points becomes smaller, the smoothed ECGOS approaches an exponential decay towards the steady-state value. Exponential decay towards steady-state is a feature of many stochastic systems; see Meyn and Tweedie (1993). If we examine Figure 1, we see that in periods of constant staffing, the ECGOS does display the convergence we expect. However, in the right half of the figure, where the number of servers is changing, we observe a very different behaviour. There are sudden ECGOS improvements associated with increases in staffing. Indeed, the ECGOS values we observe in the simulation after an increase in staffing lie above their new steady state values! Clearly, the
12
system responds much faster to changes in the number of servers than it does to changes in arrival rates. Changes in arrival rates can only impact on the system through the arrival of new customers, and thus these changes take time to be felt. However, changes in staffing have an immediate impact on both customers in the system and customers about to arrive, and so they impact the ECGOS immediately. We model changes in staffing levels as follows. As before, assume the system has an ECGOS gk-1 at time tk-1, and that during the period [tk-1, tk) there are ck staff on duty. Suppose that at time tk, the number of staff on duty changes to ck+1. Now, as before, we assume that during the period [tk-1, tk) the ECGOS approches the steady state SSGOS(λk, µ, α, ck), so that at time tk- (being the end of period [tk-1, tk)), we have a predicted ECGOS gk- = δ gk-1 + (1-δ) SSGOS(λk, µ, α, ck).
(2)
We now have to adjust gk- for the staffing change (if any) at time tk. If the system was in steady state in period k, then it would have an ECGOS given by SSGOS(λk, µ, α, ck). If the staffing in this period were changed to ck+1, then in steady state the ECGOS in period k would change to SSGOS(λk, µ, α, ck+1). The change in ECGOS is approximately given by SSGOS(λk, µ, α, ck+1)- SSGOS(λk, µ, α, ck). In our heuristic, we simply assume that some multiple of this change of ECGOS occurs at time tk, giving
κ up [SSGOS(λ k , µ ,α , c k +1 ) − SSGOS(λ k , µ ,α , c k )] c k +1 ≥ c k gk = gk- + κ down [SSGOS(λ k , µ ,α , c k +1 ) − SSGOS(λ k , µ , α , c k )] c k +1 < c k
(3)
where κup and κdown are experimentally-determined tuning parameters. Of course, gk must lie between 0 and 100, and so we map any values outside this range to the closest limit. Note that we allow reductions in staffing to immediately impact the system just as increases do. However, in a call centre, staff do not leave the system immediately, but depart as their calls come to an end. Thus, the immediate impact of staffing decreases is much less than that of increases. Note also that while our model gives instantaneous ECGOS changes, in practice 13
these will be smooth because customers arriving just before a staffing increase (decrease), will likely benefit (suffer) from that change. Thus, the ECGOS will change rapidly but smoothly in the lead-up to a staffing change. However, our simulation runs suggest that this backwards spreading effect is small, and that the discontinuous model is a good one. The ECGOS function obtained using Equations (2) and (3) is depicted in Figure 3.
ECGOS
g2 g1 g2g0 g1SSGOS(λ2,µ,α,c2)
SSGOS(λ1,µ,α,c1)
0 t0
t1
Time
t2
Figure 3: Schematic of ECGOS smoothing with staffing changes c1 v) > 0 . Then X is ϕ -irreducible for a certain nontrivial measure ϕ . Furthermore, all compact sets are petite, and the chain X is aperiodic. Proof: Since P(U 1 > v) > 0 , there exists a δ > 0 such that P(U 1 > v + δ ) > 0 . Suppose that
V1 ≤ v and U 1 > v + δ . From (A.3), it follows that with the exception of the first server (the one with the minimal workload at time 0), all server workloads either hit 0, or decrease by at least δ , on the next transition. The first server’s workload decreases by at least δ if it was initially greater than v + δ , but otherwise finishes in the interval [0, v) . It follows that after a 28
sequence of transitions, it is possible to return the chain to the set B = [0, v) × {0} × L × {0}. If we define x = max t i , then it follows that B may be reached from x in at most x / δ 1≤ i ≤ c
transitions. If P( R1 = 0) = 1 , then all customers renege, and 0 is an absorbing state for the Markov chain. We can then take ϕ to be a point mass at 0 and the result is trivial. So assume that P( R1 = 0) < 1 . In this case, the first server’s workload can reach any value in [0, v) . Hence
X
is
ϕ -irreducible,
where
ϕ
may
be
defined
via
ϕ ([0, a1 ] × [0, a 2 ] × L[0, ac ]) = min(a1 , v) . (This defines the measure on a π -class of sets, so that the measure is well defined.)
x ∈ K ⇒ x ≤ M , for some constant M < ∞ .
Now, for a given compact set K say,
Hence it is possible to reach B in at most M / δ transitions from every x ∈ K .
It
immediately follows that K is petite. It is easy to see, although tedious to prove, that the chain is aperiodic, and so we omit this part of the proof. q We now turn to establishing conditions under which the chain X returns to a compact set infinitely often using a Lyapunov function approach. Define P( x,⋅) = P (⋅ | X 0 = x) to be the 1step transition kernel for X , and for a real-valued function g , define Pg ( x ) = E ( g ( X 1 ) | X 0 = x)
(A.4)
Lemma 5: Suppose that 0 < EU 1 < ∞ , EV1 < ∞ and ER1 < ∞ . Then there exists a compact set K and a finite constant b such that Pg ( x) − g ( x) ≤ −
EU 1 + bI ( x ∈ K ), 3
where g ( x) = x . Proof: Suppose that X 0 = x = (t1 , t 2 ,..., t c ), where t1 ≤ t 2 ≤ L ≤ t c . Given that X 0 = x , we have from (A.3) that
29
(
[t − U 1 ]+ − t c g ( X 1 ) − g ( x) = c + [t1 − U 1 ] + V1 − t c
if A1 = 1 or A1 = 0 and [t1 − U 1 ] + V1 ≤ [t c − U 1 ] +
+
)
otherwise.
(A.5) The first case in (A.5) corresponds to the situation where the arriving customer reneges, or the arriving customer does not renege and his service time is sufficiently small that the c th server remains the one with the highest workload. The second case occurs when the arriving customer does not renege, and his service time is so large that the first server becomes the one with the highest workload. For an arbitrary random variable X and event A , define E ( X ; A) = E ( XI A ) , the expectation of the product of X and the indicator of the event A . From the above,
( + E ([t
(
Pg ( x ) − g ( x ) = E [t c − U 1 ] − t c ; R1 ≤ [t1 − U 1 ] or R1 > [t1 − U 1 ] and [t1 − U 1 ] + V1 ≤ [t c − U 1 ] 1
+
+
+
+
+
− U 1 ] + V1 − t c ; R1 > [t1 − U 1 ] and [t1 − U 1 ] + V1 > [t c − U 1 ] +
+
+
+
)
(A.6) We will show that if x ∉ K , where K = [0,τ ] × L × [0,τ ] and τ will be defined later, then the second term in (A.6) contributes very little to Pg ( x) − g ( x) , while the first term is sufficiently negative to yield our result. But first note that since t1 ≤ t c , (A.5) gives Pg ( x ) − g ( x ) = E ( g ( X 1 ) | X 0 = x) − g ( x) ≤ EV1 < ∞,
(A.7)
so that we can take b = EV1 + EU 1 / 3 in the statement of the theorem. Now, let ε > 0 be such that ε < EU 1 / 6 . The second term in (A.6) is bounded above by
(
E V1 ; R1 > [t1 − U 1 ] and [t1 − U 1 ] + V1 > [t c − U 1 ] +
+
(
+
≤ E V1 ; R1 > [t1 − U 1 ]
+
)
) (A.8)
≤ E (V1 ; R1 + U 1 > t1 ), where the inequalities follow from the fact that V1 is a nonnegative random variable, so that increasing the size of the set one is computing the expectation over can only increase the result. Since EV1 < ∞ , for sufficiently large t1 , say t1 > τ 1 , (A.8) is bounded by ε . 30
))
Now, if t1 ≤ τ 1 , then the second term in (A.6) is bounded above by
(
) (
)
+ + + (A.9) E V1 ; [t1 − U 1 ] + V1 > [t c − U 1 ] ≤ E V1 ;τ 1 + V1 > [t c − U 1 ] For t c sufficiently large, say t c > τ 2 , (A.9) is also bounded by ε . Hence, in either case, if
t c > τ 2 , then the second term in (A.6) is bounded by ε . We turn now to the first term in (A.6). This is bounded by
(
)
E [t c − U 1 ] − t c ; R1 ≤ [t1 − U 1 ] . +
(
+
(A.10)
)
Now, as t1 → ∞ , P R1 ≤ [t1 − U 1 ] → 1 , so that, by dominated convergence, for t1 > τ 1′ , +
(A.10) is bounded above by − EU 1 / 2. For t1 ≤ τ 1′ , note that the first term in (A.6) is bounded above by
(
)
E [t c − U 1 ] − t c ; [t1 − U 1 ] + V1 ≤ [t c − U 1 ] . +
+
+
(A.11)
Again by dominated convergence, it follows that for sufficiently large t c , say t c > τ 2′ , (A.11) is bounded above by − EU 1 / 2 . Hence, for t c > τ 2′ , the first term in (A.6) is bounded above by − EU 1 / 2 . Now, if τ ∆ max (τ 2 ,τ 2′ ) , then x ∉ K implies that t c > max (τ 2 ,τ 2′ ) , and then, from (A.6) and the above bounds, Pg ( x) − g ( x) ≤
− EU 1 − EU 1 +ε ≤ < 0. 2 3
Hence, the result is established.q Proof of Theorem 2: Positive Harris recurrence and convergence in distribution of X n follows from Lemmas 4 and 5 and Theorem 14.0.1 of Meyn and Tweedie (1993). From (A.1) and (A.2), the distribution of ( An+1 ,Wn+1 ) depends on the distribution of ( X n , Rn +1 , U n+1 ) , or , by independence of X n and (Rn +1 ,U n +1 ) on the distribution of X n and that of (Rn +1 ,U n +1 ) . From (A.1) and (A.2), we see that the functions defining ( An+1 ,Wn +1 ) have discontinuities only when Rn +1 = [X n (1) − U n +1 ] (assuming that X n (1) = min X n (i ) ). Our conditions on the +
1≤ i≤ c
31
distributions of R1 and U 1 ensure that these discontinuities are realized with probability 0, so that Theorem 5.1 of Billingsley (1968) yields the result that ( An ,Wn ) ⇒ ( A∞ ,W∞ ) as n → ∞ .q Proof of Proposition 3: Let ( S (t ) : t ≥ 0) denote the continuous-time analogue of the discretetime process X defined earlier, so that S (t ) represents the instantaneous server workloads at time t. Let (t n : n ≥ 1) denote a deterministic sequence of increasing times with t n → ∞ as n → ∞ . Let ( Rn : n ≥ 1) denote an i.i.d. sequence of exponentially distributed random variables with mean α −1 that is independent of ( S (t ) : t ≥ 0) . It was shown earlier that the Markov process X converges in distribution. With a straightforward extension of that argument, one can show that ( S (t ) : t ≥ 0) is a regenerative process with nonlattice cycle length distribution, so that, by Theorem 20 on p. 120 of Wolff (1989), S (t ) ⇒ S (∞) say, as t → ∞ . The "Poisson arrivals see time averages" property (Wolff 1989)
ensures that the discrete-time and continuous time workload processes have the same steadystate distribution. It follows that ( S (t n ), Rn ) ⇒ ( S (∞), R∞ ) , as n → ∞ , where S (∞) is distributed according to the steady-state distribution of X. Theorem
5.1
of
Billingsley
(1968)
then
allows
us
to
conclude
that
[ I ( Rn ≤ min S (t n ), min S (t n )] ⇒ ( A∞ , W∞ ) as n → ∞ . A second application of Theorem 5.1 of Billingsley (1968) shows that CGOS[ I ( Rn ≤ min S (t n )), min S (t n )] ⇒ CGOS ( A∞ , W∞ ) as n → ∞ , since the discontinuities of the function CGOS have probability 0 (relative to
( A∞ ,W∞ ) . The Bounded Convergence Theorem completes the proof of the result. q
32