Stabilizing Customer Abandonment in Many-Server Queues with Time ...

Submitted to Operations Research manuscript (Please, provide the mansucript number!)

Stabilizing Customer Abandonment in Many-Server Queues with Time-Varying Arrivals Yunan Liu, Ward Whitt Department of Industrial Engineering and Operations Research, Columbia University, New York, NY 10027-6699, {yl2342, ww2040}@columbia.edu

We develop analytical approximations to determine time-dependent staffing levels to stabilize abandonment probabilities and expected delays in the Mt /GI/st + GI many-server queueing model, which has a nonhomogeneous Poisson arrival process (the Mt ), general service times (the first GI) and allows customer abandonment according to a general patience distribution (the +GI). In particular, we propose new offeredload (OL) and modified-offered-load (MOL) approximations involving infinite-server models and show how they can be applied when the arrival rate is suitably high to determine a staffing function that makes the time-dependent abandonment probability and expected potential waiting time (the virtual waiting time of a customer with infinite patience) nearly constant over time at targeted levels, after an initial transient. We also develop approximations for other performance measures under this staffing function. These approximations show that the other performance measures are not stabilized to the same extent, and quantify their fluctuations. We perform simulations to show that the approximations are effective. Key words : staffing; capacity planning; many-server queues; queues with time-varying arrivals; queues with abandonment; infinite-server queues, offered-load approximations; service systems; History : Submitted on June 6, 2009

1. Introduction Unlike most textbook queueing models, real service systems typically have time-varying arrival rates, usually with significant variation over the day. When service times are relatively short, as in call centers when service can be provided by a short telephone call, it is possible to apply a pointwise stationary approximation (PSA) to analyze the performance and determine appropriate time-dependent staffing levels, which primarily means acting as if the system at each time is in steady state with the arrival rate that prevails at that time; see Green and Kolesar (1991), Whitt (1991). However, if service times are relatively long, as in many healthcare settings, then the PSA 1

2

Liu and Whitt: Stabilizing Performance Article submitted to Operations Research; manuscript no. (Please, provide the mansucript number!)

can be inaccurate, leading to inappropriate staffing levels, so that more complex refinements are required, e.g., as in Jennings et al. (1996). For a review of methods to cope with time-varying arrival rates, see Green et al. (2007). For reviews of call centers and associated research problems, see Aksin et al. (2007), Gans et al. (2003). Recently, Feldman et al. (2008) developed a simulation-based iterative staffing algorithm (ISA) to determine appropriate staffing levels over time for a many-server service system with customer abandonment and a time-varying arrival process. Specifically, the ISA was developed for the Mt /GI/st + GI queueing model, which has a nonhomogeneous Poisson arrival process (the Mt ), a large number st of homogeneous servers working in parallel at time t (a function of time t), independent and identically distributed (i.i.d.) service times with a general distribution (the first GI), unlimited waiting space, the first-come first-served (FCFS) service discipline, and customer abandonment from the queue of customers waiting to start service, with i.i.d. times to abandon having a general distribution (the second GI). There is strong interest in considering models with non-exponential service-time and abandonment-time distributions, because system measurements indicate that these distributions are often far from exponential; see Brown et al. (2005). Since the ISA is based on simulation, it can easily be applied to this non-Markovian setting and even to much more general models. It also has the advantage of providing automatic validation: In the process of performing ISA, we directly confirm that ISA achieves its goal. Feldman et al. (2008) applied the ISA to stabilize the delay probability (the probability that an arrival must wait in queue before starting service); i.e., they determined a time-dependent staffing function so that the time-dependent delay probability is approximately constant at any desired target level. As illustrated by Figure 3 of Feldman et al. (2008), the ISA was shown to be remarkably effective at stabilizing the delay probability in the fully Markovian Mt /M/st + M model, for all possible quality-of-service levels, provided that the arrival rate is suitably large, so that a large number of servers (e.g., 100) is actually required. The target delay probability was allowed to range from 0.1 (lightly loaded) to 0.9 (heavily loaded). Indeed, as illustrated by Figures 5 and 6 of Feldman et al. (2008), with ISA staffing the delay probability becomes essentially the


3

same as in a corresponding stationary model with constant arrival rate. Thus, from the perspective of the delay probability, the effect of the time-varying arrival rate can be eliminated by applying the ISA to choose the staffing level appropriately. Even though the ISA can stabilize the delay probability, the effect of the time-varying arrival rate cannot be eliminated entirely. When ISA is applied to stabilize the delay probability, the ISA also stabilizes other performance measures to some extent, but as illustrated by Figure 4 of Feldman et al. (2008) and Figure 6 of the accompanying e-companion, some of these other performance measures are not completely stabilized. In particular, significant fluctuations are seen in the timevarying abandonment probability and average delay with a sinusoidal arrival-rate function under heavier loads. From Feldman et al. (2008), we already know that it is not possible to stabilize all performance measures at once across a wide range of loads in the Mt /GI/st + GI setting. In this paper we develop new analytical approximation methods for staffing to stabilize the abandonment probability and the expected delay at a wide range of target levels in the Mt /GI/st + GI model. We also develop methods to determine the extent to which other performance measures are stabilized as well. We then apply simulation to verify that the staffing algorithm and approximation formulas are effective. We give an overview of our approach in the next section. At the end of that section, we indicate how the rest of the paper is organized.

2. New Offered-Load (OL) Approximations To stabilize the abandonment probability and the expected delay at target levels in the Mt /GI/st + GI model, we introduce new offered-load (OL) and modified-offered-load (MOL) approximations, involving infinite-server models. The MOL approximation is a refinement of the OL approximation, so in this section we only discuss the OL approximation. The general idea of an OL approximation is to initially assume that there are as many resources (here servers) as we need and then see how many are actually used. After we have determined how many servers would be needed if there were no constraint on their availability, we staff to provide the number of servers needed. Since the model is stochastic, the time-dependent number of servers


4

used itself is random, so we must use some deterministic time-dependent partial characterization of this time-dependent distribution, such as the time-dependent mean or that time-dependent mean plus some multiple of the associated time-dependent standard deviation, as in the staffing algorithm proposed by Jennings et al. (1996). In application, the number of servers provided will not always be adequate, so that there will be some congestion. This OL approach is promising largely because the associated Mt /GI/∞ infinite-server model is remarkably tractable, as indicated by Eick et al. (1993a,b), Massey and Whitt (1993) and the references therein. Indeed, in this model, the time-dependent number of customers in the system at time t has a Poisson distribution for each t with a convenient integral expression for the timedependent mean. Since the variance equals the mean for a Poisson random variable, we know the standard deviation when we know the mean. Moreover, the distribution is approximately normal and is relatively close to the mean as the mean increases (which happens as the arrival rate increases), because the standard deviation is the square root of the mean. This general OL approach using infinite-server models is not new, but here we use infinite-server models in a new way. Instead of directly replacing the Mt /GI/st + GI model by the Mt /GI/∞ infinite-server model, which ignores the customer abandonment, we represent our model as two infinite-server facilities in series: first the waiting room (or the queue), and then the service facility, as depicted in Figure 1. Suppose that our target delay is the constant value w. Even though the service facility in our approximation has infinitely many servers, we require that all external arrivals enter the waiting room that has infinite capacity and spend the fixed time w there before they move on to the service facility. (That is done in the approximation, not in the actual system.) While in the waiting room, each customer may abandon instead of entering service, after which the customer is lost. As in the original model, the abandonment times of successive arrivals to the queue are i.i.d. random variables with cumulative distribution function (cdf) F . We call this particular OL approximating model the delayed infinite-server-OL (DIS-OL) model. In the DIS-OL approximation with parameter w, the customer always enters service after spending time w in the queue, if the customer has not


5

The Delayed Infinite-Server Offered-Load (DIS-OL) Approximation Q(t)

(t )

waiting room

B(t)

# (t ) " (t ! w) F ( w)

(wait w)

service facility

$ (t )

G

% (t ) abandonment

F

Figure 1

The delayed infinite-server offered-load (DIS-OL) approximation for the Mt /GI/st +GI queueing model. The contents Q(t) and B(t) are independent Poisson random variables for each t; the three flows are Poisson processes.

yet abandoned. That rule is possible because the service facility has infinitely many servers. We assume the system starts empty at time 0 and we let the first customers enter service after time w. Thus, for t ≥ w, customers enter the service facility at rate λ(t − w)F¯ (w), where λ(·) is the arrival-rate function and F¯ ≡ 1 − F . Since all arrivals wait precisely the target time w before entering service, if they do not elect to abandon, with DIS-OL the approximate abandonment probability is always F (w). Hence we can initially specify either the target abandonment probability α ≡ P (Ab) or the target delay w. If F is continuous, then there always is a w such that F (w) = α for any given α. If F is also strictly increasing, then w = F −1 (α). We assume that F is continuous and strictly increasing. Clearly the DIS-OL approximation produces constant abandonment probability α and delay w for all customers that do not abandon. An important theoretical reference point for this property is the deterministic fluid approximation that is appropriate for many-server queues in the efficiency-


6

driven (ED, heavily loaded) many-server heavy-traffic limiting regime, as in Whitt (2006). Thus the DIS-OL approximation is consistent with known behavior under heavy loads. Indeed, the fact that the DIS-OL approximation should be asymptotically correct in that regime was a primary source of motivation for considering the approximation. This particular OL approximation is remarkably tractable because both facilities can be analyzed as Mt /GI/∞ queues. We use the stochastic process {B(t) : t ≥ 0} representing the number of busy servers through time as a basis for setting the staffing levels in the actual system. As in Jennings et al. (1996), we can staff at the mean m(t) ≡ E[B(t)] plus some constant multiple of the standard p deviation m(t), with that constant depending on the desired quality of service (QoS). Closely paralleling Feldman et al. (2008), in simulation experiments, we find that it is possible

to stabilize the abandonment probability (or the expected waiting time) at any desired target, for a wide range of targets and associated loads, when we apply the DIS-OL approximation to the Mt /GI/st + GI queueing model. To do so, we can staff to achieve target α according to the classical square-root-staffing (SRS) rule, letting sα (t) = mα (t) + βα

p

mα (t),

(1)

where mα (t) ≡ E[B(t)] and βα is the appropriate QoS parameter with abandonment probability target α. This SRS rule is just as in Feldman et al. (2008), except that we use the new DIS-OL version of the OL mα (t) (as opposed to the OL from a direct infinite-server approximation, with a single facility, which corresponds to m0 (t), i.e., mα (t) when α = 0). Under heavier loads, the two offered loads are different, but under lighter loads, the two offered loads are very close. Under heavy loads, SRS with our DIS-OL successfully stabilizes the abandonment probability, whereas the SRS with the direct OL approximation does not. Under lighter loads, where they agree, both are effective. The first conclusion from our research is that the SRS rule based on the DIS-OL is effective for stabilizing the abandonment probability and the expected waiting time across a wide range of loads


7

(or targets). The second conclusion is that there are effective ways to determine the appropriate value of the QoS parameter βα in (1), depending on the model and the performance target. The MOL refinement plays an important role in this second step. First, we derive a full DISMOL staffing algorithm, which itself is effective, directly yielding an appropriate staffing function s(t). Then we show that the DIS-MOL staffing algorithm is consistent with the SRS rule for some QoS parameter βα . (This logic follows Feldman et al. (2008), where first the ISA was developed to stabilize the delay probability and then, afterwards, it was shown to be consistent with SRS. In that context, the QoS parameter βα could be related to many-server heavy-traffic limits in Garnett et al. (2002).) Afterwards, we develop an explicit approximation formula for βα for the Markovian Mt /M/st + M model. We are also able to derive explicit expressions for many other performance measures with the DIS-OL approximation. Thus we see that they are indeed not simultaneously stabilized under heavy loads and we obtain estimates of the extent of their fluctuations. Here is how the rest of this paper is organized: We start in §3 by giving explicit expressions for all the key performance measures for the proposed DIS-OL approximation of the Mt /GI/st +GI model with fixed delay target w. To provide additional insight, we also give explicit formulas in structured special cases, when the arrival-rate function is sinusoidal or quadratic. In §4 we first indicate how we can apply the DIS-OL approximation to set staffing levels in the actual system, but initially it remains to determine the QoS parameter βα in (1). (That could be done by simulation.) In order to obtain a direct analytical staffing method, we then develop the modified-offered-load (MOL) refinement, which we call the DIS-MOL approximation. The MOL refinement follows §3.2 of Feldman et al. (2008) and references therein. In §5 we perform simulation experiments to validate the approximations, considering the Markovian Mt /M/st + M examples with sinusoidal arrival-rate function. We also consider corresponding many-server service systems with non-exponential service times and abandonment times in the e-companion. Our simulations add real system constraints. First the staffing levels must be integervalued, so they must be rounded. Second, when the staffing levels decrease, we do not remove

8


servers until they complete the service in progress. Throughout we assume that server assignments can be switched when a server leaves, so that only the total number of servers matters. As a consequence, when the number of servers is scheduled to decrease by one when all servers are busy, then a server leaves at the time of the next service completion. In addition, we have considered specified staffing intervals, e.g., requiring that staffing changes can only be made on the half hour. We discuss this modification in the e-companion. In §6 we relate the SRS formula to the DIS-MOL staffing function and the DIS-OL function. We give a relatively simple approximation formula for the QoS parameter βα . We also discuss manyserver heavy-traffic limits and approximations, which provide useful insight into these approximations, just as in Feldman et al. (2008). Finally, in §7 we draw conclusions. We present additional material in the e-companion. In §EC.1 we present results of simulation experiments showing that the approximations and staffing algorithm are effective for Mt /GI/st + GI models with non-exponential service and abandonment distributions. In §EC.2 we discuss a connection between the offered-load analysis and an associated deterministic fluid model studied in Liu and Whitt (2009). In EC.3 we consider several discretization issues and real-world constraints of the staffing functions. In §EC.4 we present results of simulation experiments showing that the approximations are effective when both the arrival rates and the number of servers are small. In §EC.5 we study the sensitivity of the performance to a change of a single server.

3. The DIS-OL Approximation We now describe the performance in the DIS-OL approximating model depicted in Figure 1. We start with a targeted waiting time w and the original Mt /GI/st +GI model, specified by the arrivalrate function λ, the service-time cdf G and the abandonment-time cdf F . Let S and A be generic service-time and abandonment-time random variables; i.e., G(x) ≡ P (S ≤ x) and F (x) ≡ P (A ≤ x) for x ≥ 0. Assume that E[S] < ∞. (We do not need to assume that E[A] < ∞ because in our approximation abandonments can only occur before time w.) We also assume that F has no point mass at w, i.e., P (A = w) = 0. Doing so avoids ambiguity about customer action after waiting


9

in queue for time w. We aim to specify the time-dependent number of servers s(t) such that the expected potential delay in the Mt /GI/st + GI model is stabilized at w. The potential delay is the virtual waiting time of a customer with unlimited patience. Our approximating DIS-OL model stabilizes delays at w perfectly. While in the waiting room (prior to time w), each customer may abandon, after which the customer is lost. As in the Mt /GI/st + GI model, the times to abandon are i.i.d. random variables with cdf F . As a consequence, the abandonment probability is the same for each customer: Each customer eventually abandons with probability F (w), and is eventually served with probability F¯ (w) ≡ 1 − F (w). Customers entering service have i.i.d. service times with cdf G. The approximating model thus becomes a network of two Mt /GI/∞ queues in series. The waiting room has arrival-rate function λ and service times distributed as T ≡ A ∧ w ≡ min {A, w}, while the service facility has arrival rate λ(t − w)F¯ (w) and the given service-time cdf G. Let FT be the associated cdf of the truncated random variable T , i.e., FT (x) ≡ P (T ≤ x) = F (x),

0 ≤ x < w,

FT (x) = 1,

x ≥ w.

(2)

We see that T has a point probability mass at w, since P (T = w) = P (A ≥ w) = F¯ (w). As in Eick et al. (1993a), we assume that the system starts in the infinite past (at t = −∞), with the policy above (customers entering service after waiting w if they have not yet abandoned). With this convention, all processes are defined on the entire real line. That is convenient both for some formulas and for representing the dynamic steady state associated with periodic arrival-rate functions, as in Eick et al. (1993b). If we want the system to start empty at time 0, then we can simply let λ(t) = 0 for all t < 0. We can split the arrival process into two independent Poisson processes, one for the customers who will eventually be served and the other for the customers who will eventually abandon. Each customer is eventually served with probability F¯ (w). We can further modify these two Poisson processes to obtain independent Poisson processes for the process counting customers entering service (at the times they enter service) and the process counting customers abandoning (at the


10

times they abandon). Each can be represented as the departure process from an Mt /GI/∞ queue, which corresponds to the original Poisson arrival process modified by having its points translated to the right by i.i.d. random variables; that is well known to preserve the Poisson property. For the counting process counting customers entering service, the service time is the constant value w; for the counting process counting customers abandoning, the service time is by the random value (T |T < w). By this construction, we have proved that the arrival process to the service facility is indeed a nonhomogeneous Poisson process with rate λ(t − w)F¯ (w) at time t. We can thus apply established results for the Mt /GI/∞ model, in particular Theorem 1 of Eick et al. (1993a). Let Q(t, y) denote the number of customers in queue at time t that have remaining time before abandonment greater than y and let Q(t) ≡ Q(t, 0) denote the total number of customers in queue at time t. The random variable Q(t, y) is depicted in Figure 2, which parallels Figure 1 of Eick et al. (1993a). We put a point at (x, y) in the plane if there is an arrival at x with abandonment time y. Thus Q(t, y) will be the shaded region above the interval [t − w, t] as shown. All arrivals with

abandonment time

Q(t,y) have entered service

will be served y

w have abandoned arrival time

Figure 2

t-w

will abandon

t

The random variable Q(t, y) representing the queue content at time t that has remaining time before abandonment greater than y, in the DIS-OL approximation.


11

abandonment times greater than w will be served. All arrivals before time t − w with abandonment times greater than w will have entered service before time t. Let B(t, y) denote the number of customers in the service facility at time t that have remaining service time greater than y and let B(t) ≡ B(t, 0) denote the total number of busy servers (number of customers in the service facility) at time t. Let X(t) ≡ Q(t) + B(t) be the number of customers in the system at time t. Let W (t) be the potential waiting time at time t, i.e., the virtual waiting time before entering service of a customer with infinite patience arriving at time t. We now summarize the main structural results for the approximation. For a nonnegative random variable Z with finite mean E[Z] and cdf H, let Ze denote a random variable with the associated stationary-excess cdf (or residual-lifetime cdf) He , defined by Z y 1 ¯ He (y) ≡ P (Ze ≤ y) ≡ H(x) dx, E[Z] 0

y ≥ 0.

(3)

The moments of Ze can be easily expressed in terms of the moments of Z via E[Zek ] =

E[Z k+1 ] , (k + 1)E[Z]

k ≥ 1.

(4)

Both B(t, y) and Q(t, y) are functions of the model parameter α (or w = F −1 (α)), but in this section we drop the subscript for simplicity. Theorem 1. Consider the DIS-OL approximation for the Mt /GI/st + GI model specified above, starting in the distant past with specified delay target (parameter) w ≥ 0 or abandonment probability target α = F (w). The approximation makes W (t) = w with probability 1 and the probability of abandonment F (w) for all arrivals. Moreover, Q(t, y1 ) and B(t, y2 ) are independent Poisson random variables for each t and each pair (y1 , y2 ) with y1 > 0 and y2 > 0, having means Z t Z w ¯ E[Q(t, y)] = λ(x)F (t + y − x) dx = λ(t − x)F¯ (y + x) dx , t−w 0 Z t Z ∞ ¯ ¯ ¯ ¯ + x) dx. E[B(t, y)] = F (w) λ(x − w)G(t + y − x) dx = F (w) λ(t − w − x)G(y

(5)

0

−∞

The total numbers of customers in queue and in service at time t, Q(t) and B(t) respectively, are independent Poisson random variables with means Z t Z ¯ E[Q(t)] = E[Q(t, 0)] = λ(x)FT (t − x) dx = −∞

t t−w

λ(x)F¯T (t − x) dx


12

=E

Z

t t−T

Z

λ(x) dx = E[λ(t − Te )]E[T ],

t

¯ − x) dx λ(x − w)G(t Z t−w ¯ = F (w)E λ(x) dx = F¯ (w)E[λ(t − w − Se )]E[S].

E[B(t)] = E[B(t, 0)] = F¯ (w)

−∞

(6)

t−w−S

Thus, X(t), the total number of customers in the system at time t is a Poisson random variable with a mean E[Q(t)] + E[B(t)]. The processes counting the numbers of customers abandoning and entering service are independent Poisson processes with rate functions α and β, respectively, where α(t) =

Z

w

0

λ(t − x) dF (x) = E[λ(t − T )|T < w],

β(t) = λ(t − w)F¯ (w).

(7)

The departure process (counting the number of customers completing service) is a Poisson process with rate σ(t) = F¯ (w)

Z

0

∞

λ(t − w − x) dG(x) = F¯ (w)E[λ(t − w − S)].

(8)

Proof. For the most part, these results are a direct application of Theorem 1 of Eick et al. (1993a). Understanding is facilitated by drawing pictures Poisson random measure. Here we elaborate on only one point: To establish the claim that Q(t, y1 ) is independent of B(t, y2 ) for each (y1 , y2 ), first observe that the departure process from the queue prior to time t is independent of {Q(t, y) : y ≥ 0}. That implies that the process of customers entering service prior to time t is also independent of {Q(t, y) : y ≥ 0}. But B(t, y2 ) depends only on the history of Q up to time t through its own arrival

process up to time t. As discussed in Eick et al. (1993a), the last two representations for E[Q(t)] and E[B(t)] in (6) are appealing because they show random time lags, but these random time lags appear inside the expectation in a nonlinear fashion. We get convenient explicit approximations when the arrival-rate function λ is a polynomial, as we discuss below. We can see directly that the targeted wait of w before starting service increases the random time lags in E[B(t)] by w. This will be negligible if w is small, but not otherwise. As noted in Eick et al. (1993a), the time lag in E[B(t)] involving w + Se is different from the time lag w + S appearing in the departure rate σ(t).


13

Since the departure process from the service facility is a Poisson process, we see that the approximation extends immediately to yield corresponding approximations for any number of Mt /GI/st + GI models in series, in the spirit of Massey and Whitt (1993). Since many service systems have daily cycles, it is natural to consider sinusoidal and other periodic arrival-rate functions, as was done in Jennings et al. (1996), Feldman et al. (2008). When we do so, we can apply Eick et al. (1993b) to obtain explicit formulas for the performance measures in that setting. For periodic arrival processes, it is natural to focus upon the dynamic steady state, which prevails because we have started the system empty at time t = −∞. The position within the cycle is determined by simply defining the arrival-rate function over the entire real line, and assuming that it is periodic. In that way, time 0 corresponds to a definite place within a cycle. Theorem 2. Consider the DIS-OL approximation for the Mt /GI/st + GI model with sinusoidal arrival-rate function λ(t) = a + b · sin(ct) and delay target w. Then Q(t) and B(t) are independent Poisson random variables having sinusoidal means E[Q(t)] = E[T ] (a + γ sin(ct − θ)) , p f or θ ≡ arctan[φ1 (T )/φ2 (T )], γ ≡ b φ1 (T )2 + φ2 (T )2 , ˜ , E[B(t)] = F¯ (w)E[S] a + γ˜ sin[c(t − w) − θ]

f or θ˜ ≡ arctan[φ1 (S)/φ2 (S)],

p γ˜ ≡ b φ1 (S)2 + φ2 (S)2 ,

where φ1 (X) ≡ E[sin(cXe)], φ2 (X) ≡ E[cos(cXe )] for a nonnegative random variable X. It is easy to see that the extreme values of E[Q(t)] and E[B(t)] occur at tQ = tλ + θ/c and ˜ where tλ = π/2γ + nπ/γ for n integer are times at which the extreme values of λ(t) tB = tλ + θ/c, occurs. And their extreme values are Q(tQ ) = E[T ](a ± γ) and B(tB ) = F¯ (w)E[T ](a ± γ˜ ). As shown in Eick et al. (1993b), much nicer formulas are obtained in the special case of exponential service times, but we do not display them. As discussed in §4 of Eick et al. (1993a), Massey and Whitt (1997) and §4 of Green et al. (2007), for general arrival-rate functions, it is convenient to use simple approximations stemming


14

from linear and quadratic approximations formed by Taylor series approximations. These generate simple time lags and space shifts. Theorem 3. Consider the DIS-OL approximation for the Mt /GI/st + GI model with quadratic arrival-rate function λ(t) = a + bt + ct2 and delay target w. Then Q(t) and B(t) are independent Poisson random variables having quadratic means E[Q(t)] = E[T ]λ(t − E[Te ]) + cE[T ]V ar[Te ], E[B(t)] = F¯ (w)E[S]λ(t − w − E[Se ]) + cF¯ (w)E[S]V ar[Se ] The quadratic case shows a deterministic time shift which corresponds to interchanging the order of expectation in Theorem 1, and a space shift that is the variance term multiplied by a constant. We obtain expressions for linear arrival-rate functions by simply letting the quadratic coefficient be c = 0.

4. Direct DIS-OL Staffing and the DIS-MOL Approximation We first indicate how the DIS-OL approximation can be used to specify the staffing function sα (t) in the original Mt /GI/st +GI model in order to achieve target α ≡ P (Ab). Afterwards we introduce the MOL refinement. For any given target abandonment probability α with the direct DIS-OL approximation, the number of servers actually used at time t would be the random variable Bα (t). With the DIS-OL approximation, Bα (t) has a Poisson distribution with mean mα (t) ≡ E[Bα (t)], for which we give an explicit formula. Hence Bα (t) is approximately normally distributed with both mean and variance equal to mα (t). The simple DIS-OL staffing approximation is to simply set sα (t) = mα (t). We will show that the simple DIS-OL staffing policy is effective for heavily loaded systems, but not for normally loaded systems. More generally, we propose the SRS fomula (1) for some constant βα that should be a decreasing function of the targeted abandonment probability α = F (w), but which should be independent of time t. Since βα is independent of time, we can find it by an elementary search using simulation. After we find it, it could be used in subsequent applications.


15

However, for lighter loads, we can do better. It has been observed that an approximation based on an infinite-server model can often be improved by replacing it by an associated modified-offeredload (MOL) approximation; see §4 of Jennings et al. (1996) and §3.2 of Feldman et al. (2008). (Such MOL approximations go back to Jagerman (1975); also see Massey and Whitt (1994, 1997).) For the Mt /GI/st + GI model, the MOL approximation for the number of customers in the system at time t, Xα (t), is the steady-state number of customers in an associated stationary M/GI/s + GI model with s = s(t) and a modified constant arrival rate λ = λMOL (t), based on the offered load, α interpreted as the expected number of busy servers. We explain in our context: For that purpose, let Pt (Ab) be the time-dependent probability that an arrival at time t will eventually abandon. For a stationary model, the offered load is defined as λ(1 − P (Ab))E[S], the rate customers enter service, λ(1 − P (Ab)), multiplied by the mean service time, E[S]. A candidate time-dependent generalization would be the PSA offered load λ(t)(1 − Pt (Ab))E[S], but λ(t)(1 − Pt (Ab)) is the rate of arrivals at time t that will eventually enter service; it is not the rate customers actually enter service at time t. By Little’s law applied to the service facility in the stationary model, λ(1 − P (Ab))E[S] is also the expected number of busy servers in steady state. Experience indicates that the mean number of busy servers in the infinite-server system tends to be a far better analog of the offered load in a nonstationary model. Thus, to obtain the DIS-MOL approximation for the staffing s(t) at time t to achieve the target α, we let w = F −1 (α) and we look at the steady-state distribution of the stationary M/GI/s + GI model applied with s = s(t) and arrival rate λMOL (t) ≡ α

mα (t) . E[S](1 − α)

(9)

For each fixed time t, we let the DIS-MOL staffing level sMOL (t) be the least integer staffing α level such that the stationary abandonment probability P (Ab) in the M/GI/s + GI model with arrival rate λ = λMOL (t) in (9) and s = sMOL (t) satisfies P (Ab) ≤ α. α α At first glance, the MOL approximation may not seem too helpful, because the stationary M/GI/s + GI model itself is quite complicated. However, accurate approximations have been


16

developed in Whitt (2005), which is based on an approximation of the M/GI/s + GI model by an associated Markovian M/M/s + M (n) model with state-dependent service rates. Alternatively, if we approximate the general service-time distribution by an exponential distribution with the same mean, one can use the exact solution or asymptotic approximations for the associated M/M/s + GI model from Zeltyn and Mandelbaum (2005). Either way, we are exploiting the property that the service-time distribution beyond its mean tends not to matter to much in the M/GI/s + GI model; see Whitt (2005, 2006). In an application that can be checked with simulation.

5. An Mt /M/st + M Example with a Sinusoidal Arrival-Rate Function In this section, we demonstrate the effectiveness of the DIS-MOL staffing algorithm using simulation experiments. We show that both the abandonment probability Pt (Ab) and the expected delay E[W (t)] are indeed stabilized (independent of time) in Markovian Mt /M/st + M examples. The DIS-MOL approximation is also effective for non-Markovian Mt /GI/st + GI models; we illustrate in the e-companion. To capture the spirit of real systems, we consider a sinusoidal arrival-rate function λ(t) = a + b · sin(ct),

0 ≤ t ≤ T,

(10)

letting a = 100, b = 20, and c = 1. Here we let the individual service rate µ be 1 and the individual abandonment rate θ be 0.5. (Letting µ = 1 does not lose generality because we are free to choose the time units.) Our staffing procedure applies to arbitrary arrival-rate functions, because we can apply Theorem 1 to calculate the OL function m(t). An important issue for applications is the rate of fluctuation in the arrival-rate function compared to the expected service time. To be concrete, we will measure time in hours and let T = 20 hours. A cycle of the sinusoidal arrival-rate function in (10) is 2π/c; we have set c = 1, so a cycle is 2π ≈ 6.3 hours. Thus there will be about three cycles in the interval [0, T ]; see Figure 3 and 5. We have already observed that PSA is effective if the service times are short. As in Jennings et al. (1996) and Feldman et al. (2008), we are deliberately choosing a rapidly fluctuating arrival rate function


17

(or, equivalently, relatively long service times) in order to make our problem more challenging. The method will be essentially the same as PSA if the arrival-rate function changes more slowly (or, equivalently, the service times are relatively short). Before demonstrating the effectiveness of the DIS-MOL approximation, we first show that PSA does a poor job here. We let the abandon probability target α be 0.1. As shown in Figure 3, simulation estimates of the expected delay and abandonment probability (the solid lines) under PSA are significantly different from their targets (the dashed lines). One can see that the system alternates between overstaffed and understaffed periodically, although the time average over a day is close. Figure 3 shows a time lag between the peaks of the arrival rate and the performance measures. We next show that this time lag is captured by the DIS-MOL approximation. We also show that we can treat the initial transient as well as the dynamic steady state. In order ¯ to do so, we let F¯ (x) = e−θx , G(x) = e−µx , and λ(t) ≡ 0 for t < 0, we obtain ( −θw )e−θw−µ(t−w) + √be 2 2 sin[c(t − w) − φ], t ≥ w e−θw µa − ( µa − µ2bc +c2 µ 1+c /µ E[B(t)] = 0, 0 ≤ t < w,  q 2 2  a (1 − e−θw ) + b x +y sin(ct + β − γ), t ≥ w θ 2 +c2 E[Q(t)] = aθ a −θt  θ − ( θ2bc − )e + √ 2b 2 sin(ct − γ), 0 ≤ t < w, 2 +c θ

(11)

(12)

θ +c

Arrival rate

120 110 100 90 80

0

2

4

6

8

10

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

Time

Abandonment probability

0.2 0.15 0.1 0.05 0

0

2

4

6

8

10

Expected delay

Time

0.4 0.3 0.2 0.1 0

0

2

4

6

8

10

Time

Figure 3

Simulation estimates of the abandonment probability and expected delay under the PSA staffing for the Mt /M/st + M example with the specified sinusoidal arrival rate, α = 0.1.


18

where φ = arctan(c/µ),

γ = arctan(a/θ),

x = 1 − e−θw cos(cw),

β = arctan(y/x)

y = e−θw sin(cw)

From (11), we see that E[B(t)] is asymptotically sinusoidal as t → ∞, eventually coinciding with the formula in Theorem 2 for the steady-state behavior of E[B(t)]. From (12), we see that E[Q(t)] is sinusoidal when t ≥ w. We first compare the DIS-MOL staffing function sMOL (t) and the DIS-OL staffing function α which is the offered-load (OL) function mα (t). In Figure 4, we plot {sMOL (t) : 0 ≤ t ≤ T } and α {mα (t) : 0 ≤ t ≤ T } (T = 20) with two abandonment probability targets: α = 0.1 and α = 0.001.

These two targets are extreme cases. The first one (α = 10%) represents a heavy load; the second one (α = 0.1%) represents a light load. Figure 4 shows that {sMOL (t) : 0 ≤ t ≤ T } and {mα (t) : 0 ≤ α t ≤ T } coincide when the system is heavily loaded. Therefore, in order to stabilize abandonment

140

120

Staffing levels

100

80

m (t)

60

α

MOL (t) α

s

20

0

2

4

6

8

10

12

14

α = 0.1

m (t)

α = 0.001

MOL s (t) α

α = 0.001

α

40

α = 0.1

16

18

20

Time

Figure 4

OL A comparison of the OL function mα (t) and the DIS-MOL staffing function sM (t) for the Mt /M/st + α

M example with sinusoidal arrival rate for abandonment probability targets α = 0.1 and 0.001.


19

under heavy loads, we can just staff the system with the OL functions mα (t), which has explicit expressions, given in Theorem 1. However, when the system is lightly loaded, Figure 4 shows that the DIS-MOL staffing function is significantly above the OL function, which makes the MOL refinement (or SRS with β > 0) necessary. We now show that DIS-MOL approximation achieves the desired time-stable performances under all loadings. First we evaluate the simple DIS-OL approximation at higher loads (where it coincides with DIS-MOL). Figure 5 shows simulation estimates of key performance measures with target abandonment probability α for 0.05 ≤ α ≤ 0.20. Figure 5 shows that both the abandonment probability and the expected delay are stabilized at the targets α and w = F −1 (α) = −1/θ log(1 − α).

Arrival rate

120

100

80

0

2

4

6

8

10

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

Expected queue length

Time

40 20 0

0

2

4

6

8

10

0.2 0.15 0.1 0.05

Delay probability


Time

0

0

2

4

6

8

10

Time 1 0.9 0.8 0.7 0

2

4

6

8

10

Expected delay

Time

0.4 0.2 0

0

2

4

6

8

10

Time

Figure 5

A comparison of the simple DIS-OL approximation with simulation: Estimated time-dependent expected queue lengths, abandonment probabilities, delay probabilities and expected delays for the Mt /M/st + M example with sinusoidal arrival under the OL Staffing, with heavy loading (α = 0.05, 0.10, 0.15, 0.20).


20

Moreover, as predicted, the queue-length processes are not stabilized; they agree closely with the formulas in (12). All these quantities are estimated by performing multiple (500) independent replications under the staffing function sα (t) = mα (t) = E[B(t)] in (11). Figure 5 shows that the simple DIS-OL approximation works quite well for α between 0.05 and 0.20, and associated expected delays ranging from 0.1 to 0.45. In other words, the simple DISOL approximation is effective when the system is relatively heavily loaded. However, at least 5% abandonment may not be acceptable. That is when the DIS-MOL approximation is needed. In Figure 6, we again plot the expected queue lengths, abandonment probabilities, delay probabil-


ities and expected delays, using the DIS-MOL approximations with relatively small abandonment

5 4 3 2 1 0

0

2

4

6

8

10

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20


Time 0.02

0.01

0.005 0

0

2

4

6

8

10

Delay probability

Time 0.6 0.4 0.2 0

0

2

4

6

8

10

Time

Expected delay

0.05 0.04 0.03 0.02 0.01 0

0

2

4

6

8

10

Time

Figure 6

Estimated time-dependent expected queue lengths, abandonment probabilities, delay probabilities and expected delays for the Mt /M/st + M example with sinusoidal arrival under the DIS-MOL staffing with the system relatively lightly loaded (α = 0.005, 0.01 and 0.02).


21

probability targets 0.005 ≤ α ≤ 0.02. The DIS-MOL approximation works remarkably well after the initial transience. In the DIS-OL model, since all customers are required to wait in queue for w before entering service, unlike the expected queue length, we are not able to produce an approximate delay probability, because the DIS-OL approximation predicts that every customer should be delayed. Of course, in the original stochastic model, the delay probability is always less than 1. The delay probability increases as the abandonment probability α increases. When α is big enough, e.g., α = 0.2, the delay probability is nearly 1, as shown in Figure 6; when α is as small as 0.005, the delay probability goes down to 0.2. Figure 5 and 6 show that the delay probabilities are stabilized when α is small and are sinusoidal otherwise. This phenomenon is consistent with Figure 4 in Feldman et al. (2008), which shows the abandonment probabilities when the delay probability is the target. There, both were stabilized in light loads, but the abandonment probabilities were not stabilized under heavier loads.

6. The Square-Root Staffing (SRS) Formula We next verify that the DIS-MOL approximation is indeed consistent with the SRS staffing in (1); i.e., we show that the DIS-MOL staffing function sMOL (t) has the form of the SRS formula α (1). Following Feldman et al. (2008), for an abandonment probability target α, we let Dα (t) ≡ sMOL − mα (t) be the difference between the DIS-MOL staffing and the OL functions, and let t p βα (t) ≡ Dα (t)/ mα (t) be the implicit QoS function for DIS-MOL. In Figure 7, we plot the implicit QoS function βα (t) for different targets α for the Markovian example with sinusoidal arrival rate in

§5. Figure 7 shows that, except for an initial transience, βα (t) is independent of time and decreasing

in α. Feldman et al. (2008) showed that the delay probability can be stabilized under a different SRS staffing function, namely,

s0 (t) = m0 (t) + βγ

p

m0 (t),

(13)


22

3

2

α

Implicit QoS function β (t)

2.5

1.5

1

0.5

0

−0.5

0

2

4

6

8

10

12

14

16

18

20

Time α = 0.1

Figure 7

α = 0.05

α = 0.02

α = 0.01

α = 0.005

α = 0.002

α = 0.001

α = 0.0005

α = 0.0002

α = 0.0001

p The implicit QoS function βα (t) ≡ (sα (t) − mα(t))/ mα (t) for different abandonment probability target α, in the Mt /M/st + M example, where a = 100, b = 20, c = 1, θ = 0.5, µ = 1.

where m0 (t) is the offered-load (OL) function of the Mt /GI/∞ infinite-server model, i.e., m0 (t) = E[λ(t − Se )], where λ(t) is the arrival rate and Se is a random variable with the stationary-excess distribution associated with the service time S as in (3). In other words, m0 (t) is just our mα (t) in (5) with α = w = 0. In Feldman et al. (2008), the QoS parameter βγ is a function of the delay probability target γ. Moreover, this βγ can be approximated by the Garnett function developed as a heavy-traffic approximation in Garnett et al. (2002); in particular, P (Delay) ≈ ω(−β,

p µ/θ) ≡ G1 (β),

(14)

√ 2 where ω(x, y) ≡ [1 + h(−xy)/(yh(x))]−1, h(x) ≡ φ(x)/Φ(x), φ(x) ≡ (1/ 2π)e−x /2 , Φ(x) ≡ Rx φ(y)dy. In order to stabilize the delay probability at target γ, it suffices to obtain a βγ by −∞

inverting the Garnett function, letting P (Delay) ≡ γ.

We now compare the performance of our DIS-MOL approximation (the dashed lines) to the performance of the delay-probability offered-load (P(Delay)-OL) staffing (13) (the solid lines) in


23

Figure 8. We plot simulation estimates for expected delay E[W ], abandonment probability P (Ab) and delay probability P (Delay) of the Markovian model considered in §5 under the staffing functions of these two approximations, with abandonment probability targets α = 0.1, 0.05, 0.02. We first treat the time averages of simulation estimates of P (Delay) obtained from DIS-MOL as the delay target γ (the bold numbers in Table 1) and we invert function G1 in (14) with γ to obtain βγ , then we use P(Delay)-OL staffing function (13). Figure 8 shows that the DIS-MOL staffing function works better in stabilizing P (Ab) and E[W ], while the P(Delay)-OL staffing function works better in stabilizing P (Delay), when α = 0.1, 0.05,


which is consistent with the different targets used: Here we aim to stabilize P (Ab) and E[W ];

0.1

0.05 0.02 0

0

2

4

6

8

10

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

Expected delay

Time

0.25 0.2 0.15 0.1 0.05 0

0

2

4

6

8

10

Delay probability

Time

1

0.8

0.6 0.4 0

2

4

6

8

10

Time

Figure 8

Simulation estimates for the Mt /M/st + M example with sinusoidal arrival (10), under two staffing OL functions: (i) DIS-MOL staffing sM (t) (the dashed lines); (ii) P(Delay)-OL staffing s0 (t) in (13) (the α

solid lines). In (13) we obtain βγ by inverting function G1 in (14) letting delay probability target γ be the time average of P (Delay) under DIS-MOL. The abandonment probability targets are α = 0.1, 0.05, 0.02 (light straight lines), the other parameters are fixed: a = 100, b = 20, c = 1, θ = 0.5, µ = 1.

24


Feldman et al. (2008) aim to stabilize P (Delay). Figures 3 and 4 in Feldman et al. (2008) show that the delay-probability can be stabilized over a wide range of targets, but in doing so, the abandonment probability is not stabilized under heavy loads. Here, with out different method aimed at the abandonment probability, we find that it can be stabilized over a wide range of targets, but in doing so, the delay probability is not stabilized under heavy loads. From Figure 8 we see that when α = 0.02 both methods work well. This observed “duality” is consistent with the heavy-traffic theory. The limits in Garnett et al. (2002) are for the quality-and-efficiency-driven (QED) regime. As alpha increases, we tend to shift to the different efficiency-driven (ED) regime, as in Whitt (2004, 2006). Our DIS-OL method is asymptotically correct in ED regime, because it is directly consistent with the ED heavy-traffic limit, where w and P (Ab) are asymptotically constant. But then our DIS-OL approaches the OL as α decreases, so it should be consistent with Feldman et al. (2008) under lighter loads. In Table 1, we give more detailed measurements. Here for each curve, we compute its time average, which characterizes how far it is from its target, and the difference between its maximum and minimum which quantifies the magnitude of its fluctuation. Here Both (max-min) and time average are obtained by sampling t ∈ [3, 20], to avoid initial transient. In Table 1, measurements under column ’DIS-MOL’ and ’Garnett 1’ correspond to the DIS-MOL and P(Delay)-OL staffing (13). We see that as the abandonment probability target α decreases, the relative error, i.e., (maxmin)/average, increases. The reason is that, as α goes to 0, the error becomes more and more sensitive to the discreteness of the staffing function. In other words, adding or removing a single server effects the performance more when α is smaller. We now explain the second Garnett approximation in Table 1. In Garnett et al. (2002), in addition to the heavy-traffic approximation for P (Delay), there is also a heavy-traffic approximation for the abandonment probability in the M/M/s + M model: " # p p h(β µ/θ) p p P (Ab) ≈ 1 − · ω(−β, µ/θ) ≡ G2 (β), h(β µ/θ + θ/(N µ)) p where N ≡ λ/µ + β λ/µ, and λ is the constant arrival rate for the M/M/s + M model.

(15)


Time avg. Target - time avg. P (Ab) Max - Min (Max-Min)/avg. Time avg. Target - time avg. α = 0.1 E[W ] Max - Min (Max-Min)/avg. Time avg. P (Delay) Target - time avg. Max - Min (Max-Min)/avg. Time avg. Target - time avg. P (Ab) Max - Min (Max-Min)/avg. Time avg. Target - time avg. α = 0.05 E[W ] Max - Min (Max-Min)/avg. Time avg. P (Delay) Target - time avg. Max - Min (Max-Min)/avg. Time avg. Target - time avg. P (Ab) Max - Min (Max-Min)/avg. Time avg. Target - time avg. α = 0.02 E[W ] Max - Min (Max-Min)/avg. Time avg. P (Delay) Target - time avg. Max - Min (Max-Min)/avg. Table 1

Garnett 2 0.111 -0.011 0.0443 0.3991 0.2371 -0.0264 0.1101 0.4644 0.9442 -0.0148 0.0316 0.0335 0.0499 9.57E-05 0.0215 0.4316 0.1016 9.79E-04 0.0528 0.5196 0.7256 0.011 0.0907 0.125 0.0203 -3.13E-04 0.0106 0.5197 0.0403 7.74E-05 0.0183 0.4528 0.4371 0.0131 0.0715 0.1636

Garnett 1 0.1007 -6.93E-04 0.041 0.4073 0.2127 -0.002 0.093 0.4374 0.9281 0.0004 0.0445 0.048 0.0444 0.0056 0.0191 0.4304 0.0899 0.0127 0.0409 0.4545 0.6924 0.0038 0.0678 0.0979 0.0179 0.0021 0.0083 0.4643 0.035 0.0054 0.019 0.5414 0.4076 0.0041 0.0951 0.2332

25

DIS-MOL 0.1013 -0.0013 0.0131 0.1289 0.213 -0.0022 0.0292 0.1369 0.9285 0 0.0826 0.0889 0.0458 0.0042 0.0139 0.3028 0.0928 0.0098 0.0264 0.284 0.6962 0 0.1135 0.163 0.0184 0.0016 0.0073 0.3975 0.0362 0.0042 0.0168 0.4623 0.4117 0 0.1032 0.2507

Simulation estimates for the Mt /M/st + M example with sinusoidal arrival rate in (10), under three

OL staffing functions: (i) DIS-MOL staffing sM (t) (column ’DIS-MOL’); (ii) P(Delay)-OL staffing s0 (t) in (13), α

where we obtain βγ by inverting function G1 in (14), letting the target γ be the time-average P (Delay) under DIS-MOL shown in bold numbers (column ’Garnett 1’); (iii) the same SRS formula in (13), but with βα instead of βγ , where βα is obtained by inverting function G2 in (15) with target α (column ’Garnett 2’). The abandonment probability targets are α = 0.1, 0.05, 0.02; the other parameters are fixed: a = 100, b = 20, c = 1, θ = 0.5, µ = 1.


26

3

2.5

2

β

α

1.5

1

0.5

Garnett DIS−MOL Linear fit

0

−0.5

0

1

2

3

4

5

6

7

8

9

10

−log(α)

Figure 9

The α − βα relation in the DIS-MOL approximation (red), in approximation (15) (blue), and its linear approximation (black dashed) for the Mt /M/st + M model with sinusoidal arrival. The other parameters fixed: a = 100, b = 20, c = 1, θ = 0.5, µ = 1.

Instead of inverting G1 in (14) with a delay probability target γ (which we here obtain from the average delay under DIS-MOL), we can invert G2 in (15) to obtain βα for a given abandonment probability target α, and apply P(Delay)-OL staffing in (13) based on the OL m0 (t) with QoS parameter βα . The simulation estimates are under column ’Garnett 2’ of Table 1. Table 1 shows that ’Garnett 2’ performs nearly the same as ’Garnett 1.’ This is consistent with the results here and in Feldman et al. (2008): In order to stabilize the abandonment probability under heavy loads, it is necessary to use our DIS offered load mα (t) instead of the direct infinite-server OL m0 (t). Figure 7 shows that there exists a function βα , which is strictly decreasing in α with the other parameters fixed, that makes the SRS function perform as well as DIS-MOL in stabilizing the abandonment probability and the expected waiting time. We now proceed to study this function. Figure 9 shows the values of βα with 0.001 ≤ α ≤ 0.4 in natural log scale (the red curve) for the Markovian example with sinusoidal arrival rate function with a = 100, b = 20, c = 1. Figure 7 shows that the function βα is approximately a linear function of − log(α) (thus exponential with α) for


27

3

MOL: a = 100, θ = 0.5, µ = 1 MOL: a = 1000, θ = 0.5, µ = 1 MOL: a = 10, θ = 0.5, µ = 1 MOL: a = 100, θ = 1, µ = 1 MOL: a = 100, θ = 0.25, µ = 1 MOL: a = 100, θ = 4, µ = 1 Linear: a = 100, θ = 0.5, µ = 1 Linear: a = 1000, θ = 0.5, µ = 1 Linear: a = 10, θ = 0.5, µ = 1 Linear: a = 100, θ = 1, µ = 1 Linear: a = 100, θ = 0.25, µ = 1 Linear: a = 100, θ = 4, µ = 1

2.5

2

βα

1.5

1

0.5

0

2

3

4

5

6

7

−log(α)

Figure 10

Comparison of the linear approximation in 16 for βα to the value of βα obtained from the DIS-MOL approximation.

small α (α < 0.1). When α is relatively large (α > 0.1), βα is nearly constant (close to 0) which is consistent with Figure 4, i.e., the simple DIS-OL approximation works well when the system is in heavy load. In Figure 7 we also plot the approximation (15) in Garnett et al. (2002) (the blue curve). It is seen that DIS-MOL is close to (15) for 0.001 < α < 0.1, where the system may be regarded as being in the QED regime. We also want to characterize how βα depends on the other model parameters. First, experiments show that βα depends upon the arrival rate function λ(t) only through its time average R ¯ ≡ (1/T ) T λ(t)dt. Next, we can eliminate µ, i.e., we can let it be 1 by choosing appropriate λ 0

¯ and θ by µ in general. measurement units. That is equivalent to dividing λ

¯ θ, µ) = β(α, λ/µ, ¯ ¯ θ), which is independent of µ, as Therefore, we consider β(α, λ, θ/µ, µ) = β(α, λ, ¯ a function of the abandonment probability target α, the average scaled arrival rate λ/µ, and the scaled abandonment rate θ/µ.


28

For the linear part of Figure 9, we propose the following linear approximation (in − log(α)) as an engineering solution: ¯ θ, µ) ≈ K · (− log(α)) + C(λ/µ, ¯ β(α, λ, θ/µ), q ¯ ¯ where C(λ/µ, θ/µ) ≡ −K log( λ/µ) + H(θ/µ),

(16)

¯ is the average arrival rate, and H(·) is a third-order polynomial, where constant K = 0.43, λ i.e., H(x) = p1 x3 + p2 x2 + p3 x + p4 with p1 = 0.0024, p2 = −0.012, p3 = 0.28, p4 = 0.011. This approximation is simple since it is linear with − log(α) when the other parameters are fixed, also ¯ and θ/µ is separable. its dependence on λ/µ In Figure 10 we plot both βα obtained from the DIS-MOL approximation and its linear approx¯ and θ/µ. Here we still use the Markovian example with sinusoidal imation (16), with different λ/µ ¯ is the average arrival arrival considered in §5. We fix µ = 1. Since the arrival is sinusoidal, a = λ rate. Figure 10 shows that linear approximation (16) is good when a is not too small (a > 10) so that we are in the many-server regime, and when θ/µ is not too big (θ/µ < 4), where the model begins to behave as a loss model.

7. Conclusions Our goal has been to develop a systematic procedure to stabilize the abandonment probability in an Mt /GI/st + GI model with a time-varying arrival rate, across a wide range of performance targets, by providing an appropriate staffing function s(t). We especially wanted to treat the challenging case in which the service times are relatively long compared to the fluctuations of the arrival rate function. We have found that we can indeed stabilize both the abandonment probability and the expected waiting time across a wide range of performance targets by using the delayed-infiniteserver (DIS) model based on two Mt /GI/∞ infinite-server (IS) models in series, introduced in §2. In the approximating DIS model, each customer arrives at the first IS system (representing

the queue) and stays (is delayed) there for a fixed time w = F −1 (α) (the target). If the customer does not abandon, he will proceed to the second IS system (representing the service facility) and


29

receive service. Since the number of busy servers in the service facility, B(t), is a Poisson random variable for each t, it is easy to analyze. Its mean mα (t) ≡ E[B(t)] as a function of the abandonment probability target α is the offered-load function we use in this paper. We gave explicit formulas for mα (t) in §3. We found that the DIS-OL mean mα (t) itself provides an excellent staffing function for heavily loaded systems, but in order to obtain a staffing function that works for all loads, we applied the DIS-OL mα (t) with a new modified-offered-load (MOL) approximation, obtaining our overall DISMOL approximation. As in previous MOL approximations, our MOL approximation exploits the associated stationary model with a constant arrival rate depending on the appropriate offered load. To treat the steady-state of the stationary M/GI/s+GI model, we use an approximation developed in Whitt (2005), which is based on an approximation of the M/GI/s + GI by an associated Markovian M/M/s + M (n) model with state-dependent service rates. Our simulation experiments have shown that the DIS-MOL approximation not only stabilizes expected delays and abandonment probabilities, but it also describes other performance measures, e.g., the expected queue length E[Q(t)]. As in Feldman et al. (2008), we find that other performance measures are stabilized to a great extent, but not fully (across a wide range of performance targets). That was illustrated in Figures 5, 6 and 8. In addition, we have shown that the classical square-root-staffing (SRS) formula in (1) can be applied for staffing to meet abandonment-probability targets provided that we use the DIS offeredload function mα (t), for which we have given explicit formulas in §3. However, in general, it remains to determine an appropriate quality-of-service (QoS) parameter βα . In §6 we developed an explicit approximation formula for the (QoS) parameter βα for the Markovian Mt /M/st + M special case; see (16). (It is not based on a heavy-traffic limit; it is fit directly to the DIS-MOL numerical results.) Its special linear separable structure reveals how performance depends on the model parameters. This work was intended to complement and extend the recent paper by Feldman et al. (2008), which focused on a different performance target: the delay probability. A main contribution of Feldman et al. (2008) was a simulation-based staffing algorithm, the so-called iterative staffing


30

algorithm (ISA). In contrast, our method is entirely based on analytical formulas involving approximations. However, Feldman et al. (2008) also showed that an MOL approximation is effective. Moreover, for the Mt /M/s + M model they too showed that the SRS staffing function can be applied with the appropriate offered load, which for the delay probability is the standard offered load m0 (t). Feldman et al. (2008) showed that the QoS parameter βγ in that context can be determined by inverting the Garnett function in (14), which is obtained from the many-server heavy-traffic limit for the Mt /M/s + M model in Garnett et al. (2002). Unfortunately, we have not yet obtained an analytical expression for the QoS parameter βα in our context directly from a heavy-traffic limit. However, our procedure is asymptotically correct in the efficiency-driven (ED, overloaded) many-server heavy-traffic regime, because then the waiting time is indeed constant; see Whitt (2004, 2006). We have demonstrated that the DIS-MOL approximation is remarkably effective by performing simulation experiments for both Markovian and non-Markovian (in the e-companion) models with sinusoidal arrival-rate functions, when the arrival rates are not too small (around 100). In general, the performance of DIS-MOL tends to improve as the scale (arrival rate and number of servers) increases. In the e-companion we also consider both smaller and larger systems, in particular, for average arrival rates ranging from s = 20 to s = 1000. The performance of DIS-MOL is spectacular for s = 1000 and still reasonable for s = 20. While conducting the simulation experiments, we considered several discretization issues: how to convert a continuous staffing function into an integer-valued staffing function; what is the consequence of agents being required to finish their current services when called to leave; how does the size of a fixed-staffing period affect this approach (discussed in the e-companion). Much work remains to be done in the future. Here are some natural next steps: 1. From different points of view, one may want to stabilize different performance measures. For instance, system managers care about queue lengths, production (departure) rates, loss (abandonment) rates; customers care about delays, abandonment probabilities, delay probabilities; agents


31

care about service utilization, etc. It remains to find alternative approaches to better stabilize other performance measures, such as the expected queue length and service utilizations. 2. The model we have discussed in this paper has a single customer class and a single service pool. It remains to extend the DIS-MOL approximation for multi-class multi-pool systems. 3. The service discipline used here is first-come first-served (FCFS). It remains to find new approaches for stabilizing performance measures under other service disciplines, such as last-come first-served (LCFS), random selection (RS), processor sharing (PS), etc.

References Aksin, Z., M. Armony, V. Mehrotra. 2007. The modern call center: a multi-disciplinary perspective on operations management research. Production and Operations Mgmt. 16 665–688. Brown, L., N. Gans, A. Mandelbaum, A. Sakov, H. Shen, S. Zeltyn, L. Zhao. 2005. Statistical analysis of a telephone call center: a queueing-science perspective. J. Amer. Statist. Assoc. 100 36–50. Eick, S. G., W. A. Massey, W. Whitt. 1993a. The physics of the Mt /G/∞ queue. Oper. Res. 41 731–742. Eick, S. G., W. A. Massey, W. Whitt. 1993b. Mt /G/∞ queues with sinusoidal arrival rates. Management Sci. 39 241–252. Feldman, Z., A. Mandelbaum, W. A. Massey, W. Whitt. 2008. Staffing of time-varying queues to achieve time-stable performance. Management Sci. 54 324–338. Gans, N., G. Koole, A. Mandelbaum. 2003. Telephone call centers: tutorial, review and research prospects. Manufacturing Service Oper. Management 5 79–141. Garnett, O., A. Mandelbaum, M. Reiman. 2002. Designing a call center with impatient customers. Manufacturing Service Oper. Management 4 208–227. Green, L. V., P. J. Kolesar. 1991. The pointwise stationary approximation for queues with nonstationary arrivals. Management Sci. 37 84–97. Green, L. V., P. J. Kolesar, W. Whitt. 2007. Coping with time-varying demand when setting staffing requirements for a service system. Production and Operations Management 16 13–39. Jagerman, D. L. 1975. Nonstationary blocking in telephone traffic. Bell System Tech. J. 54 625–661.


32

Jennings, O. B., A. Mandelbaum, W. A. Massey, Whitt, W. 1996. Server staffing to meet time-varying demand. Management Sci. 42 1383–1394. Kaspi, H. and K. Ramanan. 2008. Law of large numbers limits for many-server queues. Carnegie-Mellon University. Krichagina, I., A. A. Puhalskii. 1997. A heavy-traffic analysis of a closed queueing system with a GI/∞ service center. Queueing Systems 25 235–280. Liu, Y. and W. Whitt. 2009. A fluid model for the Gt /GI/st + GI many-server queue with abandonments and time-varying arrivals, Columbia University, in preparation. Mandelbaum, A., W. A. Massey, M. I. Reiman. 1998. Strong approximations for Markovian service networks. Queueing Systems 30 149–201. Massey, W. A., W. Whitt. 1993. Networks of infinite-server queues with nonstationary Poisson input. Queueing Systems 13 183–250. Massey, W. A., W. Whitt. 1994. An analysis of the modified offered load approximation for the nonstationary Erlang loss model. Ann. Appl. Probabil. 4 1145–1160. Massey, W. A., W. Whitt. 1997. Peak congestion in multi-server sewrvice systems with slowly varying arrival rates. Queueing Systems 25 157–172. Whitt, W. 1991. The pointwise stationary approximation for Mt /Mt /s queues is asymptotically correct as the rates increase. Management Sci. 37 307–314. Whitt, W. 2004. Efficiency-driven heavy-traffic approximations for many-server queues with abandonments. Management Sci. 50 1449–1461. Whitt, W. 2005. Engineering solution of a basic call-center model. Management Sci. 51 221–235. Whitt, W. 2006. Fluid models for multiserver queues with abandonments. Operations Research 54 37–54. Zeltyn S., A. Mandelbaum. 2005. Call centers with impatient customers: many-server asymptotics of the M/M/n+G queue. Queueing Systems 51, 361–402.

e-companion to Liu and Whitt: Stabilizing Performance

This page is intentionally blank. Proper e-companion title page, with INFORMS branding and exact metadata of the main paper, will be produced by the INFORMS office when the issue is being assembled.

ec1

ec2


E-Companion This e-companion has five sections. In §EC.1 we perform simulation experiments to show that the DIS-MOL approximation is effective for stabilizing the abandonment probability and the expected waiting time in Mt /GI/st + GI models with non-exponential service-time and abandonment-time distributions; all experiments in the main paper were for the Mt /M/st + M model. In §EC.2 we observe that the simple DIS-OL approximation, which just uses the offered load mα (t) directly to staff, actually coincides with an associated deterministic fluid model, studied in Liu and Whitt (2009). Since the fluid model arises as the many-server heavy-traffic limit in the efficiency-driven (ED) regime, we better understand why the simple DIS-OL approximation is effective under heavy loads. We show that our DIS-OL model approaches the fluid model as the scale increases. In §EC.3 we consider several discretization issues and real-world constraints for the DIS-OL and DIS-MOL staffing functions. In §EC.4 we report simulation experiments for larger and smaller Mt /M/st + M systems, specifically for the same sinusoidal arrival rate function in (10) for average arrival rates a = 20, a = 50 and a = 1000; the main paper considered the case a = 100. Finally, to put the staffing issue in perspective, in §EC.5 we study the sensitivity of the performance to a change of a single server. To do so, we give numerical results for the stationary M/M/s + M model. These show that errors in meeting the target may be partly due to the impact of a single server.

EC.1. Non-Exponential Distributions In this section we show that the DIS-MOL approximation is also effective for non-Markovian Mt /GI/st + GI models by performing simulation experiments for models with non-Markovian service and abandon distributions. In particular, we consider the following three models: (i) deterministic service times and hyperexponential abandonment times (Mt /D/st + H2); (ii) Erlang service distribution and exponential abandonment times (Mt /E2/st + M ); (iii) Hyper-exponential service times and exponential abandonment times (Mt /H2/st + M ).


ec3

Let A and S be the generic abandon and service times. Here we fix their means E[A] ≡ 1/θ and E[S] ≡ 1/µ with µ ≡ 1 and θ ≡ 0.5. For the Erlang distribution, we consider Erlang-2 (E-2), which has squared coefficient of variation (SCV) c2X ≡ V ar(X)/E[X]2 = 1/2, where X can be A or S. For the Hyper-exponential distribution, X is a mixture of two independent exponential random variables, i.e., X has complementary cdf √ 1 − G(x) = p · e−λ1 x + (1 − p) · e−λ2x . We choose p = 0.5(1 − 0.6), λ1 = 2p/E[X], λ2 = 2(1 − p)/E[X], which yields c2X = 4 (and balanced means). We use the same sinusoidal arrival rate function in (10) with the same parameters: a = 100, b = 20, c = 1. Just as for the Markovian example in the main paper, here we plot the simulation estimations of the expected queue lengths, abandonment probabilities and expected delays for different abandon probability targets α in a 20-hour day for the Mt /D/st + H2 (in Figures EC.1 and EC.2), Mt /H2/st + M ((in Figures EC.3 and EC.4)) and Mt /E2/st + M (in Figures EC.5 and EC.6) models. The figures show that the DIS-MOL approximation consistently performs well.

ec4


Arrival rate

120 110 100 90 80

0

2

4

6

8

10

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

Time


30 20 10 0

0

2

4

6

8

10

Time


0.25 0.2 0.15 0.1 0.05 0

0

2

4

6

8

10

Time

Expected delay

0.3 0.2 0.1 0

0

2

4

6

8

10

Time

Figure EC.1

Simulation estimates of expected queue length, abandonment probability and expected delay for the Mt /D/st + H2 under the DIS-OL staffing, with the system heavily loaded (0.05 ≤ α ≤ 0.20).


10 8 6 4 2 0

0

50

100

150

200

250

300

350

400

25

30

35

40

25

30

35

40

Time


0.06

0.04

0.02

0

0

5

10

15

20

Time

Delay probability

1 0.8 0.6 0.4 0.2 0

0

5

10

15

20

Time

Expected delay

0.08 0.06 0.04 0.02 0

0

20

40

60

80

100

120

140

160

180

200

Time

Figure EC.2

Simulation estimates of expected queue length, abandonment probability and expected delay for the Mt /D/st + H2 under the DIS-MOL staffing, with the system lightly loaded (0.01 ≤ α ≤ 0.05).

ec5


Arrival rate

120 110 100 90 80

0

2

4

6

8

10

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

Time


50 40 30 20 10 0

0

2

4

6

8

10

Time


0.25 0.2 0.15 0.1 0.05 0

0

2

4

6

8

10

Time

Expected delay

0.6

0.4

0.2

0

0

2

4

6

8

10

Time

Figure EC.3

Simulation estimates of expected queue length, abandonment probability and expected delay for the Mt /H2/st + M under the DIS-OL staffing, with the system heavily loaded (0.05 ≤ α ≤ 0.20).


15

10

5

0

0

2

4

6

8

10

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

Time


0.06

0.04

0.02

0

0

2

4

6

8

10

Time

Delay probability

1 0.8 0.6 0.4 0.2 0

0

2

4

6

8

10

Expected delay

Time 0.1 0.08 0.06 0.04 0.02 0

0

2

4

6

8

10

Time

Figure EC.4

Simulation estimates of expected queue length, abandonment probability and expected delay for the Mt /H2/st + M under the DIS-MOL staffing, with the system lightly loaded (0.01 ≤ α ≤ 0.05).

ec6


Arrival rate

120 110 100 90 80

0

2

4

6

8

10

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

Time


50 40 30 20 10 0

0

2

4

6

8

10

Time


0.25 0.2 0.15 0.1 0.05 0

0

2

4

6

8

10

Expected delay

Time 0.5 0.4 0.3 0.2 0.1 0

0

2

4

6

8

10

Time

Figure EC.5

Simulation estimates of expected queue length, abandonment probability and expected delay for the Mt /E2/st + M under the DIS-OL staffing, with the system heavily loaded (0.05 ≤ α ≤ 0.20).


15 10 5 0

0

2

4

6

8

10

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

Time


0.06 0.04 0.02 0

0

2

4

6

8

10

Time

Delay probability

1

0.5

0

0

2

4

6

8

10

Expected delay

Time 0.1

0.05

0

0

2

4

6

8

10

Time

Figure EC.6

Simulation estimates of expected queue length, abandonment probability and expected delay for the Mt /E2/st + M under the DIS-MOL staffing, with the system lightly loaded (0.01 ≤ α ≤ 0.05).


ec7

EC.2. Connection to Time-Varying Deterministic Fluid Models In this section we relate the simple DIS-OL approximation to a deterministic fluid model for the Gt /GI/st + GI queueing model with time-varying arrival rate introduced in Whitt (2006) and further studied in Liu and Whitt (2009). The simple DIS-OL approximation staffs exactly at the mean value of B(t), i.e., the OL function of the DIS-OL approximation, the time-dependent Poisson number of busy servers determined by the offered-load analysis. Interestingly, that coincides exactly with what we would do with the deterministic fluid model. We can obtain the deterministic fluid model by ignoring the stochastic fluctuations. Alternatively, we can obtain the fluid model by considering a many-server heavy-traffic limit in the overloaded or efficiency-driven (ED) regime; see Mandelbaum et al. (1998), Krichagina and Puhalskii (1997), Whitt (2006), Kaspi and Ramanan (2008). In this limit, both the arrival rate and the number of servers are allowed to increase without bound, so that the system becomes overloaded, at least a significant portion of time. This many-server fluid model is very useful, in large part because the service-time cdf G and the abandon-time cdf F still play a role beyond their means. The deterministic fluid limit is elementary to establish if we do it in the context of the OL approximation. In that case we can simply multiply the arrival-rate function by a constant c and let c → ∞. The corresponding random variables Qc (t) and Bc (t) depending on the scaling parameter c will still be Poisson, but with mean values precisely c times the value for c = 1. Thus the scaled random variables Qc (t)/c and Bc (t)/c have mean values independent of c, but their variances are this common mean value divided by c. Thus these scaled random variables converge to the deterministic mean values E[Q(t)] and E[B(t)] as c → ∞. However, the deterministic fluid model we obtain in this way is only a special case of the fluid model considered in Liu and Whitt (2009), because it is only for the case in which the waiting time is stabilized at the constant value w. The more general fluid model is constructed and analyzed in Liu and Whitt (2009). Using the Mt /M/st +M example with sinusoidal arrival that we considered in §5, we showed that the DIS-MOL approximation stabilizes the expected delays and the abandonment probabilities,

ec8


a = 300

Abandonment Probability


a = 100 0.2 0.15 0.1 0.05 0

0

2

4

6

8

10

12

14

16

18

0.2 0.15 0.1 0.05 0

20

0

2

4

6

8

0.4 0.3 0.2 0.1 0

0

2

4

6

8

10

12

14

16

18

20



0.1 0.05 6

8

10

12

14

16

18

0

0

2

4

6

8

0.1 6

8

10

16

18

20

12

14

16

18

20

12

14

16

18

20

0.2

0.1 0.05 0

0

2

4

6

8

10

12

14

16

18

20

0.4 0.3 0.2 0.1 0

0

Time

Figure EC.7

14

0.15

Expected delay

Ex[ected delay

0.2

4

10

Time

0.3

2

12

0.1

20

0.4

0

20

0.2

Time

0

18

a = 3000

0.2

4

16

Time

0.15

2

14

0.3

a = 1000

0

12

0.4

Time

0

10

Time

Expected delay

Expected delay

Time

2

4

6

8

10

Time

One sample path of expected delays and abandonment probabilities with a = 100, 300, 1000, 3000, α = 0.1.

which are estimated by averaging 500 independent samples. However, for the fluid model, no averaging is needed. Thus we should anticipate that less and less averaging is needed as the scale increases, and that is confirmed by simulation experiments. In order to investigate the extent that averaging is needed, we now plot single sample paths of expected delays and abandonment probabilities with differently scaled arrival-rate functions in Figures EC.7-EC.9. Here we still let λ(t) be sinusoidal in (10), but we increase the average arrival rate a (a between 100 and 3000) while maintaining the ratio b/a = 0.2. Notice that the DIS-OL staffing functions are scaled by the same scaling factors because it is a linear function of a. In Figures EC.7, EC.8

ec9


a = 100

a = 300



0.1

0.05

0

0

2

4

6

8

10

12

14

16

18

0.1

0.05

0

20

0

2

4

6

8

Expected delay

Expected delay

0.2 0.15 0.1 0.05 0

0

2

4

6

8

10

12

14

16

18

18

20

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

0.1

0

0

2

4

6

8

10

Time

a = 3000 0.1



16

0.05

a = 1000

0.05

0

2

4

6

8

10

12

14

16

18

20

0.05

0

0

2

4

6

8

Time 0.2 0.15 0.1 0.05 0

2

4

6

8

10

12

14

Time

Figure EC.8

10

Time

Expected delay

Expected delay

14

0.2

20

0.1

0

12

0.15

Time

0

10

Time

Time

16

18

20

0.2 0.15 0.1 0.05 0

0

2

4

6

8

10

Time


and EC.9, the abandonment probability targets are α = 0.1, 0.05 and 0.02, respectively. We observe from Figure EC.7 that stochastic fluctuations are tremendous and neither expected delays nor abandonment probabilities are flat when a is relatively small (a=100, 300). This somehow explains why we can only stabilize the means calculated by 500 independent samples in §5. However, when a gets large (a = 1000, 3000), even single sample paths of expected delays and abandonment probabilities are stable. In other words, we don’t even have to consider the means of these performance measures when a is sufficiently large. Figure EC.7 shows that the DIS model converges to the corresponding fluid model as the λ(t) and s(t) both go to infinity in the prescribed way. It is easy to see that the stochastic arrival process, abandoning process, entering-service process and departure process all become continuous

ec10


a = 300



a = 100 0.04

0.02

0

0

2

4

6

8

10

12

14

16

18

20

0.04

0.02

0

0

2

4

6

8

Time

Expected delay

0.06 0.04 0.02 2

4

6

8

10

12

14

16

18

2

4

6

8

0

2

4

6

8

a = 3000

12

14

16

18

20

Expected delay

Expected delay 4

6

8

10

12

14

16

18

20

Time

Figure EC.9

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

0.04

0.02

0

0

2

4

6

8

10

Time

0.05

2

10

a = 1000

0.1

0

20

0.02

Time

0

18

0.04

Time

10

16

0.06

20

0.02

0

14

0.1

Time

0.04

0

12

0.08

0 0



Expected delay

0.1 0.08

0

10

Time

0.1

0.05

0

0

2

4

6

8

10

Time


(fluid) in this case. However, Figure EC.8 and EC.9 show that this convergence becomes slow when the system is lightly loaded. In other words, it requires larger a to stabilize one sample path of these performance measures when α is smaller. With this deterministic approximation, the time-dependent number of servers used is deterministic, so we can staff according to it. However, we have just seen that the time-dependent fluid content in this deterministic fluid model coincides with the time-dependent mean in the offered-load approach described above, so the fluid model only adds a different interpretation to the offeredload approximation. The offered-load approximation has the additional advantage that it yields approximations for the distributions beyond their means. Even better approximations for the distributions may be possible from a refined many-server heavy-traffic stochastic-process limit for the


ec11

Mt /GI/st + GI model, but such a limit remains to be established and, of course, it remains to investigate the resulting approximations.

ec12


EC.3. Discretization Issues and Real-World Constraints In this section we consider several discretization issues and real-world constraints for the DIS-OL and DIS-MOL staffing functions. First, our basic method produces continuous staffing levels, but we have to staff at integer levels. To be conservative, we could always round up, but to obtain greater accuracy, we round to the nearest integer, as shown in Figure EC.10. This does not affect the performance much. When the staffing schedule calls for decreasing the number of servers, we impose a real system constraint: We do not release any server until that server completes the service in process. However, we make an additional assumption that greatly helps achieve the required staffing level: We assume that server assignments can be switched when a server is scheduled to leave. In other words, if a specific server is scheduled to depart, then any other available server can complete the service. That means that we must only wait until the next service completion by any server when all servers are busy and the staffing level is scheduled to decrease by one. This realistic feature makes the actual staffing level always lie above the integer-valued scheduled staffing level, as shown in Figure EC.10. As a consequence, the system will be slightly overstaffed when the staffing function s(t) decreases. However, this effect tends to be insignificant when there

90

88

86

Number of servers

84

82

80

78

76

74

Target (Discrete) Actual

72

Target (Continuous) 70

2

2.5

3

3.5

4

4.5

time

Figure EC.10

The discretized staffing function and the actual staffing function.


ec13

are a large number of servers (The number of servers ranges from 70 to 120 in the Markovian example considered in §5), provided we allow server switching as assumed above. If the service-time distribution is exponential and all servers are busy when the number of servers is called to decrease, the lack-of-memory property of the exponential distribution implies that the time lag between the actual jump-down time of s(t) and the scheduled one is 1/nµ, if there are currently n servers in the system. We can propose methods to counter this effect. In particular, during periods of planned staffing decrease, we can deliberately set the staffing level a bit lower to anticipate for that lag in releasing servers. One direct approach is to re-schedule a new staffing function which takes into consideration these time lags since they can easily be quantified. In §5 we don’t have to modify the DIS-OL and DIS-MOL staffing functions for the Mt /M/st + M example with sinusoidal arrival since this effect is insignificant, as shown in Figure 5 and 6. It is also interesting to consider another realistic constraint for the staffing function of the DIS-MOL approximation: Many real service systems have staffing periods with constant staffing levels. For example, it is common to only change staffing levels every half hour or every hour. We now consider this case. It is evident that the smaller the staffing period is, the less accuracy we should lose because of the requirement that the staffing function remain constant over each staffing interval. In Figures EC.11, EC.12 and EC.13, we plot the abandonment probabilities, expected delays with abandonment probability target α = 0.1, 0.05 and 0.02, respectively for two cases: fixed staffing periods of 2 hours and 30 minutes. When the staffing period is long (2 hours), fluctuations brought by the crudeness of the discreteness of the staffing functions are unavoidable. However, these performance measures are still quite stable except during the initial transient periods, which is caused by the steep rise of the DIS-MOL staffing function.

ec14


EC.4. Small Arrival Rate and Number of Servers In this final section we investigate how the DIS-OL and DIS-MOL approximations perform for smaller systems, when the average arrival rate (and thus the number of servers) is small. We also consider a very large system to give an overall perspective on scale. (Large systems were also considered in §EC.2.) We again consider the Markovian Mt /M/st + M example with sinusoidal arrival rate (λ(t) =

Staffing Levels

(a) Period is 2 hours 100 50 0

0

2

4

6

8

10

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

Expected delay

Time 0.4 0.2 0

0

2

4

6

8

10


Time 0.2 0.1 0

0

2

4

6

8

10

Time

Staffing Levels

(b) Period is 30 minutes 100 50 0

0

2

4

6

8

10

Expected delay

Time 0.4 0.2 0

0

2

4

6

8

10


Time 0.2 0.1 0

0

2

4

6

8

10

Time

Figure EC.11

Simulation estimates for the Mt /M/st + M example with sinusoidal arrival under DIS-MOL staffing with fixed staffing period d = 30 minutes and 2 hours, α = 0.1.

ec15


a + b · sin(c · t)). As before, we use the parameters µ = 1 and θ = 1/2. In Figure EC.14-EC.21, we plot simulation estimates with parameters a = 20, 50, and 1000, b = 0.2a, c = 1. In each case, we consider the system heavily and lightly loaded. When a = 20 and 50, the number of servers is around 20 and 50. Performance measures (expected delay and abandonment probability) have bigger fluctuations than the case a = 100. But when a = 1000, all performance measures coincide with their targets perfectly, because of the fluid effect

(a) Period is 2 hours

Staffing Levels

100 50 0

0

2

4

6

8

10

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

Expected delay

Time 0.4 0.2 0

0

2

4

6

8

10


Time 0.2 0.1 0

0

2

4

6

8

10

Time

(b) Period is 30 minutes

Staffing Levels

100 50 0

0

2

4

6

8

10

Time

Expected delay

0.2

0.1

0

0

2

4

6

8

10


Time 0.1 0.05 0

0

2

4

6

8

10

Time

Figure EC.12


ec16


that we have discussed in §EC.2.

(a) Period is 2 hours

Staffing Levels

100 50 0

0

2

4

6

8

10

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

Expected delay

Time 0.2 0.1 0

0

2

4

6

8

10


Time 0.1 0.05 0

0

2

4

6

8

10

Time

Staffing Levels

(b) Period is 30 minutes 100 50 0

0

2

4

6

8

10

Time

Expected delay

0.1

0.05

0

0

2

4

6

8

10


Time 0.04 0.02 0

0

2

4

6

8

10

Time

Figure EC.13


ec17

Arrival rate


25

20

15

0

2

4

6

8

10

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20


Time 10

5

0

0

2

4

6

8

10


Time

0.2 0.15 0.1 0.05 0

0

2

4

6

8

10

Delay probability

Time 1 0.8 0.6 0.4

0

2

4

6

80

10

Expected delay

Time

0.4

0.2

0

0

2

4

6

8

10

Time

Figure EC.14

Simulation estimates for the Mt /M/st + M example with sinusoidal arrival rate function in (10) when a = 20, b = 4, c = 1. The system is heavily loaded (α = 0.05, 0.10, 0.15, 0.20).

For the smaller system with a = 20, Figures EC.14 and EC.16 show that the realized abandonment probabilities are stabilized at a level somewhat below the target. It is natural to wonder if that is due to the effect of a single server (and so is actually the correct staffing) or if instead it is a genuine error due to the smaller scale. In order to address this question, we consider the simple refinement of using one server less, i.e., we replace s(t) by s(t) − 1. Indeed, that can be achieved by rounding down instead of rounding up. The results of this modification are shown in Figures EC.15 and EC.17. We see that the performance sometimes exceeds the target with staffing by s(t) − 1, but it is actually much closer. Such a modification is easy to consider in applications when verifying the results by simulation.

ec18

Arrival rate


25

20

15

0

2

4

6

8

10

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20


Time

10

5

0

0

2

4

6

8

10


Time

0.2 0.15 0.1 0.05 0

0

2

4

6

8

10

Delay probability

Time 1 0.8 0.6 0

2

4

6

8

10

Expected delay

Time 0.6 0.4 0.2 0

0

2

4

6

8

10

Time

Figure EC.15

Simulation estimates for the Mt /M/st + M example with sinusoidal arrival rate function in (10) with staffing s(t) − 1, when a = 20, b = 4, c = 1. The system is heavily loaded (α = 0.05, 0.10, 0.15, 0.20).

We also remark that we see that a single server does have a large impact. We explore this issue further in §EC.5.

ec19


EC.5. Sensitivity of Performance to a Single Server In order to evaluate the performance of alternative staffing algorithms, it is useful to understand the impact of a single server. We should not expect more precision than the effect of one server. We could staff too high (so that the performance is below the target), but one server less might fail to satisfy the target constraint. In that case, we have actually achieved our goal, even though the simulation shows some error. In order to gain insight into the sensitivity of the abandonment probability P (Ab) to a single server, we give exact numerical results for the stationary M/M/s + M model in Table EC.5. In particular, we do a sensitivity analysis showing the impact of a single server as a function of the target α and the arrival rate λ. We consider the stationary M/M/s + M model with abandonment probability targets α = 0.2, 0.1, 0.01 and 0.005, and arrival rates λ = 20, 100 and 1000. The other


parameters are fixed: µ = 1, θ = 0.5.

1

0.5

0

0

2

4

6

8

10

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20


Time 0.02

0.01 0.005 0

0

2

4

6

8

10

Delay probability

Time

0.2 0.1 0

0

2

4

6

8

10

Expected delay

Time 0.04 0.03 0.02 0.01 0

0

2

4

6

8

10

Time

Figure EC.16

Simulation estimates for the Mt /M/st + M example with sinusoidal arrival rate function in (10) when a = 20, b = 4, c = 1. The system is lightly loaded (α = 0.005, 0.01, 0.02).

ec20


Table EC.5 shows how the errors depend on the two parameter λ and α. As we would expect, the sensitivity increases as λ, and thus the number of servers, decreases. Under heavy loads, we can predict the quantitative impact from the heavy-traffic approximation P (Ab) ≈ (λ − sµ)/λ; see Whitt (2004). The impact of α depends on how we look at it. The absolute error increases as α increases. As α increases, we get more and more overloaded. Then the formula above applies. But under lower α, there has to be more slack, so that one server can make less difference. However, On the other hand, Table EC.5 shows that the relative error as a function of α actually goes the other way. It does not change so much.


4 3 2 1 0

0

2

4

6

8

10

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20


Time 0.03 0.02 0.01 0

0

2

4

6

8

10

Delay probability

Time 0.4 0.3 0.2 0.1 0

0

2

4

6

8

10

Expected delay

Time 0.06 0.04 0.02 0

0

2

4

6

8

10

Time

Figure EC.17

Simulation estimates for the Mt /M/st + M example with sinusoidal arrival rate function in (10) with staffing s(t) − 1, when a = 20, b = 4, c = 1. The system is lightly loaded (α = 0.005, 0.01, 0.02).

ec21

Arrival rate


60 50 40

0

2

4

6

8

10

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20


Time 20 10 0

0

2

4

6

8

10


Time 0.2 0.1 0

0

2

4

6

8

10

Delay probability

Time 1

0.8

0.6

0

2

4

6

8

10

Expected delay

Time

0.4 0.2 0

0

2

4

6

8

10

Time

Figure EC.18


ec22



2.5 2 1.5 1 0.5 0

0

2

4

6

8

10

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20


Time

0.02

0.01 0.005 0

0

2

4

6

8

10

Delay probability

Time 0.4 0.3 0.2 0.1 0

0

2

4

6

8

10

Time

Expected delay

0.05 0.04 0.03 0.02 0.01 0

0

2

4

6

8

10

Time

Figure EC.19


ec23

Arrival rate


1200 1000 800

0

2

4

6

8

10

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20

12

14

16

18

20


Time

400

200

0

0

2

4

6

8

10


Time

0.2 0.15 0.1 0.05

Delay probability

0

0

2

4

6

8

10

Time 1 0.98 0.96 0

2

4

6

8

10

Expected delay

Time

0.4

0.2

0

0

2

4

6

8

10

Time

Figure EC.20


ec24



60

40

20

0

0

50

100

150

200

250

300

350

400

25

30

35

40

25

30

35

40


Time

0.02

0.01 0.005 0

0

5

10

15

20

Delay probability

Time 1 0.8 0.6 0.4 0.2

0

5

10

15

20

Time

Expected delay

0.05 0.04 0.03 0.02 0.01 0

0

20

40

60

80

100

120

140

160

180

200

Time

Figure EC.21


ec25


α

λ

Staffing level s Ps (Ab) Ps−1 (Ab)

20 100 1000 20 0.1 100 1000 20 0.01 100 1000 20 0.005 100 1000 0.2

Table EC.1

17 81 801 19 91 901 26 108 1001 27 111 1015

0.1681 0.1901 0.199 0.0997 0.0945 0.099 0.0072 0.0088 0.01 0.0045 0.0049 0.0049

0.2095 0.2001 0.2000 0.1312 0.1034 0.1000 0.0112 0.0106 0.0105 0.0072 0.006 0.0052

Absolute error Ps−1 (Ab) − Ps (Ab) 0.415 0.01 0.001 0.0315 0.0088 1.00E-03 0.004 0.0018 4.67E-04 2.70E-03 0.0011 2.75E-04

Relative error Abs.err./Ps (Ab) 0.2467 0.0524 0.005 0.3165 0.0933 0.0101 0.5624 0.2027 0.0468 0.5984 0.2228 0.0557

Sensitivity analysis for the stationary M/M/s + M model, with abandonment probability target

α = 0.2, 0.1, 0.01 and 0.005, arrival rate λ = 20, 100 and 1000. The other parameters are fixed: µ = 1, θ = 0.5. Ps (Ab) is the steady-state abandonment probability with s servers.

Stabilizing Customer Abandonment in Many-Server Queues with Time ...

Stabilizing Customer Abandonment in Many-Server Queues with Time ...

Suggest Documents

Many-server queues with customer abandonment: Numerical ... - arXiv

QUEUES WITH BEAKDOWNS AND CUSTOMER ...

Two queues with random time-limited polling

A Heavy Traffic Approximation for Queues with Restricted Customer ...

waiting-time tail probabilities in queues with long ... - Semantic Scholar

Sojourn time asymptotics in processor-sharing queues

1 queues with partial

S queues with catastrophes

1 Queues with Working

BIN PACKING WITH QUEUES

Multiserver Queues with Finite Capacity and Setup Time

Explicit solutions for queues with Hypo-exponential service time and ...

Queues with waiting time dependent service - Semantic Scholar

Explicit solutions for queues with Hypo-exponential service time and ...

Some Algorithms for Discrete Time Queues with Finite ... - ECE@IISc

Sociodemographic factors associated with infant abandonment in ...

Discrete-time queues with generally distributed service times and ...

waiting time analysis of multi-class queues with impatient customers

The Asymptotic Behavior of Queues with Time-Varying ... - CiteSeerX

Packet Routing over Parallel Time-Varying Queues with Application to ...

Queues with Time-Varying Arrivals and ... - Columbia University

Pseudo Priority Queues for Real-Time

Synchronized Queues with Deterministic Arrivals

Queues