Admission Control for a Multi-Server Queue with Abandonment Yas¸ar Levent Koc¸a˘ga1 and Amy R. Ward2 April 13, 2010 Abstract In a M /M /N +M queue, when there are many customers waiting, it may be preferable to reject a new arrival rather than risk that arrival later abandoning without receiving service. On the other hand, rejecting new arrivals increases the percentage of time servers are idle, which also may not be desirable. We address these trade-offs by considering an admission control problem for a M /M /N +M queue when there are costs associated with customer abandonment, server idleness, and turning away customers. First, we formulate the relevant Markov decision process (MDP), show that the optimal policy is of threshold form, and provide a simple and efficient iterative algorithm that does not presuppose a bounded state-space to compute the minimum infinite horizon expected average cost and associated threshold level. Under certain conditions, we can guarantee that the algorithm provides an exact optimal solution when it stops; otherwise, the algorithm stops when a provided bound on the optimality gap is reached. Next, we solve the approximating diffusion control problem (DCP) that arises in the HalfinWhitt many-server limit regime. This allows us to establish that the parameter space has a sharp division. Specifically, there is an optimal solution with a finite threshold level when the cost of an abandonment exceeds the cost of rejecting a customer; otherwise, there is an optimal solution that exercises no control. This analysis also yields a convenient analytic expression for the infinite horizon expected average cost as a function of the threshold level. Finally, we propose a policy for the original system that is based on the DCP solution, and show that this policy is asymptotically optimal. Our extensive numerical study shows that the control that arises from solving the DCP achieves a very similar cost to the control that arises from solving the MDP, even when the number of servers is small.
Keywords: admission control, customer abandonment, Markovian decision process, diffusion control problem, Halfin-Whitt QED limit regime, average cost.
1
Information and Operations Management Department, Marshall School of Business , University of Southern California,
[email protected] 2 Information and Operations Management Department, Marshall School of Business , University of Southern California,
[email protected]
1
1
Introduction
Customers will not wait an indefinite amount of time for service. Hence it may occur that a customer that has already joined a queue for service will leave before this service begins, because he has decided that the wait time for that service is excessive. In an effort to prevent this phenomenon, it may be preferable to turn away a newly arriving customer rather than have that customer join the queue, but possibly abandon the queue later without receiving service. This suggests that it is natural to consider admission control in any queueing model that incorporates customer abandonment. Loosely motivated by a call center application, we consider a many server setting. The admission control problem is relevant for call centers because it could be equated to a decision on whether or not to outsource an arriving call, and many companies use an outsourcer to help them answer calls when the number of callers on-hold is large. For example, 41.3% of the respondents in a recent survey (ICMI 2006) cited handling overload as one of the major drivers of the outsourcing decision. We formulate an admission control problem for a M /M /N +M queue when there are costs associated with customer abandonment, server idleness, and turning away customers. Our objective is to minimize the expected long-run average cost. The control problem is a continuous-time Markov Decision Process (MDP). Because we do not assume an upper bound on the state space, the transition rates and customer abandonment costs are potentially unbounded. Hence it is not possible to solve the MDP by constructing a uniformized chain. Therefore, we consider the following alternative approach: Through an iterative algorithm that truncates the state space we show that a threshold control policy is optimal for this problem under certain conditions. The algorithm stops either when a local minimum is found or when a provided bound on the optimality gap is reached. When the abandonment cost a exceeds the cost of rejecting a customer c, we can guarantee that finding a local minimum coincides with finding an optimal policy having a finite threshold level. This leads us to conjecture that there is a sharp division of the parameter space. Specifically, when a > c, there is an optimal policy that has a finite threshold level, and when a ≤ c, there is an optimal policy that exercises no control (i.e., that admits every customer). We show this conjecture is valid by solving the approximating diffusion control problem that arises in the Halfin-Whitt many-server limit regime. The solution to the approximating diffusion control problem (DCP) is also of threshold form, and there is a convenient analytic expression for the expected long-run average cost as a function of the threshold level. We compute the threshold level l? that minimizes the expected long-run average cost, and propose a threshold admission control policy for the original h√ iM /M /N +M system that turns away customers whenever the queue ? 3 reaches the level N + N l .We prove that this policy is asymptotically optimal in the HalfinWhitt limit regime. We also provide numerics that show that the control that arises from solving the DCP achieves a cost very similar to that of the optimal control that arises from solving the 3
Throughout the text, [x] denotes the nearest integer function, that is the closest integer to x ∈
0. Each customer abandonment costs a > 0. Finally, each idle server costs hI ≥ 0 per unit time. We capture the control decision by the function θ : Z + → {0, 1} where Z + = {0, 1, 2, ...} is the set of non-negative integers. The system manager rejects an arriving customer that finds x customers in the system if θ(x) = 1 and admits the customer otherwise. Note that we are restricting the class of admissable policies to the class of stationary policies. The further restriction to deterministic policies is without loss of generality because we show in Section 3 that, under certain conditions, there exists a deterministic optimal policy within the class of stationary policies. We are interested in threshold policies. A threshold policy with threshold level l has θ(x) = 0 for all x < l and θ(x) = 1 for all x ≥ l. Any policy within the class of deterministic stationary policies is equivalent to a threshold policy when the system starts empty4 . The threshold level l is exactly the smallest state in which customers are rejected. When the threshold level is finite, the system dynamics are equivalent to a M /M /N +M queueing model with a finite buffer, of size determined by the threshold level. Any policy within this class is trivially of threshold form when the system starts empty. The threshold level is exactly the smallest state for which customers are rejected. 4
Consider a policy that rejects customers in state 1, does not reject customers in states 2 and 3, and rejects customers in states 4 and higher. If the system starts empty, this policy is equivalent to a threshold policy with threshold level 1. However, if the system starts in state 2, it is not.
4
We now specify the system evolution equations for the process X that denotes the total number of customers in the system. Let λ > 0 and µ > 0 represent the system arrival rate and service rate of an individual server. Let 1/γ for γ > 0 be the mean time until a customer abandons. The arrival, service and abandonment (or lost customer) processes are formed from the independent, standard Poisson processes A, S, and L. Then, Z
t
θ(X(s))dA(s) X(t) := X(0) + A(t) − 0 Z t Z t + −S µ [N ∧ X(s)] ds − L γ [X(s) − N ] ds , t ≥ 0. 0
(2.1)
0
The total system cost up to time t is Z t Z t Z t + + γ [X(s) − N ] ds , t ≥ 0. cθ (X(s)) dA(s) + aL hI [N − X(s)] ds + ξ(t) := 0
0
0
(2.2) Let A denote the set of functions θ defined on the non-negative integers and having range {0, 1}. Our objective is to find the policy that achieves the minimum average cost defined as z ? := min lim inf θ∈A
t→∞
E [ξ(t)] . t
(2.3)
It is useful to define the holding cost rate function h(x) := hI [N − x]+ + aγ [x − N ]+ , when the total number of customers in the system is x, and note that Z t Z t θ (X(s)) dA(s) . h(X(s))ds + cE E [ξ(t)] = E 0
0
Next, we present the Markovian Decision Process analysis for our model.
3
The Markov Decision Process (MDP)
Observe that when there is no admission control the process X in (2.1) is a birth-and-death process with birth rate λ and state-dependent death rate µ(n) := µ (n ∧ N ) + γ [n − N ]+ , and we can formulate the admission control problem as a Markov decision process. Note however that the state space for the process X in (2.1) can potentially be countably infinite which will result 5
in unbounded death rates. Therefore our model does not fit into the standard MDP framework where one can apply uniformization. Instead, we pursue an alternative approach and devise an optimal policy through the following iterative algorithm. The algorithm first determines the system cost under a threshold policy at level l. At each next step, the algorithm increases the threshold level l by one and calculates the system cost. Under the assumption that c < a, we establish that the first local minimum reached is a global minimum. In general, there is a stopping criterion for the algorithm that is based on a calculatable optimality gap. We start our analysis by showing in Section 3.1 that within the class of stationary policies there exists a deterministic optimal policy. In Sections 3.2 and 3.3, we provide the aforementioned iterative algorithm and establish its properties.
3.1
The Optimality Equations
The first step is to provide the optimality equations for the relative value function v, and average cost constant z (see, for example, Puterman (1994)). Standard arguments in MDP analysis show that v and z must satisfy ( h(n)−z ) µ(n) λ + λ+µ(n) v(n − 1) + λ+µ(n) v(n + 1), λ+µ(n) v(n) = min , for n ∈ {1, 2, . . .}, h(n)−z µ(n) λ + λ+µ(n) v(n − 1) + λ+µ(n) (v(n) + c) λ+µ(n) and λv(1) = λv(0) − h(0) + z. It is preferable for our analysis to work with the relative cost differences y(n) := v(n) − v(n − 1) for n ∈ {1, 2, . . .}. Then, we can re-write the preceding equation that v must satisfy in terms of y as follows h(n) − z = µ(n)y(n) − λ min (y(n + 1), c) , for n ∈ {1, 2, . . .},
(3.1)
λy(1) = z − h(0).
(3.2)
and Our first theorem shows that there is no randomized policy that has a lower infinite horizon average expected cost than z when z is finite and (y(1), y(2), . . .) is a uniformly bounded sequence. Let ΠS denote the set of stationary policies. A policy in this class π S = (p(n) : n ∈ {0, 1, . . .}) specifies the probability p(n) of admitting a customer for each possible system state n. Note that the set ΠS includes both deterministic and randomized policies. Theorem 3.1 Suppose there exists z < ∞ and a uniformly bounded sequence (y(1), y(2), ...) satisfying equations (3.1) and (3.2). Then if z S is the average cost associated with a policy π S ∈ ΠS , z S ≥ z. 6
Given that there exists a solution to the optimality equations (3.1)-(3.2), one can easily see that the first level n with y(n + 1) ≥ c will be the optimal threshold level. The problem is that we do not know upfront that a solution to equations (3.1)-(3.2) exists.
3.2
The n-Terminating Problem
Unfortunately, it is not obvious how to find a solution to (3.1)-(3.2). The key is to introduce the n-terminating problem. In the n-terminating problem, we would like to find a constant z n and a vector (y n (1), y n (2), ...y n (n), y n (n + 1)) that satisfy h(k) − z n = µ(k)y n (k) − λy n (k + 1) for all k ∈ {1, . . . , n}, n
n
λy (1) = z − h(0) and,
(3.3) (3.4)
n
y (n + 1) = c.
(3.5)
The interpretation for z n is that z n is the average cost associated with a threshold policy at level n. Also because the n-terminating problem is a Markovian Reward Process with finite state space, the existence of a unique solution to equations (3.3), (3.4) and (3.5) is well established in theory (see, for example, Proposition 8.2.1 and Corollary 8.2.7 in Puterman (1994)). The following Lemma establishes a useful property of the n-terminating problem, that the vector y n is increasing in z n . Lemma 3.1 Suppose z n1 > (≥) z n2 for some non-negative integers n1 and n2 . Then, y n1 (k) > (≥) y n2 (k) for k ∈ {1, 2, . . . , min(n1 , n2 ) + 1}. We use the n-terminating problem (3.3)-(3.5) to construct a threshold policy that achieves the minimum possible average cost within the class of stationary policies. Specifically, we solve a sequence of n-terminating problems (3.3)-(3.5), and search for the first local minimum z m having z m < z k for all k ∈ {0, 1, . . . , m − 1}, with z m+1 ≥ z m . Consistent with the intuition that it does not make sense to reject an arriving customer when there is a free server, the following Lemma establishes that we do not expect to find a local minimum before state N . Lemma 3.2 The sequence of solutions to the n-terminating problem (3.3)-(3.5) has z 0 > z 1 > · · · > z N −1 > z N . We end this section with a theorem that shows that finding such a z m is equivalent to knowing the z in a solution to the optimality equations (3.1)-(3.2).
7
Theorem 3.2 Let c < a and suppose there exists a sequence of solutions to the n-terminating problem (3.3)-(3.5) such that z m < z k for all k ∈ {0, 1, . . . , m − 1}, and z m+1 ≥ z m . Then, if z S is the average cost associated with a policy π S ∈ ΠS , zS ≥ zm.
3.3
Policy Computation
We are now in a position to construct an algorithm to find a threshold policy that achieves the minimum possible average cost within the class of stationary policies. The algorithm is similar in spirit to Adusumilli and Hasenbein, but is adapted to our setting. In particular, the proofs supporting the algorithm (Theorem 3.2 above and Theorems 3.3 and 3.4 below) must be changed in a non-trivial fashion to account for the state-space dependence caused by customer abandonments. The algorithm leverages Lemma 3.2 and starts by setting the threshold level to N . Initialization: Set n = N + 1. Step 1: Solve the n-optimality equations (3.3)-(3.5). Step 2: If a solution to the n-optimality equations (3.3)-(3.5) exists and z n ≥ z n−1 , then the (n − 1)-terminating policy is optimal; that is, if z S is the average cost associated with a policy π S ∈ ΠS , z S ≥ z n−1 . Otherwise, increase n by 1 and go to step 1. There are two cases: either there will be a local minimum, or there will not be a local minimum. In the case that there is a local minimum and c < a, Theorem 3.2 implies we have found an optimal policy. If there is not a local minimum, the sequence of solutions to the n-terminating problem converges to z ? . Theorem 3.3 If the sequence of solutions to the n-terminating problem is decreasing; i.e., if z0 > z1 > z2 > · · · , then z ? = lim z n . n→∞
If no local minimum has been found after n iterations, the following theorem can be used as a stopping criterion. Theorem 3.4 If z 0 > z 1 > · · · z n , then z n − z ? ≤ λ(c − y n (n)). Note we do not know upfront whether the algorithm will stop at an optimal threshold. Moreover there is no sharp condition that determines when it is never optimal to reject a customer. 8
4
The Diffusion Control Problem
We formulate and solve an approximating diffusion control problem. Section 4.1 states the approximating diffusion control problem. In Section 4.2 we characterize the long-run average expected cost associated with a threshold policy. Finally, in Section 4.3 we show that the policy that minimizes long-run average expected cost among all policies is a threshold policy.
4.1
The Approximating Control Problem
Let B be a standard Brownian motion, and let Uˆ be a process that is adapted to B. The process Z t ˆ ˆ ˆ X(t) = X(0) + σB(t) + m X(s) ds − Uˆ (t), (4.1) 0
where
2
σ = 2µ and m(x) =
−m − µx x ≤ 0 −m − γx x > 0
√ , and the process approximates the centered and scaled number of customers in the system, X(t)−N N Uˆ approximates the scaled cumulative number of rejected customers. The constant m can be used
to approximate the capacity imbalance of the system; see Theorem 5.1 and the discussion in the paragraph following Theorem 5.2. Let ˆ h(x) := hI x− + aγx+ , and ˆ = ξ(t)
Z
t
ˆ h X(s) ds + cUˆ (t).
(4.2)
0
Let Aˆ be the set of all non-negative, non-decreasing and RCLL processes that are adapted to B, un ˆ der which (1) a strong solution to the stochastic equation (4.1) above exists, and (2) E X(t) /t → 0 as t → ∞. We will solve h i ˆ E ξ(t) min lim inf . (4.3) ˆ ∈Aˆ t→∞ t U The diffusion control problem (4.3) approximates the original control problem (2.3).
4.2
Threshold Policies
We initiate our analysis by considering the class of threshold policies. We will show that a policy in this class solves the diffusion control problem (4.3); see Theorem 4.1 in Section 4.3.
9
A threshold policy UˆT that regulates at the level l < ∞ has h i+ ˆ UˆT (0) = X(0) −l
(4.4)
ˆ X(t) ≤ l for all t ≥ 0 Z ∞h i+ ˆ l − X(t) dUˆT (t) = 0. 0
The following lemma characterizes the long-run average expected cost associated with a threshold policy. Lemma 4.1 Suppose there exists a twice continuously differentiable function V having bounded first derivative and a finite constant κ that solve σ 2 00 ˆ V (x) + m(x)V 0 (x) + h(x) = κ, 2
(4.5)
for x ≤ l with the boundary condition V 0 (l) = c. Then, κ represents the average cost associated ˆ in (4.1) using the threshold policy UˆT regulating at the level l; i.e. with controlling the diffusion X i h ˆ E ξ(t) . κ = lim t→∞ t We now construct a solution to (4.5) that is twice continuously differentiable on (−∞, l]. For this, it is sufficient to specify the function V 0 . Let φ and Φ be respectively the standard normal pdf and cdf. The function 0 V1 (x) x ≤ 0 0 (4.6) V (x) := V20 (x) 0 < x ≤ l for 2 ! √ r π hI m µ m 2µ m hI κ− exp x+ Φ x+ − (4.7) 2 µ µ σ µ σ µ µ 2 ! √ √ r 2 π γ m 2γ m 2γ m 0 V2 (x) := x+ (κ + am) exp Φ x+ −Φ l+ σ γ σ2 γ σ γ σ γ !! 2 2 −γ m m + a + (c − a) exp l+ − x+ , (4.8) σ2 γ γ
2 V10 (x) := σ
10
and A(l) κ := κ(l) := (4.9) B(l) 2 r r hI 2 π m m 2 −γ 2 m A(l) := 1+ m exp Φ + c exp l +2 l µ σ µ µσ 2 σ µ σ2 γ l2 + 2 m l 1 − exp −γ σ2 γ q √ q +a 2γ m m m2 2 Φ − Φ l + + σ2 πγ m exp γσ 2 σ γ σ γ 2 r 2 √ r 2√ 1 2γ m m 2 1 m m m 2 B(l) := π √ exp Φ + exp Φ l + − Φ √ σ µ µσ 2 σ µ γ γσ 2 σ γ σ γ solves (4.5) and has V 0 (l) = c. Define Z V (x) :=
x
V 0 (y)dy for x ≤ l.
(4.10)
0
Then, the function V is twice continuously differentiable on (−∞, l]. To see this, note that it is straightforward to check that V10 (0) = V20 (0) and V100 (0) = V200 (0) after deriving 2µ m 2 00 V1 (x) = 2 x + V10 (x) + 2 (κ + hI x) σ µ σ 2γ 2 2 m V200 (x) = 2 x + V20 (x) − 2 aγx + 2 κ. σ γ σ σ Finally, the function V 0 is bounded because it is continuous and lim V10 (x) = −
x→−∞
hI . µ
The following Lemma verifies the intuition that κ is positive, and is increasing in the cost parameters a, hI , and c. Lemma 4.2 The function κ := κ(l, a, hI , c) satisfies κ > 0 for all l, a, hI , c ≥ 0. Furthermore, ∂κ ∂κ ∂κ > 0, > 0, and > 0. ∂c ∂a ∂hI We view κ as a function of l, and minimize κ over l for fixed values of the cost parameters a, hI , and c. Our next proposition shows the conditions under which κ(l) has a minimum that will be attained at some finite l. 11
Proposition 4.1 If a > c, then there exists a unique l? that satisfies (a − c)γl? − κ(l? ) = cm,
(4.11)
and κ(l) ≥ κ(l? ) for all l ≥ 0. Otherwise, κ0 (l) ≤ 0 for all l ≥ 0. The formula (4.11) has the following intuition. In order that l is a minimum, it is necessary that κ0 (l) = 0. We show in the proof of Proposition 4.1 that in order that κ0 (l) = 0, it is necessary that (a − c)γl − cm − κ(l) = 0, which implies that V200 (l) = 0 (using the expression for V200 (x) that appears below (4.10)). Then, (4.11) is obtained by letting x = l in (4.5). Extend the definition of V 0 in (4.6) so that V 0 (x) = c for all x ≥ l? . Then, the function V is twice continuously differentiable because V200? ) = 0. Hence the conditions of Lemma 4.2 are satisfied, and we can conclude that κ(l? ) represents the average cost associated with controlling ˆ in (4.1) through the threshold policy that regulates at the level l? . the diffusion X Corollary 4.1 Let UˆT? be the threshold policy that regulates at level l? , with associated cumulative cost process ξˆT? . Then, h i E ξˆT? (t) κ(l? ) = lim . t→∞ t
4.3
The Diffusion Control Problem Solution
We show that the threshold policy that regulates at the level l? solves the diffusion control problem when a > c, and that it is optimal to exercise no control when a ≤ c. The first step is to provide a verification Lemma that characterizes the minimum achievable long-run average expected cost. Lemma 4.3 Suppose there exists a twice continuously differentiable function V having bounded first derivative, and a constant κ that satisfy σ 2 00 ˆ (x) ≥ κ, V (x) + m (x) V 0 (x) + h 2
(4.12)
and V 0 (x) ≤ c for all x ∈ c, and that exercising no control attains the minimum achievable long-run average cost when a ≤ c. To do this, it is sufficient to show that the function V defined in (4.10) satisfies the conditions of Lemma 4.3. In particular, it is enough to 12
show that V10 and V20 in (4.7) and (4.8) are increasing. This is because when a > c, V20 (x) = c for all x ≥ l? by construction, and when a ≤ c, V20 (x) → a as x → ∞. Also, it is straightforward to check that (4.12) is satisfied (and we show this explicitly in the proof of Theorem 4.1). Theorem 4.1 Let Uˆ ∈ Aˆ be an admissible control, and let ξˆ be the associated cumulative cost process in (4.2). (i) Suppose a > c, and let l? be defined as in the statement of Proposition 4.1. Let UˆT? be the threshold control at l? , as defined in (4.4), and let ξˆT? be the associated cumulative cost process in (4.2). Then, h i h i ? ˆ ˆ E ξ(t) E ξT (t) lim inf ≥ lim = κ(l? ). t→∞ t→∞ t t (ii) Suppose a ≤ c. Let Uˆ 0 (t) = 0 for all t ≥ 0, and let ξˆ0 be the associated cumulative cost process in (4.2). Then, i h i h ˆ E ξ(t) E ξˆ0 (t) lim inf ≥ lim = κ0 , t→∞ t→∞ t t where κ0 = lim κ(l) l→∞ 2 q 2 q q q hI 2 π m m 2 2 π m m 2 1 + σ µ m exp µσ2 Φ σ µ + a 1 + σ γ m exp γσ2 Φ σ γ −1 µ q q = . 2 2 2√ 2 m 2 √1 exp m 2 Φ m √1 exp m 2 π 1 − Φ + σ µ µσ σ µ γ γσ σ γ It is worthwhile to double-check that the expression for κ0 agrees with standard results in the literature. Since Uˆ 0 (t) = 0 for all t ≥ 0, it follows that h i 0 ˆ h i h i E ξ (t) − + ˆ ˆ lim + aγE X(∞) , (4.13) = hI E X(∞) t→∞ t ˆ ˆ in (4.1) under control where X(∞) has the steady-state density associated with the process X ˆ ˆ ˆ ˆ Uˆ 0 . The expressions for E[X(∞)| X(∞) ≤ 0] and E[X(∞)| X(∞) > 0] are given in (18.29) of ˆ ˆ Browne and Whitt (1995), and the expressions for P (X(∞) ≤ 0) and P (X(∞) > 0) are given in
13
(18.5) of this same paper. Hence it follows from Browne and Whitt (1995) that h i h i − ˆ ˆ ˆ ˆ E X(∞) = −E X(∞)| X(∞) ≤ 0 × P X(∞) ≤0 q √ 2 Φ m σ µ 2 1 m σ q +1 √ √ σ µφ m 2 2µ σ µ q q = m 2 2 Φ σ 1−Φ m µ σ γ 1 1 q + √ q √ µφ m 2 γ φ m 2 σ µ σ γ h i h i + ˆ ˆ ˆ ˆ E X(∞) = E X(∞)|X(∞) > 0 × P X(∞) > 0 q √ 2 1−Φ m σ γ 2 1 −m σ q +1 √ √ γ σ φ m 2 2γ σ γ q q . = m 2 2 Φ σ 1−Φ m µ σ γ q √1 q + √1 m m 2 2 µ γ φ
σ
φ
µ
σ
γ
It is now straightforward to verify that h i h i − + ˆ ˆ hI E X(∞) + aγE X(∞) = lim κ(l), l→∞
i i h h + − ˆ ˆ into (4.13). and E X(∞) by plugging the expressions for E X(∞) We end this Section by plotting the minimum achievable average cost, κ(l? ), and the associated threshold level l? as a function of the abandonment cost a and cost per customer rejected c. Figure 4.1 shows that κ(l? ) is increasing in both the parameters a and c, while l? is decreasing in a but increasing in c. We do not include the cases where a ≤ c, because in this case l? = ∞.
5
The Performance of the Policy Arising from the Diffusion Control Problem
There is a natural translation from the optimal policy for the diffusion control problem in Theorem 4.1 to a policy for the original system. Specifically, when there are N servers, let n h√ io θ?,N (x) := 1 [x − N ]+ ≥ N l? , where l? satisfies (4.11). The following Theorem, which is Theorem 7.6 in Pang et al. (2007) re-stated in our setting, justifies that this is the right translation. Theorem 5.1 having N servers and arrival rate λN = N µ − √ √ Consider a sequence of systems m N + o( N ), for some m ∈ y n2 (2). Continued iteration of the last two sentences shows y n1 (k) > y n2 (k) for k ∈ {1, . . . , min(n1 , n2 ) + 1}.
Proof of Lemma 3.2: Without loss of generality, in this proof only, let λ = 1. The proof is by induction. For the base case, first note that it follows from (3.3)-(3.5) that z 0 = N hI + c, and h(1) − z 1 = µ(1)y 1 (1) − y 1 (2) y 1 (1) = z 1 − h(0) y 1 (2) = c. Solving the above equations for z 1 shows z1 = Since
hI (N − 1) + µhI N + c . 1+µ
hI + cµ hI (N − 1) + µhI N + c = N hI + c − < z0, 1+µ 1+µ 3
we conclude that z 1 < z 0 . Next, select any n ∈ {2, 3, . . . , N − 1}, and suppose that z n < z n−1 < z 1 < z 0 . To complete the proof, it is sufficient to show that z n+1 < z n . The argument is by contradiction. Suppose not; i.e., suppose that z n+1 ≥ z n . It follows from Lemma 3.1 that y n (k) < y n−1 (k) for all k ∈ {1, 2, . . . , n}, and y n (k) ≤ y n+1 (k) for all k ∈ {1, 2, . . . , n + 1}. Since from (3.5), y n−1 (n) = y n (n + 1) = c, it follows that y n (n) < c and y n+1 (n + 1) ≥ c. Equations (3.3)-(3.5) show that z n = h(n) − µ(n)y n (n) + c z n+1 = h(n + 1) − µ(n + 1)y n+1 (n + 1) + c, and so z n > h(n) − µ(n)c + c z n+1 ≤ h(n + 1) − µ(n + 1)c + c. The fact that h(n) > h(n + 1) and µ(n) < µ(n + 1) implies z n > h(n + 1) − µ(n + 1)c + c. We conclude that z n > z n+1 , which is a contradiction.
Proof of Theorem 3.2: Let m ≥ N be such that z m < z k for all k ∈ {0, 1, ..., m − 1} and z m ≤ z m+1 . Assume also that a > c. For δ = z m+1 − z m ≥ 0 define the modified problem as µ ¯(k) for k ∈ {0, 1, . . . , m} µ ˜(k) = µ ¯(m + 1) for k ∈ {m + 1, m + 2, . . .} and ˜ h(k) =
h(k) h(m + 1) − δ
for k ∈ {0, 1, . . . , m} for k ∈ {m + 1, m + 2, . . .} 4
and finally y(k) =
y m (k) y m+1 (m + 1)
for k ∈ {0, 1, . . . , m} for k ∈ {m + 1, m + 2, . . .}
The idea is to show that if the threshold level m is optimal for this modified problem then it is also optimal for the original problem. This will be done by assuming there exists k > m with z k < z m and arriving at a contradiction. It is immediate by Lemma 3.1 that y m (k) < c for all k ∈ {0, 1, . . . , m} and y m+1 (m + 1) ≥ c. Then letting z = z m , it is straightforward to verify that (z, (y(1), y(2), . . .)) satisfies the optimality equations (3.1)-(3.2) for the modified problem and therefore the threshold level m is optimal with associated optimal average cost z m . From this point on, let z˜k be the long run average expected cost of the policy with threshold level k for the modified problem. Similarly, let y˜k (.) be the relative cost difference for the associated modified problem. Note that these values will coincide with those of the original problem for k ≤ m + 1. Now assume that m is not the optimal threshold level for the original problem. Then there exists n ≥ m + 2 such that n = inf k ≥ m + 2 : z k < z m To proceed it is enough to prove that z˜n < z n which would imply such an n cannot exist because we know z m = z˜m < z˜n as threshold level m is optimal for the modified problem. Now we show that z˜n < z n . First note from Lemma 3.1, y n (k) < c for all k ∈ {0, 1, ..., n}. From the n-terminating equation (3.3) for the original and modified problems we have h(n) − z n = µ ¯(n)y n (n) − cλ and ˜ h(n) − z˜n = µ ˜(n)˜ y n (n) − cλ. Now assume z˜n ≥ z n . Then subtracting the above equations from each other ˜ z˜n − z n = h(n) − h(n) + µ ¯(n)y n (n) − µ ˜(n)˜ y n (n) = (m + 1)aγ − naγ − δ + (N µ + (n − N )γ)y n (n) − (N µ + (m + 1 − N )γ)˜ y n (n) = −(n − m − 1)γ(a − y n (n)) − µ ¯(m + 1)(˜ y n (n) − y n (n)) < 0 where the last inequality follows since y n (n) < c < a and y n (n) < y˜n (n). To see how the latter follows, first note that, under the assumption z˜n ≥ z n , the logic of Lemma 3.1 applies directly for states k = 1, 2, ..., m + 1 giving y n (k) ≤ y˜n (k). To see that y n (k) ≤ y˜n (k) for k ≥ m + 2 we rewrite the n-terminating equation (3.3) for the original and modified problem to get λy n (k) = µ ¯(k − 1)y n (k − 1) + z n − h(k − 1) 5
(A-2)
and ˜ − 1). λ˜ y n (k) = µ ˜(k − 1)˜ y n (k − 1) + z˜n − h(k
(A-3)
Hence it follows immediately that y n (m + 2) ≤ y˜n (m + 2). Next subtracting equation A-3 from equation A-2 with k = m + 3 we get λ(y n (m + 3) − y˜n (m + 3)) = µ ¯(m + 2)y n (m + 2) + z n − h(m + 2) ˜ − µ ˜(m + 2)˜ y n (m + 2) − z˜n + h(m + 2) = (¯ µ(m + 1) + γ)y n (m + 2) − µ ¯(m + 1)˜ y n (m + 2) + z n − z˜n − h(m + 2) + h(m + 1) − δ < µ ¯(m + 1)(y n (m + 2) − y˜n (m + 2)) + γc − γa − δ < 0 where the inequalities follow since y n (m + 2) ≤ y˜n (m + 2) and y n (m + 2) < c < a. Thus y n (m + 3) < y˜n (m + 3). An inductive argument similar to Lemma 3.1 establishes y n (k) < y˜n (k) for all m + 2 < k ≤ n, and so it follows similarly that y n (m + j) < y˜n (m + j) for all j ∈ {3, 4, 5, . . . , n − m}. Thus our assumption cannot be correct and it must be true that z˜n < z n . This in turn implies such an n cannot exist which ultimately shows that z m < z k for all k ∈ {0, 1, ...}.
Proof of Theorem 3.3: Consider the modified holding cost function h(x) for x ≤ n ˆ h(x) = h(n) for x > n. Similar to Lemma 2 in Adusumilli and Hasenbein (2008) there exists (ˆ z n , (ˆ y n (1), ..., yˆn (n)) that satisfies the optimality equations and has yˆn (k) ≤ c for all k ≥ n. Hence, it is never optimal to reject a customer. Let pn (i) and p∞ (i) be the steady state probability of being at state i under the n-terminating policy and the policy that does not excercise any control respectively. Observe that pn (i) = Qi
λi
j=1
Using
P
i
µj
pn (0) and p∞ (i) = Qi
λi
j=1
µj
p∞ (0).
pn (i) = 1 we get pn (0) =
1 1 and p∞ (0) = . n ∞ i X X λ λi 1+ 1+ i i Y Y i=1 i=1 µj µj j=1
j=1
6
By construction n
n
z ≤ zˆ + pn (n)c +
n X
[pn (i) − p∞ (i)] h(i).
i=1
ˆ ≤ h(i) for all i ∈ {0, 1, ...}, it follows that Furthermore, since h(i) n
?
z ≤ z + pn (n)c +
n X
[pn (i) − p∞ (i)] h(i).
i=1
It is trivial to see that pn (n) → 0 as n → ∞. Thus if we can show that n X
[pn (i) − p∞ (i)] h(i) → 0 as n → ∞
(A-4)
i=0
then lim inf z n ≤ lim sup z n ≤ z ? . n→∞
n→∞
Since z n ≥ z ? for each n and the sequence of solutions {z n } is decreasing by assumption, we can conclude that lim z n = z ? . n→∞
We now show that (A-4) is correct. Note that n X
[pn (i) − p∞ (i)] h(i) =
i=0
n X
[pn (0) − p∞ (0)] Qi
j=1
i=0
n Define i? = inf i :
λ µi
o < 1 and note that p∞ (i) = Qi
λi−i
λ µi
∞ X
∗
p∞ (i)h(i)
i∗ since µi is increasing. Then
∗
j=i∗ +1 µj
and so
0 for all l ≥ 0.
Finally, the coefficient for hI /µ is r 2 r 2 r r r √ m m 2 m m 2 m 2 m 2 2 π m exp Φ = 2π exp φ + Φ . 1+ σ µ µσ 2 σ µ µσ 2 σ µ σ µ σ µ To see this term is positive, it is sufficient to establish that φ(x) + xΦ(x) ≥ 0 for all x ∈ 0), by (A-8), xe
x2 /2
Φ(x) + e
x2 /2
x φ(x) = √ π x >√ π
Z x2 /2 e
∞
√ −x/ 2
−x √ 2
+
1 q
x2 2
+ q
e
4 π
√ x2 √ + 2x + −x + π4 2 2 = q √ 2 −x x 4 2π √2 + +π 2 > 0. We conclude φ(x) + xΦ(x) ≥ 0 for all x ∈ 0. 11
It follows from the expression for κ that √ c + hµI σ µ h m q + I , κ(0) = √ 2 µ 2 π exp m Φ m 2 µσ 2
σ
γ
and so r r r √ σ µ m 2 m 2 hI m 2 q φ + Φ . κ(0) + cm = c + µ 2Φ m 2 σ µ σ µ σ µ σ
µ
We conclude that κ(0) + cm > 0 because φ(x) + xΦ(x) > 0 for all x ∈ c. Any stationary point l must satisfy κ0 (l) = 0, which implies (a − c)γl − cm − κ(l) = 0. At the point l, it follows from the above equality that √ √ 2 φ 2γ l + σ 2 2π m exp κ00 (l) = 2 2 σ γσ B(l)
m γ
(A-9)
(a − c)γ > 0.
Hence any point l that satisfies (A-9) will be a local minimum. To see that such a point exists, because κ0 (0) < 0, it is sufficient to show that for some l > 0, κ0 (l) > 0. This follows from the expression for κ0 (l) and the fact that (a − c)γl − cm − κ(l) → ∞ as l → ∞, because liml→∞ κ(l) < ∞. Note also that this point is unique as we cannot have another local minimum without having a local maximum first. Thus we define l? as the unique point that solves (a − c)γl? − cm − κ(l? ) = 0. Suppose a < c. It is sufficient to show that no local minima exist. This argument is by contradiction. Let l be such that κ0 (l) = 0. Then, it follows from the expression for κ00 (l) that κ00 (l) < 0, and so l is a local maximum. But since κ0 (0) < 0, there must exist l ∈ (0, l) such that κ0 (l) = 0 is a local minimum, and so κ00 (l) > 0. This is a contradiction, because the expression for κ00 implies that κ00 (l) < 0. Finally, in the case that a = c, we show directly that κ0 (l) ≤ 0 for all l ≥ 0. For this, it follows from the expression for κ0 (l) that it is sufficient to show cm + κ(l) ≥ 0 for all l ≥ 0. 12
Since when a = c
hI µ
2 σ
q
π m exp µ
m2 µσ 2
q m σ
2 µ
1+ Φ q √ 2 q 2γ m 2 +c 1 + m σ2 πγ m exp γσ Φ m − Φ l+ 2 σ γ σ
κ(l)+cm =
m γ
+ B(l) ,
B(l)
and we established in the proof of Lemma 4.2 that 2 r r 2 π m m 2 1+ m exp Φ ≥ 0, 2 σ µ µσ σ µ it is sufficient to show that r 2 r √ 2 π m m 2 m 2γ m exp 1+m Φ −Φ l+ + B(l) ≥ 0. σ γ γσ 2 σ γ σ γ This follows because r 2 r √ 2 π m 2γ m m 2 1+m m exp −Φ l+ + B(l) Φ σ γ γσ 2 σ γ σ γ 2 r √ r ! √ m m 2 2m m 2 = exp 2π φ Φ , +√ 2 µσ σ µ µσ σ µ and φ(x) + xΦ(x) > 0 for all x ∈ l? . To see this observe that l? satisfies (4.11) and V 00 (x) = 0 for x ≥ l? . Thus for x ≥ l? σ 2 00 ˆ (x) = −(m + γx)c + aγx V (x) + m (x) V 0 (x) + h 2 = −mc + γ(a − c)x ≥ −mc + γ(a − c)l? = κ(l? ),
(A-12)
where the last equality follows since a > c. Thus V 0 (x) satisfies (4.12) for all x ∈ < Next we show that the function V10 is strictly increasing. For this, first note that the function Φ(x)/φ(x) is strictly increasing because d Φ(x) φ(x) + xΦ(x) = , dx φ(x) φ(x) and φ(x) + xΦ(x) > 0, as shown in the last paragraph of the proof of Lemma 4.2. Then, since it follows from the expression for V10 in (4.7) that V10 is equivalently written as √ √ Φ 2µ x + m σ µ 2 1 hI m h √ − I , κ(l) − V10 (x) = √ σ µ µ µ φ 2µ x + m σ
it is sufficient to show κ(l) −
hI m > 0 for all l ≥ 0 µ
15
µ
(A-13)
to conclude that V10 is strictly increasing. It follows from the expression for κ in (4.9) that κ(l) −
hI m µ
√ q 2γ m 2 1− Φ σ l+ γ −Φ m σ µ l2 + 2 m l 1 − exp −γ σ2 γ +a q q √ 2γ m m 2 π m2 2 Φ σ γ −Φ σ l+ γ + σ γ m exp γσ2 m 2 +c exp −γ l + 2 l σ2 γ q . q √ = √ 2γ 2 2 2 1 m 1 m m2 m2 √ √ π µ exp µσ2 Φ σ µ + γ exp γσ2 Φ σ l+ γ −Φ m σ σ γ hI µ
2 σ
q
π m exp µ
m2 µσ 2
We have already shown in the proof of Lemma 4.2 that the term multiplying a is positive, and the term multiplying c is clearly positive. It remains to show the term multiplying hI /µ is positive. Since Φ is increasing, 2 √ r r m 2γ 2 π m m 2 m exp 1− Φ l+ −Φ 2 σ µ µσ σ γ σ µ 2 r r 2 π m m 2 ≥1− m exp , 1−Φ 2 σ µ µσ σ µ q q m 2 2 = Φ − by the properties of the normal cdf, it is enough to show that and 1 − Φ m σ γ σ γ 2 1− σ
r
π m exp µ
m2 µσ 2
r m 2 ≥ 0. 1−Φ σ µ
This follows because r 2 2√ 1 m m 2 π √ exp 1− Φ − σ γ γσ 2 σ γ r 2 r r √ m m 2 m 2 m 2 = 2π exp φ − Φ 2 γσ σ γ σ γ σ γ r r 2 r √ m m 2 m 2 m 2 = 2π exp φ − − Φ γσ 2 σ γ σ γ σ γ ≥ 0, because φ(x)+xΦ(x) ≥ 0 for all x ∈ 0. 2 This is a contradiction, because the right-hand side of (A-15) is 0. Proof of (ii): We have already shown, at the end of Section 4.3, that h i h i − + ˆ ˆ hI E X(∞) + aγE X(∞) = lim κ(l). l→∞
It remains to show
i h ˆ E ξ(t)
≥ κ0 . (A-16) t For this, first define V as in (4.10), except modify the definition of V20 in (4.8) as follows 2 ! √ r 2 π γ 2γ m m 0 0 V2 (x) := κ + am exp Φ x+ − 1 + a. x+ σ γ σ2 γ σ γ lim inf t→∞
The function V20 and the constant κ0 solve the ode σ 2 00 ˆ V (x) + m(x)V 0 (x) + h(x) = κ for all x ∈