Maximizing Throughput in Finite-Source Parallel ... - Semantic Scholar

20 downloads 9432 Views 503KB Size Report
Jun 8, 2011 - School of Business, University of Alberta, Edmonton, AB T6G 2R6, Canada ... Ingolfsson ([email protected]), Phone: 1 (780) 492-7982, .... (1995)), and (2) the sequencing of repair service on failed machines ...
Maximizing Throughput in Finite-Source Parallel Queue Systems Mohammad Delasay, Bora Kolfal, Armann Ingolfsson∗ School of Business, University of Alberta, Edmonton, AB T6G 2R6, Canada [email protected][email protected][email protected]

Abstract Motivated by the dispatching of trucks to shovels in surface mines, we study optimal routing in a Markovian finite-source, multi-server queueing system with heterogeneous servers, each with a separate queue. We formulate the problem of routing customers to servers to maximize the system throughput as a Markov Decision Process. When the servers are homogenous, we demonstrate that the Shortest Queue policy is optimal, and when the servers are heterogenous, we partially characterize the optimal policy and we present a near-optimal and simple-to-implement policy. We use the model to illustrate the substantial benefits of pooling, by comparing it to permanent assignment of customers to servers. Keywords: Markov processes, Queueing, Routing to parallel queues, Dispatching systems, Markov decision processes

1. Introduction In this paper, we formulate a Markov Decision Process (MDP) model for the routing of customers from a finite population to a set of parallel queues with the objective of maximizing average throughput (number of customers served) per time unit. The left panel of Figure 1 illustrates the system. Customers arrive from the population of size N to a routing point, where they are routed (assigned) to one of s parallel servers. Each server has a separate queue and customers cannot jockey between queues. Service times are independent and exponentially distributed with parameter µi for server i, i = 1, . . . , s. After completing service, customers return to the population, where they spend independent exponentially distributed amounts of time with parameter λ before returning to the routing point. Our primary motivating example is the routing of trucks (customers) to shovels (servers) in a surface mine (for example, in the Alberta oilsands). In oilsands mines, the size of the ∗

Corresponding Author: Armann Ingolfsson ([email protected]), Phone: 1 (780) 492-7982, Fax: 1 (780) 492-3325. Preprint submitted for publication

June 8, 2011

Routing point

. . .

Queue 1

Server 1

Queue 2

Server 2

. . .

. . .

Queue s

Server s

Finite Population

Figure 1: General finite population routing to parallel queues problem and it’s application in surface mining

extraction plant determines the throughput rate required to keep the plant running. Given a required throughput rate, one can view the planning problem as minimizing the cost of shovels and trucks needed to achieve that throughput rate. We frame the problem as one of routing trucks to shovels so as to maximize throughput, given a set of shovels and trucks. Our model could be solved with different sets of shovels and with different truck fleet sizes in order to determine the least costly configuration that achieves the required throughput. The right panel of Figure 1 illustrates a surface mining operation, where trucks circulate between shovels, where they are filled with ore, a dump location, and a dispatch point, where they are routed to one of the shovels. Ta et al. (2010) provide further information about oilsands mines and discuss how one can assign trucks to shovels in a static fashion so as to minimize the number of trucks while achieving the required throughput and satisfying other constraints. We use our model to illustrate the substantial pooling benefits of real-time truck routing over static assignment of trucks to shovels. In one numerical example, we estimate that pooling could save $12.8 Million per year. Our model and results could also be relevant to the assignment of waste collection trucks to routes or the assignment of airplanes to repair facilities for preventive maintenance. A distinguishing feature of both of these examples, as well as the routing of trucks in surface mines, is that jockeying (switching from one queue to another) involves traveling a substantial distance and is therefore unlikely to occur. We assume no jockeying in our model. Optimal customer routing polices have been studied extensively; see Stidham and Weber (1993) for an overview. We classify the related literature based on whether the customer population is finite or infinite and based on the objective function. Most research on customer routing assumes an infinite customer population. Models with a finite customer population have mainly been studied in the context of machine interference. Two widely used objec2

tives are to maximize throughput (customers served) and to minimize holding cost (which translates to minimizing the average number of customers waiting or being served if the holding costs are homogenous). Winston (1977) and Hordijk and Koole (1990) seek to maximize throughput in an infinite customer population system with homogeneous parallel servers, each with a separate queue, where service times are exponentially distributed with equal mean. Winston’s model assumes a Poisson arrival process while Hordijk and Koole allow a general arrival process. However, the arrival process in Hordijk and Koole (1990) is assumed independent of the number of customers in service or waiting to get service, which rules out an arrival process from a finite population. Weber (1978) extends Winston’s analysis to allow for service time distributions with a non-decreasing hazard rate. All three studies demonstrate that the Shortest Queue (SQ) policy is optimal. Koole et al. (1999) show that in homogeneous infinite customer population systems, the SQ policy is optimal with respect to various cost functions, for systems with two identical servers and a broad range of service time distributions. In infinite customer population models with homogeneous servers, minimizing the average number of customers waiting or receiving service is often equivalent to minimizing the average waiting time and minimizing the average workload (Koole (2005)). For heterogeneous servers, Hordijk and Koole (1992) partially characterize optimal routing policies to more than two parallel queues with exponential servers by proving that routing to the faster server when that server has a shorter queue minimizes expected cost. Xu and Zhao (1996) extend the study of two heterogeneous servers to permit jockeying between the queues and they characterize the routing policy that minimizes the expected holding and jockeying cost. Larsen and Agrawala (1983), Xu et al. (1992), and Koyanagi and Kawai (1995) propose optimal threshold routing policies for two-server systems with respect to various cost functions. With these policies, customers are routed to the fast server (regardless of the status of slow server) until a certain threshold value for the fast-server queue size, at which point a customer is removed from that queue and sent to the slow server. Much of the study of finite-population queueing systems has focused on a machine repair context. The prescriptive analysis of the multi-server “machine repairman problem” mainly focuses on two types of decisions: (1) the number of servers, servers’ service rates and machines’ failure rates (Albright (1980); Ching (2001); Ke and Wang (1999); Wartenhorst (1995)), and (2) the sequencing of repair service on failed machines (Frostig (1993)). Few papers address the optimal assignment of repair people to failed machines. Righter 3

(1996) studies the routing of failed machines with arbitrary arrivals to heterogeneous exponential servers and shows that sending the next job to the fastest available server stochastically maximizes the number of service completions. The main difference between Righter’s model and our model is that in Righter’s model failed machines wait in a common buffer, whereas in our model, the servers have separate queues and jockeying is not permitted. (In the surface mining context, jockeying corresponds to a truck traveling from one shovel to another, which is unlikely to happen.) The work that is perhaps closest to ours is that of Goheen (1977) and Cinlar (1972), who address the problem of routing failed machines to repair stations (with separate buffers for each station, as in our model) so as to minimize the long run average cost (as opposed to throughput, as in our model). Cinlar (1972) assumes exponential repair times whereas Goheen (1977) assumes Erlang repair times. Goheen and Cinlar demonstrate that an optimal policy exists and that it can be found by solving either a linear (Cinlar (1972)) or nonlinear (Goheen (1977)) program but they do not analyze the structure of those policies and they do not provide computational results. We extend the study of customer routing to parallel queues to finite source populations. We partially characterize the optimal routing policy for two-server systems and demonstrate that the SQ policy is optimal for an arbitrary number of homogeneous servers. When the servers are heterogenous, the optimal routing policy is complex. We propose an easyto-implement and near-optimal heuristic policy for systems with an arbitrary number of heterogeneous servers. The policy begins by eliminating servers that are so slow that the optimal policy rarely or never uses them and then uses an easily computed index policy for the remaining servers. Our numerical results show that our policy performs extremely well for a wide range of model parameters. Section 2 presents the MDP model and our assumptions. Section 3 presents several structural properties of the optimal routing policy for two-server systems and proves that the optimal policy for systems with an arbitrary number of homogeneous servers is SQ. Section 4 describes our near-optimal heuristic policy for systems with arbitrarily many heterogenous servers and Section 5 numerically evaluates the performance of our proposed heuristic policy, illustrates how server utilization depends on server speed under the optimal policy, and illustrates the benefits of pooling.

4

2. Model Formulation Let S = {1, . . . , s} be the set of servers, let ni ∈ {0, . . . , N } be the number of customers waiting or being served by server i, i ∈ S, and let n = (n1 , . . . , ns ). The state space of our P MDP model is Ω = {n : n ∈ Z n , i∈S ni ≤ N }. There are two types of decision epochs: (1) service completion by server i, where one decides whether to begin serving the next customer (if ni > 0) or to idle and (2) arrival of a customer to the routing point, where one decides where to route the arriving customer. P Let IR be an indicator function for event R and let Λ = si=1 µi + N λ. Recall that 1/λ is the average time that customers spend in the population before returning to the routing point and 1/µi is the average service time for server i. Using uniformization (Lippman (1975)), we express the MDP optimality equation as: s n o X µi g + ν(n) = max Ini ≥1 + ν (n − ei × Ini ≥1 ) , ν (n) Λ Λ i=1 P n o λ Ps n (N − si=1 ni ) λ i=1 i + max ν (n + ei ) + ν (n) . i∈S Λ Λ

(1)

where ν is the optimal value function, g is the optimal throughput per time unit and n±ei = (n1 , . . . , ni ± 1, . . . , ns ). The first term of the right hand side represents decisions made upon a service completion: idle or serve the next customer. If server i serves a customer, the throughput increases by one unit and ni decreases by one. If the server idles, then the throughput and ni stay the same. The second term on the right hand side presents the routing decision, where ni increases by one if the arriving customer is routed to server i, while the number of customers in the other queues do not change. 3. Structural Properties of the Optimal Policy In this section, we partially characterize the optimal routing policy for systems with two heterogeneous servers. The proofs of all results in this Section are in the Appendix. We prove that unforced idling (idling when there is at least one customer in the queue) is never optimal. We also show that the optimal routing policy for queue i is monotone in ni . For systems with an arbitrary number of homogeneous servers, we show that the SQ policy is optimal. Let Dj be the first difference operator, defined as Dj ν (n) = ν (n + ej ) − ν (n) for a function ν of a vector n of two integer variables and define the second difference operators Dii = Di Di and Dij = Dji = Di Dj . We use “increasing” and “decreasing” in the weak 5

sense of “non-decreasing” and “non-increasing” throughout. Let Ψ be the set of functions ν defined on the state space Ω that have the following properties: P1. Submodularity (Dij ν ≤ 0; ∀i, j ∈ S, i 6= j): Di ν is decreasing in nj and Dj ν is decreasing in ni . P2. Diagonal submissiveness (Dii ν ≤ Dij ν; ∀i, j ∈ S, i 6= j): Dj ν − Di ν is increasing in ni and decreasing in nj . P3. Concavity (Dii ν ≤ 0; ∀i ∈ S): Properties P1 and P2 together imply concavity. If ν is concave, then Di ν is decreasing in ni . P4. Upper boundedness (Di ν ≤ 1; ∀i ∈ S). We define the operators Tµi , i ∈ S, and Tλ as follows: o n Tµi ν (n) = max Ini ≥1 + ν (n − ei × Ini ≥1 ) , ν (n) n o Tλ ν (n) = max ν (n + ei ) .

i ∈ S,

(2) (3)

i∈S

Setting Λ = 1, we define operator T to represent the right hand side of (1) as T ν(n) =

s X

µi Tµi ν (n) +

N−

s X

i=1

! ni

λTλ ν (n) +

i=1

s X

ni λν (n)

(4)

i=1

Lemma 1 shows that the properties P1-P4 are preserved under the operator T and the optimal value function ν ∈ Ψ: Lemma 1. Let τ be a real valued function defined on Ω. If τ ∈ Ψ, then (1) Tµi τ ∈ Ψ, i ∈ S, (2) Tλ τ ∈ Ψ, and (3) T τ ∈ Ψ. Furthermore, the optimal value function ν ∈ Ψ. Lemma 1 enables us to prove the following two theorems: Theorem 1. If a queue is nonempty, then it is suboptimal for its server to be idle. Theorem 2. If routing to server i is optimal in state n, then routing to server i is also optimal in states n − ei (when ni > 0) and n + ej (when nj < N, j 6= i). Theorem 1 simplifies the optimality equation (1) to: g + ν(n) =

s X

µi (Ini ≥1 + ν (n − ei × Ini ≥1 ))

i=1

+

N−

s X

! ni

i=1

λTλ ν(n) + λ

s X i=1

6

ni ν (n) .

(5)

n2

n1

0

1

2

3

4

5

0

2

1

1

1

1

1

1

2

2

2

1

1

2

2

2

2

2

3

2

2

2

4

2

2

5

2

Figure 2: Optimal routing policy for a heterogenous system with N = 6, λ = 2, µ1 = 2, and µ2 = 4 (1: route to server 1, 2: route to server 2)

Based on Theorem 2, given the optimal decision in one state, we can deduce the optimal routing decision for several other states. For example, for the two-server system shown in Figure 2, routing to server 2 is optimal in state (1, 2). Therefore, based on Theorem 2, it is also optimal to route to server 2 for all the states to the left or below state (1, 2). Lemma 2 allows us to characterize the optimal policy further, in Theorem 3. Lemma 2. For i, j ∈ {1, 2}, (i) If ni ≥ nj > 0, µj ≥ µi , and ν (n + ej ) ≥ ν (n + ei ), then ν(n−ei +ej ) ≥ ν(n+ei −ej ). (ii) If ni = nj and µj ≥ µi , then ν(n + ej ) ≥ ν(n + ei ). Theorem 3. When n1 = n2 , it is optimal to route arriving customers to the faster server. Theorems 2 and 3 together allow us to determine the optimal routing decision for more than half of all states for two-server heterogeneous systems (as illustrated in Figure 2): Theorem 3 specifies the optimal decision for states on the diagonal n1 = n2 and Theorem 2 specifies the optimal decision either for all states above or all states below the diagonal. For homogeneous s-server systems (µi = µ for i ∈ S, s ≥ 2), the SQ policy is optimal, as shown in the following Theorem. Theorem 4. Routing to the Shortest Queue policy is optimal for systems with an arbitrary number of homogeneous servers. In the proof of this Theorem in the Appendix, we show that SQ stochastically maximizes throughput, which implies that SQ maximizes expected throughput.

7

4. Two-stage Near-optimal Policy for General Systems The complex structure of the optimal policy, even for two-server systems, motivated us to develop an easy-to-use, near-optimal heuristic policy. Implementing the optimal policy would require the use of an s-dimensional lookup table (for a system with s shovels), as well as convincing the system operator that the resulting policy is a sensible one. Thus, we seek a policy that is easily justified, in addition to being simple to implement and close to optimal. Our proposed policy begins with Server Elimination (SE) and then uses a Modified Least Remaining Workload (MLRW) index policy for the remaining servers. The SE stage was motivated by our observation, based on numerical experiments, that if a server is sufficiently slow (µi is sufficiently small) then the optimal policy never routes to that server. Based on extensive numerical experiments with two-server systems, we found that removing server i was near-optimal if the following inequality was satisfied: µj ≥ N µi + λ(N/2 − 1),

(6)

where µi , µj , and λ are not normalized. We developed this expression by considering two special cases of two-server systems. Suppose server 1 is the fast one, that is, µ1 > µ2 . First, consider a system where λ is small enough that λN ≈ 0, suppose that (n1 , n2 ) = (N − 1, 0), and suppose that the remaining customer arrives to the routing point while the system is in this state. The expected time until that customer completes service is minimized by routing to the fast server if the condition µ1 ≥ N µ2 holds. In states where n1 < N − 1, there would be even greater reason to route an arriving customer to the fast server. Therefore, if µ1 ≥ N µ2 , then it will never be beneficial to route a customer to the slow server, and therefore nothing is lost by removing the slow server. Second, suppose that the term N λ is not close to zero. The risk that we take when routing to the slow server is that the fast server becomes idle before the slow server completes serving the customer that was routed to it. As N λ increases, this risk decreases, because the probability that another customer arrives to the routing point before the slow server completes its service increases. Therefore, the threshold rate that the fast server must exceed in order to remove the slow server should increase with N λ. The term we used, λ(N/2 − 1), was chosen because in the special case when N = 2, as long as µ1 ≥ 2µ2 , routing to the fast server will minimize the expected time until the routed customer completes service, regardless of the magnitude of λ. For systems with more than two servers, we check whether (6) holds for the fastest server 8

(j) and the slowest server (i) and if so, we eliminate server i. We continue this procedure recursively until no more servers can be eliminated. To motivate the MLRW policy, consider the following two greedy strategies, which focus only on the customer that is currently at the routing point and ignore all future arrivals to the routing point. • Least Time to Complete Service (LTCS): Route to the queue i∗ with the least expected time until service of the current customer is completed, that is 



ni + 1 µi

i = arg min i∈S

 .

(7)

• Least Remaining Workload (LRW): Route to the queue i∗ with the least expected time until all currently assigned customers have been served, that is 



i = arg min i∈S

ni µi

 .

(8)

Neely et al. (2003) introduced LTCS and LRW in the context of satellite and wireless networks and referred to them as a greedy strategy (LTCS) and a work-conserving strategy (LRW), respectively. Our MLRW policy chooses the server to route to by minimizing a quantity that is between the expected time to complete service and the expected remaining workload and it incorporates a term that depends on the parameters N and λ of the finite source population. Specifically, the MLRW policy routes to the queue i∗ , where 



i = arg min i∈S

P  ni N − j∈S nj + . µi N µi

(9)

The first term, ni /µi , is the expected remaining workload for server i. The second term, 0
0 We will show that w has properties P1-P4 with respect to n ∈ Ω for a given u ∈ {0, 1}. For brevity, we will omit “with respect to n for a given u” in what follows. By definition, Tµ1 τ (n) = max w (u, n) ,

(A.2)

u∈{0,1}

and u = 0, 1 correspond to server 1 idling or starting to serve the next customer, respectively. The first and second differences of w are:   D1 τ (n) 1 D1 w (u, n) = w (u, n + e1 ) − w (u, n) =  D1 τ (n − e1 )   D2 τ (n) D2 τ (n) D2 w (u, n) = w (u, n + e2 ) − w (u, n) =  D2 τ (n − e1 ) 16

if u = 0 if u = 1 and n1 = 0 , if u = 1 and n1 ≥ 1

(A.3)

if u = 0 if u = 1 and n1 = 0 , if u = 1 and n1 ≥ 1

(A.4)

  D21 τ (n) 0 D12 w (u, n) = D21 w (u, n) =  D21 τ (n − e1 )   D11 τ (n) D1 τ (n) − 1 D11 w (u, n) =  D11 τ (n − e1 )   D22 τ (n) D22 τ (n) D22 w (u, n) =  D22 τ (n − e1 )

if u = 0 if u = 1 and n1 = 0 , if u = 1 and n1 ≥ 1

(A.5)

if u = 0 if u = 1 and n1 = 0 , if u = 1 and n1 ≥ 1

(A.6)

if u = 0 if u = 1 and n1 = 0 . if u = 1 and n1 ≥ 1

(A.7)

Since τ is submodular, based on (A.5), w is also submodular. As τ satisfies P2-P4, each term in (A.6) and (A.7) is smaller than or equal to the corresponding term in (A.5) and therefore, w is diagonally submissive. Since w is submodular and diagonally submissive, it is concave. As w is concave and upper bounded, D1 w (u, n) in (A.3) is increasing in u for a given n. The submodularity of w implies that D2 w (u, n) in (A.4) is increasing in u for a given n. Expressions (A.3), (A.4), and τ ∈ Ψ imply that w is upper bounded. Therefore, w satisfies P1-P4, that is, w ∈ Ψ. Next, we use w and its properties to show that Tµ1 preserves P1-P4. Proof that Tµ1 τ is submmodular: We need to show that for any n, D12 Tµ1 τ = D21 Tµ1 τ ≤ 0, which is equivalent to: Tµ1 τ (n + e1 ) + Tµ1 τ (n + e2 ) ≥ Tµ1 τ (n + e1 + e2 ) + Tµ1 τ (n)

(A.8)

Let u1 , u2 ∈ {0, 1} be the maximizers of w at n and n + e1 + e2 , respectively, that is: Tµ1 τ (n) = w (u1 , n) and Tµ1 τ (n + e1 + e2 ) = w (u2 , n + e1 + e2 )

(A.9)

We separate the proof that (A.8) holds into two cases: u1 ≥ u2 and u1 < u2 . 1. Case 1: u1 ≥ u2 Tµ1 τ (n + e1 ) + Tµ1 τ (n + e2 ) ≥ w (u1 , n + e1 ) + w (u2 , n + e2 ) by (A.2), = w (u2 , n + e1 + e2 ) + w (u1 , n) − D1 w (u2 , n + e2 ) + D1 w (u1 , n) ≥ w (u2 , n + e1 + e2 ) + w (u1 , n) − D1 w (u2 , n + e2 ) + D1 w (u2 , n) as D1 w (u, n) is increasing in u, = w (u2 , n + e1 + e2 ) + w (u1 , n) − D21 w (u2 , n) ≥ w (u2 , n + e1 + e2 ) + w (u1 , n) as w is submodular (P1), = Tµ1 τ (n + e1 + e2 ) + Tµ1 τ (n) by (A.9)

17

2. Case 2: u1 < u2 , which implies u1 = 0 and u2 = 1. Tµ1 τ (n + e1 ) + Tµ1 τ (n + e2 ) ≥ w (1, n + e1 ) + w (0, n + e2 ) by (A.2), = 1 + τ (n) + τ (n + e2 ) = w (0, n) + w (1, n + e1 + e2 ) by (A.1) = Tµ1 τ (n + e1 + e2 ) + Tµ1 τ (n) by (A.9) Proof that Tµ1 τ is diagonally submissive: We need to show that for any n, D22 Tµ1 τ ≤ D12 Tµ1 τ and D11 Tµ1 τ ≤ D12 Tµ1 τ holds, or, Tµ1 τ (n + e1 + e2 ) + Tµ1 τ (n + e2 ) ≥ Tµ1 τ (n + 2e2 ) + Tµ1 τ (n + e1 ) , Tµ1 τ (n + e1 + e2 ) + Tµ1 τ (n + e1 ) ≥ Tµ1 τ (n + 2e1 ) + Tµ1 τ (n + e2 ) .

(A.10) (A.11)

The proofs for (A.10) and (A.11) are similar to that of (A.8). To prove (A.10), let u1 , u2 ∈ {0, 1} be the maximizers of w at n + e1 and n + 2e2 : Tµ1 τ (n + e1 ) = w (u1 , n + e1 ) and Tµ1 τ (n + 2e2 ) = w (u2 , n + 2e2 ) .

(A.12)

We again consider the cases u1 ≥ u2 and u1 < u2 : 1. u1 ≥ u2 Tµ1 τ (n + e1 + e2 ) + Tµ1 τ (n + e2 ) ≥ w (u1 , n + e1 + e2 ) + w (u2 , n + e2 ) by (A.2), = w (u1 , n + e1 ) + w (u2 , n + 2e2 ) + D2 w (u1 , n + e1 ) − D2 w (u2 , n + e2 ) ≥ w (u1 , n + e1 ) + w (u2 , n + 2e2 ) + D2 w (u1 , n + e1 ) − D2 w (u1 , n + e2 ) as D2 w(u, n) is increasing in u, ≥ w (u1 , n + e1 ) + w (u2 , n + 2e2 ) + D2 w (u1 , n + e1 ) − D2 w (u1 , n + e1 ) as w is diagonally submissive (P2), = w (u1 , n + e1 ) + w (u2 , n + 2e2 ) = Tµ1 τ (n + 2e2 ) + Tµ1 τ (n + e1 ) by (A.12) 2. u1 < u2 , which implies u1 = 0 and u2 = 1. We separately consider n1 ≥ 1 and n1 = 0. When n1 ≥ 1: Tµ1 τ (n + e1 + e2 ) + Tµ1 τ (n + e2 ) ≥ w (1, n + e1 + e2 ) + w (0, n + e2 ) by (A.2), = 1 + τ (n + e2 ) + τ (n + e2 ) by (A.1), = 1 + τ (n + e2 ) + D2 τ (n) − D1 τ (n) + τ (n + e1 ) = 1 + τ (n + e1 ) + D2 τ (n) − D1 τ (n) + τ (n − e1 + 2e2 ) + D1 τ (n − e1 + e2 ) − D2 τ (n − e1 + e2 ) = 1 + τ (n + e1 ) + τ (n − e1 + 2e2 ) + D2 τ (n) − D1 τ (n) + D1 τ (n − e1 + e2 ) − D2 τ (n − e1 + e2 ) = w(0, n + e1 ) + w(1, n + 2e2 ) + A ≥ Tµ1 τ (n + e1 ) + Tµ1 τ (n + 2e2 ) as A ≥ 0, 18

which follows from A = D2 τ (n) − D1 τ (n) + D1 τ (n − e1 + e2 ) − D2 τ (n − e1 + e2 ) = D2 τ (n) − D2 τ (n − e1 ) + D2 τ (n − e1 ) − D2 τ (n − e1 + e2 ) − D1 τ (n) + D1 τ (n − e1 ) − D1 τ (n − e1 ) + D1 τ (n − e1 + e2 ) = (D21 τ (n − e1 ) − D22 τ (n − e1 )) + (D12 τ (n − e1 ) − D11 τ (n − e1 )) ≥ 0 because τ is diagonally submissive (P2). When n1 = 0: Tµ1 τ (n + e1 + e2 ) + Tµ1 τ (n + e2 ) ≥ w (0, n + e1 + e2 ) + w (1, n + e2 ) by (A.2), = τ (n + e1 + e2 ) + τ (n + e2 ) by (A.1), = τ (n + e1 ) + τ (n + 2e2 ) + D2 τ (n + e1 ) − D2 τ (n + e2 ) ≥ τ (n + e1 ) + τ (n + 2e2 ) because τ is diagonally submissive (P2), = w(0, n + e1 ) + w(1, n + 2e2 ) = Tµ1 τ (n + 2e2 ) + Tµ1 τ (n + e1 ) by (A.12) To prove (A.11), define u1 , u2 ∈ {0, 1} such that Tµ1 τ (n + e2 ) = w (u1 , n + e2 ) and Tµ1 τ (n + 2e1 ) = w (u2 , n + 2e1 ) ,

(A.13)

and again consider two cases: 1. u1 ≥ u2 Tµ1 τ (n + e1 + e2 ) + Tµ1 τ (n + e1 ) ≥ w (u1 , n + e1 + e2 ) + w (u2 , n + e1 ) by (A.2), = w (u1 , n + e2 ) + w (u2 , n + 2e1 ) + D1 w (u1 , n + e2 ) − D1 w (u2 , n + e1 ) ≥ w (u1 , n + e2 ) + w (u2 , n + 2e1 ) + D1 w (u1 , n + e2 ) − D1 w (u1 , n + e1 ) as D1 w(u, n) is increasing in u, ≥ w (u1 , n + e2 ) + w (u2 , n + 2e1 ) + D1 w (u1 , n + e2 ) − D1 w (u1 , n + e2 ) as w is diagonally submissive (P2), = w (u1 , n + e2 ) + w (u2 , n + 2e1 ) = Tµ1 τ (n + 2e1 ) + Tµ1 τ (n + e2 ) by (A.13) 2. u1 < u2 , which implies u1 = 0 and u2 = 1. Tµ1 τ (n + e1 + e2 ) + Tµ1 τ (n + e1 ) ≥ w (1, n + e1 + e2 ) + w (0, n + e1 ) by (A.2), = 1 + τ (n + e2 ) + τ (n + e1 ) by (A.1), = w(1, n + 2e1 ) + w(0, n + e2 ) = Tµ1 τ (n + 2e1 ) + Tµ1 τ (n + e2 ) by (A.13) Proof that Tµ1 τ is upper bounded: We need to show that for any n, D1 Tµ1 τ (n) ≤ 1 and D2 Tµ1 τ (n) ≤ 1. Let u1 , u2 ∈ {0, 1} be the maximizers of w at n and n + e1 : Tµ1 τ (n) = w(u1 , n) and Tµ1 τ (n + e1 ) = w(u2 , n + e1 ). 19

(A.14)

We have: D1 Tµ1 τ (n) = Tµ1 τ (n + e1 ) − Tµ1 τ (n) = w(u2 , n + e1 ) − w(u1 , n) by (A.14), ≤ w(u2 , n + e1 ) − w(u2 , n) as u1 is the maximizer at n by (A.14), = D1 w(u2 , n) ≤ 1 as w satisfies P4 The proof that D2 Tµ1 τ (n) ≤ 1 follows the same steps with the indices 1 and 2 reversed. Proof that Tλ τ ∈ Ψ: For any τ ∈ Ψ, we begin by redefining w to be the following function of (u, n) ∈ {1, 2} × Ω:  τ (n + e1 ) if u = 1 w (u, n) = . (A.15) τ (n + e2 ) if u = 2 In this function, u = 1 or 2 correspond to routing to server 1 or 2, respectively. By definition, Tλ τ (n) = max {τ (n + e1 ) , τ (n + e2 )} = max w (u, n) .

(A.16)

u∈{1,2}

The first and second differences of w are:  D1 w (u, n) = w (u, n + e1 ) − w (u, n) =  D2 w (u, n) = w (u, n + e2 ) − w (u, n) =  D12 w (u, n) = D21 w (u, n) =  D11 w (u, n) =  D22 w (u, n) =

D1 τ (n + e1 ) if u = 1 , D1 τ (n + e2 ) if u = 2

(A.17)

D2 τ (n + e1 ) if u = 1 , D2 τ (n + e2 ) if u = 2

(A.18)

D21 τ (n + e1 ) if u = 1 , D21 τ (n + e2 ) if u = 2

(A.19)

D11 τ (n + e1 ) if u = 1 , D11 τ (n + e2 ) if u = 2

(A.20)

D22 τ (n + e1 ) if u = 1 . D22 τ (n + e2 ) if u = 2

(A.21)

For a given u, w equals τ with one coordinate shifted by one, and therefore τ ∈ Ψ implies w ∈ Ψ. Furthermore, as w is diagonally submissive, D1 w (u, n) in (A.17) is increasing in u and D2 w (u, n) in (A.18) is decreasing in u. We need to show that properties P1 to P4 are preserved under the maximum operator, Tλ τ . Proof that Tλ τ is submodular: We need to show that for any n, D12 Tλ τ ≤ 0, or, Tλ τ (n + e1 ) + Tλ τ (n + e2 ) ≥ Tλ τ (n + e1 + e2 ) + Tλ τ (n)

20

(A.22)

Let u1 , u2 ∈ {1, 2} be the maximizers of (A.16) at n + e1 + e2 and n: Tλ τ (n + e1 + e2 ) = w (u1 , n + e1 + e2 ) and Tλ τ (n) = w (u2 , n) .

(A.23)

We prove (A.22) separately for u1 ≥ u2 and u1 < u2 : 1. u1 ≥ u2 Tλ τ (n + e1 ) + Tλ τ (n + e2 ) ≥ w (u1 , n + e1 ) + w (u2 , n + e2 ) by (A.16), = w (u1 , n + e1 + e2 ) + w (u2 , n) − D2 w (u1 , n + e1 ) + D2 w (u2 , n) ≥ w (u1 , n + e1 + e2 ) + w (u2 , n) − D2 w (u2 , n + e1 ) + D2 w (u2 , n) as D2 w (u, n + e1 ) is decreasing in u, = w (u1 , n + e1 + e2 ) + w (u2 , n) − D21 w (u2 , n) ≥ w (u1 , n + e1 + e2 ) + w (u2 , n) as w is submodular (P1), = Tλ τ (n + e1 + e2 ) + Tλ τ (n) by (A.23) 2. u1 < u2 Tλ τ (n + e1 ) + Tλ τ (n + e2 ) ≥ w (u2 , n + e1 ) + w (u1 , n + e2 ) by (A.16), = w (u1 , n + e1 + e2 ) + w (u2 , n) − D1 w (u1 , n + e2 ) + D1 w (u2 , n) ≥ w (u1 , n + e1 + e2 ) + w (u2 , n) − D1 w (u2 , n + e2 ) + D1 w (u2 , n) as D1 w (u, n + e2 ) is increasing in u, = w (u1 , n + e1 + e2 ) + w (u2 , n) − D21 w (u2 , n) ≥ w (u1 , n + e1 + e2 ) + w (u2 , n) as w is submodular (P1), = Tλ τ (n + e1 + e2 ) + Tλ τ (n) by (A.23) Proof that Tλ τ is diagonally submissive: We need to show that for any n, D22 Tλ τ ≤ D12 Tλ τ and D11 Tλ τ ≤ D12 Tλ τ holds, or, Tλ τ (n + e1 + e2 ) + Tλ τ (n + e2 ) ≥ Tλ τ (n + 2e2 ) + Tλ τ (n + e1 ) , Tλ τ (n + e1 + e2 ) + Tλ τ (n + e1 ) ≥ Tλ τ (n + 2e1 ) + Tλ τ (n + e2 ) .

(A.24) (A.25)

The proofs for (A.24) and (A.25) are similar to that of (A.22). We prove (A.24). The proof for (A.25) follows by symmetry. Let u1 and u2 be the maximizers of w at n + 2e2 and n + e1 : Tλ τ (n + 2e2 ) = w (u1 , n + 2e2 ) and Tλ τ (n + e1 ) = w (u2 , n + e1 ) . We prove (A.24) separately for u1 ≥ u2 and u1 < u2 :

21

(A.26)

1. u1 ≥ u2 Tλ τ (n + e1 + e2 ) + Tλ τ (n + e2 ) ≥ w (u2 , n + e1 + e2 ) + w (u1 , n + e2 ) by (A.16), = w (u2 , n + e1 ) + w (u1 , n + e2 ) + D2 w (u2 , n + e1 ) ≥ w (u2 , n + e1 ) + w (u1 , n + e2 ) + D2 w (u2 , n + e2 ) as w is diagonally submissive (P2), ≥ w (u2 , n + e1 ) + w (u1 , n + e2 ) + D2 w (u1 , n + e2 ) as D2 w (u, n + e2 ) is decreasing in u, = w (u2 , n + e1 ) + w (u1 , n + 2e2 ) = Tλ τ (n + 2e2 ) + Tλ τ (n + e1 ) by (A.26) 2. u1 < u2 , which implies u1 = 1 and u2 = 2. Tλ τ (n + e1 + e2 ) + Tλ τ (n + e2 ) ≥ w (u2 , n + e1 + e2 ) + w (u1 , n + e2 ) by (A.16), = τ (n + e1 + 2e2 ) + τ (n + e1 + e2 ) by (A.15), = w (u1 , n + 2e2 ) + w (u2 , n + e1 ) by (A.15), = Tλ τ (n + 2e2 ) + Tλ τ (n + e1 ) by (A.26) Proof that Tλ τ is upper bounded: We need to show that for any n, D1 Tλ τ (n) ≤ 1 and D2 Tλ τ (n) ≤ 1. Let u1 , u2 ∈ {1, 2} be the maximizers of w at n and n + e1 : Tλ τ (n) = w(u1 , n) and Tλ τ (n + e1 ) = w(u2 , n + e1 ).

(A.27)

We have: D1 Tλ τ (n) = Tλ τ (n + e1 ) − Tλ τ (n) = w(u2 , n + e1 ) − w(u1 , n) by (A.27), ≤ w(u2 , n + e1 ) − w(u2 , n) as u1 is the maximizer at n by (A.27), = D1 w(u2 , n) ≤ 1 as w satisfies P4. The proof that D2 Tλ τ (n) ≤ 1 is identical, with the indices 1 and 2 reversed. Proof that T τ ∈ Ψ: The set Ψ is closed under convex combinations, that is, f, g ∈ Ψ implies αf + (1 − α)g ∈ Ψ for α ∈ [0, 1], as one can easily verify. From (4), T τ is a convex combination of Tµi τ (i ∈ S), Tλ τ, and τ , and therefore T τ ∈ Ψ when τ ∈ Ψ, that is, T τ preserves the properties P1-P4.  Proof of Theorem 1: From Lemma 1, the optimal value function ν is upper bounded (P4), that is, Di ν(n) ≤ 1 for all n ∈ Ω. Assume that n1 ≥ 1 (queue 1 is nonempty). Then Di ν(n − e1 ) ≤ 1 implies 1 + ν(n − e1 ) ≥ ν(n), 22

which implies that the optimal action for server 1 when it completes service is to begin serving the next customer in its queue. The proof for server 2 is identical.  Proof of Theorem 2: We need to show:  ν(n + ei ) ≥ ν(n + ej ) =⇒

ν(n) ≥ ν(n − ei + ej ) . ν(n + ei + ej ) ≥ ν(n + 2ej )

(A.28)

To prove the first part of (A.28), we use diagonal submissiveness (P2) at state n − ei : Dii ν(n − ei ) ≤ Dij ν(n − ei ) =⇒ Di ν(n) ≤ Di ν(n − ei + ej ) =⇒ ν(n + ei ) − ν(n) ≤ ν(n + ej ) − ν(n − ei + ej )

(A.29)

The left hand side of (A.28), ν(n + ei ) ≥ ν(n + ej ), and (A.29), imply ν(n) ≥ ν(n − ei + ej ). To prove the second part of (A.28), we use diagonal submissiveness (P2) at state n: Djj ν(n) ≤ Dji ν(n) =⇒ Dj ν(n + ej ) ≤ Dj ν(n + ei ) =⇒ ν(n + 2ej ) − ν(n + ej ) ≤ ν(n + ei + ej ) − ν(n + ei )

(A.30)

The left hand side of (A.28), ν(n + ei ) ≥ ν(n + ej ), and (A.30), imply ν(n + ei + ej ) ≥ ν(n + 2ej ).  Proof of Lemma 2: Without loss of generality, we take i = 1 and j = 2. We need to show: n1 ≥ n2 > 0, µ2 ≥ µ1 , ν (n + e2 ) ≥ ν (n + e1 ) ⇒ ν(n − e1 + e2 ) ≥ ν(n + e1 − e2 ) (A.31) n1 = n2 , µ2 ≥ µ1 ⇒ ν(n + e2 ) ≥ ν(n + e1 ). (A.32) We use induction and value iteration to prove (A.31) and (A.32). We first show that (A.31) and (A.32) hold at iteration k = 1. Then, we prove that if both hold at iteration k ≥ 1, then they are preserved at iteration k + 1. The value iteration algorithm iterates as follows: νk+1 (n) =

s X

µi (Ini ≥1 + νk (n − ei × Ini ≥1 ))

i=1

+ λ

s X

ni νk (n) +

i=1

N−

s X

! ni

λTλ νk (n).

(A.33)

i=1

(The optimality equation (5) corresponds to the limit when k → ∞ of this iteration, with g = limk→∞ {νk+1 (n) − νk (n)}.) For states n − e1 + e2 and n + e1 − e2 in (A.31), by using

23

(A.33) we obtain: νk+1 (n − e1 + e2 ) =µ1 (In1 ≥2 + νk (n − e1 − e1 × In1 ≥2 + e2 )) + µ2 (1 + νk (n − e1 )) + λ(n1 + n2 )νk (n − e1 + e2 ) + (N − n1 − n2 )λ max{νk (n + e2 ), νk (n − e1 + 2e2 )}, (A.34) νk+1 (n + e1 − e2 ) =µ1 (1 + νk (n − e2 )) + µ2 (In2 ≥2 + νk (n + e1 − e2 − e2 × In2 ≥2 )) + λ(n1 + n2 )νk (n + e1 − e2 ) + (N − n1 − n2 )λνk (n + e1 ). (A.35) By the assumption in (A.31), we know that routing to server 2 is optimal at n. Therefore, from Theorem 2, it is also optimal to route to server 2 at n + e1 − e2 and this is why the last term on the left hand side of (A.35) does not have a maximum operator. To prove (A.32), we use (A.33) at n + e1 and n + e2 when n1 = n2 = n: νk+1 (n + e1 ) = µ1 (1 + νk (n)) + µ2 (In≥1 + νk (n + e1 − e2 × In≥1 )) + λ(2n + 1)νk (n + e1 ) + (N − 2n − 1)λ max{νk (n + 2e1 ), νk (n + e1 + e2 )}, νk+1 (n + e2 ) = µ1 (In≥1 + νk (n − e1 × In≥1 + e2 )) + µ2 (1 + νk (n)) + λ(2n + 1)νk (n + e2 ) + (N − 2n − 1)λ max {νk (n + e1 + e2 ), νk (n + 2e2 )} .

(A.36)

(A.37)

Iteration 1: Setting ν0 = 0 as the initial values of the value iteration algorithm, we prove that (A.31) and (A.32) hold at iteration 1. For (A.31), we have: ν1 (n − e1 + e2 ) − ν1 (n + e1 − e2 ) = (µ1 In1 ≥2 − µ1 ) − (µ2 In2 ≥2 − µ2 )  if n1 ≥ 2, n2 ≥ 2  0 if n1 ≥ 2, n2 < 2 . = µ2 ≥ 0  µ2 − µ1 ≥ 0 by the assumption in (A.31) if n1 < 2, n2 < 2 For (A.32), we have: ν1 (n + e2 ) − ν1 (n + e1 ) = (µ2 − µ1 )(1 − In≥1 ) ≥ 0 since µ2 ≥ µ1 . Iteration k: We now assume that both (A.31) and (A.32) hold at iteration k. That is, n1 ≥ n2 > 0, µ2 ≥ µ1 , νk (n + e2 ) ≥ νk (n + e1 ) ⇒ νk (n − e1 + e2 ) ≥ νk (n + e1 − e2 ) (A.38) n1 = n2 = n, µ2 ≥ µ1 ⇒ νk (n + e2 ) ≥ νk (n + e1 ). (A.39) Iteration k + 1: Based on the induction assumption at iteration k ((A.38) and (A.39)), we now prove (A.31) and (A.32) separately at iteration k + 1. Proof for (A.31): We consider three regions: (1) n1 ≥ n2 ≥ 2, (2) n1 ≥ 2, n2 = 1, and (3) 24

n1 = n2 = 1. Region 1: n1 ≥ n2 ≥ 2. In this region, by using (A.34) and (A.35), we obtain νk+1 (n − e1 + e2 ) − νk+1 (n + e1 − e2 ) = µ1 (νk (n − 2e1 + e2 ) − νk (n − e2 )) + µ2 (νk (n − e1 ) − νk (n + e1 − 2e2 )) + λ (n1 + n2 ) (νk (n − e1 + e2 ) − νk (n + e1 − e2 )) + (N − n1 − n2 ) λA,

(A.40)

where A = max {νk (n + e2 ), νk (n − e1 + 2e2 )} − νk (n + e1 ).

(A.41)

By (A.38), the first three terms on the right hand side of (A.40) are non-negative. It remains to prove that A is also non-negative, which we do by considering two cases for the optimal routing policy in state n − e1 + e2 . Case 1: Optimal to route to server 1 in state n − e1 + e2 , which implies A = νk (n + e2 ) − νk (n + e1 ), which is nonnegative based on the assumption in (A.38). Case 2: Optimal to route to server 2 in state n−e1 +e2 , which implies A = νk (n−e1 +2e2 )− νk (n + e1 ). In this case, the optimality of routing to server 2 (that is, νk (n − e1 + 2e2 ) ≥ νk (n + e2 )) and (A.38) imply that A is nonnegative. Region 2: n1 ≥ 2, n2 = 1. In this region, νk+1 (n − e1 + e2 ) − νk+1 (n + e1 − e2 ) = µ1 (νk (n − 2e1 + e2 ) − νk (n − e2 )) + µ2 (1 + νk (n − e1 ) − νk (n + e1 − e2 )) + λ (n1 + n2 ) (νk (n − e1 + e2 ) − νk (n + e1 − e2 )) + (N − n1 − n2 ) λA (A.42) All terms in the right hand side of (A.42) are the same as the terms in (A.40), except the second term. The following shows that this second term is nonnegative: 1 + νk (n − e1 ) − νk (n + e1 − e2 ) ≥ νk (n − e1 + e2 ) − νk (n + e1 − e2 ) by P4 ≥ 0 by (A.38) Region 3: n1 = n2 = 1. In this region, νk+1 (n − e1 + e2 ) − νk+1 (n + e1 − e2 ) = µ1 (νk (n − e1 + e2 ) − 1 − νk (n − e2 )) + µ2 (1 + νk (n − e1 ) − νk (n + e1 − e2 )) + λ (n1 + n2 ) (νk (n − e1 + e2 ) − νk (n + e1 − e2 )) + (N − n1 − n2 ) λA (A.43) In the Region 1 proof, we showed that the third and fourth terms in the right hand side of (A.43) are nonnegative. In the Region 2 proof, we showed that the second term is 25

nonnegative. To prove that (A.43) is nonnegative, we now show that the sum of the first and second terms is nonnegative: µ1 (νk (n − e1 + e2 ) − 1 − νk (n − e2 )) + µ2 (1 + νk (n − e1 ) − νk (n + e1 − e2 )) ≥ µ1 (νk (n − e1 + e2 ) − νk (n + e1 − e2 ) + νk (n − e1 ) − νk (n − e2 )) because µ2 ≥ µ1 by (A.31) and the second term is nonnegative ≥ µ1 (νk (n − e1 ) − νk (n − e2 )) by (A.38) ≥ 0 because n1 = n2 and routing to 2 is optimal at n − e1 − e2 by (A.39) Proof of (A.32): Given the induction assumption (A.39), we now prove (A.32) at iteration k + 1, by considering two regions for n1 = n2 = n: (1) n = 0, (2) n ≥ 1. Region 1: n = 0. In this region, νk+1 (n + e2 ) − νk+1 (n + e1 ) = µ2 (1 + νk (n) − νk (n + e1 )) − µ1 (1 + νk (n) − νk (n + e2 )) + λ(2n + 1) (νk (n + e2 ) − νk (n + e1 )) + (N − 2n − 1) λB, (A.44) where B = max {νk (n + e1 + e2 ), νk (n + 2e2 )} − max {νk (n + 2e1 ), νk (n + e1 + e2 )} .

(A.45)

First, we show that the sum of the first two right hand side terms in (A.44) is nonnegative: µ2 (1 + νk (n) − νk (n + e1 )) − µ1 (1 + νk (n) − νk (n + e2 )) ≥ (µ2 − µ1 )(1 + νk (n) − νk (n + e2 )) because νk (n + e2 ) ≥ νk (n + e1 ) by (A.39) = (µ2 − µ1 )(1 − D2 νk (n)) ≥ 0 because µ2 ≥ µ1 and D2 ν(n) ≤ 1 by P4. The induction assumption (A.39) implies that the third term in the right hand side of (A.44) is nonnegative. It remains to prove that B is nonnegative. By the induction assumption (A.39), routing to 2 is optimal at state n and consequently at state n + e1 (Theorem 2). This leaves two possible combinations of optimal routing policies at states n + e1 and n + e2 . Case 1: Optimal to route to server 2 in states n + e1 and n + e2 , which implies B = νk (n + 2e2 ) − νk (n + e1 + e2 ), which is nonnegative because routing to server 2 is optimal in state n + e2 . Case 2: Optimal to route to server 1 in state n + e2 and server 2 in state n + e1 , which implies B = νk (n + e1 + e2 ) − νk (n + e1 + e2 ) = 0.

26

Region 2: n ≥ 1. In this region, νk+1 (n + e2 ) − νk+1 (n + e1 ) = µ2 (νk (n) − νk (n + e1 − e2 )) − µ1 (νk (n) − νk (n − e1 + e2 )) + λ(2n + 1) (νk (n + e2 ) − νk (n + e1 )) + (N − 2n − 1) λB (A.46) where B follows (A.45). The third term on the right hand side is nonnegative by the induction hypothesis and we showed in the Region 1 proof that B is nonnegative. We complete the proof by showing that the sum of the first two right hand side terms is nonnegative: µ2 (νk (n) − νk (n + e1 − e2 )) − µ1 (νk (n) − νk (n − e1 + e2 )) ≥ (µ2 − µ1 ) (νk (n) − νk (n + e1 − e2 )) as νk (n − e1 + e2 ) ≥ νk (n + e1 − e2 ) by (A.38) ≥ 0 because µ2 ≥ µ1 and because routing to 2 is optimal at n and consequently also at n − e2 by Theorem 2.  Proof of Theorem 3: Follows directly from Lemma 2 (ii).  Proof of Theorem 4: We use an argument that is similar to the proof in Winston (1977). We use the uniformized Markov Decision Process from Section 2, with rates that are normalized so that Λ = sµ + N λ = 1, where µi = µ is the common service rate. Letting X(n) = {j|nj > 0; j ∈ S} be the set of queues with busy servers, if the process begins in state n, then the next transition will result in one of the following: P 1. With probability (N − i∈S ni )λ an arrival occurs, requiring routing to one of the queues. 2. With probability |X(n)|µ a service is completed at one of the queues in X(n). P 3. With probability 1 − |X(n)|µ − (N − i∈S ni )λ nothing happens. We will show that the SQ policy stochastically maximizes throughput over t transitions. Figure A.6 illustrates the notation that we will use in the proof. Let R = (δ1 , δ2 , · · · , δt ) be a decision rule, where δt−u (n) = k means that action k (routing to server k) is taken with t − u − 1 transitions to go, if the preceding transition was an arrival, given that the state before the preceding transition was n. Let ∆ be the set of all decision rules. For R ∈ ∆, νt−u (n|R) is the random variable denoting throughput with t − u transitions to go, not counting throughput during the final transition, given that the state with t − u transitions to go is n. If δt (n) = k, then we have  ∈ X(n)  1 + νt−1 (n − ei |R) with prob. µ for iP νt−1 (n + ek |R) with prob. (N − i∈S ni )λ νt (n|R) = . (A.47) P  νt−1 (n|R) with prob. 1 − |X(n)|µ − (N − i∈S ni )λ Let L(n) = arg mini∈S {ni } and let R? be the SQ policy, which routes each new arrival to the shortest queue, L(n), with ties broken arbitrarily. Our goal is to prove that R? is optimal,

27

ν1 ν2

Transitions to go

t If an δ arrival t

t -1

t -2

δt - 1

δt - 2

. .

.

2

1

δ2

δ1

νt νt - 1 νt - 2

Figure A.6: Notation for proof of optimality of the SQ policy.

in the sense that for any decision rule R ∈ ∆, νt (n|R? ) ≥st νt (n|R)

R? , R ∈ ∆

(A.48)

where X ≥st Y (X is stochastically greater than or equal to Y ) means that for all real numbers c, Pr{X > c} ≥ Pr{Y > c}. As stochastic ordering implies ordering of the expected values, (A.48) is stronger than the statement in Theorem 4. As part of the proof for (A.48), we will prove the following two properties: νt (n|R? ) ≥st νt (n|R? ),

(A.49)

where for n = (n1 , n2 , n3 , n4 , · · · , ns ) and n = (n1 , n2 , n3 , n4 , · · · , ns ), n1 + n2 = n1 + n2 and |n1 − n2 | ≤ |n1 − n2 |, and 1 + νt (n|R? ) ≥st νt (n + ei |R? )

i ∈ S.

(A.50)

Note that states n and n are identical except for two servers, which we label here as servers 1 and 2, without loss of generality. We prove that (A.48), (A.49), and (A.50) hold for t ≥ 1 by induction. We begin by proving that the three properties hold for t = 1. To prove that (A.48) holds, set ν0 (n|R) = 0 for all n and R, which implies that ν1 (n|R) and ν1 (n|R? ) have the same distribution, that is, both equal 1 with probability |X(n)|µ and zero otherwise and hence ν1 (n|R) =st ν1 (n|R? ). To prove that (A.49) holds, note that X(n) ⊆ X(n), as one can easily verify, and therefore Pr{ν1 (n|R? ) = 1} = |X(n)|µ ≥ |X(n)|µ = Pr{ν1 (n|R? ) = 1}, which implies ν1 (n|R? ) ≥st ν1 (n|R? ). To prove that (A.50) holds, note that Pr{1 + ν1 (n|R? ) > 1} = |X(n)|µ ≥ 0 = Pr{ν1 (n + ei |R? ) > 1} and Pr{1 + ν1 (n|R? ) > 0} = 1 ≥ |X(n) ∪ {i}|µ = Pr{ν1 (n + ei |R? ) > 0}, which together imply that 1 + ν1 (n|R? ) ≥st ν1 (n + ei |R? ). As the induction assumption, assume that (A.48)-(A.50) hold for t − 1. We complete the proof by proving that (A.48)-(A.50) continue to hold for t. 28

Proof of (A.48): Let δt (n) = k. From (A.47), property (A.48) will be valid for t if 1 + νt−1 (n − ei |R? ) ≥st 1 + νt−1 (n − ei |R) i ∈ X(n), νt−1 (n + eL(n) |R? ) ≥st νt−1 (n + ek |R), νt−1 (n|R? ) ≥st νt−1 (n|R).

(A.51) (A.52) (A.53)

Inequalities (A.51) and (A.53) hold because (A.48) holds for t − 1. Inequality (A.52) holds because: νt−1 (n + eL(n) |R? ) ≥st νt−1 (n + ek |R? ) as (A.49) holds at t − 1 ≥st νt−1 (n + ek |R) as (A.48) holds at t − 1. Proof of (A.49): Assume, without loss of generality, that n1 ≤ n2 . Also, let’s first assume that n1 ≤ n2 . We show that (A.49) holds in the five possible cases. One can easily obtain the proof for n2 ≤ n1 by interchanging the indices for n1 and n2 in the following proof. Case I. n1 = n1 and n2 = n2 . In this case n = n and (A.49) holds with equality. Case II. L(n) = L(n) = k and n1 , n2 , n1 , n2 > 0, which implies X(n) = X(n). To prove (A.49), we need to show that 1 + νt−1 (n − ei |R? ) ≥st 1 + νt−1 (n − ei |R? ) i ∈ X(n) νt−1 (n + ek |R? ) ≥st νt−1 (n + ek |R? ) νt−1 (n|R? ) ≥st νt−1 (n|R? ). All three inequalities follow directly from the induction assumption on (A.49) at t − 1. Case III. L(n) = L(n) = 1 and n1 = 0, n1 , n2 , n2 > 0. In this case X(n) = X(n) ∪ {1}. To prove (A.49) for this case, we need to show that 1 + νt−1 (n − ei |R? ) ≥st 1 + νt−1 (n − ei |R? ) i ∈ X(n) νt−1 (n + e1 |R? ) ≥st νt−1 (n + e1 |R? ) νt−1 (n|R? ) ≥st νt−1 (n|R? ) 1 + νt−1 (n − e1 |R? ) ≥st νt−1 (n|R? ).

(A.54) (A.55) (A.56) (A.57)

Inequalities (A.54)-(A.56) hold as (A.49) holds at t − 1. To verify (A.57), based on (A.50) and (A.49) at t − 1, we have 1 + νt−1 (n − e1 |R? ) ≥st νt−1 (n|R? ) ≥st νt−1 (n|R? ). Case IV. L(n) = k > 1, L(n) = 1 and n1 , n2 , n1 , n2 > 0. In this case X(n) = X(n). We 29

need to show (A.54), (A.56), and νt−1 (n + ek |R? ) ≥st νt−1 (n + e1 |R? ).

(A.58)

As discussed in case III, (A.54) and (A.56) follow directly from the induction assumption that (A.49) holds at t − 1. To prove (A.58), we have νt−1 (n + ek |R? ) ≥st νt−1 (n + e1 |R? ) as (A.48) holds at t − 1 ≥st νt−1 (n + e1 |R? ) as (A.49) holds at t − 1. Case V. L(n) = k > 1, L(n) = 1 and n1 = 0, n1 , n2 , n2 > 0. In this case X(n) = X(n)∪{1}. The validity of (A.49) in this case follows from validity of (A.54), (A.55), (A.57), and (A.58) shown in previous cases. Cases I-V cover all possibilities, as we now explain. In Case I, n1 = 0, while in Cases II-V, n1 > 0. Note that, if n1 = 0, then |n2 − n1 | = n2 = n1 + n2 = n1 + n2 and this means that either (n1 , n2 ) = (0, n2 ) = (n1 , n2 ) or (n1 , n2 ) = (n2 , 0) = (n2 , n1 ). In the former case, n = n, as in Case I. In the latter case, n is equal to n with the positions of n1 and n2 interchanged. By an easy induction, we can show that νt (n|R∗ ) and νt (n|R∗ ) have the same distribution in the homogenous server case, as n is n with the positions of two of its elements rearranged. Since n1 > 0 in Cases II-V, n2 > 0 (as n2 ≥ n1 ) and n2 > 0 (as n1 + n2 = n1 + n2 > 0 and n2 ≥ n1 ) as well in these cases, but n1 can either be 0 or positive. In Cases III and V, n1 = 0 and in Cases II and IV, n1 > 0. In Cases II and III, L(n) = L(n), while in Cases IV and V, L(n) 6= L(n). Proof of (A.50): The two possible cases are: Case I. X(n) = X(n + ei ). In this case, we need to show 2 + νt−1 (n − ej |R? ) ≥st 1 + νt−1 (n + ei − ej |R? ) for j ∈ X(n) 1 + νt−1 (n + eL(n) |R? ) ≥st νt−1 (n + ei + eL(n+ei ) |R? ) 1 + νt−1 (n|R? ) ≥st νt−1 (n + ei |R? ).

(A.59) (A.60) (A.61)

Inequalities (A.59) and (A.61) follow directly because (A.50) holds at t − 1. If L(n) = L(n + ei ), (A.50) also results in (A.60). If L(n) 6= L(n + ei ), then 1 + νt−1 (n + eL(n) |R? ) ≥st 1 + νt−1 (n + eL(n+ei ) |R? ) as (A.49) holds at t − 1 ≥st νt−1 (n + ei + eL(n+ei ) |R? ) as (A.50) holds at t − 1. Case II. X(n + ei ) = X(n) ∪ {i}. Inequalities (A.59)-(A.61) are still valid in this case. All that remains is to verify that (A.59) holds when i = j, which we see from the following: 1 + νt−1 (n|R? ) ≥st νt−1 (n|R? ) = νt−1 (n + ei − ei |R? ). 30

Suggest Documents