On the optimal design of tandem queueing systems with finite buffers. Frederick S. Hillier. Department of Operations Research, Stanford University,. Stanford CA ...
Queueing Systems 21 (1995) 245-266
245
On the optimal design of tandem queueing systems with finite buffers Frederick S. Hillier
Department of Operations Research, Stanford University, Stanford CA 94305, USA Kut C. So
Graduate School of Management, University of California, Irvine, CA 92717, USA Received 22 April 1994; revised 5 June 1995 We consider tandem queueing systemsthat can be formulated as a continuous-time Markov chain, and investigatehow to maximizethe throughput when the queue capacities are limited. We consider various constrained optimization problems where the decision variables are of one or more of the followingtypes: (1) expected servicetimes, (2) queue capacities, and (3) the number of servers at the respectivestations. After surveying our previous studies of this kind, we open up consideration of three new problems by presenting some numerical results that should give some insight into the general form of the optimal design.
Keywords: Tandem queues, optimal design, bowl phenomenon.
1.
Introduction
Systems of finite queues in series arise frequently in a variety of contexts, including production line systems (e.g., see Hillier and Boling [8-10]), flexible manufacturing systems (e.g., see Stecke and Solberg [30, 31]), communication networks (e.g., see Schwartz [28]), and computer systems with limits on multiprogramming (e.g., see Konheim and Reiser [22]). When service times are highly variable and the queue capacities (buffers between stations) are small, the performance of the system as measured by its throughput (output rate) is seriously degraded due to frequent blocking of customers ready to move to the next station. This raises the issue of how to design the system so as to maximize throughput subject to constraints on the design variables. Considerable research has been devoted to studying this issue, and this undoubtedly will continue for some time. Three of the key kinds of design variables considered are (1) the division of work among the stations, (2) the size of the buffers (queue capacities) between stations, and (3) the allocation of servers to the stations. For example, the authors [11-19] recently have systematically studied each of these three kinds o f design variables individually (as well as the first and third kinds 9 J.C. Baltzer AG, SciencePublishers
246
F.S. Hillier, K.C. So/Optimal design of tandem queues
jointly). These studies were viewed, in part, as building blocks for subsequent research into the simultaneous optimization of at least two of these kinds of design variables together. The main contribution of this paper is to help open up this subsequent research. Section 4 examines four optimization problems, namely, the simultaneous optimization of each of the three pairs of kinds of design variables, and then the simultaneous optimization of all three kinds. Although the authors [14] have previously studied one pair (briefly reviewed here), the other three optimization problems are new. For each of these three problems, the optimal design is obtained for a number of problem instances that should give some insight into the general form of the optimal design. It is hoped that this preliminary work will motivate further research into these important problems. We begin with the formulation of the model (section 2) and a brief review of the building blocks - the optimization of each kind of design variable individually (section 3). The main focus in section 3 is on the authors' recent studies, but details (including discussion of other related work) are left to the references. 2.
Model formulation
We use the classical queueing model formulated by Hunt [21] for finite queues in series. In particular, every customer must be processed through each of N service facilities (stations) in the same fixed sequence, namely, stations 1, 2,..., N. Stationj has sj servers operating in parallel (j = 1 , 2 , . . . , N ) . There always is a customer available to begin service at station 1. For j = 2, 3,..., N, the maximum number of customers that can be held in the queue (buffer) before station j (not counting any customers being held at station (j - 1) due to blocking or any customers being served at station j ) is qj. Thus, if a customer completes service at station (j - 1) when the queue before station j if full, this customer must be held by its server at station ( j - 1) without service beginning for the next customer. Until there is room for this customer in the queue before station j, this server must remain idle, simply holding the customer, so station j is said to be blocking station ( j - 1). (This blocking mechanism is referred to in the literature by various names: type 1 blocking, transfer blocking, production blocking, and non-immediate blocking.) We assume that all service times are independent, and that all service times at stationj are identically distributed with mean wj (j = 1 , 2 , . . . , N). We also assume that the service-time distributions at the various stations are the same except perhaps for their means (scale factors), so that their coefficients of variation are the same. For phase-type service-time distributions, the overall queueing process formulated above is a continuous-time Markov chain. Efficient procedures for its numerical solution are given by Hillier and Boling [7], Mitra and Tsoucas [24], Papadopoulos, Heavey, and O'Kelly [26, 27], and Heavey, Papadopoulos, and Browne [6]. To investigate different coefficients of variation (cv) for service times,
F.S. Hillier, K.C. So~Optimal design of tandem queues
247
we use the exponential distribution for cv --= l, the Erlang distribution for cv < 1, and the two-stage Coxian distribution for cv > 1 (mainly). Our primary measure of performance of the system is the throughput (output rate), denoted by R(q,s,w), where q = (q2, q 3 , . . . , q u ) , S = (S1, s2,...,SN) and w = (Wl, w2, 999 WN). (Some of the arguments of R are suppressed when their values are understood.) We normalize (choose the unit of time) by setting the average of the wj values equal to 1. We also calculate L(q, s, w), the expected total number of customers in the entire system (excluding customers waiting to begin service at station 1) as a measure of work-in-process inventory. However, no L values are recorded in this paper. Both R and L are easily calculated from the stationary distribution for the continuous-time Markov chain representing this queueing process. Let Q be the total number of available buffer spaces to be allocated to the (N - 1) buffers (queues). Let S be the total number of available servers to allocate to the N stations. Then the most general version of our current optimization model is ,
maximize R(q, s, w), subject to N j=2 N
~-~sj. = S, j=l N
~--~ wj. = N, j=!
qj >_ 0 and integer (j = 2 , 3 , . . . , N ) , sj _> 1 and integer (j = 1 , 2 , . . . , N ) , wj > 0 (j = 1 , 2 , . . . , N ) , where Q, S, and N are fixed constants, whereas q, s, and w are decision vectors. The third constraint indicates that the sum of the expected service times is a fixed constant, where normalizing sets this constant equal to N, but this total workload can be divided as desired among the N stations. Given the values of q and s, we use steepest ascent P A R T A N (method of parallel tangents), developed by Buehler, Shah, and Kempthorne [1], to find the
248
F.S. Hillier, K.C. So~Optimal design of tandem queues
maximizing value of w. Enumeration is used to find the maximizing values of q and s. Both reversibility results (e.g., see Yamazaki, Kawashima, and Sakasegawa [33]) and concavity results (e.g., see Meester and Shanthikumar [23]) reduce the number of cases that need to be considered through enumeration. The next section reviews the cases where two of the three vectors are fixed at default values, so only one actually is a decision vector. Cases where at least two are decision vectors are considered in section 4. 3.
Individual allocation of work, buffer space and servers
We begin by considering w as the only decision vector, then q alone, and finally just s. Allocation o f work to stations
Assume now that all qj are fixed and equal to the same constant q, and that all sj are fixed and equal to 1. Under the model of section 2, Hillier and Boling [8, 9]
then discovered empirically that the maximizing value of w (call it w*) consistently satisfies the following "bowl phenomenon." The B o w l Phenomenon: F o r j = 1 , 2 , . . . , N , Wj -~ WN + I _ j ,
> wj+ 1 w]
* < Wj+l
1