Generalized Lindley-Type Recursive Representations for Multi-Server Tandem Queues with Blocking WAI KIN VICTOR CHAN1 Rensselaer Polytechnic Institute ________________________________________________________________________ Lindley’s recursion is an explicit recursive equation that describes the recursive relationship between consecutive waiting times in a single-stage single-server queue. In this paper, we develop explicit recursive representations for multi-server tandem queues with blocking. We demonstrate the application of these recursive representations with simulations of large-scale tandem queueing networks. We compare the computational efficiency of these representations with two other popular discrete-event simulation approaches, namely, event scheduling and process interaction. Experimental results show that these representations are seven (or more) times faster than their counterparts based on the event-scheduling and process-interaction approaches. Additional Key Words and Phrases: Lindley Recursion, Tandem Queue, Blocking, Fast Simulation
________________________________________________________________________ 1. INTRODUCTION One important equation for modeling queueing dynamics is the Lindley recursion [Lindley 1952]. This recursion, which models the waiting times of a single-server queue, has been widely used in analyzing queueing systems, such as deriving distributions, properties, bounds, and performance measures of the underlying systems. For instance, the stochastic convexity property of the waiting time with respect to the arrival and service parameters of a single-server queue can be readily established using the Lindley recursion [Shaked and Shanthikumar 1988]; the waiting time distribution under certain assumptions can also be derived using the Lindley recursion [Buzacott and Shanthikumar 1993]. Other applications of the Lindley recursion and its variations in performance evaluation, calculation of bounds for performance measure, and optimization can be found, among many others, in Glasserman [1991], Fu and Hu [1997], Baccelli and Bremaud [2003], and references therein. There are efforts to extend the Lindley recursion to model multi-server queues. Kiefer and Wolfowitz [1954] describe an algorithmic recursive procedure (called “KW procedure” in this paper) to analyze multiple server queues, such as deriving the limiting distribution of the queue size. Unlike the Lindley recursion, the KW procedure is not described in an explicit expression. In particular, to obtain the waiting time of the th job in a queue with parallel servers, one is required to algorithmically sort a list of elements. Another recursion that models customer delay in a multi-server queue is given in Scheller-Wolf and Sigman [1997]. This recursion, while also requiring sorting a list of waiting times in a way similar to the KW procedure, allows the authors to demonstrate conditions for finite moment of customer delay that are weaker than that obtained using the KW procedure. One of main reasons for the variety of applications of the Lindley recursion is that it explicitly describes the relationship between successive waiting times using simple arithmetic operations including only addition (subtraction) and maximum. In this paper, Author address: W. K. V. Chan, Department of Decision Sciences and Engineering Systems, Rensselaer Polytechnic Institute, CII 5015, 110 Eight Street, Troy, NY 12180; email:
[email protected]. © ACM, (YEAR). This is the author’s version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in PUBLICATION, {VOL#, ISS#, (DATE)} http://doi.acm.org/10.1145/{nnnnnn.nnnnnn}
W. K. V. Chan
explicit recursions using only simple arithmetic operations (i.e., addition, subtraction, maximum, and minimum) are called the Lindley-type recursive representation (LRR) to draw a distinction between explicit recursions and algorithmic recursive procedures such as the KW procedure. Advantages of having an explicit recursive expression, as demonstrated by the Lindley recursion, motivate the development of an explicit recursion for multi-server queues by Krivulin [1994]. This recursion is an LRR because it consists of a set of explicit recursive equations that involve only minimum, maximum, and addition operations. However, the number of required arithmetic operations in this recursion is exponential. Specifically, to compute the waiting time of the th job in a queue with parallel servers, this recursion needs number of minimum or maximum operations. The computational complexity increases as the job index increases, limiting its practical usage in, for instance, simulation studies that simulate thousands of jobs and dozens of servers. Despite its complexity, the recursion introduced by Krivulin [1994] is one of the first LRRs for modeling multi-server queues. It also inspires the development of our LRR, number of simple arithmetic operations for multi-server queues. which requires only This LRR maintains a list of sorted elements in a way similar to the KW procedure. However, compared with the KW procedure, this LRR has an important advantage of being an explicit form, which allows us to extend it to new explicit LRRs for modeling the dynamics (such as finish times and waiting times) of tandem queueing networks with multiple servers in each stage and finite buffer spaces between stages. Therefore, the first objective of this paper is to introduce an LRR that uses only number of simple arithmetic operations for multi-server queues. The second objective of this paper is to extend this LRR to model a class of more complex queueing systems, tandem queueing networks with multiple servers at each stage and limited buffer spaces between stages. To demonstrate the applications of the LRR, the last objective of this paper is to apply the LRR to fast simulations of tandem queueing networks. Tandem queueing networks with finite buffer space for jobs are useful and popular tools in modeling, for example, manufacturing systems, and computer and communication systems. A tandem queueing network is called a single-server tandem queue (STQ) when every stage has exactly one server. When there is at least one stage with more than one server, the tandem queueing network is called a multi-server tandem queue (MTQ). Recursions for STQs have been widely used in the literature. For example, Chen and Chen [1990] develop a fast simulation approach based on recursive relationships of system dynamics in an STQ. By exploiting these recursive relationships, this fast simulation approach avoids the scheduling of any event. As a result, it runs much faster than a traditional discrete-event simulation, for example, with a one-fourth reduction in runtime. Cheng [1995] uses a recursion to prove the reversibility property of STQs under general blocking. Liu and Buzacott [1992] use a recursion for closed STQs to show the reversibility property to closed STQs. Chan and Schruben [2009] model the recursive relationships of closed STQs as mathematical programming models and use those models to establish reversibility, symmetry, reverse-symmetry, and semi-reversesymmetry properties. Chen [1997] discusses the use of recursions for STQs on parallel simulation. Performance evaluation (such as deriving bounds for performance measure) and optimization (such as gradient estimation) can also be carried out using the recursions of STQs [Buzacott and Shanthikumar 1993, Fu and Hu 1997]. Actually, the applications of these recursions can be traced back to the 1970’s when Yamazaki and Sakasegawa [1975] reported the reversibility phenomenon of STQs with blocking.
Generalized Lindley-Type Recursive Representations for Multiple-Server Tandem Queues
While recursions of STQs are prevalent, explicit recursions for MTQs are not welldeveloped. Nevertheless, there are some studies that utilize the recursive properties of the dynamics of MTQs to assist the analysis of these queueing systems. One of the studies is by Cheng [1997] who investigates the reversibility property of MTQs. The recursive procedure used in Cheng [1997] is an algorithmic recursive procedure similar to the KW procedure. Hunt and Foote [1995] describe a recursive algorithm for simulating queueing systems without blocking. This algorithm is based on both traditional discrete-event simulation and the recursive relationships used in Chen and Chen [1990]. This algorithm is efficient because it generalizes the recursive relationships in tandem queues to queueing networks and significantly reduces the number of scheduled events compared with a traditional discrete-event simulation. It is shown to be one to three times faster than traditional discrete-event simulations. All these publications and results are evidence for the importance of recursive relationships/equations in simulating and analyzing queueing systems. Therefore, there is a need to develop an explicit recursive representation of MTQs to facilitate both fast simulation and performance analysis of these systems. In the present paper, we introduce LRRs for MTQs under various blocking schemes. These LRRs are explicit expressions that are easy to implement and can significantly speed up MTQ simulations. The resulting simulation models completely eliminate the need to schedule any event and might facilitate the development of theoretical distributions or performance measures of MTQs. As mentioned earlier, the last objective of this paper is to demonstrate the application of the LRRs on fast simulations of MTQs. By doing so, we are able to benchmark the computational efficiency of the LRRs against two traditional discrete-event simulation approaches: the process-interaction (PI) approach and the event-scheduling (ES) approach. Experimental results show that the runtimes of the LRR simulation models are less than thirteen percent of those of the PI and ES simulation models under study. The rest of the paper is organized as follows. Section 2 defines MTQs and their blocking schemes. Section 3 reviews the KW procedure and existing explicit recursions for STQs. Based on the review, Section 4 introduces an size LRR for single-stage multi-server queues. The main result of this paper is presented in Section 5 where the explicit LRRs for MTQs are developed. Section 6 illustrates the application of the LRRs on fast simulations of MTQs with blocking. Section 7 offers conclusions. 2. MULTI-SERVER TANDEM QUEUES WITH BLOCKING A tandem queueing system (or in short, tandem queue) is composed of multiple singlestage queues connected in series through intermediate buffers. The queueing systems considered in this paper are open multi-server tandem queues (MTQs). An MTQ has consecutive stages, labeled , each of which consists of parallel servers and a finite storage space of size for jobs. The first stage is assumed to have an infinite buffer space. Figure 1(a) gives an example of a 3-stage tandem queue. Each big rectangle in the figure represents the location of the servers at a stage and does not contain any buffer space. The small rectangles in the figure denote the buffer spaces. The number of small rectangles in front of a stage is equal to —the size of the buffer at that stage. It should be noted that already includes the spaces for the jobs that are currently being served by the servers. Hence, the condition for is assumed.
W. K. V. Chan
Arriving jobs are processed first-come-first-served (FCFS) sequentially at Stages 1, 2, 3,…, ; and depart the system from Stage . Every job must be processed only once at each stage by any one of the servers. The th service initiated at Stage requires units of time to complete. Since the size of the intermediate buffers is finite, blocking may happen, e.g., when the number of jobs in the succeeding buffer reaches the buffer capacity. During the blocking period, the movement of jobs from the preceding stage to the succeeding stage is temporary paused. A control policy is needed to coordinate the service process between stages. Common control policies considered in the literature include communication blocking, production blocking, kanban blocking, variations of kanban blocking, and general blocking [Onvural and Perros 1986, Buzacott 1989, Onvural 1990, Cheng 1993, Liberopoulos and Dallery 2000]. Under communication blocking, a server at stage k starts processing a job whenever the following three conditions are satisfied: (C1) a job is available for processing, (C2) a server is available, and (C3) there is at least one empty space at the next stage. The maximum number of jobs (in queue and in service) allowed in a stage at any time is specified by the parameter . Figure 1(a) gives an example of = 3, which already includes the spaces for jobs currently being served. Under production blocking (also called manufacturing blocking), a server at stage k starts processing a job whenever the following three conditions are satisfied: (P1) a job is available for processing, (P2) a server is available, and (P3) the server is not holding a finished job. A more general blocking scheme (in terms of having more control parameters) is general blocking. It uses three parameters (or kanban cards), , , and to control, respectively, the maximum number of jobs in queue and in service, the maximum number of finished jobs allowed to be blocked at the stage, and the maximum total number of jobs allowed at the stage (including the jobs in queue, in service, and blocked; see Figure 1(b)). Under general blocking, a server at stage starts processing a job whenever the following three conditions are satisfied: (G1) a job is available for processing, (G2) a server is available, and (G3) there are strictly less than bk finished jobs blocked at stage . Detailed reviews of these blocking schemes can be found in Chan and Schruben [2009] and the references therein. c2=2
1
a2=3
2
3
1
(a) Communication or Production Blocking
a2=2
2
b2=1
3
(b) General Blocking
Fig. 1. Different Blocking Schemes
3. BACKGROUND Under a FCFS queueing discipline, an STQ is a first-in-first-out (FIFO) system, i.e., jobs depart the system in the order of arrival. This FIFO property facilitates the development of recursive equations of STQs. However, an MTQ is not a FIFO system (i.e., the th arrived job at a stage is not necessary the th departure job of that stage) even thought the jobs are still served in FCFS order. This is because the service time of each job is typically random, and an earlier started job might require a service time so long that some
Generalized Lindley-Type Recursive Representations for Multiple-Server Tandem Queues
of the subsequent arriving jobs would have already finished their services and left the stage/system. This non-FIFO property is also called overtaking and is one of the difficulties in studying MTQs. In this section, we first define the notation and then review some of the existing recursive procedures and equations of single-stage multi-server queues and STQs. The following notation will be used in this paper. th : time interval between the external arrival and the th external arrival. : time interval needed for the th started service at stage (see Remark 1 below). : time of the th external arrival event. : start time of the th service event at stage . : completion time of the th service event at stage . : time of the th scheduled departure event at stage when the th job departs, (see Remark 2 below). th : the maximum of the completion time of the service event and the th scheduled departure event at stage , i.e., , with , (see Remark 3 below). : waiting time of the th departure event at stage .
Remark 1: For single-stage queues, the subscript will be dropped. Remark 2: Because there is no future-event list, the calculations of the event times do not need to follow the occurrence sequence of the events. Therefore, each job can see scheduled future departure events, which are calculated using Eq. (5) in Proposition 1 in Section 4. However, the th departure event scheduled to occur when the th job departs might not be the th actual departure event because it could be overtaken by a later th arriving job. For example, in Figure 2 when the job departed at time a ” in the figure), was the second scheduled departing time (Mark (see Mark “○ th b ”). However, a later arriving job (the job) overtakes this departure event “○ c ”). Note also that , earlier than (Mark “○ and departs the system at time th d ”). , is the actual departure event (Mark “○ when Remark 3: The variables can help find the completion time of a service and will be useful in developing the LRRs. In Figure 2, for example, the completion time of th service is larger than . As a result, the e ”). (Mark “○
e d Dki1
a
Dk,i-1,2
Dk,i-1,1 Job i–1 departs
Uki1 = Uki2 Dki2
Uki3
c
Fk,i+r-1
Dki3
Ukir
b
Dk,i-1,3
Dkir
...
Dk,i-1,r
t Job i+r –1 overtakes a scheduled departure (Dk,i-1,3)
W. K. V. Chan Fig. 2. Updating of the Scheduled Departure Events
Kiefer and Wolfowitz [1954] use an algorithmic recursive procedure to describe the queues. To help draw a connection between KW recursive dynamics of procedure and the LRRs introduced in this paper, we review the KW procedure here. services have already been initiated and the th job has not Suppose that the first started yet. Before the procedure starts, a sorted list is defined: , where is the waiting time for the th started job to wait for the th earliest available machine. The KW procedure includes three steps. First, the th started job is assumed to be served by the first available machine (with an arbitrary tie-breaking rule, such as using the available machine with the smallest index), resulting in a temporary list, . Second, it computes the remaining waiting times when th the job arrives by subtracting from each of the elements in the list, yielding a . Finally, second temporary list, the elements in are sorted in increasing order with all negative elements replaced by , where
zero, resulting in a new list of waiting times: th
th
similarly is the waiting time for the started job to wait for the earliest available machine. As discussed in Section 1, recursions for STQs have been widely used in the literature to study properties or performance evaluations of STQs. The explicit recursions for STQs under communication blocking and production blocking are, respectively, shown in the following [Buzacott and Shanthikumar 1993]. The notation for the departure times in the , which drops the third index from the notation , following two equations is which was defined at the beginning of this section because these two equations are for ). STQs (i.e., Communication Blocking: Production Blocking:
(1) (2)
These two recursions explicitly define the recursive relationships among subsequent departure times at consecutive stages of the STQ. For example, Eq. (1) says that the th started job at Stage departs the stage units of time after it starts its service, which th occurs when the following three events have happened: (1) the job departs from th th Stage , (2) the job departs from Stage , and (3) the job departs from These relationships hold for all jobs. Stage . The popularity of the STQ recursions, however, did not lead to prevalent applications of MTQ recursions, perhaps due to the complexity of the MTQ dynamics. Most of the studies that exploit the recursive relationships in MTQs make use of implicit algorithmic recursive procedures rather than explicit recursions similar to the Lindley recursion or Eqs. (1) and (2). In the subsequent sections, we will first develop explicitly recursive equations (i.e., an LRR) for single-stage multi-server queues similar to the Lindley recursion, which can be expressed explicitly using simple arithmetic operations. The idea of the LRR is similar to that of the KW procedure, which is to maintain an ordered list of event times. The key
Generalized Lindley-Type Recursive Representations for Multiple-Server Tandem Queues
difference is the explicit expression of the LRR, which enables us to extend it to model the dynamics of MTQs. 4. LINDLEY-TYPE RECURSIONS FOR MULTI-SERVER QUEUES The dynamics of a queueing system are the trajectories of the arrival, finish, and departure times. In this section, we introduce an LRR that requires 2 recursive equations and only one minimum or maximum operator in each of these equations to model the queue. The key is to implement the sorting step in the KW dynamics of a procedure using simple arithmetic operations. The idea is to insert the new finish time (or waiting time) into the sorted list by repeatedly using the minimum and maximum operators. To illustrate, consider the following example in which a new finish time is to be inserted into the sorted list . First, is the actual departure time and therefore it is removed from the list. Second, the first minimum is . Third, calculate . Fourth, the second minimum is . Fifth, calculate . Finally, the third minimum is simply . The resulting new list is . An initial sorted list of the first departure times is needed. The finish times sorted in ascending order give the departure times. Therefore, the first ordered finish times serve as the initial conditions for the LRR. In particular, the initial conditions are: , ” operator means taking the th minimum where the notation “ ” underneath the “ within the set. The next proposition describes the LRR for a queue. The symbols “ ” and “ ” represent the maximum and minimum operators, respectively. PROPOSITION 1. Given a set of , the dynamics of a
and queue can be determined by the following LRR: (3) (4) (5)
where
.
PROOF. The start time of the th job’s service is the maximum of the job’s arrival time (i.e., ) and the earliest time a server is available to serve this job. If is the th earliest time at which a server is available to serve the job, then Eq. (3) represents the finish time of the th job. Therefore, we will show that by updating the event times recursively using Eqs. (4) and (5), is indeed equal to the earliest time at which a server is available to serve the th job. The proof is by induction. The first jobs can always be served by one of the th servers. Hence, the induction starts with job. When , the initial conditions ensure that is the first actual departure time, which is also equal to the earliest time at
W. K. V. Chan th which a server is available to serve the job. On the other hand, the remaining ’s, constitute an ordered list of the other scheduled departure th times, i.e., . The finish time of the job, , calculated by Eq. (3), could be in any position within this list. We need to show that Eqs. (4) and (5) will , i.e., correctly yield a new ordered list of departure times after including , . Using Eq. (4) repeatedly, we have:
for
, where the third last equation follows from Using Eq. (5) and the above expression for , we have:
.
, where the third and fourth equations follow from . Assume this is true when , i.e., , . This is similar to the initial conditions. As a result, when , we can repeat the same procedure as in the base case (when ) with the first subscripts 1 and 2 replaced by and , respectively. This completes the proof. for
The computational complexity of this LRR is as follows. For each job, Proposition 1 determines the corresponding finish time by using equations (Eqs. (3) – (5)). As for the number of operations, Eq. (3) needs one maximum and one addition. Both Eqs. (4) and (5) involve only one minimum or maximum operation. Therefore, the computational for each job. As for the memory usage, the number of variables used complexity is by Proposition 1 to record the event times is , which are , , , . In summary, both computational time and memory usage are in the order of for each job (or for jobs). The waiting time is one of the main performance measures of interest in many situations. We relate the above LRR to the Lindley recursion by changing the variables to and . the waiting times. To accomplish this, we first define two new variables th is the difference between the arrival time and the th departure time scheduled to
Generalized Lindley-Type Recursive Representations for Multiple-Server Tandem Queues th occur when the th job departs. is the difference between the arrival time and th th the maximum of departure time scheduled to occur when the job departs and th the finish time. Specifically, and equal in value and , respectively, with being the boundary condition for . We replace the variables , , and by new variables , , and using the identity and the expressions for and in terms of and , resulting in the following waiting-time-based LRR for queues.
PROPOSITION 2.
Given a set of and , queue can be determined by the following LRR:
and initial conditions , the waiting times of a
,
for
, and
. ’s are auxiliary variables (event times), which Observe that in Proposition 1, the can be eliminated by plugging Eq. (4) into Eq. (5); this will not affect the values of other event times. Eliminating the auxiliary variables could reduce the number of recursive equations in the LRR at the cost of increasing the number of minimum or maximum can operators in each recursive equation. Similarly, in the above waiting-based LRR be eliminated by plugging it into the expression for . The following example gives the resulting LRR for a queue. EXAMPLE 1. The LRR for the waiting times of a
, with
for
queue is:
, and
. 5. LINDLEY-TYPE RECURSIVE REPRESENTATIONS FOR MULTI-SERVER TANDEM QUEUES In this section, we introduce LRRs for modeling the dynamics of MTQs. The derivation of these LRRs requires two steps. The first step is to establish the recursions for the dynamics (i.e., the departure times) within a stage. The second step is to connect the recursions among stages based on their actual departure times. It should be noted that the departure times obtained in the first step are the earliest departure times at each stage. Under communication blocking they are also the actual departure times for jobs to move
W. K. V. Chan
to downstream stages; as a result, jobs can always depart to downstream stages upon completion. However, under production blocking or general blocking, these earliest departure times might not be the actual departure times for jobs to move to downstream stages because these two blocking schemes in general do not require an empty space at the next stage for a service to start; as a result, a job might be held at the current stage upon completion if the next stage is full. Similar to the LRR for a queue, the LRR for an MTQ also assumes that the first , finish times at each stage are sorted in ascending order, i.e., ,
.
The first step uses the results in Proposition 1. In particular, Eqs. (4) and (5) are used ) to determine an ordered list of the scheduled departure times (i.e., th th as seen by the job when it departs (with being the actual departure time) at stage . This results in the last two recursive equations in all of the three LRRs given in the following. ) at all stages, the second step is to calculate Given the earliest departure times ( the completion times of jobs based on these departure times, which determine when the services can start. Since the completion time is equal to the service initiation time plus the service time, determining the completion times is equivalent to finding the service initiation times. As discussed in Section 2, for communication blocking, the three conditions for the th job at Stage to initiate its service are: (1) the th job has arrived at Stage from Stage ; (2) at least one server is available at Stage ; and (3) at least one space is available at Stage . The earliest time at which the th job arrives at Stage from Stage is . The earliest time at which at least one server is available at Stage is the th departure time of the job, i.e., . The earliest time of the occurrence of th the third condition is the departure time of the job at Stage , i.e., . The maximum of these three departure times gives the service initiation time of the th job at Stage . The LRR for an MTQ under communication blocking (LRR-MTQ-C) is given below. LRR-MTQ-C: Given a set of and , the dynamics of a MTQ under communication blocking can be determined by the following LRR: (6) (7) (8) for
where
and .
For the LRRs of production blocking and general blocking, the first step of the derivation is similar to that of the communication blocking. Also, in the second step, the
Generalized Lindley-Type Recursive Representations for Multiple-Server Tandem Queues
first two conditions for a service to start are the same as that of communication blocking. The main difference is the third condition, which is discussed in the following. For production blocking, this third condition is that the server is not holding a finished job; this condition is true when a blocked job at Stage is transferred to Stage , which occurs under one of the following possibilities: (1) a job departs from Stage to Stage when Stage has available spaces; (2) a job leaves Stage for Stage when Stage has available spaces and both Stages and are blocked, triggering a job to move from Stage to Stage and consequently another job from Stage to Stage ;...; ( ) a job departs the system (from Stage ) when all Stages from to are blocked, triggering a series of job departures from to to … to to . The earliest times of these possibilities are, respectively, . As a result, the maximum of all these event times gives the earliest time at which a space is available at Stage . Taking the maximum among these departure times and the departure times obtained from the other two conditions yields the service initiation time of the th job at Stage . The resulting LRR for an MTQ under production blocking (LRR-MTQ-P) is given below. LRR-MTQ-P: Given a set of and , the dynamics of a MTQ under production blocking can be determined by the following LRR: (9) (10) (11) for
where
and .
The LRR for general blocking can be developed similarly to the LRR for production blocking except that the third condition for the th job at Stage to initiate its service now changes to when there are strictly less than finished jobs (which could be different from ) blocked at Stage k. This condition also becomes true under one of the following possibilities as described in the production blocking case. However, the earliest times at which these possibilities could occur are, respectively, , which are obtained by counting the total number of possible jobs between stages. Similarly, taking the maximum among these departure times and the departure times obtained from the other two conditions yields the service initiation time of the th job at Stage . The resulting LRR for an MTQ under general blocking (LRR-MTQ-G) is given in the following. LRR-MTQ-G:
W. K. V. Chan
Given
a
set
of
and the dynamics of a MTQ under production blocking can be determined by the following LRR: (12) (13) (14) for
with Eq. (12) for and
where ,
.
The computational complexity of these LRRs is as follows. We focus on the LRRMTQ-G as it has the highest complexity among the three LRRs. First, for each job and , and the complexity of Eqs. (13) and (14) each stage, the complexity of Eq. (12) is is . These complexities become and , respectively, when all stages are considered, where is the maximum value of the numbers of servers among all stages. Therefore, the computational complexity of LRR-MTQ-G is for jobs. 6. APPLICATION Explicit recursions for discrete-event dynamic systems or queueing systems have many applications; some of them are discussed in Section 1. The applications of explicit recursions can be broadened by adopting techniques in max-plus algebra and linear system theory, which are areas with a wide spectrum of applications. Indeed, many discrete-event dynamic systems exhibit recursive or periodic behavior involving a set of repeatedly activities, which can be described by a set of max-plus equations and can be characterized by their eigenvalues and eigenvectors. This provides an opportunity for performance evaluation of discrete-event dynamic systems (see details in Cohen et al. [1989], Baccelli et al. [1992], and Cohen et al. [1999]. The LRRs developed in Section 5 represent the recursive relationships in MTQs using max-plus equations and therefore, techniques in max-plus algebra and linear system theory could be applied to these LRRs to broaden the applications of LRRs. In this section, we illustrate the application of these LRRs using fast simulations. The goal is to examine the computational efficiency of these LRRs in simulating MTQs. Three MTQs of different sizes are simulated. The first one is a small-scale 5-stage MTQ with a large number of simulated jobs ranging from 0.1M (105) to 1B (109). The second one is a moderate-scale 30-stage MTQ with the number of simulated jobs ranging from 105 to 108. The last one is a large-scale 100-stage MTQ with the number of simulated jobs ranging from 105 up to 108. Testing the performance of these LRRs in handling such a large number of jobs is necessary for large-scale simulation studies, e.g., in a largescale design of experiments involving hundreds of simulations and tens of replications for each simulation with each replication requiring hundreds of thousands of jobs. Production blocking is assumed in all the simulations presented in this section. The LRR-MTQ-P’s of the three MTQ systems were implemented in C. These LRR models were benchmarked against two traditional discrete-event simulation approaches: process interaction (PI) and event scheduling (ES). All experiments were executed on a desktop
Generalized Lindley-Type Recursive Representations for Multiple-Server Tandem Queues
computer equipped with an Intel quad-core 3GHz CPU and 8GB of RAM, and under the 64-bit Windows Vista operating system. The PI models of the 5-stage and 30-stage MTQs were implemented using the simulation package Arena, version 11 [Kelton et al. 2007]. The ES models of all three MTQs were created using the simulation package SIGMA [Schruben and Schruben 2001]. One useful feature of SIGMA is that it can automatically translate simulation models into C code. As a result, users could fully exploit the benefit and efficiency of C code as opposed to other high-level programming languages. Our ES models for the three MTQs were also translated into C code and executed in standalone mode. Different software packages use different methods to trace and record statistical information, resulting in variations in their performance. To ensure a fair comparison on the computational efficiency of the three methods, all the tracing and recording functionalities were disabled in all simulation models. Specifically, in the PI models, all animation factures were turned off and the models were run in the “Batch Run (No Animation)” mode. Moreover, no statistics were collected, i.e., all the “Report Statistics” boxes were left unchecked and statistics output files were also disabled. In the ES models, none of the events or state variables were traced and no statistical output files were generated. Similarly, no statistics were traced in the LRR models. All the simulation models assume an infinite number of jobs in front of the first stage, which is a useful practice in estimating the system throughput [Buzacott and Shanthikumar 1993]. As a result, the arrival process (or events) was not included in these models. Nevertheless, the arrival process can be treated as a virtual service stage with a single-server and the original interarrival times as the virtual service times (see, e.g., Chan [2005]). Therefore, this assumption does not affect the generality of the studies. Because the PI approach stores all existing entities in memory, the number of entities allowed in runtime is constrained by the physical memory of the computer. Therefore, to mimic the infinity of jobs at the first stage, we circulate the jobs (entities) from the last stage back to the first stage. That is, a certain number of entities (greater than ) are created at time zero and wait in front of Stage 1. They are then processed stage-bystage until the last stage, where they will be routed back to Stage 1 and treated as a new job. Since the maximum number of jobs allowed inside the system (excluding those in ), this circulation of jobs mimics the front of Stage 1) is finite (i.e., equal to original tandem queue with an infinite number of jobs waiting in front of Stage 1. In the ES models, we use the so-called resource-driven approach to create these models. Using this modeling approach, we could save a significant amount of memory and computational time in memorizing and handling all individual jobs (see Schruben and Roeder [2003] for details about resource-driven simulations). In addition, to further speed up these models, the arrival event in each model was not included. This is done by using a single integer variable to represent the number of jobs in front of Stage 1. This , and is increased by one whenever variable is initialized at a value larger than a job leaves the system from the last stage. This ensures that there are always jobs to be processed by the first stage. The LRR models do not model the jobs as entities or events; as a result, there is no limitation on the number of entities or events in the LRR models. Moreover, setting an infinite number of jobs in front of Stage 1 is as simple as setting all ’s in the equations to zero or simply removing them from the equations.
W. K. V. Chan
Table 1 gives the configurations of the 5-stage MTQ. For simplicity, all service rates are set to 1. Table 2 presents the configurations of the 30-stage MTQ. All parameters of the 30-stage MTQ are randomly generated. # of servers Buffer size Mean service time (Exponential)
# of servers Buffer size Mean service time (Exponential)
Table 1. Configurations of a 5-Stage MTQ. = {3, 2, 1, 2, 2} = { 3, 2, 4, 3} = {1, 1, 1, 1, 1} Table 2. Configurations of a 30-Stage MTQ. = {6, 9, 1, 6, 9, 2, 1, 7, 5, 5, 2, 7, 8, 3, 6, 6, 4, 5, 3, 7, 9, 6, 4, 2, 8, 9, 9, 7, 8, 3} = { 9, 3, 8, 10, 2, 2, 8, 6, 8, 2, 10, 9, 4, 9, 6, 4, 8, 6, 8, 10, 8, 5, 4, 10, 12, 11, 11, 11, 3} = {7.59, 0.48, 9.41, 7.49, 8.50, 9.48, 2.40, 7.76, 7.05, 4.66, 7.53, 1.25, 7.65, 1.02, 6.65, 4.17, 6.47, 7.04, 1.22, 7.65, 6.84, 2.46, 0.38, 3.52, 0.72, 8.78, 2.38, 4.51, 7.33, 7.12}
Although a more thorough experimental design on the parameters such as different service rates or distributions can be conducted to examine the effect of degrees of congestion, the runtime of the LRR models is not sensitive to these parameters because the number of equations is unchanged as long as the number of simulated jobs is fixed, regardless of how frequently they are blocked and/or how long they stay in a queue or service. Figures 3(a)–(b) give the log-log plots of the runtimes of the three simulation models for the 5-stage MTQ and 30-stage MTQ, respectively. Tables 3 and 4 detail the changes of these runtimes in the number of simulated jobs row by row from 0.1M to 1B jobs. Each entry in the second, third, and fourth columns is the average runtime from five replications using different random streams. Experiments that run longer than 10,000 seconds are manually terminated; this situation happened only to the PI and ES models, e.g., the 5-stage PI model simulating 108 jobs. The LRR model can simulate 109 jobs in approximately 660 seconds. The 95% confidence intervals are also included. The last two columns of these two tables are the runtime ratios of LRR to PI and the runtime ratios of LRR to ES, respectively. It can be seen that the runtimes of the LRR model are only about 10% or less of those of the PI and ES models. The results also suggest a linear relationship between the runtime and the number of simulated jobs in all three models. Such a linear relationship result is consistent with the literature [Chen and Chen 1990, Hunt and Foote 1995]. For validation purposes, we also performed two additional experiments for the 5-stage system, in which code was added into both the LRR and ES models in order to collect and compare waiting time statistics. A simulation experiment consisting of 10 replications, each simulating 10K jobs, resulted in the following approximate 95% confidence intervals for Stages 2–5 (in vector notation): [3.246 0.008, 2.058 0.005, 0.246 0.001, 0.198 0.002] for the LRR model and [3.227 0.018, 2.040 0.013, 0.248 0.006, 0.203 0.005] for the ES model. The waiting time statistics for Stage 1 were not collected because there are an infinite number of jobs in front of that stage.
Generalized Lindley-Type Recursive Representations for Multiple-Server Tandem Queues
Fig. 3. (a) Runtimes of 5-stage MTQ
(log-log)
(b) Runtimes of 30-stage MTQ (log-log)
Table 3. Runtimes of the Three Simulation Models (seconds) (5-Stage MTQ). Ratio of Ratio of PI ES LRR LRR/PI LRR/ES jobs 0.1 9.09% 12.50% 2.2 0.73 1.6 0.18 0.2 0.15 3.39% 5.19% 1 15.4 0.18 0.8 0.15 23.6 0.09 3.38% 4.94% 10 7.6 0.18 230.8 0.91 157.8 0.27 2.83% 4.13% 100 65.4 0.18 2305.8 1.17 1579.2 0.48 N/A N/A 1,000 659.6 0.29 Table 4. Runtimes of the Three Simulation Models (seconds) (30-Stage MTQ). Ratio of Ratio of PI ES LRR LRR/PI LRR/ES jobs 0.1 5.22% 8.33% 13.4 0.18 8.4 0.18 0.7 0.25 5.73% 9.18% 1 82.8 0.15 7.6 0.27 132.6 0.19 5.62% 9.04% 10 828.8 0.63 74.9 0.16 1333.2 0.43 N/A 9.04% 100 > 10,000 8284.6 11.02 749.3 0.25 N/A N/A 1,000 > 10,000 > 10,000 7509.6 1.29 Next, we focus our study on the large-scale 100-stage MTQ using only the ES and LRR models. For simplicity, all 100 stages of this MTQ have the same parameters, i.e., each stage has 3 servers and 5 buffer spaces, and all service times follow an exponential distribution with mean 1. Figure 4 gives a pictorial view of the runtimes of the two models with the details provided in Table 5. The runtime ratios of LRR to ES are around 13%.
W. K. V. Chan Fig. 4. Runtimes of 100-stage MTQ
0.1 1 10 100
Table 5. Runtimes of the Two Simulation Models (100-Stage MTQ). Ratio of ES LRR LRR/ES 13.46% 4.2 0.15 31.2 0.15 13.47% 42.4 0.18 314.8 0.27 13.40% 3154.2 1.38 422.6 0.29 > 10,000 N/A 4245.6 2.63
The future-event list is a key element in most traditional discrete-event simulation models. The processing of the future-event list, if not intelligently programmed, could be a burden in computation (e.g., in scheduling events). The LRR models, on the other hand, completely eliminate the need of a future event list. In addition, they have a low computational complexity of , which is also insensitive to the traffic intensity of the queueing systems. The LRR models are also easy to scale up. For example, simply changing the value of is equivalent to adding/removing stages. Remark 4: The literature has reported that some ES models can outperform their respective PI models [Schruben and Roeder 2003, Schruben 2009]. Our experimental results support the literature. However, a more general conclusion on the performance between the ES-based models and the PI-based models would require a thorough comparison such as using various kinds of languages or packages to create the two types of models; such a comparison, however, is outside the scope of this paper. 7. CONCLUSION This paper presented explicit recursive representations for multi-server tandem queueing systems with blocking. All the job-job, job-server, and stage-stage interactions in a MTQ can be described using only three sets of recursive equations. These LRRs are fully scalable and insensitive to the traffic intensity. They are also easy to implement; the resulting simulation model simply loops through the three sets of recursions for times, which can be written in less than twenty lines of code, plus some code for initialization. The applications of explicit recursive equations of queueing systems were discussed in Sections 1 and 6 and the cited references. The application on fast simulations was demonstrated in this paper. The LRR models were benchmarked against the ES and PI models of a 5-stage, 30-stage, and 100-stage MTQs. The LRR models were shown to be more efficient than the other two types of models, with a runtime ratio ranging from 3% to 13%, which is equivalent to 30 times to 7 times faster in execution time. The reasons the LRR models outperformed these two types of models were discussed in Section 6. What should be noted again here is the fact that the LRR models do not model each job as an entity and do not require a future-event list (event calendar), thus saving a significant amount of time in handling entities and event processing, which are required in the PI and ES models. ACKNOWLEDGMENTS
Generalized Lindley-Type Recursive Representations for Multiple-Server Tandem Queues
The author wishes to thank Professor Lee Schruben for his support on some of the work reported here. This research is partially supported by the National Science Foundation through grant CMMI-0644959. REFERENCES BACCELLI, F. AND BREMAUD, P. 2003. Elements of Queueing Theory: Palm Martingale Calculus and Stochastic Recurrences. 2nd ed., Springer Verlag, New York. BACCELLI, F., COHEN, G., OLSDER, G. J., AND QUADRAT, J. P. 1992. Synchronization and linearity: An Algebra for Discrete-Event Systems. Wiley, New York. BUZACOTT, J. A. AND SHANTHIKUMAR, J. G. 1993. Stochastic Models of Manufacturing Systems. Prentice Hall, New Jersey. CHAN, W. K. V. 2005. Mathematical Programming Representations of Discrete-Event System Dynamics. Ph.D. dissertation, Industrial Engineering and Operations Research Department, University of California, Berkeley. CHAN, W. K. V. AND SCHRUBEN, L. W. 2008. Optimization models of discrete-event system dynamics. Operations Research 56, 1218-1237. CHAN, W. K. V. AND SCHRUBEN, L. W. 2009. Mathematical Programming Models of Closed Tandem Queueing Networks. ACM Transactions on Modeling and Computer Simulation (to appear). CHEN, L. 1997. Parallel simulation by multi-instruction, longest-path algorithms. Queueing Systems 27, 1, 37-54. CHEN, L. AND CHEN, C.-L. 1990. A fast simulation approach for tandem queueing systems. In Proceedings of the 1990 Winter Simulation Conference, New Orleans, LA. Institute of Electrical and Electronics Engineers, Piscataway, New Jersey, 539-546. CHENG, D. W. 1995. Line reversibility of tandem queues with general blocking. Management Science 41, 5, 864-873. CHENG, D. W. 1997. Line reversibility of multiserver systems. Probability in the Engineering and Informational Sciences 11, 177-188. CHENG, D. W. AND YAO, D. D. 1993. Tandem queues with general blocking: a unified model and comparison results. Discrete Event Dynamic Systems 2, 3-4, 207-234. COHEN, G., GAUBERT, S., AND QUADRAT, J. P. 1999. Max-plus algebra and system theory: where we are and where to go now. Annual Reviews in Control 23, 207-219. COHEN, G., MOLLER, P., QUADRAT, J. P., AND VIOT, M. 1989. Algebraic tools for the performance evaluation of discrete event systems. In Proceedings of the IEEE 77, 1, 39-85. FU, M. AND HU, J. Q. 1997. Conditional Monte Carlo: Gradient Estimation and Optimization Applications. Kluwer Academic Publishers, Boston. GLASSERMAN, P. 1991. Gradient Estimation via Perturbation Analysis. Kluwer Academic Publishers, Boston. HUNT, C. S. AND FOOTE, B. L. 1995. Fast simulation of open queueing systems. Simulation 65, 3, 183-190. KELTON, W. D., SADOWSKI, R. P., AND STURROCK, D. T. 2007. Simulation with Arena, 4th ed. McGraw-Hill, New York, NY. KIEFER, J. AND WOLFOWITZ, J. 1954. On the theory of queues with many servers. Bulletin of the American Mathematical Society 60, 1, 35-35. KRIVULIN, N. K. 1994. A recursive equations based representation of the G/G/M queue. Applied Mathematics Letters 7, 3, 73-77.
W. K. V. Chan
LIBEROPOULOS, G. AND DALLERY, Y. 2000. A unified framework for pull control mechanisms in multi-stage manufacturing systems. Annals of Operations Research 93, 325-355. LINDLEY, D. V. 1952. The theory of queues with a single server. In Proceedings of the Cambridge Philosophical Society 48, 2, 277-289. LIU, X. G. AND BUZACOTT, J. A. 1992. The reversibility of cyclic queues. Operations Research Letters 11, 4, 233-242. ONVURAL, R. O. 1990. Survey of Closed Queueing Networks with Blocking, Computing Surveys 22, 2, 83-121. SCHELLER-WOLF, A. AND SIGMAN, K. 1997. Delay moments for FIFO GI/GI/S queues. Queueing Systems, Theory and Applications 25, 1-4, 77-95. SCHRUBEN, D. AND SCHRUBEN, L. W. 2001. Graphical Simulation Modeling using SIGMA. Custom Simulation, Ithaca, NY. SCHRUBEN, L. W. 2009. Simulation modeling for analysis. ACM Transactions on Modeling and Computer Simulation (to appear). SCHRUBEN, L. W. AND ROEDER, T. M. 2003. Fast simulations of large-scale highly congested systems. Simulation 79, 3, 115-125. SHAKED, M. AND SHANTHIKUMAR, J. G. 1988. Stochastic convexity and its applications. Advances in Applied Probability 20, 2, 427-446. YAMAZAKI, G. AND SAKASEGAWA, H. 1975. Properties of duality in tandem queuing systems. Annals of the Institute of Statistical Mathematics 27, 2, 201-212.