Leland, et.al, [LELA94] measured and analyzed the arrival of mil- lions of packets ...... [BERA95] J. Beran, R. Sherman, M. S. Taqqu and W. Willinger, \Variable-.
COMPARISON OF BUFFER USAGE UTILIZING SINGLE AND MULTIPLE SERVERS IN NETWORK SYSTEMS WITH POWER-TAIL DISTRIBUTIONS
John Edward Hatem B.A., Boston College, 1987
A Dissertation Submitted in Partial Ful llment of the Requirements for the Degree of Doctor of Philosophy at the University of Connecticut 1997
Copyright by John Edward Hatem
1997
APPROVAL PAGE
Doctor of Philosophy Dissertation COMPARISON OF BUFFER USAGE UTILIZING SINGLE AND MULTIPLE SERVERS IN NETWORK SYSTEMS WITH POWER-TAIL DISTRIBUTIONS
Presented by John Edward Hatem, B.A
Major Advisor
Lester Lipsky
Associate Advisor Associate Advisor
Howard A. Sholl Ian Greenshields University of Connecticut 1997 ii
ACKNOWLEDGEMENTS First of all I would like to thank my major advisor, Lester Lipsky, for his help, guidance and patience throughout the process. Most of the theoretical material for this dissertation comes from his book titled \Queueing Theory: A Linear Algebraic Approach", and a paper he co-authored \The Importance of Power-tail Distributions for Telecommunication Trac Models". I would also like to thank my associate advisors, Howard Sholl and Ian Greenshields, for their suggestions and assistance. Thanks are also due to my family who have been very supportive throughout the process: My parents, Victor and Grace Hatem, my brother Peter and his wife Jan, my brother Jim and his wife Susan, and my brothers Dan and Bill. I would like to thank the BRC support sta for the help over the years, especially Sue Lipsky, who has not only been a great supervisor, but a good friend and con dante over the years. Also, thanks to John Marshall, Sandi Lizee and Elizabeth Moore. I would also like to thank Mary Ellen O'Donnell and Pierre Fiorini for both their friendship and technical discussions over the years. Also thanks are due to Dave Olafson, Jolyn Parker, Kenny Salyer, Rich Landers, Gerard and Elizabeth Fogarty, Jennifer McCormick, The Flaxen Hackers (Miles P. and Tookie Gerlek, Scott Green, James E! Henderson, Rich and Jackie Shaugnessy, Pat and Bonnie Durante, Robert Ska), Charline Mahoney and Robert Hanlon for their support. Also thanks to Michael Hatem, Gerard Fogarty, Dan Stefek, Jaime Drach and Mr. Q for their inspiration.
iii
TABLE OF CONTENTS Chapter 1: Chapter 2:
Introduction Queueing Model
Chapter 3:
Power-tail and Truncated Power-tail Distributions
Chapter 4:
Buer Sizing in Single-Server Systems with Power Tail Distributions 13
Chapter 5:
Comparison Of Buer Usage Utilizing Single vs. Multiple Servers In Network Systems With Power-Tail Distributions 39
2.1 Probability Distribution Functions 2.2 LAQT Basics : : : : : : : : : : : :
:: :: :: ::: :: :: :: :: :: :: :: ::: :: :: :: ::
1 5
5 6
9
3.1 Power-tail Distributions : : : : : : : : : : : : : : : : : : : : : : : 9 3.2 Truncated Power-tail Distributions and LAQT Representations : : 10
4.1 Modeling Queue Length Probabilities : : : : : : : : : : : : : : : : 4.1.1 GI/M/1 Queues - No Packets Lost : : : : : : : : : : : : : 4.1.2 Buer Size for High Utilization Rates : : : : : : : : : : : : 4.1.2.1 Some Examples : : : : : : : : : : : : : : : : : : : : : : : 4.1.3 GI/M/1/N Queues - Finite Buer : : : : : : : : : : : : : 4.1.4 M/G/1 Queues - In nite Buer : : : : : : : : : : : : : : : 4.1.5 M/G/1/N Queues - Finite Buer : : : : : : : : : : : : : : 4.2 Analyses of Single Server Systems : : : : : : : : : : : : : : : : : : 4.3 Asymptotic Behavior for Full Power-Tails in M/G/1 Systems : : : 4.3.1 Formulas for a(n) : : : : : : : : : : : : : : : : : : : : : : : 4.4 Asymptotic Convergence : : : : : : : : : : : : : : : : : : : : : : :
5.1 Multiple Server Queueing Systems : : : : : : : : : : : : : : : : : : 5.1.1 Steady State M/ME/C//N Loop : : : : : : : : : : : : : : 5.1.2 Formal De nition of M/ME/C//N system : : : : : : : : : 5.1.3 Load Dependence at S (Time0Sharing Systems with Population Constraints : : : : : : : : : : : : : : : : : : : : : : 5.1.4 Open Generalized M/G/C Queue : : : : : : : : : : : : : : 5.2 M/G/C Queues - No Packets Lost : : : : : : : : : : : : : : : : : : 5.3 M/G/C/N Queues - Packets Rejected : : : : : : : : : : : : : : : 5.4 G/M/C : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5.4.1 G/M/C/N Queues { Packets Rejected : : : : : : : : : : : 5.4.2 G/M/C Queues { No packets lost : : : : : : : : : : : : : : 2
Chapter 6:
The parameter
6.1 Multiple Servers with Dierent Values of : : 6.2 Convergence to Steady State : : : : : : : : : :
Chapter 7: Conclusions and Future Work Bibliography
iv
::: :: :: :: :: ::: :: :: :: ::
13 14 16 16 26 29 33 35 35 35 36
39 40 43 44 44 45 49 52 52 55
57
57 57
61 63
LIST OF TABLES
v
LIST OF FIGURES 1 2 3 4 5 6
Typical server model : : : : : : : : : : : : : : : : : : : : : : : : : 6 Truncated power-tail reliabilities RYM for M 2 f1; 2; 3; 5; 7; 9; 12; 100g = 1:5 and = 0:5 plotted on a log-log scale. For M = 100, log(R(1)) is a straight line for many orders of magnitude of change in x. Even with only 12 terms, log(R(1)) looks like a straight line for a factor of 100 change in x : : : : : : : : : : : : : : : : : : : : 11 Primary buer size needed in a GI/M/1 queue for over ow to be less than 1%, as a function of = 1=( 1 x). : : : : : : : : : : : : 17 Primary buer size needed in a GI/M/1 queue for over ow, plotted on logarithmic scale, multiplied by 1 0 .The curves are discontinuous because N is an integer function, and have negative slopes for small because of the factor 1 0 . : : : : : : : : : : : : : : : 18 Primary buer size needed in a GI/M/1 queue for over ow plotted on logarithmic scale, multiplied by 1-. As we can see here, all curves (except that for M = 1) are nite at = 1 : : : : : : : : 19 Comparison of primary buer sizes between a TPT(32) and various H distributions with the same C , as a function of = 1=( 1 x). Buer size*(1 0 ) is plotted on a logarithmic scale. For the H 's, p = 100k ; for k = 4; 5; 6; 7. The M/M/1 queue is included for reference. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 21 Comparison of primary buer sizes between a TPT(32) and various H distributions with the same C for 0:99 1 : : : : : : : : 22 Primary buer size of a GI/M/1 queue for the PT and various TPT distributions as a function of , with = 1=( 1 x) = 0:8, =1.1-1.3, buer size = 1-10 . : : : : : : : : : : : : : : : : : : : 23 Primary buer size of a GI/M/1 queue for the PT and various TPT distributions as a function of , with = 1=( 1 x) = 0:8. =1.3-2.0, buer size = 1-10 : : : : : : : : : : : : : : : : : : : : 24 Primary buer size of a GI/M/1 queue for the PT and various TPT distributions as a function of , with = 1=( 1 x) = 0:8. =1.1-1.3, buer size = 10 0 10 : : : : : : : : : : : : : : : : : 24 Primary buer size needed for 1% over ow in a GI/M/1 queue for the TPT(64) distribution as a function of , with = 1=( 1 x) = 0:7; 0:8 and 0:9. : : : : : : : : : : : : : : : : : : : : : : : : : : : : 25 Buer size for 1% loss in a GI/M/1/N queue for various distributions as a function of = 1=( 1 x). : : : : : : : : : : : : : : : : : 27 Same as the previous gure except for higher values of , and we don't multiply by (1 0 ) here. Note that the gures blow up at = 0P = : = 1:01010101 : : :. : : : : : : : : : : : : : : : : : : : 28 Primary buer size needed for over ow of an M/G/1 queue to be less than 1%, as a function of = 1 x. Note that we are plotting buer size 1 (1-) on a log scale. : : : : : : : : : : : : : : : : : : : 30 Same as the previous gure except for higher values of . There is a slight upturn for the H2 curve near = 1, but this is almost surely due to numerical instability. All curves appear to be nite at = 1. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 31 Primary buer size of an M/G/1/N queue for various TPT distributions as a function of = 1 x. : : : : : : : : : : : : : : : : : : 33 2
2
2
7 8 9 10 11 12 13 14 15 16
2
2
8
8
8
1
1
14
1 0 99
vi
17 18 19 20
Same as the previous gure except for higher values of . : : : : : 34 An M/TPT/1 queue with a truncated power-tail service time distribution. In this example, we set xed at 0.50 and vary M. : : 37 An M/TPT(72)/1 queue with a truncated power-tail service time distribution. In this example, we x M and vary . : : : : : : : : 38 Steady state M/ME/C//N loop. S is an exponential server, but S is made up of two or more identical nonexponential matrixexponential servers, each made up of m phases and represented by the vecor-matrix pair, < p; B > : : : : : : : : : : : : : : : : : : : 41 An M/G/2 queue with a power-tail service time distribution. : : : 45 Mean response time for an M/PT/C queue with TPT(8) and TPT(20) and C=1,2 servers. M/M/1 and M/M/2 queues are included for comparison. : : : : : : : : : : : : : : : : : : : : : : : : 46 Mean response time for an M/TPT(8)/C queue and C=1,2,3 servers. For this case, two servers are better than one for > 0:38 and three is better than two at > :65. : : : : : : : : : : : : : : : : : : : : 47 Primary buer size needed in a M/PT/C queue for over ow to be less than 1%, as a function of = 1=( 1 x). : : : : : : : : : : : : 48 Primary buer size needed in a M/PT/C queue for over ow to be less than 1%, as a function of = 1=( 1 x) for Power-tail and hyperexponential-2 distributions. The H distributions jump quickly once they hit a certain , whereas the TPT distributions rise evenly across all . : : : : : : : : : : : : : : : : : : : : : : : : 50 Primary buer size needed in a M/TPT/C/N queue for rejection rate to be less than 1%. We assume that when the system lls up (we have N customers), any new arriving customers are rejected. 51 A G/M/C/N queue with a power-tail arrival distribution and exponential servers. The smallest buer size needed is for the TPT(16)/M/1/N queue and the largest buer size needed is for the TPT(16)/M/4/N curve, but the dierence is minimal. : : : : 54 A G/M/C queue with a power-tail arrival distribution and exponential servers. The smallest buer size needed is for the TPT(16)/M/1/N queue and the largest buer size needed is for the TPT(16)/M/4/N curve, but the dierence is minimal. : : : : : : : : : : : : : : : : 56 Primary buer size needed in a M/PT/C queue for over ow to be less than 1%, for dierent 's : : : : : : : : : : : : : : : : : : : : 58 Sample size for mean to be accurate within two digits. For = 2, we need 10000 samples for two digit accuracy. As ! 1, we need more samples, and at = 1, we need an in nite number of samples! 59 2
1
21 22 23 24 25
2
26 27 28 29 30
vii
Chapter 1 Introduction In recent years, there has been a push towards increasingly faster communications networks, especially in light of the World Wide Web and other systems with high-bandwidth requirements. As network demands steadily grow, increasing network speed is seen as the only way to handle the increasing trac arriving. However, while doubling the service rate on a system does indeed double the throughput, other measurements, such as response time [ERRA96], probability of buer over ow [LIPS97] and mean time to over ow [HEAT96] in many cases show only marginal improvement or are unaected. The key to understanding this apparent paradox is in the distribution of the service times of these networks. In recent years there has been an ever increasing interest in the development of systems which will be able to process incoming trac from various communications networks. Numerous papers have appeared indicating that the trac to be expected in the future will be of an extraordinary character. Leland, et.al, [LELA94] measured and analyzed the arrival of millions of packets on ETHERNET networks at Bellcore, and found an enormous instability of arrival rates. Beran, et. al., [BERA95] have measured and analyzed several millions of frames from Variable-Bit-Rate (VBR) video services. Park, et. al. [PARK96] have analyzed le sizes and network trac on the World Wide Web, and discovered that such trac shows variability at a wide range of scales; all see the same instability of the number of arrivals whether they measure in .01, .1, 1, 10 or 100 second intervals. These, and other papers, have argued that r(kj1), the auto-correlation function lag-k of the number of arrivals per time interval, 1, (not to be confused with the autocorrelation function for the inter-arrival times themselves), must go to zero so slowly that P1k r(kj1) = 1. They imply that any realistic model of such trac must include very long-range correlation eects. This has been described as \self-similar" (\bursty", \fractal") behavior. This behavior is not new or exclusive to multimedia; it was evident in text le sizes and CPU time distributions years ago, but was not thought to be important. We are just now gaining an understanding of this self-similarity. Willinger, et. al. [WILL95] has shown that self-similarity can be caused by the superimposition =1
1
2 of heavy tailed on-o processes, which they called packet trains. Each node on a network is either transmitting (which corresponds to the on-times) or not transmitting (which corresponds to the o periods). As the Web grows, the number of systems in this model increases; at high levels of aggregation, the bursts are more prominent. They measure this by using the Hurst parameter to show the degree of self-similarity. Greiner, Jobmann and Lipsky [GREI95] de ne a class of heavy-tail distributions known as Power-tails, which have nite mean but in nite variance. The degree of self-similarity for a power-tail distribution is determined by , which is related to the Hurst Parameter as = (3 0 H )=2. Lipsky and Fiorini [LIPS95] have shown that a renewal process (no correlation of interarrival times) where the interarrival times have a power-tail (PT) distribution (i.e., distribution functions which behave as 10F (x) ) 1=x for large x) can have such autocorrelations. Furthermore, Van de Liefvoort and Weng have generated \self-similar" data of the kind described in [LELA94] by simulating a renewal process where the interarrival times come from a single PT distribution with a nite mean (but in nite variance) [LIEF94]. Many attempts to do the same using various compound Poisson processes (without power tails, but with built-in burstiness) have not been successful. Greiner, et. al. [GREI95] describe in detail the properties of PT distributions, de ning an analytic class of wellbehaved distributions (a sub-class of which are Phase Distributions which can be used in Markov Chain models) that have truncated power tails (TPT), and in the limit become PT distributions. This class was rst used in [LIPS86] to explain the long-tail behavior of measured CPU times at Bellcore in 1986 [LELA86]. It was also used to show what might happen in data-retrieval systems which have power-tail le sizes [GARG92], and even to explain the distribution of medical insurance claims [LOWR93]. Greiner, et.al. [GREI95] then used these distributions to study the behavior of steady-state GI/M/1 queues, as a model for telecommunications networks. They described the eects of dierent 's on the geometric parameter, s [LIPS92] as a function of the utilization parameter, , where := [arrival rate] 1 [mean service time]: (1) The variance of a PT distribution is in nite if 2. Their calculations showed that s stays close to 1 as decreases from 1, and drops o ever more slowly for smaller . Note that the mean queue length for GI/M/1 queues is proportional to 1=(1 0 s). Thus steady-state performance of these queues becomes worse gradually as drops below 2 with xed, becoming disastrous as approaches 1 from above (i.e., the mean still exists). Current eorts have been made to model data with these distributions, with dierent formulas for nding the (or Hurst) parameter. The diculty lies in gaining an accurate measurement of the parameter. Data with these distributions appear to be linear over several orders of magnitude on a log-log scale. One attempt at modeling has used a Hill estimator [TAQQ95] and attempts to infer a value of from a stable region of the plot. This method is heuristic at best, since it is dicult to get a point estimate for a graph, and the graph may exhibit considerable volatility. Attempts at smoothing can yield somewhat better results; however, nding accurate values of remains an open topic. Current empirical estimates place somewhere between 1.05 [PARK96] and 1.4 [LELA94] for network trac. Anyway, there is no reason to suppose that is the same every day, every line, or even every application. In this thesis, we maintain that the aects of noisy/bursty trac can be modelled adequately with PT distributions, and most if not all of the pathological
3 behavior which might occur in real systems will be re ected, at least qualitatively, in an appropriate queueing model where the arrival process is a renewal process. We show that self-similarity creates tremendous bottlenecks for single server computer networks. Traditionally, the way to reduce bottlenecks is to identify and upgrade the oender. This increases the speed of the system. However, the self-similar nature of le sizes and network trac indicate that this will only work marginally. No matter how fast you make the network or how much you speed up the system, eventually a single job of signi cant size will arrive and cause a bottleneck. We will show here how multiple servers improve performance dramatically, especially for moderately loaded systems. Many analyses focus on unrealistically high utilization parameters (i.e. = :9 and above) where few systems would actually run. By focusing on systems with more realistic utilizations, we can hope to nd ways of improving working systems. Another advantage of using multiple servers is expense; a server (or network line) that runs twice as fast is almost always more expensive than two systems (network lines) at single speed. (Note: In this proposal, we use the word \system" to indicate any shared resource.) These analyses will focus on network lines such as the World Wide Web, which epitomizes the self-similar behavior we are examining. We show that even with limited buer size, where over ow packets are rejected, slower multiple servers still outperform single fast servers in systems with power-tail distributions. The rationale behind this approach lies with the fundamental nature of the packet trac: its self-similar behavior. With this behavior, eventually a job with a long service time will arrive at the server. When this arrives, it occupies the one server, no matter how fast, for a long period of time. While that one server is occupied, the queue backs up, waiting time increases, and the buer size needed to prevent over ows (or packets going to a backup buer) increases. With two servers, this long job can go to one server, while the smaller jobs can still go through the other server. This is optimal for moderate utilizations; at low utilization, a double fast server will be able to process the infrequent large job before too many smaller ones arrive and at high utilization, both servers eventually are occupied with long jobs, and the system behaves similarly to a system with a double fast server. Throughout this dissertation, we will show that for self-similar distributions, multiple slower servers are almost always better than a single faster one. Of course, all this is done assuming steady-state behavior for open systems. Closed systems limit maximum queue lengths. But this may require inordinately many arrivals before such large queue lengths could be seen in reality. Discrete event simulation models necessarily suer from the same problem. [GREI95] presented an argument showing that the closer is to 1, the more arrivals must occur before any system's steady state can be approached. They show that the number of events needed for = 1.4 is 100 times that needed for a similar queue with Poisson arrivals and exponential service times. Part of this is certainly due to the unbounded variance inherent in Power-tail distributions, but as we will see, the variance is not enough to explain the behavior. Simulations we have run have failed to produce consistent results which would indicate the length of time required to reach steady-state. The outline of the rest of this dissertation is as follows: In Chapter 2, we give a brief overview of Linear Algebraic Queueing Theory. In Chapter 3, we cover the de nition and characteristics of power-tail distributions, and truncated power tails which in their limits become power-tails.
4 We show the history of these distributions, how they have been used to describe dierent physical phenomena, the parameters we use for the power-tail distributions, and the justi cation behind choosing those parameters. In Chapter 4, We start by looking at systems with single servers, and show how quickly queue length probabilities grow for these systems. We also compare the queue length probabilities of systems with power-tail distributions with systems with hyperexponential-2, Erlangian and Poisson distributions with similar characteristics, and show how these distributions are inadequate for modeling telecommunications trac. We speci cally look at hyper-exponential distributions with the same mean and variance, and show how they are inadequate to more simply describe the systems at which we are looking. We show how the degree of truncation of the power-tails eects the buer sizing. We include some full power-tail examples, which can be calculated numerically. We see how for M/G/1 queueing systems, a(n), the probability of an arriving customer seeing n customers in the system already, asymptotically approaches f ()=n . In Chapter 5, We start by discussing the formulas for open systems with multiple servers (M/G/C systems), closed systems with multiple servers (M/G/C/N systems), systems with a generalized arrival time and multiple exponential servers, both open and closed (G/M/C and G/M/C/N systems). We also cover the formulas for queue length probabilities for these systems. We compare systems with single vs. multiple servers for systems with power-tail distributions, with respect to buer size. We show the dierence in rejection rates for systems with single faster servers verses multiple slower ones. We look at queue length probabilities and buer requirements to reduce the rejection (over ow) rate to certain levels. We look at these rejection rates for our four dierent system models. We also brie y examine the response time, to see at what point the segmentation becomes optimal. In Chapter 6, we look at the power parameter, to see how it aects our models. Here we show the sensitivity of these systems to the power parameter. We also brie y review how large a sample size we need to reach a steady state for dierent values of . Finally, We summarize our research and attempt to draw conclusions from this work. We also outline some suggestions for future work.
Chapter 2 Queueing Model 2.1 Probability Distribution Functions
A Probability Distribution Function (PDF), for some random variable, X , is de ned as: F (x) := Prob(X x); (2) while its Reliability Function is given as R(x) := Prob(X > x) = 1 0 F (x): (3) Also, if it exists, the probability density function (pdf) is de ned as dF (x) (4) = 0 R(x) : f (x) := dx
dx
We de ne the moments of f(x) as follows. We can write Z1 E (X ) := xf (x)dx (5) We read this as \The expected value of X is equal to 1 1 1". In general, we can write Z1 E (X n ) := xn f (x)dx (6) We read this as \The expected value of X n ", or \The nth moment of f(x)". We also de ne the variance, symbolized by as Z1 (7) = E ([X 0 E (X )] ) = (X 0 E (X )) f (x)dx which can be shown to be equal to = E (X ) 0 [E (X )] (8) 0
0
2
2
2
2
0
2
2
5
2
6
P1j
µ1
p1
pj
µj
Pij pm
pi
Pjm µm
qj qm
µi
Figure 1: Typical server model The standard deviation of f (x) is symbolized by , which satis es the obvious p = . represents the average spread about the mean; the smaller is, the narrower the distribution. Since we will usually deal with functions that are de ned for positive x, a relative width is often useful. Therefore, we have the coecient of variation, de ned by 2
C2 =
2.2 LAQT Basics
2 [E (X )]2
(9)
Most of the theory presented here is an outgrowth of \Queueing Theory : A Linear Algebraic Approach" [LIPS92]. Here we provide the foundations for the formulas needed to understand the algorithms, the implementation and the obtained results. Much of this chapter is taken from the theses of Christian Callegari [CALL94], Siddhartha Roy [ROYS92] and the dissertation of Yiping Ding [DING92], which is in turn taken from [LIPS92] The following notation scheme is adopted. All matrices are represented by boldface capital letters (e.g., P). Their components are represented either as Pij or [P]ij . Similarly all vectors are represented by boldface lower case letters (e.g., q). Their components are represented as qi or [q]i . We also need to refer to data structures, like arrays, in our algorithms. These arrays are represented in italics (e.g. T or Pc ). Figure 1 shows a typical server. Each server in our server model is as follows: Let our subsystem be called S . A subsystem can be accessed by only one customer at a time and the rest of the customers wait outside (in the queue) until the active one leaves. A subsystem is equivalent to a single server whose probability density function (pdf) is certainly not exponential. A subsystem is represented by m 0
7 phases or states which have exponentially distributed completion time, with mean completion rate i where i = 1; 2; 1 1 1 ; m. A customer enters the box, and with probability pi goes to phase i. Thus we can de ne the entrance vector, p, p [p ; p 1 1 1 pm ] After spending a random time taken from the exponential distribution with rate i , he goes to phase j with probability Pij , or leaves with probability qi. The exit vector, q, is de ned as, q [q ; q 1 1 1 qm ] the transition matrix, P, is 1
2
1
2
2 66 P11 P12 1 1 1 P1m 66 66 P21 P22 1 1 1 P2m P 666 ... . . . ... 66 ... 66 4 Pm1 Pm2
111
Pmm
3 77 77 77 77 77 77 75
The completion rate matrix M has diagonal elements which are the completion rates of the individual states in S ; that is 2 66 1 0 1 1 1 66 66 0 2 1 1 1 M 666 66 ... ... . . . 66 4
0 0 ...
3 77 77 77 77 77 77 75
0 0 1 1 1 m We also de ne the column vector, "0, all of whose components are 1. I.e., " [1; 1 1 1 1 1] (the 0 denotes transpose). Since a customer must go somewhere when he enters the subsystem, it follows that m X i=1
pi = p "0 = 1
Also, since the customer, upon leaving phase i must either go to another phase (Pmj Pij ) or leave (qi), we have =1
m X j =1
Pij + qi = 1
or
P "0 + q = "0
or
q0 = (I 0 P)"0 :
8 Next let
B = M(I 0 P)
and V = B0 where B is the service rate matrix and V is the service time matrix whose individual components Vij , are interpretable as the mean time a customer spends at phase j (counting all visits to it) from the time it rst visits phase i until it leaves S. Let's de ne to be that vector whose ith component is the mean time it will take for a customer to leave S , given that it started at i. The average time at each visit that a customer is served by phase i is 1 = (M0 "0)i: 1
1
i
Then either it leaves with probability qi or it goes to phase j with probability Thus, in vector form, 0 = M0 "0 + P 0 Solving for 0, (I 0 P) 0 = M0 0 or 0 = (I 0 P)0 M0 0 = [M(I 0 P)]0 0 = V0 The mean service time of S is x = p 0 = pV"0 : (10) For any matrix, X , we use the notation 9[X] = pX"0: (11) Thus, the mean service time of S is x = 9[V]: (12) It can be shown in general that R(x) = 9[exp(0xB)] (13) These formulas form the basis for Matrix Exponential distributions. More formulas will be introduced as necessary for each dierent queueing system. Pij .
1
1
1
1
1
1
Chapter 3 Power-tail and Truncated Power-tail Distributions 3.1 Power-tail Distributions
Power-tail distributions were rst outlined in, and are thoroughly described in [GREI95]. A summary is given here. A Power-tail Distribution can be de ned by its behavior for very large x. That is, if c (14) R(x) 0! ; x then R(x) [or F (x)] is a power-tail distribution with power , where is a positive, real number. From elementary calculus it is easy to show that if 1 then the distribution has an in nite mean. If 1 < 2 then F (x) has a nite mean, but an in nite variance. In general, E(X ` ) :=
Z1 0
x` f (x) dx = 1
8 ` :
(15)
Such distributions have been known to exist for a long time. Pareto used them in describing the distribution of wealth in economics. Levy showed that all stable distributions with in nite variance have power tails. Thus they are also referred to as Pareto, or Levy-Pareto, or simply Levy Distributions in the literature for various disciplines. For a more complete discussion, the reader is referred to [FELL71], [SAMO94] or [GREI95]. These distributions have been ignored in computer science and related elds in the past because an extremely large number of events (a number of the order of 10 would not be unreasonable) must occur for the eects of this distribution to be relevant. What does it mean for a model to predict a steady-state mean queue length of, say 10,000 customers, when there are hardly that many customers in the user community? Only now, 7
9
10 with the presence of the information super-highway (and the World-Wide Web) can we expect to see (and, in fact, have already seen) so many events (customers - packets) in a relatively short time. Additionally, since the World Wide Web is global, there is not the \overnight" most current systems enjoy where the queue drains while the users sleep. The sun never sets on the World Wide Web. 3.2 Truncated Power-tail Distributions and LAQT Representations
In general, simple power-tail distributions [the one used by Pareto was of the form: f (x) = c x0 =(1 + x) ] are dicult to use for Laplace transforms, and do not have direct matrix representations. But a most useful sub-class of them is given in [GREI95]. The particular one we use here is de ned as: 1 0 MX0 n R (x= n ) (16) RM (x) = 1 0 M n We call these functions truncated power-tail distributions. In the TPT distributions, and are parameters satisfying the inequalities: 0 < < 1 and
> 1. They are well behaved (or Phase, or matrix-exponential) if R (x) is, and converge to 1
+
1
0
=0
0
and
R(x) := Nlim R (x) = (1 0 ) !1 YN
1 X
n=0
n R0 (x= n )
(17)
1 !n X
f0 (x= n ): (18)
n=0 RYN (1) is well behaved for all N , but the limit function R(x)
f (x) := Nlim f (x) = (1 0 ) !1 YN
How interesting. is not. We have a nite sum of well behaved phase distributions which approach a power-tail in their limit. Hereafter, for simplicity, we will use: R (x) = exp(0x): (19) Then it is not hard to show that the `th moments are given by " 0 # (1 0 ) MX ` k ` ( ) : (20) E(X ) = 0
1
M
k=0
Alternatively, using the closed form of the sum we have ` M E(XM` ) = 1100M 1 1 01 0( )` 1 1 : It can be shown that R(x) satis es (14), and is related to and by log() 1 = 1 ; or := 0 log( )
(21) (22)
11 0 1 3 5 7 9 12 100
-5
-10
-15
-20 0
1
2
3
4
5
6
7
Figure 2: Truncated power-tail reliabilities RYM for M 2 f1; 2; 3; 5; 7; 9; 12; 100g = 1:5 and = 0:5 plotted on a log-log scale. For M = 100, log(R(1)) is a straight line for many orders of magnitude of change in x. Even with only 12 terms, log(R(1)) looks like a straight line for a factor of 100 change in x From this it follows that E(X ) := Mlim E(XM` ) = 1 for ` : !1
Since P1k ( ` )k diverges for ` 1, equation (15) is satis ed. We refer to the functions, RM (x) as truncated power-tail (TPT) distributions, because they look like their limit function, the true power-tail, R(x), until for some large x, depending upon M , they drop o exponentially. In Figure 2, we plot power-tail reliabilities for M = 1; 2; 3; 5; 6; 9; 12; 20 and 100, with = 1:5 and = 0:5, plotted on a log-log scale. For M = 100, log(R(1)) is a straight line for many orders of magnitude of change in x. Even with only 12 terms, log(R(1)) looks like a straight line for a factor of 100 change in x We note here that for notation, we use TPT(32) to indicate a Truncated Power-tail with M =32. For full power-tails, we simply use PT, which is equivalent to TPT(1). Also note that TPT(1) is an exponential distribution, although we may also use the standard notation for that. These distribution functions have the bene t of being easy to manipulate algebraically. Furthermore, they are M 0dimensional phase distributions whose vector-matrix representations, < p ; B > are given by (using the notation of [LIPS92]): =0
M
M
12 2 66 66 66 66 6 BM = 66 66 66 66 64
1 0 0 ... 0
0
01
0 ... 0
0 0
02
... 0
3 0 77 77 0 777 77 0 777 ... 7777 77 5
0(M 01)
... ... ... ... ...
and p = 1100M [1 M
2
111
M 01 ]:
(23) We need these matrices to calculate the properties of nite-buer queues. They generate the functions given above by the relations RM (x) = 9M [exp(0xB )] and the moments can be shown to be E(XM` ) = `!9M [V` ]: Furthermore, the Laplace transform of FM ( 1 ) is M
M
B 3 (s) :=
Z1 0
e0sx fM (x) dx = 9M [(I + sVM )01 ]:
Note that the matrix B, representing R(x), is in nite dimensional, and has an in nite set of eigenvalues, f ng, with an accumulation point at 0. So, in principle, its inverse does not exist. However, with judicious use of limits, all calculations can be carried out. Because only truncated power-tails can be represented in LAQT format, we use them to do our analysis. Current work is underway with analyses of full power-tails, which can be done algebraically by taking limits of sums. However, computationally we are limited to Truncated Power-tail solutions. The code for the analysis places these phase distributions in matrix exponential format, with square matrices of size D[k], where D[k], the number of states, is 0 B M +k01 D(k ) = B B @ k
1 C C C A
with M being the number of phases, and k being the number of active servers. Computer speed, storage space and numerical precision limits have so far limited our analysis to matrices up to approximately 210 by 210, depending on the analysis.
Chapter 4 Buer Sizing in Single-Server Systems with Power Tail Distributions We present new exact formulas for calculating buer over ow or loss probabilities in steady-state GI/M/1, GI/M/1/N, M/G/1, and M/G/1/N queueing systems. We show how PT distributions can be incorporated into these models. We then present the results of a parametric study of the buer size needed to prevent over ow or loss in such systems. We examine the eects of PT and other distributions on buer over ow and compare the behavior of such systems with that of better behaved systems, namely where the interarrival times, or service times, have hyperexponential-2 or Erlangian distributions. We show that power tails can cause problems for intermediate values of the utilization parameter, . These problems become very serious (beyond the usual 1=(1 0 ) factor) when is close to 1, and/or when approaches 1. We found that buer over ow, though much larger than the M/M/1 queue, can be kept under control unless is less than 1.5. The emperical data of [LELA94] indicated an of approximately 1.4, so serious buer problems can be expected in the future. We also found that two-phase hyperexponential distributions with large coef cients of variation, C , comparable to an equivalent truncated tail distribution, yielded wildly inconsistent results. (H distributions are 3-parameter functions, two xed by the mean and variance. Performance varies drastically depending upon the choice of the third parameter.) 2
2
4.1 Modeling Queue Length Probabilities
For this analysis, our system is made up of a single processor receiving data from an arrival stream of variable-sized packets. The arrival stream may be considered a renewal process. The receiving processor has a nite primary memory buer which can hold at most N packets. If a no-loss system is required, then 13
14 we assume there exists an unbounded secondary or backup-buer that will store the over ow (e.g., a disc-array sub-system), and transfer the data to the primary buer when space becomes available. The assumption of an (almost) \in nite buer" is not unreasonable, given the emerging technologies for fairly high-speed massive storage. Qi Gan (amongst others) proposed such a system in a paper in 1995 (never published), \High Performance I/O Network Systems". We will show presently that for GI/M/1 queues under heavy load, if the primary buer is large enough so that only 1% of arriving packets will have to be placed in the backup, then a backup buer that is k times the size of primary will over ow only one time in 10 k . Although there may be many problems associated with the transfer (e.g., loss of rst-come- rst-served sequencing, extra processors needed), we assume that the transfer can be made at least as quickly as it takes to drain the primary buer, so there is no change in eective service rate. This is, then, a GI/G/1 open queueing system. If there is no backup buer, then there must be losses, and we have a GI/G/1/N system. We will assume that either \GI" or \G" is exponential, yielding a total of four dierent types of queues. First we will assume that packet arrivals can be considered a general renewal process, where each packet must be serviced in a time taken from an exponential distribution with mean time 1= (a GI/M/1 queue). If no backup buer is provided, then we have a GI/M/1/N queue. In an alternate view (see [LIKH95]) a Poisson process with a \disbursed" batch of packets whose number is distributed by a power tail, can generate self-similar data. If the packets can be \reassembled" at the receiving node and counted as one customer whose service time is taken from a PT distribution then we have an M/G/1 queue, or an M/G/1/N queue if there is no backup buer. 2
4.1.1 GI/M/1 Queues - No Packets Lost
Here we assume that the time to process a packet is exponentially distributed, with mean 1=, and the buer is unbounded in size (bigger than we'll ever need). The packet arrivals constitute a renewal process, with interarrival-time distribution F (x). As already discussed, this constitutes an open GI/M/1 queue. It is well known that the steady-state probability for nding k customers in the queue, (k ), is given by (0) = 1 0 (24) k 0 (k ) = (1 0 s) 1 1 s ; k > 0 [LIPS92] where s is the geometric parameter satisfying the equation s = B 3[(1 0 s)]: (25) B 3(z ) is the Laplace transform of the interarrival distribution. Alternatively, in the LAQT representation, s is the smallest eigenvalue of the matrix 1 (26) A := I + B 0 Q 1
with B as de ned in Chapter 2, I being the identity matrix of appropriate size and
15 Q = 0p:
(27)
Let x be the mean interarrival time. Then 1 = x
and the mean queue length (including the one being served) for the process is q :=
1 X
k 1 (k ) =
10s1 What is needed for our analysis is the probability that an arriving packet will nd exactly k > 0 other packets already enqueued. The arrival probabilities are dierent from the (k)'s and are given by s a(k ) = (1 0 s) 1 sk = (k ) 1 1 k=0
The probability, then, that an arriving packet will have to be stored in the backup buer is 1 1 X X Pr(N ) = a(k) = (1 0 s) sk = sN : k N k N We see that the smaller s is, the less likely it becomes that over ow will occur. Equivalently, the closer s is to 1, the bigger q and Pr(N ) will be, giving less desirable performance. There are some general statements one can make. For instance, when = 1, so does s. If R(0) = 1 (a non-defective distribution) then s = 0 when = 0. Also, only for the M/M/1 queue does s = for all . We say that if s > then system performance is worse, and if s < then system performance is better than one would \expect". It has been shown [LIPS92] that the slope of the curve, s versus at = 0 is xf (0). So if this is less than (greater than) 1, then for small , performance is better (worse) than the equivalent M/M/1 queue. At the other end, at = 1, the slope is 2=(C + 1). If C > 1 (C < 1) then performance is worse (better). In general, near = 1 performance depends only on the moments of the interarrival time distribution [LIPS92]. But for Power-tail distributions, if 2 then C = 1 and the slope is 0. This means that s will remain close to 1 even as decreases. At the other end, for small performance depends only on the behavior of R(x) when x is small. For instance, in modelling PT distributions using (17) with = (1 0 )=(1 0 ), it follows that f (0) > 1 for all and all . A dierent function (other than e0x) could0have been chosen which would have yielded a x 0 x smaller s for small (e.g., xe instead of e ). But the performance for ! 1 would be similar. This shows the diculty in selecting test functions when exploring the general performance of systems. Without more knowledge of a particular system, no model can be relied upon to give an accurate picture of the performance for small or intermediate . However, qualitative behavior can be surmised. =
=
2
2
2
2
2
16 4.1.2 Buer Size for High Utilization Rates
When is close to 1, it is better to look at Pr(N ) as a function of t := 1 0 s, for then P := Pr(N ) = sN = eN s = eN 0t e0tN for t 1: In other words, for high utilization, the probability of an arriving customer nding N customers ahead of him in a GI/M/1 system is roughly e0tN . Since s is a function of , t s also a function of . Here we are trying to nd a relation between0kbuer size and rejection rate. Assuming we want a rejection rate of P = 10 (In our examples, k = 2 for a 1% rejection rate), we get log P k loge(10) : N () = e = (0t) t So for some xed (and thus xed t), the buer size is a function of the over ow probability. From the previous formula, clearly, a doubling of the buer size will reduce the probability of over ow to 100 k . Therefore we see that inexpensive backup buers can reduce packet loss to arbitrarily small values if t is not too small. As will be shown presently, for Power-tail distributions, it can be extrememly small even for moderate . log( )
log(1
)
2
4.1.2.1 Some Examples
Our purpose in this chapter is to examine the eect which PT distributions and their truncated cousins have upon buer sizes. For our \base" function we have chosen = 1:4 and = 0:5. [GREI95] has shown that for xed , system behavior is quite insensitive to changes in , so any intermediate value will do. On the other hand, performance is very sensitive to . The value = 1:4 ts the data given in [LELA94]. For comparison, we have included the Erlangian-2 function, E (x) = xe0x , and the hyperexponential-2 function, 0 x H (x) = p e + (1 0 p) e0 x. In all cases, the 's have been chosen to give a mean interarrival time of 1. As always, the Erlangian-2 has a coecient of variation of C = 1=2. The H function, however, is a 3-parameter function, and even after choosing an appropriate C , one arbitrary parameter remains. For our rst set of calculations we chose p = :0001 and C = 4:75, the same as TPT(8). The results for P = :01 (1 percent primary buer over ow probability) are given in Figure 3. As can be seen in Figure 3, both the X and Y axes are linear. On the X axis, we have the utilization rate . On the Y axis, we have the buer size needed in a GI/M/1 queue for over ow to be less than 1%. As approaches 1, the buer size grows unboundedly. Plotting this on a linear scale does not show us what is happening due to the immensity of the buer size required. We can control the variation along the Y axis by using a logarithmic scale. Even so, for large M , C becomes unboundedly large. In Figure 4, we plot log[(1 0 )N ] versus instead. For the rest of the charts in this paper, we will plot using logarithmic scales, usually multiplying by 1-. The curves are discontinuous because N is an integer function, and have negative slopes for small because of the factor 1-. Figure 5 shows that although the buer size can become very large as approaches 1, all curves (except that for M = 1) are nite at = 1. 2
1
1
2
2
2
2
2
2
2
2
2
17
G/M/1 Queue, 1% to Overflow Buffer
4
18
x 10
TPT(32)
16 14
Buffer Size
12 10 8 6 4 2 0 0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
TPT(16) TPT(8) M/M/1 H2 E2 1
Rho
Figure 3: Primary buer size needed in a GI/M/1 queue for over ow to be less than 1%, as a function of = 1=( 1 x).
18
G/M/1 Queue, 1% to Overflow Buffer
4
10
TPT(32) 3
Buffer Size * (1−Rho)
10
2
10
TPT(16)
TPT(8) H2
1
10
M/M/1 E2
0
10 0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Rho
Figure 4: Primary buer size needed in a GI/M/1 queue for over ow, plotted on logarithmic scale, multiplied by 1 0 .The curves are discontinuous because N is an integer function, and have negative slopes for small because of the factor 1 0 .
19
G/M/1 Queue, 1% to Overflow Buffer
5
10
4
TPT(32)
Buffer Size * (1−Rho)
10
3
10
2
TPT(16)
10
1
10
TPT(8) H2 M/M/1 E2
0
10 0.99
0.992
0.994
0.996
0.998
1
Rho
Figure 5: Primary buer size needed in a GI/M/1 queue for over ow plotted on logarithmic scale, multiplied by 1-. As we can see here, all curves (except that for M = 1) are nite at = 1
20 It is true that for all distributions with nite variance (see [LIPS92]), t 2 1 lim = ! (1 0 ) C +1 Therefore, ! C +1 lim[(1 0 ) 1 N ()] = loge (1=P ) 2 1 ! Except for very large M , Figure 5 clearly shows that a limit exists. But since C goes to in nity as M does, the PT distribution itself must yield an in nite limit. This is certainly true. [GREI95] has shown that as approaches 1 t() 0! (1 0 ) 0 1 Therefore, (1 0 ) 1 N () 0! 1 00 0! 1: (1 0 ) How interesting. The buer size required to prevent over ow approaches in nity as approaches 1 even when we multiply by 1 0 . Put dierently, for PT distributions with 1 < 2, lim(1 0 ) 0 1 N () ! 2
1
2
1
2
1
2
1
1
1
1
1
is nite and greater than 0. While these limits are interesting in their own right, and show that the calculations are consistent with theory, they may not be of much use for practical performance analysis. Most of the extreme behavior occurs for > :9, where almost any system would be expected to behave badly. Of more interest is the range 0:5 < < 0:9. In this range it is clear that E and Poisson renewal processes predict no great buer problems. Even the H distribution gives results very similar to that of the M/M/1 queue except above = :9 where it nally rises abruptly to approach the same limit as the curve for the TPT(8), as it must, since they have the same C . The implication is strong here that C is not as signi cant as it is, say in the mean queue length of the steady-state M/G/1 queue. To explore this further we compared the TPT(32), and C = 5033:44 1 1 1 with various H distributions with exactly the same mean and coecient of variation (C = 5033:44 1 1 1). For the dierent H 's we selected p = :0001; :00001 :000001; and .0000001 (p must be smaller than .0004 in order to get such a big C ). The results are given in Figures 6 and 7. We see that rare-event models are insucient to adequately model this behavior. In these models, most trac is well behaved, with the rare-events or outliers creating the bad behavior. However, with Power-tail distributions, the bad behavior lies across all scales, so it is impossible to point to a few outliers as the cause of the problem. Recall that all the buer sizes grow unboundedly as approaches 1, so we once again plot (1 0 ) 1 N () on a logarithmic scale. It is clear from Figure 6 that none of the curves have anything in common, except near = 0 and at = 1. Figure 7 shows they have the same asymptotic value as approaches 1. The 2
2
2
2
2
2
2
2
2
21
G/M/1 Queue, 1% to Overflow Buffer
5
10
4
10
H2, k=5 Buffer Size * (1−Rho)
H2, k=4 3
10
2
H2, k=7
10
H2, k=6
TPT(32)
1
10
M/M/1
0
10 0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Rho
Figure 6: Comparison of primary buer sizes between a TPT(32) and various H distributions with the same C , as a function of = 1=( 1 x). Buer size*(1 0 ) is plotted on a logarithmic scale. For the H 's, p = 100k ; for k = 4; 5; 6; 7. The M/M/1 queue is included for reference. 2
2
2
22
G/M/1 Queue, 1% to Overflow Buffer
5
10
H2, k=6,5,4 4
10
Buffer Size * (1−Rho)
H2, k=7 3
TPT(32)
10
2
10
1
10
M/M/1 0
10 0.99
0.992
0.994
0.996
0.998
1
Rho
Figure 7: Comparison of primary buer sizes between a TPT(32) and various H distributions with the same C for 0:99 1 2
2
23 G/M/1 Queue, Rho = 0.8, 1% to Overflow Buffer
8
10
TPT(64)
TPT(inf)
7
10
6
10
5
Buffer Size
10
4
10
3
TPT(16)
10
2
10
1
TPT(8) M/M/1
10
0
10
1
1.05
1.1
1.15 Alpha
1.2
1.25
1.3
Figure 8: Primary buer size of a GI/M/1 queue for the PT and various TPT distributions as a function of , with = 1=( 1 x) = 0:8, =1.1-1.3, buer size = 1-10 . 8
TPT increases smoothly throughout the range, but the others behave as would an M/M/1 queue for small , and at dierent values of jump rapidly to a higher level. We expect that this is an artifact of the H distributions. They can each be thought of as generating a Poisson stream of packets, interspersed infrequently (p) with an extremely long pause (1= ). When is small, the pause is long enough for the queue to drain. As increases, enough packets arrive during the busy times to back up the queue suciently so that it cannot drain during the quiet time. The smaller p is, the closer is to 1, so the queue cannot build up even during the busy times unless is very close to 1. Clearly, this describes such specialized behavior that H functions cannot be used to describe a general behavior pattern, at least not for buer problems. We see as a general rule that in the range of interest, C does not tell the whole story (or even a good part of it). In the previous gures we chose the power parameter to be = 1:4, matching the experimental value that appeared in [LELA94]. We now describe how performance varies over the critical range of 1 < 2 for intermediate 0:7 0:9. In Figures 8, 9 and 10 we see how the degree of truncation of the power tail aects performance when = 0:8. The M/M/1 queue (M = 1) is again included as reference. We see that even for an M as little as 16, the TPT and PT distributions yield comparable results for 1:4. But below that value, the buer sizes become extraordinarily large, and below = 1:1 dierent truncations yield very dierent results. Even TPT(64) does not come close to the full PT distribution. In this region, the buer sizes are so big that they become meaningless for a real-world situation. In the near future, at least, can we expect a host to process 10 packets in a single hour, let alone store them? Thus we must conclude that systems experiencing PT arrivals with < 1:1 never reach a steady state. Another modelling procedure must be found. 2
1
2
2
2
12
24 G/M/1 Queue, Rho = 0.8, 1% to Overflow Buffer
8
10
7
10
6
10
5
Buffer Size
10
4
10
3
10
TPT(64,Inf) 2
TPT(16) TPT(8) M/M/1
10
1
10
0
10 1.3
1.4
1.5
1.6
1.7
1.8
1.9
2
Alpha
Figure 9: Primary buer size of a GI/M/1 queue for the PT and various TPT distributions as a function of , with = 1=( 1 x) = 0:8. =1.3-2.0, buer size = 1-10 8
G/M/1 Queue, Rho = 0.8, 1% to Overflow Buffer
14
10
M=Infinity 13
10
12
Buffer Size
10
11
10
10
10
M=64
9
10
8
10
1
1.05
1.1
1.15 Alpha
1.2
1.25
1.3
Figure 10: Primary buer size of a GI/M/1 queue for the PT and various TPT distributions as a function of , with = 1=( 1 x) = 0:8. =1.1-1.3, buer size = 10 0 10 8
14
25
G/M/1 Queue, M=64, 1% to Overflow Buffer
14
10
Rho = .9 12
10
10
Rho = .8
8
Rho =
Buffer Size
10
10
.7
6
10
4
10
2
10
0
10
1
1.1
1.2
1.3
1.4
1.5 Alpha
1.6
1.7
1.8
1.9
Figure 11: Primary buer size needed for 1% over ow in a GI/M/1 queue for the TPT(64) distribution as a function of , with = 1=( 1 x) = 0:7; 0:8 and 0:9.
2
26 Figure 11 Shows that for 2, PT distributions and their truncated cousins behave like other distributions. Their unusual behavior only becomes signi cant when goes below 1.4. System behavior seems to vary smoothly with increasing . But keep in mind that buer size is given on a log scale. Even at = 2 there is a factor of 2 dierence between the = 0:7 and the = 0:9 curves. 4.1.3 GI/M/1/N Queues - Finite Buer
An explicit steady-state expression for the probability of nding k customers in a GI/M/1/N queue is only known in terms of LAQT and is given in [LIPS92]. They are (0 j N ) = g (N )9[UN V] (k j N ) = g (N )9[UN 0k ] where 1 = 9[UK(N )] (28) g (N ) and U := A0 , 9[ 1 ] is de ned in (11), and A is given by (26). The arrival probabilities (i.e., the probability that an arriving packet will see k packets already in the buer) are dierent. Let N be the size of the buer, then a(k j N ) = K (N )9[UN 0k ] for 0 k N: K (N ) is the normalizing factor making the sum of the probabilities equal to 1. That is, N X 1 = 9[UN +UN 0 +1 1 1+U +U+I] = 9[(I0U)0 (I0UN )]: a(k j N ) = 1 =) K (N ) k Details of how to compute this are given in [LIPS92]. The probability that a packet will be lost is the same as the probability that an arriving packet will see a full buer, and is given by a(N j N ). Thus Pr(N ) = K (N ): +1
1
1
2
1
+1
=0
Figure 12 is similar to Figure 4 except that N () is not multiplied by (1 0 ), since N ( = 1) itself is nite. In Figure 5, we see the functions blow up at 0P . When the packets arrive too quickly then the buer size needs to increase appropriately to keep the rejection rate down to P%. The same pattern occurs here. The distributions included in Figure 12 are truncated power-tails with M = 1; 8; 16; 32, E and H (C = 4:75). The H /M/1/N system behaves no dierently than the M/M/1/N until > 0:9, even though it has the same C value as the curve labeled TPT(8). So we see again that C does not tell the story, certainly not in the range of primary interest. 1
1
2
2
2
2
2
2
27
G/M/1/N Queue, 1% Rejected
2
Buffer Size * (1−Rho)
10
TPT(32)
TPT(16)
1
10
TPT(8) 0
H2
10
M/M/1
−1
10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Rho
Figure 12: Buer size for 1% loss in a GI/M/1/N queue for various distributions as a function of = 1=( 1 x).
28
G/M/1/N Queue, 1% Rejected
8
10
7
10
6
Buffer Size
10
5
10
M=32 4
10
M=16
3
10
M=8 H2 M/M/1 E2
2
10
1
10 0.99
0.992
0.994
0.996
0.998
1 1.002 Rho
1.004
1.006
1.008
1.01
Figure 13: Same as the previous gure except for higher values of , and we don't multiply by (1 0 ) here. Note that the gures blow up at = 0P = : = 1:01010101 : : :. 1
1
1 0 99
29 4.1.4 M/G/1 Queues - In nite Buer
Erratic trac may be caused by les whose sizes are distributed according to a power-tail law, but are broken up into numerous smaller packets which then disburse upon transmission, giving an appearance of burstiness. If we imagine that they are reassembled at the server and stored as one job, then this could be adequately described as an M/G/1 system. But now we are faced with an obvious problem. The Pollaczek-Khinchin formula states clearly that a service distribution with in nite variance must produce an in nite mean queue length for steady-state M/G/1 queues, for all . If the time to process a packet is proportional to the size of the packet, then the amount of buer space needed to hold waiting packets must be proportional to the waiting time, which over a long time must grow unboundedly if C = 1. However, if an entire reassembled packet can be stored in a single slot, then the steady-state probability that an arriving packet will nd that N or more buer slots are already taken, is not in nite, even though the mean queue length is in nite. We hypothesize without proof, that the steady-state (and arrival) probabilities for PT distributions satisfy, for large n const a(n) = (n) 0! f () 1 1 n We explore this more closely in the next section. Then, if > 1, 1 1 X 2
n=1 n
is a convergent series and
lim Pr(N ) = Nlim !1
N !1
1 X n=N
a(n) = 0
where now (and hereafter), = x. is the Poisson arrival rate, and x is the mean service time. This says that there exists a nite N for which Pr(N ) = for all 1 > > 0, however small is. Notice that this would be true even though q is in nite if 2, for then 1 n 1 1 X X q := n a(n) (1 0 ) 1 const = (1 0 ) 1 const X 1 = 1: n=1
n=1
n
01 n=1 n
(Note that if 1 then there can be no steady state.) We assume here, then, that each reconstituted packet takes up one slot. The steady-state and arrival probabilities are the same for an M/G/1 queue, and are given in LAQT form by: a(n) = (n) = (1 0 )9[Un ]: The probability that an arriving packet will nd N or more slots full is given by: 1
Pr(N ) = (1 0 ) X 9[Un] = (1 0 )9[UN (I 0 U)0 ]: n=N
1
30
M/G/1 Queue, 1% to Overflow Buffer
5
10
M=32 4
Buffer Size * (1−Rho)
10
3
10
2
M=16
1
H2 M=8
10
10
M/M/1 E2 0
10 0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Rho
Figure 14: Primary buer size needed for over ow of an M/G/1 queue to be less than 1%, as a function of = 1 x. Note that we are plotting buer size 1 (1-) on a log scale.
31
M/G/1 Queue, 1% to Overflow Buffer
5
10
TPT(32)
4
Buffer Size * (1−Rho)
10
3
10
2
TPT(16)
10
H2 TPT(8)
1
10
M/M/1 E2 0
10 0.99
0.992
0.994
0.996
0.998
1
Rho
Figure 15: Same as the previous gure except for higher values of . There is a slight upturn for the H2 curve near = 1, but this is almost surely due to numerical instability. All curves appear to be nite at = 1.
32 We have calculated these probabilities for the usual collection of distributions, and present the results in Figures 14 and 15. All curves appear to be nite at = 1, as shown by Figure 15 The behavior is similar to that of the GI/M/1 queue (Figure 4), but the primary buer sizes are somewhat bigger here. However we see that these curves are concave downward (that is, the buer size needed grows somewhat more slowly than 1=(1 0 )). Also, the H system behaves peculiarly, even for the relatively small C = 4:75. But it does approach the same value in the TPT(8) system as approaches 1. Figure 15 shows a slight upturn for this curve very close to 1, but this is almost surely due to numerical instability. 2
2
33 M/G/1/N Queue, 1% Rejected
5
10
TPT(32) 4
10
3
TPT(16)
2
TPT(8) H2
Buffer Size
10
10
M/M/1 E2
1
10
0
10 0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Rho
Figure 16: Primary buer size of an M/G/1/N queue for various TPT distributions as a function of = 1 x. 4.1.5 M/G/1/N Queues - Finite Buer
Finally we reach our last system, and last gure in this section. As with the open M/G/1 system the arrival probabilities and the steady-state probabilities are equal. (This is not true for GI/M/1 and GI/M/1/N queues.) Therefore, from [LIPS92] 8 > > < a(n; N ) = (n; N ) = G(N ) > > :
9[Un] for 0 n < N 9[UN 0 V] for n = N 1
where [G(N )]0 = 9[(I 0 U)0 (I 0 UN )] + 9[UN 0 V]: The probability that an arriving packet will be rejected is a(N ; N ). Therefore, N0 Pr(N ) = 9[(I 0 U)0 (I9[0UUN )] V+]9[UN 0 V] 1 Figures 16 and 17 show the result of our calculations for 1:0% rejection. It is not clear how the true PT would behave, but the buer size needed for P% rejection 1
1
1
1
1
1
34
M/G/1/N Queue, 1% Rejected
6
10
5
10
TPT(32) 4
Buffer Size
10
TPT(16) 3
10
TPT(8) H2 2
10
M/M/1 E2
1
10 0.99
0.992
0.994
0.996
0.998
1 Rho
1.002
1.004
1.006
1.008
1.01
Figure 17: Same as the previous gure except for higher values of .
35 will be nite at = 1. As with the GI/M/1/N queue, N ( = 1) is nite but should grow unboundedly as approaches 0P , with P = 0:01 throughout this chapter. 1
1
4.2 Analyses of Single Server Systems
We have shown how to integrate power-tail distributions and their truncated cousins into the analysis of communications networks using various GI/G/1 queues with-and-without nite buers. Other types of test functions (e.g., H (x)) must surely be inadequate. On the other hand, more complicated processes which heuristically build in correlations, may well be unnecessary. The models given here, though relatively simple, can be used by researchers who use discrete event simulations. The reason why this could be very important is that statistics for PT distributions converge to their mean much more slowly than distributions with nite variance [GREI95]. Our simple analytic model and results could serve as base-line comparisons to see if an equivalent simulation is anywhere near convergence, before the researcher attempts a more complicated simulation model. Whereas for most processes the average for a set of data converges to its mean q as 1= (n) (n is the number of data points), data generated by PT distributions with 1 < < 2 converge as 1=n , where = 1 0 1=. If 1, the data doesn't converge at all! In fact, if one were to try to test to see if a given set of data was generated by a PT distribution with < 2, the test would fail even if the hypothesis were true. If it turns out that PT distributions play a signi cant role in communications systems, then much more research will have to be carried out on the statistical convergence of simulations. It will probably be necessary to nd new tools for analysis, based on transient systems, because the number of events needed to bring a PT system to its steady state may exceed the lifetime of that system. 2
2
4.3 Asymptotic Behavior for Full Power-Tails in M/G/1 Systems
Earlier in this chapter, we calculated the queue length probabilities and buer over ow probabilities in an M/PT/1 and PT/M/1 queueing system. We did this using formulas from [LIPS92] and the matrix representations of the Truncated Power-tail distributions. Steady-state analytical solutions only exist for the Truncated Power-tail subclass Here we show that the probability of an arriving customer seeing n customers in a M/G/1 system approaches a constant as a function of the utilization factor, , the power parameter, , and the queue length, n. How quickly they converge depends on M and the values of the utilization parameter . Fortunately, they converge most quickly for larger M (the larger M , the more the Truncated powertail approaches the full power-tail) and low-to mid values of , where most systems operate. We now present a numerical study of these formulas. 4.3.1 Formulas for a(n) Remember that a(n) is the probability of an arriving packet seeing n packets already in the queue ahead of it at arrival, and is the power parameter. In the previous section, we hypothesized that the steady-state and arrival probabilities for a full Power-tail distributions satisfy, for large n
36 a(n) = (n) 0!
f() :
(29) The steady-state and arrival probabilities are equal for an M/G/1 queue, and are given in LAQT form by: a(n) = (n) = (1 0 )9[Un ]: The probability that an arriving packet will nd N or more slots full is given by: Pr(N ) = (1 0 )
1 X n=N
n
9[Un] = (1 0 )9[UN (I 0 U)0 ]:
4.4 Asymptotic Convergence
1
Our claim is that a(n) asymptotically approaches this function. We now numerically examine this question. Given (29) we can see that a(n) 1 n 0! f(): For our examples, we chose = 1.4, which matches the data from [LELA94]. We arbitrarily choose = 0.50. We then vary the value of M to see if we can capture this convergence. In Figure 18, the X-axis is the queue length, and the Y-axis is a(n) 1 n. Here we see that as M increases, a(n) 1 n approaches a constant, f (), quickly even for lower values of n. We also see, due to the fact that the tails are truncated, that a(n) 1 n must necessarily drop o; Only for the full Power-tail does it approach a constant. For instance for TPT(16), the curve quickly reaches its pinnacle and then drops o quickly. For TPT(18), the pinnacle is also reached quickly, but drops o more slowly. For TPT(20) and TPT(22), we see the same pattern; the curve reaches its peak and drops o even more slowly. At TPT(32), the curve doesn't drop o until n > 1500, but it must drop o. Analysis of higher values of M show the same { that the larger the truncation parameter, the larger n will be before it drops o. In Figure 19, we set M = 72. We have found that for this value, a(n) is close to that of a full power-tail even for n as big as 1500. Increasing M more does not seem to help signi cantly. Similarly to the previous chart, we set =1.4. This time, instead of varying M , we vary . From this diagram, it is clear that a(n) 1 n approaches some function of , although it is not yet clear what f() is. Several hypothesis have been tried, but analysis so far have yet to bear them out.
37
f(rho) = a(n) & n^Alpha), Rho = 0.50, M/TPT/1 queue 0.18 TPT(32)
0.16
TPT(22) TPT(20)
0.14
a(n) * n^Alpha
0.12 0.1
TPT(18)
0.08 0.06 0.04 TPT(16) 0.02 0 0
TPT(8) 500
1000
1500
n
Figure 18: An M/TPT/1 queue with a truncated power-tail service time distribution. In this example, we set xed at 0.50 and vary M.
38
2 Rho=.95 1.8 1.6
a(n)*n^Alpha
1.4 1.2 1
Rho=.85
0.8 0.6 Rho=.75 0.4 Rho=.65 0.2 0 0
500
Rho=.05
Rho=.15 1000
Rho=.25
Rho=.35
Rho=.55 Rho=.45 1500
n
Figure 19: An M/TPT(72)/1 queue with a truncated power-tail service time distribution. In this example, we x M and vary .
Chapter 5 Comparison Of Buer Usage Utilizing Single vs. Multiple Servers In Network Systems With Power-Tail Distributions In this chapter, we extend the formulas to include multiple servers, and present the results of a parametric study of the buer size needed to prevent over ow or loss in single and multiple server systems where data arrivals or service times are distributed according to the power-tail law. For systems with a power-tail arrival distribution and multiple servers(PT/M/C), we gain no performance increase by utilizing multiple exponential servers. However, for systems with a Poisson arrival rate and power-tail service times (M/PT/C), the improvement by using multiple, slower servers over a single faster one can be great indeed. Our purpose in this dissertation is to examine the eect which power-tail distributions and their truncated cousins have upon buer sizes, and how utilizing multiple servers can reduce the buer space required to limit buer over ow to some threshold. The formulas we reviewed in Chapter 2 are all appropriate for single server and some multiple server models. With multiple servers, extended formulas are included as required. At each section, we will introduce the additional terms required to generate the formulas. 5.1 Multiple Server Queueing Systems
Linear Algebraic Queueing Theory (LAQT) [LIPS92] is currently the only analytical solution available for an M/G/C system, i.e. those with a generalized service time. LAQT uses Phase Distributions to get arbitrarily close to a full power-tail by modeling them with truncated power-tails. With LAQT, we can model the buer over ow probabilities for systems with truncated power-tail distributions of arbitrarily long length. 39
40 M/G/C type systems are of great practical importance but have not been studied in detail, mainly because of the computational intractability involved. The linear algebraic approach is able to cast the entire system into a matrix form which makes it amenable to easy computation. The idea of dealing with multiple servers which have non0exponential service time distributions is not so trivial. Only when they are exponential can the idea of multiple servers be replaced by the concept of load dependence which basically states that the service rate of a server is dependent on the number of customers present in the server. Consider the following two models. For the rst model there is a single server that doubles its service rate on the rst customer when a second customer is present. The second model has a subsystem that has two servers, one for each customer. For exponential servers these two models yield the same equations unless we mark the customers to keep track of them. However while dealing with non0exponential servers the scenarios lead to dierent models. The rst scenario can be modeled by multiplying the completion rate matrix, M, by a constant factor when a second customer arrives. This is correct since the internal state description and dimension of the matrix representation are unchanged. However the second scenario becomes completely dierent. When two customers are in the subsystem, each of them is at a certain phase. Thus a formal notation has to be used which keeps track of where both the customers are at any point in time. 5.1.1 Steady State M/ME/C//N Loop
Consider the queueing system shown in Figure 20. S contains C identical servers, each server in turn being made of m phases. The box in the bottom half of the gure shows what each of the servers looks like inside. While dealing with single0server subsystems we made no real distinction between the server and S . Now since S has C servers we must keep track of all the active customers. But more than C customers cannot be active in S at the same time. If there are n C customers at S this means that n 0 C customers are waiting at the queue at S . When a customer completes service at S he leaves the subsystem and joins the queue at S which has an exponential (perhaps load dependent) service time distribution. If there are customers in the queue at S at the time, a customer comes in and takes the place of the departed customer. 1
1
1
1
1
1
1
2
1
41
S1 Server 1
n
Server 2
.. .
Server C
S2 λ k = N -n
p1
pi
P1j
µ1 pj µi
µj
qj Pjm
Pij
µm
pm
qm
Figure 20: Steady state M/ME/C//N loop. S is an exponential server, but S is made up of two or more identical nonexponential matrix-exponential servers, each made up of m phases and represented by the vecor-matrix pair, < p; B > 2
1
Each of the servers is described by the set of matrices namely,
p, the entrance vector, q00, the exit vector,
, column vector of 1's,
P, the transition rate matrix, M, the completion rate matrix, B = [hM(I i0 P)], the service rate matrix, V = B01 , the service time matrix.
For a detailed study refer to Chapter 6 of [LIPS92]. The system has a total of N customers, and n is the number that are at S (including the ones being served) at any particular time. 1
De nition 1
4k := fi = h ; ::::mi j0 l k; and Pml 1
2
=1
l = k g
42 for 1 k C . 4k gives the list of internal states of S when there are k active customers. Each object i (which is an m0tuple), is an internal state of the subsystem S which has C identical servers . Hence there are customers at phase 1, customers at phase 2 and so forth, each of them however in a dierent server. Taking a typical example from [LIPS92], suppose that m = 5 and k=4. A typical internal state would be f2; 2; 4; 5g meaning that one customer is in phase 2, another customer also at phase 2, and one each in phases 4 and 5 , all of them in dierent servers. If the servers are identical, then there is no need to distinguish among them. This also could be interpreted as no customers at phase 1 ( =0) in any server, two customers are at phase 2 of dierent servers( = 2), none at phase 3 ( = 0), one at phase 4 ( = 1) and one at phase 5 ( = 1). Thus, the new, equivalent, ordered sequence would be < 0; 2; 0; 1; 1 > . 1
1
1
2
1
2
4
3
5
De nition 2
(l) is the probability rate that one of the customers at phase will complete, given that there are l customers in that phase. Note that there is no distinction between having k identical servers and only one server whose phases are load
dependent. For 1 k C , let < i >=< ; ; 1 1 1 ; ; 1 1 1 ; m >2 4k and < j >=< ; ; 1 1 1 ; ; 1 1 1 ; m >2 4k where the 's and 's are nonnegative integers whose sum is k. 1
2
1
2
De nition 3 Mk is a diagonal matrix with components m X [Mk]ii = ( ): =1
(30)
This is called the completion rate matrix.
De nition 4 [Pk]ij , for i; j 2 4k , is zero unless i = j or [< i > 0 < j >] has one 1 (at position
) and one -1 (at position ), the rest being 0. ( )
If this is satis ed, then
[Pk]ij = [P ] [M ]ii for i 6= j k 1
and
(31)
m X (32) [Pk]ii = [M1 ] [P1]
( ): k ii This is the probability that a customer, upon completing at some phase in S , will go to another phase in the same server in S , thereby taking the system from state i 2 4k to j 2 4k. =1
1
1
43 De nition 5 [Rk]ij , for all i 2
4k0 , j 2 4k, is zero unless [< j > 0 < i >] has one 1 (at position ) and the rest, 0's. If this is satis ed, then [Rk]ii = p (33) This is the probability that a customer, who upon entering S and nding it in internal state i 2 4k0 , will go to the server and phase that puts the system in state j 2 4k . 1
1
1
De nition 6 [Qk]ij , for all i 2
4k , j 2 4k0 is zero unless [< i > 0 < j >] has one 1 (at position ) and the rest, 0's. If this is satis ed, then ( )q (34) [Qk]ij := [M k ]ii This is the probability that a customer, upon leaving S when the system was in state i 2 4k , leaves the system in state j 2 4k0 after he exists. 1
1
1
De nition 7
k (n; N ):= Steady0state probability vector, that there are n customers at S and N 0 n customers at S . The ith component of the vector is the steady-state probability that the active customers at S are collectively in state i 2 4k: r(n; N ) = k (n; N )0k is the associated scalar probability where k = n if n C and k = C if n > C: 1
2
1
5.1.2 Formal De nition of M/ME/C//N system De nition 8
The generalized M/ME/C//N system can be de ned as a two subsystem loop where S2 is exponential (maybe load dependent) and S1 is a network of load dependent exponential servers conforming to the following restrictions. S1 can have a maximum of C active customers at any time. If S1 has more than C customers the excess customers queue up at S1. However if there fewer than C customers at S1 an arriving customer enters immediately. When a customer leaves S1 , a new customer, if available, enters immediately to take its place.
Now that we have rede ned things quite a bit it would be bene cial to redescribe our subsystem. When a customer enters at S he goes to server with probability [p] . The probability rate of leaving that server is (l) where l is the number of customers at server . If the customer completes service at server he goes to server with probability [P1] or leaves S with probability [q] . According to our terminology, if all the servers in S are load dependent ( (l) = l) we call it a pure M/G/C queue . If one or more of the servers is load dependent [i.e., if v (l) < v for some l > 1, queueing delays can actually occur inside S . 1
1
1
1
44 5.1.3 Load Dependence at S2 (Time0Sharing Systems with Population Constraints
In order to study time sharing systems, or for that matter any system that has population size constraints, it is important to convert S into a load dependent server. In order to propagate this change we need to investigate only those equations that have . Now can be replaced by (l) where l is the number of customers at S . Since Mk, Rk, Qk, and Pk do not depend on they remain unchanged. The idea of converting the balance equations is very simple. One needs to replace with (N 0 n) where n is the number of customers at S . This is obvious because that leaves us with N 0 n customers at S . However our two families of Uc0s and Uk (N jC )0s will change, since they depend on . They are as follows, Uc(0) := (1)Vc (35) Uc (l) = (l + 1)[Bc + (l)Ic 0 Uc (l 0 1)Mc Qc Rc]0 (36) for l 1 Uk (N jC ) := (N 0 k + 1)[Bk + (N 0 k)Ik 0 Rk 1 Uk 1 (NjC)Mk 1 Qk 1 ]0 (37) If we look closely at the Uc0s we nd that they depend on S . 2
2
1
2
1
+
+
+
1
+
2
5.1.4 Open Generalized M/G/C Queue
We assume that (l) = 8 l. We can assume that S behaves as a Poisson source of customers to S . This should be the case because if the maximal service rate of S is greater than , then as N becomes large the probability that S is idle falls towards 0. For our open system we let N tend to in nity and we expect Uc (N ) to converge. 0 Uc := Nlim U (38) c (N ) = [Bc + Ic 0 UcMc Qc Rc ] : !1 Also, Uc (1jC ) := Nlim U (N jC ) = Uc : (39) !1 c We can calculate our family of Uk(1jC) by Uk (1jC) := Nlim U (N jC ) = [Bk + Ik 0 Rk 1 Uk 1 (1jC)Mk 1 Qk 1 ]0 : !1 k (40) Fortunately everything else remains the same except that we replace N by 1. Our xc0s simplify to (41) xc(C + n) := Nlim x (n; N ) = xc (C )Ucn !1 c where xc(C ) := xc(1jC ): (42) The steady-state probability vectors are as follows, k (k ) = r(0)xk (1jC ) (43) 2
1
1
2
1
+
+
+
+
1
45 Server with Power-tail service time distribution and mean service time 0.5
λ
Server with Power-tail service time distribution and mean service time 0.5
Figure 21: An M/G/2 queue with a power-tail service time distribution. for 0 k C and (44) c(n) = r(0)xc (n) = r(0)xc(C )Uc n0C for C n. The steady-state scalar probabilities are given by r(n) = k (n)0k (45) where k = n for n < C and k = C otherwise. As usual, we obtain r(0) by the normalization, 1 1 = CX0 x (1jC )0 + X xc(n)0c (46) k k r(0) k n C We have an in nite sum in the second term, but since the xc(n)0s have a matrix geometric nature we can express this as a closed sum, 1 = CX0 x (1jC )0 + xc(C )[Ic 0 Uc]0 0 (47) k c r(0) k k 1
=0
1
=
1
=0
5.2 M/G/C Queues - No Packets Lost
As before, we assume that the time to process a packet has service time distribution F (x). We also assume that the arrival process is Poisson, with rate , and we have both a primary buer of a limited size and a secondary buer of in nite size. We also assume we can have more than one server (C = 1; 2 1 1 1) all with the same service time distribution F (x). For our model, we will assume that the total service rate of the system is 1, i.e. if there are two servers, each has mean service time of 2. This allows us to make a fair comparison; we wish to nd whether a single double speed server is better than two single speed servers. This model constitutes an open M/G/C queue. A diagram of the system can be found in Figure 21. In this example, C = 2; we also use C = 1 for comparison. There are some obvious statements we can make. If we have one customer in the system, then a faster server will be better than two servers, as only one server can be utilized. At what point does the performance of one system surpass the other? For an M/M/2 queue, the doubly fast server is always better. It is impossible to distinguish between two servers and a double fast one when both servers are
46 Mean Response time for an M/TPT/C Systems
4
10
3
Response time
10
2
10
M/TPT(8)/1 M/TPT(20)/1 1
10
M/TPT(8)/2
M/TPT(20)/2 M/M/2 M/M/1
0
10
0
0.1
0.2
0.3
0.4
0.5 Rho
0.6
0.7
0.8
0.9
1
Figure 22: Mean response time for an M/PT/C queue with TPT(8) and TPT(20) and C=1,2 servers. M/M/1 and M/M/2 queues are included for comparison. active, therefore when both servers are active, the performance is the same, but with one customer in the system, the performance is slower for the double server. In Figure 22, we chart the mean response time for M/TPT(8,20)/C systems, and C=1,2 servers. M/M/1 and M/M/2 queues are included for comparison. For lower utilization rates, a single, faster server gives better response times. However, for TPT(8) systems, there is a \crossover" point at around = :38, above which, a double server gives a better response time than a single server. This is what we would expect { that at lower rates of utilization, a faster server is better than two servers. However, as the truncated power-tail grows (M increases), this point shifts towards lower values of ; for TPT(20) system, this crossover occurs at a point lower than = :05! So for all but the systems with the shortest truncated power-tail distributions (which do not look power-tail at all), two servers have a better response time than a single double fast one. The question for this example, and the primary question of this chapter, is \For what systems is it better to use multiple servers?". We look at response times in Figure 23. This shows mean response times for M/TPT(8)/C queues with C=1,2,3 servers. For this case, two servers are better than one for > 0:38 and three is better than two at > :65. The crossover points distinctly indicate that it is better to use 2 slower servers than one faster one starting at = 0:38. Also, three is better than one at about = 0:5. However, there is only marginal improvement from 2 to 3 servers at
47
Mean Response Time for an M/TPT(8)/C System
2
Response Time
10
1
10
M/TPT(8)/3 M/TPT(8)/2 0
10
0
M/TPT(8)/1 0.1
0.2
0.3
0.4
0.5 Rho
0.6
0.7
0.8
0.9
1
Figure 23: Mean response time for an M/TPT(8)/C queue and C=1,2,3 servers. For this case, two servers are better than one for > 0:38 and three is better than two at > :65.
48 Buffer space required for 1% to overflow buffer
3
10
M/TPT(20)/2 M/TPT(20)/1
2
Buffer Size required * (1−Rho)
10
1
10
M/TPT(8)/1 M/TPT(8)/2 M/M/2 M/M/1 0
10
−1
10
0
0.1
0.2
0.3
0.4
0.5 Rho
0.6
0.7
0.8
0.9
1
Figure 24: Primary buer size needed in a M/PT/C queue for over ow to be less than 1%, as a function of = 1=( 1 x). about = 0:7. Clearly, the optimal number of servers depend not only on the distribution, but on as well. For > :9 almost any system would be expected to behave intolerably badly. Of more interest is the range 0:4 < < 0:8. For Truncated power-tails with larger M, the advantage goes to the multiple server system; this supports our contention that the larger M is, the better multiple servers will perform over a faster, single server. One could also argue that this is merely a result of the higher coecient of variation, C , for the higher order power-tails. Indeed, some system performance characteristics, such as the mean queue length, depend solely on the rst two moments. But, in the previous chapter we showed that for many of the parameters, such as buer size choice, the rst two moments are inadequate to explain what is going on. Indeed, systems with the same mean service time and variance can yield wildly dierent behaviors. We will look at this a little more in this chapter. We now look at buer size required to limit over ow. In Figure 24 we chart the buer size required so that only 1:0% of the arriving packets go to our \over ow" buer. As would be expected, the necessary buer size grows unboundedly as approaches 1. In order to control the variation along the y-axis, N was multiplied by 1 0 . As with C = 1, for large values of M , C becomes unboundedly large, so we have plotted log[(1 0 ) N ] versus . Our current model assumes that the request size is uniform and each request takes up exactly one slot in the buer. However, the request size may be longer, 2
2
49 or variable (e.g., a complex SQL request), and the service time for that request can still have its own distribution. In this case, we could put the request header in the buer, which would take up a slot (i.e. a pointer to the request) and have that request in some other location (backup buer). When the request comes to the head of the queue, it can be pulled from backup and serviced. Modeling variable arrival packet size would be an interesting topic and is certainly worthy of future research. Note that as the truncated power-tail grows (M gets bigger), and it will as the World-Wide Web gets larger and more international, the double server requires a signi cantly smaller buer size than the double-fast single server. Also of interest is the region of greatest improvement. For the lower-middle values of ( = 0.2 to 0.5), the required buer size is reduced by one or two orders of magnitude. As the system gets busier ( ! 1), the buer size required for the same percentage to over ow come together again. Some of this behavior can be explained by the large variance inherent in powertail distributions. The high coecient of variation indicates that large jobs will arrive, although infrequently. The system will be processing many small jobs, some mid size jobs and a rare large job. However, as increases, the probability of a large job increases. For a relatively idle system, the large job will occupy only one of the servers; the other server remains to service the small and mid sized jobs in the meanwhile. With a busier system, the probability that both queues are simultaneously occupied with large jobs increases. The system saturates, and the arriving jobs land in the backup queue, just like they would with a system with a single server. So as ! 1, the advantage of the double server diminishes. As we did in the previous chapter, we brie y examine the hyperexponential-2 distribution to see if we can obtain the same results. We compare a TPT(20) distribution with =1.4 with a hyperexponential-2 (H2) distributions having the same mean and variance and p=0.9999. We see in Figure 25 that we get similar improvements in performance by using multiple servers in an H2 distribution, however the curves are genuinely distinct. The performance of a system with an H2 distribution shows a jump around = 0.15 for the performance of a queue with a single server and = 0.55 for a queue with multiple servers. This is consistent with our assertion that H2 service time distributions radically change when they hit a certain (which varies with the choice of p) while those with Power-tail distributions are consistent across all scales. 5.3 M/G/C/N Queues - Packets Rejected
In a closed system, as with most systems, buer size is limited. If preserving packets is critical, we can add our \in nite buer" (the secondary buer on disk which feeds the rst buer), but rarely are packets so critical that we must take these steps. Traditionally, upper layers in a protocol recover missorted or lost packets. For example, the IP protocol is de ned as an unreliable, best-eort, connectionless packet delivery system. It does not discard packets capriciously; unreliability arises when resources are exhausted. Discarding packets is not in itself unreasonable. The TCP layer is what guarantees in-order delivery [COME91]. Other protocols have similar methods for handling dropped packets. Here, we assume that when the system lls up (we have N customers), any new arriving customers are rejected. The probability that an M/G/C/N queue will drop a packet is the same as the probability that an arriving customer will see a full queue. r(N ;N ) is or the
50
Buffer space required for 1% to overflow buffer
3
10
M/H2 (M=20)/1 M/TPT(20)/1
2
Buffer Size required * (1−Rho)
10
M/TPT(20)/2 1
10
M/H2 (M=20)/2
0
10
−1
10
0
0.1
0.2
0.3
0.4
0.5 Rho
0.6
0.7
0.8
0.9
1
Figure 25: Primary buer size needed in a M/PT/C queue for over ow to be less than 1%, as a function of = 1=( 1 x) for Power-tail and hyperexponential2 distributions. The H distributions jump quickly once they hit a certain , whereas the TPT distributions rise evenly across all . 2
51 Buffer space required for 1% REJECTED
5
10
4
10
Buffer Size
3
10
M/TPT(20)/1/N 2
10
M/TPT(20)/2/N M/TPT(8)/1/N M/TPT(8)/2/N
1
10
0
10
0
0.1
0.2
0.3
0.4
0.5 Rho
0.6
0.7
0.8
0.9
1
Figure 26: Primary buer size needed in a M/TPT/C/N queue for rejection rate to be less than 1%. We assume that when the system lls up (we have N customers), any new arriving customers are rejected. probability that a customer will see N customers in a system that only holds N customers. This is: (48) = r(0; N )xc(N ; N )"0 for C N The derivation of these formulas is covered in Chapter 2. The reader is invited to reference [LIPS92] for a full discussion. In the previous chapter, it was shown that discarding packets reduces the buer size required over having an in nite secondary buer by several orders of magnitude in some cases. It was also shown that the high variance of the truncated power-tails was inadequate to fully explain this behavior. Hyperexponential distributions with the same variance yielded fundamentally dierent results. Here we have reduced the buer size even further by splitting the servers. Now we still discard packets when the buer lls, but the buer lls less frequently due to the multiple servers compensating for the arrival of large jobs. In Figure 26, we chart the buer size required so that only 1:0% of the arrivals are rejected in systems with single and multiple servers. Of interest here are the great savings in buer size due to splitting the servers. Even when we reject jobs when the queue lls, there is still the problem of a large r(N ; N )
52 job occupying the server. Therefore, although part of the problem is with the fact that the queue length can grow unboundedly, especially due to the larger jobs that appear periodically, we still can gain signi cant savings in buer size by splitting the servers. 5.4 G/M/C
Up until now, we have assumed that the service time has a power-tail distribution. Let us look at what happens when the interarrival times have a power-tail distribution. The premise behind this section has been that we can gain performance by splitting a faster server into two slower, independent servers utilizing a single queue. This works well, as we have seen, when the service time has a power-tail distribution. But with the G/M/C queues, we are splitting an exponential server into C exponential servers with rate =C . When each of the servers are busy, it is impossible to distinguish between a load dependent server and multiple servers. The performance of these two systems is almost exactly the same. In a G/M/C queue, either there are multiple servers or a single server which acts faster when more than one customer is present. Since exponential subsystems have only one internal state, the two views are mathematically equivalent. Modifying the service rates corresponds to changing M as a function of queue length but leaving (I 0 P) alone. Since and B = M(I 0 P) always appear together, changing one or the other yields the same result. 5.4.1 G/M/C/N Queues { Packets Rejected
Using the notation of Chapter 2, we have (N ) (N ; N ) = (N 0 1; N )BQ; 2
2
2(0; N )M = 2 (1; N )(1) + 2(0; N )MP
(49) (50)
and for 0 < k < N (k ; N )[B + (k )I] = (k 0 1; N )BQ + (k + 1; N )(k + 1): (51) So we de ne (N ; N ) := r (N ; N )p (52) We now solve for the 's in terms of r (N ; N ), which is the probability of a random observer seeing N customers in the queue. (Note: This is dierent than a (N ; N ), which is the probability of an arriving customer seeing N in the queue) Thus, we have (0; N )M[I 0 P] = (1; N )(1); (53) or (0; N ) = (1; N )U(0) (54) where U(0) := (1)V (55) 2
2
2
2
2
2
2
2
2
2
2
2
53 For k=1, we get
2(1; N )[B + (1)I] = 2(0; N )BQ + 2(2; N )(2) =
or where
(1)2 (1; N )Q + (2)2 (2; N );
(56)
2(1; N ) = 2(2; N )U(1)
(57)
"
and Next we de ne
#
1 B0Q (1) I+ A(1) := (2) (1)
(58)
U(1) = [A(1)]01
(59)
"
#
1 B0Q (k ) I+ A(k ) := (k + 1) (k )
= [U(k)]0
1
(60)
We now de ne the steady state solution to the ME/M/X//N loop given by the following formulas (Note that if (k) = k, then we have an ME/M/C//N system; However, if (k) is some other function, then we have an ME/M/X//N system.): For arbitrary (k) > 0, let " #0 1 (k + 1) (61) I+ B0Q U(k ) = (k ) (k ) then (N ; N ) = r (N ; N )p, (N 0 1; N ) = r (N ; N )pU(N 0 1), (N 0 2; N ) = r (N ; N )pU(N 0 1)U(N 0 2), (k ; N ) = r (N ; N )pU(N 0 1)U(N 0 2) 1 1 1 U(k ), (0; N ) = r (N ; N )pU(N 0 1)U(N 0 2) 1 1 1 U(k ) 1 1 1 U(0) (62) and r (k ; N ) = (k ; N )0 for all k (63) We need to normalize with K(N ) = I + U(N 0 1) + U(N 0 1)U(N 0 2) + 1 1 1 + U(N 0 1)U(N 0 2) 1 1 1 U(N 0 k )+ U(N 0 1)U(N 0 2) 1 1 1 U(N 0 k ) 1 1 1 U(0): (64) 1
2
2
2
2
2
2
2
2
2
2
2
The probability of a random observer seeing exactly N customers is [r (N ; N )]0 = 9[K(N )]: (65) However, we are looking for the rejection probability, which is the same as the probability of an arriving customer seeing exactly N 0 1 customers in the queue, that is, the buer is full and the \last" customer to arrive must be rejected. To calculate a(N 0 1; N ), we rst need 2
1
54 TPT16/M/C=1,2,3,4/N queue, buffer size for 1% rejection 20 18 16
Size Needed * (1−Rho)
14 12 10 8 6 4 2 0 0
TPT(16)/M/4/N TPT(16)/M/3/N TPT(16)/M/2/N TPT(16)/M/1/N
0.1
0.2
0.3
0.4
0.5 Rho
0.6
0.7
0.8
0.9
1
Figure 27: A G/M/C/N queue with a power-tail arrival distribution and exponential servers. The smallest buer size needed is for the TPT(16)/M/1/N queue and the largest buer size needed is for the TPT(16)/M/4/N curve, but the dierence is minimal. N X
(66) 3(N ) = (k)r (k; N ): k The probability of a ME/M/X//N queue rejecting a packet is (N ) (67) a (N 0 1; N ) = 3(N ) r (N ; N ): In Figure 27, we examine the buer space needed so that we need only reject 1% of the arriving packets in a G/M/C/N system. We need a smaller buer since we reject packets (vs. packets into an over ow buer), yet employing multiple servers causes behavior inferior to that of a single server. Since multiple exponential servers of slower speed perform similarly to a single higher speed server (as long as the cumulative service rates are the same), we see that as long as the servers are all busy, we get the same performance. It is only when the utilization factor is low that we see any dierence. 2
=1
2
2
55 5.4.2 G/M/C Queues { No packets lost
Here we have the in nite backup buer, Truncated Power-tail arrival rates, Exponential service times and multiple servers. We now give the equations required to nd the steady-state probability that an arriving customer will nd the server full upon arrival in a G/M/C queueing system. Let us assume that there are multiple servers, yet there can be more customers than servers in the system at any time, thus allowing queueing, and (k ) = (C ) for k C (68) We see that (60) becomes " # 1 (C ) I+ B 0 Q = A(C ) for all k C: (69) A(k ) := (C ) (C ) We de ne X(k; N ) = [U (C )]N 0k for k C (70) and for k C X(k; N ) = [U(C )]N 0C U(C 0 1) 1 1 1 U(k ) = [U(C )]N 0C X(k; C ): (71) K(C) is as de ned before in (64): K(C ) = I + U(C 0 1) + 1 1 1 + U(C 0 1) 1 1 1 U(0): (72) Just as we had s and u^ in the previous chapter, we de ne sc u^c in this chapter; sc is the smallest eigenvalue in magnitude of A(C ), and uc is the associated left eigenvector. u^c is de ned as u (73) u^c := c 0 uc Now we need the normalizing factor g(C ): 1 0 sc g (C ) := (74) sc + (1 0 sc )u^c K(C )0 With these equations, we can now state the steady state probability of an arriving customer seeing k customers in the queue in a G/M/C server as: a2 (k jC ) = g (C )c sck 0C p for k C 0 1 (75) and for k < C 0 1 a2 (k jC ) = g (C )c [u^cX(k + 1; C )0]p (76) Using these equations, we generate the buer size required so that only 1% of the customers go the the over ow buer. As Figure 28 shows, the required buer sizes dier when 0.95), but are completely unreliable for middle values of , where most systems operate. Other more complicated processes which heuristically build in correlations and burstiness (such as Compound Poisson processes) apear to be less accurate if they do not incorportate PT behavior. The elegance of the simplicity of this model for power-tail distributions makes it a powerful modeling tool. The reason multiple servers oer such a gain in performance is because a single large job does not tie up the system. With the secondary buer, while the 61
62 system is serving large jobs, there is a higher probability that a second large job will arrive. The server splitting osets this, but ultimately enough large jobs may arrive to drag down the system. We have also seen the buer savings gained by rejecting packets when the buer lled, instead of sending them to an over ow buer. In this case, large arriving jobs can be rejected if the system is already full. The results of this thesis suggest further questions. For example, we show that even for smaller truncated power-tail distributions, having two servers improves performance over having one that is double fast. However, we saw that having three servers each one third as fast as the single one oers only slight buer savings in buer space and response time. Obviously, there is an optimal level of parallelism; perhaps some sort of dynamic server splitting (similar to time sharing) can occur. Also, most of these formulas use Truncated Power-tails to simulate full power tails, due to the fact that there are no analytical solutions for full power-tails. However, using Matrix Exponential Distributions, we can possibly use the Spectral Decomposition Theorem to analyze full power-tails. However, from what we have seen, given large enough M Truncated Power-tails are sucient to simulate full power-tails. Also of future interest is a simulation of these systems. As discussed in the last chapter, we may need extraordinarily large numbers of samples to approach the steady state. Smaller networks, while \expected" to behave in a certain way, rarely do so with the small time and sample size available. It would be interesting to examine large Internet hubs to see if their performance more closely matches our results.
Bibliography [BERA95] J. Beran, R. Sherman, M. S. Taqqu and W. Willinger, \VariableBit-Rate Video Trac and Long-Range Dependence", IEEE/ACM Trans. on Networking, 1995. [CALL94] Christian Callegari, The Distribution of Arrivals During a Busy Period: An Algorithmic Solution for the M/G/1 and GI/M/1 Queues, M.S. Thesis, University of Connecticut, Storrs, Connecticut, 1994. [COME91] Douglas E. Comer, David L. Stevens, Internetworking with TCP/IP, Vol. I, Prentice Hall, Englewood Clis, NJ, 1991 [DING92] Yiping Ding, On Performence of Real-Time Systems, Ph.D. Dissertation, University of Connecticut, Storrs, Connecticut, 1994. [ERRA96] Ashok Erramilli, Onuttom Narayan and Walter Willinger, \Experimental Queueing Analysis with Long-Range Dependent Packet Trac", IEEE/ACM Trans. on Networking, pp. 209-223, April 1996 [FELL71] William Feller, An Introduction to Probability Theory and its Applications, Vol. II, John Wiley and Sons, New York, 1971. [GARG92] Sharad Garg, Lester Lipsky and Maryann Robbert, \The Eect of Power-tail Distributions on the Behavior of Time Sharing Computer Systems", 1992 ACM SIGAPP Symposium on Applied Computing, Kansas City, MO, March, 1992. [GREI95] Michael Greiner, Manfred Jobmann and Lester Lipsky, \The Importance of Power-tail Distributions for Telecommunication Trac Models", Technical Report, Department of Informatics, Technical University-Munich (TUM), Submitted for publication. [HATE97] John Hatem and Lester Lipsky, \Steady State Queue Length Probabilities For M/G/1 Systems with Power-Tail Service Times", Technical Report, Booth Research Center, University of Connecticut, Storrs, CT 1997. [HEAT96] David Heath, Sidney Resnick and Gennady Samorodnitsky, \Patterns of Buer Over ow in a Class of Queues with Long Memory in the Input Stream", Technical Report, Cornell University. 63
64 [KLEI75] Leonard Kleinrock, Queueuing Systems: Volume I, WileyInterscience, New York, 1975 [LELA86] Will E. Leland and Teunis Ott, \Load-Balancing Heuristics and Process Behavior", Proceedings of ACM SIGMETRICS 1986, May 27 - 30, 1986, pp. 54-69. (The proceedings appeared as vol 14, no 1, Performance Evaluation Review, May, 1986.) [LELA94] Will E. Leland, Murad S. Taqqu, Walter Willinger and Daniel V. Wilson, \On the Self-Similar Nature of Ethernet Trac (Extended Version)", Proc. of IEEE/ACM Trans. on Networking, 2,1, Feb. 1994. [LIEF94] Appie van de Liefvoort and Hai Feng Weng, \On the Instability (`Fractal' Behavior) of Arrival Counts Generated By Power-tail Renewal Processes", Department Report, Computer Science and Telecommunications Program, University of Missouri-Kansas City, 1994. [LIKH95] Nikolai Likhanov, Boris Tsybakov and Nicolas D. Georganas, \Analysis of an ATM Buer with Self-Similar (\Fractal") Input Trac", Proc. IEEE INFOCOM'95, Boston, April 1995. [LIPS86] Lester Lipsky, \A Heuristic Fit of an Unusual Set of Data", Bell Communications Research Report, January 1986. [LIPS92] Lester Lipsky, QUEUEING THEORY: A Linear Algebraic Approach, MacMillan and Company, New York, 1992. [LIPS95] Lester Lipsky and Pierre Fiorini, \Auto-Correlation of Counting Processes Associated with Renewal Processes", Technical Report, Booth Research Center, University of Connecticut, August 1995. [LIPS97] Lester Lipsky and John Hatem, \Buer Problems in Telecommunications Systems", Fifth International Conference on Telecommunication Systems, Nashville, Tennessee, March, 1997 [LOWR93] Walter Lowrie and Lester Lipsky, \A Model For The Probability Distribution of Medical Expenses", Proceedings of CONFERENCE OF ACTUARIES IN PUBLIC PRACTICE, 1993. [PARK96] Kihong Park, Gitae Kim and Mark Crovella, \On the Relationship between File Sizes, Transport Protocols, and Self-Similar Trac", Technical Report, Boston University, TR-96-016, Sumbitted for publication. [SCHE96] Alan Sheller-Wolf and Karl Sigman, \Delay Moments for FIFI GI/GI/s Queues", Scheduled to appear in Queueing Systems: Theory and Analysis, October, 1996 [SCHW97] Hans-Peter Schwefel, \Performance of Packet Switches Using Markov Modulated Poisson Processes That Have Power-tail Bursts", M.S. Thesis, Institut fur Informatik, Technische Universitat, Munchen, 1997. [ROYS92] Siddhartha Roy, The Generalized M/G/C Queue: An Implementation with Some Results, M.S. Thesis, University of Connecticut, Storrs, Connecticut, 1994.
65 [SAMO94] Gennandy Samorodnitsky, Murad S. Taqqu, Stable Non-gaussian Random Processes: Stochastic Models with In nite Variance, Chapman and Hall, New York, NY 1994 [TAQQ95] Walter Willinger, Murad S. Taqqu, Robert Sharman and Daniel V. Wilson, \Self Similarity In High Speed Packet Trains: Analysis and Modeling of Ethernet Trac Measurements", Statistical Science, vol. 10, pp. 67-85, 1995 [TEHR83] Aby Tehranipour, Explicit Solutions of Generalized M/G/C//N Systems Including an Analysis of Their Transient Behaviour, Ph.D. Thesis, University of Nebraska, Lincoln, December 1983. [TEHR90] Aby Tehranipour, Lester Lipsky, The Generalized M/G/1//N Queue as a Model for Time-Sharing Systems, ACM-IEEE Joint Symposium on Applied Computing '90, Fayetteville, AR, April 1990. [WILL95] Walter Willinger, Murad S. Taqqu, Robert Sharman and Daniel V. Wilson, \Self Similarity Through High-Variability: Statistical Analysis of Ethernet LAN Trac at the Source Level (Extended Version)", ACM SIGCOMM'95.