Keywords: accelerated simulation, event reduction, Generalised Processor Sharing (GPS), vacation time. 1 Introduction. The main difficulty in simulating packet ...
Accelerated Simulation of a GPS Scheduler Using Traffic Aggregation Vindya Amaradasa & John Schormans Queen Mary, University of London Mile End Road, London E1 4NS
(vindya.amaradasa, john.schormans)@elec.qmul.ac.uk
Abstract: We develop an accelerated simulation technique for Generalised Processor Sharing (GPS) systems. The technique provides a substantial reduction in the number of events simulated, by removing background traffic. This paper includes an analytical method for the validation of the proposed acceleration technique, along with results. It is demonstrated that the accelerated model maintains a very close approximation to the original simulation model over a range of scenarios. Keywords: accelerated simulation, event reduction, Generalised Processor Sharing (GPS), vacation time
1 Introduction The main difficulty in simulating packet networks at the packet level is the large amounts of computer resources (i.e. time and memory) required. The problem is worsened as networks grow larger and more complex (e.g. include CoS, AQM mechanisms). As a solution, Accelerated Simulation (AS) develops methods to manipulate the simulations in such a way that the results can be determined with fewer events, thus requiring fewer resources. Researchers have experimented with a number of AS techniques over the years, achieving various degrees of success. One of the earliest methods is called the RESTART (REpetitive Simulation Trials After Reaching Thresholds) technique, which increases the probability of rare events by restarting the simulation at specific occasions, i.e. when the queue reaches certain thresholds. This is done by conditioning rare events such as packet loss, on less rare events such as threshold levels in the queue [1]. Another method, ‘Importance Sampling’ is where the underlying probabilities of certain events are modified (or biased) to make the important, rare events occur more frequently. Forcing these events to occur with a higher probability means the overall number of events to be simulated is reduced [2]. Another AS technique (mainly aimed at ATM networks) is the cell-rate method, where an ‘event’ is characterised by the change in the cell arrival rate, rather than the arrival of an individual cell, thus reducing the number of events [3]. Among the more recent techniques is a hybrid technique, where the network traffic is classified as Foreground Traffic and Background Traffic, and only the Foreground traffic (which is of interest to us) is simulated. The Background Traffic is modelled by an analytical method. This gives a substantial reduction in the overall number of events to be simulated [4]. Yet another technique uses the aggregation of sources: Traffic Aggregation (TA) replaces N multiplexed sources by a single equivalent ON-OFF source, which not only reduces the simulation time but also simplifies the scenario. TA has also been successfully demonstrated in conjunction with the RESTART technique, which is an example of successful concatenation of AS techniques [5]. Almost all existing AS methods have been applied to FIFO queueing systems; therefore, acceleration on GPS systems a novel area to study. The significance of GPS scheduling is explained in section 2. Then section 3 describes our acceleration technique and also an analytical solution that can approximate our accelerated model, which is used for validation purposes. The outcome of the analysis i.e. queue state probabilities of the accelerated model, are compared with those of the original simulation model in section 4. This is followed by the conclusions and a summary of further work.
P23/1
2 Fairness and Generalised Processor Sharing (GPS) The obvious disadvantage of FIFO scheduling arises from the lack of differentiation of the sources or the destinations, which may result in ‘unfair’ allocation of bandwidth. By the term fairness here we mean its most common form, namely ‘max-min fairness’, where all the flows get the same share of the available bandwidth. If a particular flow(s) cannot fully use its (their) share (e.g. due to low sending rates), then the excess bandwidth is shared equally among the remaining flows [6]. In a FIFO system, it is possible that some sources get more than their fair share (e.g. ill-behaved sources with extremely high sending rates), leaving others with much less than their fair share. Thus, FIFO scheduling is not satisfactory for modern, QoS oriented networks. A variation of FIFO, namely Priority Queueing is where the order of service is determined based on priority levels rather than the order of arrival. Even though this is a slight improvement over the traditional FIFO, sometimes the lower priority packets maybe starved by the higher priority ones, again creating unfairness [7]. Parallel queueing was introduced as a means of offering fair bandwidth allocation. Weighted Round Robin scheduling is one such approach, where there is one sub-queue for the packets from each traffic flow, which are served as the name suggests in a weighted round robin manner. However, as the weights are pre-defined based on packet sizes, this approach offers fairness only if the packet sizes within each sub-queue are constant. In order to provide fair bandwidth allocation irrespective of the nature of packets, Processor Sharing (PS) must be used. An ideal PS scheduler serves packets in a Bit-by-bit Round robin (BR) manner; i.e. one bit from each sub-queue is served at a time. This provides exactly the same service rate for each flow i.e. max-min fairness. Note that this kind of fairness may not always be the most preferred, for e.g. when some customers pay more than others, or when each sub-queue of the GPS system holds an aggregate of a different number of traffic flows and we wish to treat flows equally [8]. In this case, the per sub-queue service rates are modulated according to weights attributed to each sub-queue: this is called ‘Generalised Processor Sharing’ (GPS) [9]. The ideal GPS scheduler cannot be implemented as it requires serving a specific number of bits from each sub-queue at a time i.e. the packets would need to be infinitely divisible. However, several practical approximations have been introduced, which are referred to as Weighted Fair Queueing (WFQ) algorithms. These algorithms calculate the time when each packet would have finished service in the GPS system and then serve them in the increasing order of their finishing times [10]. WFQ algorithms differ from each other in the mechanisms used for the calculation of these finishing times. The Packet-by-packet GPS (PGPS) scheme which uses the notion of ‘virtual time’ is considered the closest approximation so far, to the theoretical ideal GPS Queue [11]. Further, our simulation experiments have been carried out using NS2’s implementation of the PGPS system.
3 Accelerated simulation technique for a GPS scheduler 3.1 The technique Our acceleration technique is based on the classification of network traffic as ForeGround (FG) and BackGround (BG) traffic in a similar manner to [4], but for a GPS system. FG traffic is the portion of traffic that is of interest to us, and BG is the rest. The exact definition of what will be the FG traffic depends on the application. e.g. a specific set of data connections across a typical network, a single voice connection etc. The technique is to completely remove the BG traffic from the simulator, largely reducing the number of events to be handled in the simulator. In order to maintain the accuracy, the behaviour of the FG traffic in the simulator needs to be manipulated to give the same (or nearly the same) effect as if the background traffic were in fact present. Let us define the original model as having N flows, 1 with FG traffic and the rest with BG traffic, with each flow taking up one sub-queue in the GPS system. The server will take turns in serving packets from each of the N sub-queues in the order determined by the BR algorithm, which depends on the individual sending rates of each flow and the weights assigned to each sub-queue. P23/2
Assume equal packet sizes in all traffic flows, and let the service time per packet be the fundamental time unit. If a FG packet has just completed service, then the next FG packet will not start service immediately, but only after i time units, where i is the number of BG packets that will be served between these 2 FG packets (Figure 1 (a)). This i units of time can also be referred to as a ‘vacation time’ i.e. as far as the FG flow is concerned, the server working on the BG flows is equivalent to the server taking a ‘vacation’ and not being available to the FG packets during this period. Therefore, the instants at which the server will be available to the FG traffic depend on i (i.e vacation time), which in turn depends on the behaviour of the BG traffic. Note that i is not a constant, but varies over time as some sub-queues maybe empty at times (depending on their respective utilisations). For example, when equal sending rates and equal weights are applied, i would range from 0 to N-1.
FG flow
FG flow
i BG packets
N-1 BG flows
Server
Server
Empty sub-queue
Figure 1 (a): Orig. Model with all flows present
Figure 1 (b): Acc. Model with BG flows removed
Now consider the accelerated model. The BG traffic is removed (Figure 1 (b)); however, the service offered to the FG traffic must remain the same as in the original model, in order to create equivalence between the two systems. Therefore, we need to estimate at which instants the server would have been available for the FG flows, had the BG flows been present. This, as explained above requires the parameterisation of the distribution of i. In the case where each traffic flow has a Poisson arrival process with mean rate λ, the distribution of i will be Binomial (N, λ). Further, for large N, this can also be approximated by a Poisson distribution. In the accelerated model, the server taking a Binomail/Poisson distributed vacation time between serving FG packets can also be considered as spending a Binomial/Poisson distributed time serving each FG packet, rather than taking a vacation. This realisation is useful as it makes our accelerated model similar to an M/G/1 system with a Binomial/Poisson service time distribution, thus enabling the application of well-known queueing solutions. In this paper, we apply the Excess Rate (ER) queueing analysis [12, 13] as a form of validation for our acceleration technique. This is summarised below.
3.2 Analytical approximation using Excess Rate (ER) analysis The ER queueing analysis is based on the identification of an Imbedded Markov Chain (IMC) at ER arrivals. ER packets are those which must be buffered as they represent an ‘excess’ of instantaneous arrival rate over the service rate. Define the fundamental time unit as the time required to serve an average-length packet. Then, if N packets arrive in any time unit, that time unit experiences N-1 excess packets. The definition of ER packets is important because of its relationship to the change in queue state. i.e. for every ER packet, the queue size increases by one. Accordingly, the arrival of ER packets is connected via balance equations representing adjacent queue states, from which formulae for queue state probability are derived [12]. The general formula for p(k), the probability that an arriving packet sees k already in the queue for an M/G/1 system with utilisation ρ is derived in [12, 13] as follows:
P23/3
⎡ q + (1 − q)((1 − a[0] − a[1]) /(1 − a[1])) ⎤ p(k ) = (1 − ρ ).⎢ ⎥ (a[0]) /(1 − a[1]) ⎣ ⎦
k
where a(k) is the probability of having k arrivals in a packet service time. q is defined as the probability of having another ER packet in a time unit which just had one. q = a[3]/a[2] = λ/3 The definitions of a[0] and a[1] are determined by the packet size distribution of the queueing system concerned. For a Poisson distribution they are defined in [13] as;
a[0] = (e − ρ / ρ )[exp(ρe − λ ) − 1]
a[1] = λe − λ e − ρ (1 + ρe − λ ) exp(ρe − λ ) For our accelerated model, λ and ρ will take the corresponding values of the FG flow. i.e. λ = λo/N and ρ = ρo/N, where λo and ρo refer to the arrival rate and utilisation of the original model (i.e. including the BG flows) respectively. The queue state probabilities of the accelerated model can now be derived.
4 Results The queue state probabilities estimated by the analysis in section 3 are compared with those of the FG flow of each corresponding simulation model, for the 8 scenarios listed below. ρ = 0.3: N=20, N=100 (Figure 2 (a))
ρ = 0.5: N=20, N=100 (Figure 2 (c))
ρ = 0.4: N=20, N=100 (Figure 2 (b))
ρ = 0.6: N=20, N=100 (Figure 2 (d))
N=20 Orig
N=20 Acc
N=100 Orig
1.0E+00
1.0E+00
0
1
2
3
1.0E-01
1.0E-02
1.0E-02
1.0E-03
1.0E-03
p(k)
p(k)
1.0E-01
N=100 Acc
1.0E-04
1
1.0E-05
1.0E-06
1.0E-06
1.0E-07
1.0E-07
2
3
queue size, k
2 (a): ρ = 0.3
2 (b): ρ = 0.4
1.0E+00
1.0E+00
0
1
2
3
1.0E-01
1.0E-02
1.0E-02
1.0E-03
1.0E-03
p(k)
p(k)
3
1.0E-08
queue size, k
1.0E-01
2
1.0E-04
1.0E-05
1.0E-08
0
1.0E-04
0
1
1.0E-04
1.0E-05
1.0E-05
1.0E-06
1.0E-06
1.0E-07
1.0E-07 1.0E-08
1.0E-08
queue size, k
queue size, k
2 (c): ρ = 0.5
2 (d): ρ = 0.6
Figure 2: Comparison of Queue State Probabilities – Original Model Vs. Accelerated Model
P23/4
As illustrated, the FG flow queue state probabilities of the original simulation model are very closely approximated by the M/G/1 system, over a range of ρ and N. This shows that the application of a Poisson service time distribution for the FG packets results in a corresponding accelerated model.
5 Conclusions and Further work We have presented an acceleration technique for GPS schedulers along with initial results of validation. Acceleration is achieved by removing BG traffic, and modifying the server so that it takes a vacation (equivalent to serving the missing BG traffic) in between serving the FG traffic. As explained in the paper, this type of model can be closely approximated by an M/G/1 system with a Poisson service time distribution, which has formed the basis for our validation. There are two options for the way forward. The first one is an analytical approach i.e. derive analytical solutions for other performance metrics (other than queue state probability) of the accelerated system e.g. delay and loss probabilities. The second option is a simulation based approach i.e. create an accelerated simulation model with Poisson distributed packet service times, from which all other performance metrics can be obtained experimentally. The plan is to implement and test the accuracy of both approaches, by comparing with the original simulation model. In addition to the accuracy, the advantages offered by each approach over the original simulation model will also be considered i.e. the level of computational complexity in the analytical approach and the amount of event reduction achieved in the simulation based approach. Moreover, we will improve our acceleration technique to be applicable for systems with different sending rates across flows, other types of traffic (e.g. exponential, Pareto) and also different weights for each sub-queue.
References [1] Villen-Altamirano, M., Villen-Altamirano, J.,“RESTART: a Straightforward Method for FastSimulation of Rare Events”, Proceedings of 1994 Winter Simulation Conference (WSC’94), Orlando, Florida, pp 282-289, December 1994 [2] Pagano, M. and Sandmann, W., “Efficient rare event simulation -- A tutorial on importance sampling”, In D. Kouvatsos, editor, Proceedings of the 3rd International Working Conference on Performance Modelling and Evaluation of Heterogeneous Networks ({HetNets}), Ilkley, UK, July 2005 [3] Pitts, J. M. “Cell-rate modelling for accelerated simulation of ATM at the burst level”, IEE Proceedings. Comput. Digit. Tech. 142, 6 (Dec.) 1995. [4] Schormans J.A, Enjie L., Cuthbert L. and Pitts J.M. “A hybrid technique for the accelerated simulation of ATM networks and network elements”, ACM Transactions on Modeling and Computer Simulation (TOMACS) 11(2): 182-205, 2001. [5] Schormans, J.A and Ma, A.H.I., “Fast, stable simulation of power-law packet traffic using concatenated acceleration techniques”, IEE Proceedings - Communications -- August 2005 -Volume 152, Issue 4, p. 420-426 [6] Rakocevic V, “Dynamic Bandwidth Allocation in Multi-Class IP networks using Utility Functions”, PhD Dissertation, Queen Mary, University of London, January 2002 [7] Ariffin S. and Schormans J. "Efficient Accelerated Simulation Technique for Packet Switched Networks: A Buffer with Two Priority Inputs", IEEE International Conference on Communication (ICC04), Paris, June 2004 [8] N. Chrysos, M. Katevenis: "Weighted Fairness in Buffered Crossbar Scheduling", Proceedings of the IEEE Workshop on High Performance Switching and Routing (HPSR 2003), Torino, Italy, June 2003, pp. 17-22.
P23/5
[9] Roberts J, Mocci U, Virtamo J. “Broadband network teletraffic performance evaluation and design of broadband multiservice networks”, June 1996. (Final Report of Action COST 242). Lecture Notes in Computer Science, vol. 1155 [10] Demers A., Keshav S., and Shenkar S., “Analysis and simulation of a fair queueing algorithm,” Inremet. Res. and &per., vol. 1, 1990. [11] Parekh A. and Gallager R. “A generalized processor sharing approach to flow control in integrated services networks: the single node case” IEEE/ACM Transactions on Networking, 1(3):344--357, June 1993. [12] Pitts J., Schormans J., Introduction to IP and ATM Design and Performance 2nd Edition, 2000. [13] Schormans J. A. and Pitts J. M., “Solution for M/G/1 queues”, Electronic Letters , vol. 33, no. 25, pp. 2109–2111, Dec. 4, 1997.
P23/6