was represented by a multicast videoconferencing Station. (Silicon Graphics Indy), running video (NV) and audio (VAT) applications, which can be configured to ...
Statistical Multiplexing of Self-Similar VBR videoconferencing traffic R.G. Garroppo, S. Giordano, S. Miduri, M. Pagano, F. Russo {garrop, simone}@indy.iet.unipi.it; {giordano, pagano, russo}@iet.unipi.it
Departement of Information Engineering University of Pisa - Italy
ABSTRACT: ATM based telecommunication networks are designed to obtain high resource utilization by means of statistical multiplexing of traffic sources corresponding to heterogeneous multimedia services. As highlighted by recent measurements and statistical studies, these applications are quite different from traditional narrowband services. Particularly, self similarity and long range dependence, that charaterise the bursty nature of the traffic pattern, have been pointed out as a very predominant characteristic in the design of new networks. In order to achieve an efficient resource utilisation, peak rate allocation must be substituted by a type of dynamic rate allocation. This paper presents a Dynamic Bandwidth Allocation method that employs linear prediction of the input rate based on selfsimilar modelling of data. The performances of the proposed strategy, in terms of multiplexing gain for reasonable values of cell loss rate and queuing delay, have been evaluated considering experimental data collected over a DQDB MAN during a multiparty videoconferencing session. 1 INTRODUCTION Many of the studies carried out during recent years have shown how self-similar traffic models are emerging as a realistic mathematical representation of multimedia sources. Long Range Dependence (LRD) is one of the most important features of these processes and can be captured by self-similar models with a limited number of parameters. As pointed out by previous studies on queuing performances [1] [2], LRD and the related high burstiness of the actual traffic have a deep impact on queue behaviour. In fact they determine a heavytailed complementary probability distribution of the occupancy level in a queue with infinite buffer. The same effect is relevant even with finite dimension buffers [3], [4]: a low utilization of the network resources is required in order to operate with acceptable values of cell loss rate and queuing delay. The growing interest devoted to inexpensive desktop videoconferincing tools (such as CU-SeeMe by Cornell University) has led us to analyse the traffic offered by a Ethernet LAN to a DQDB MAN (MAN “Tuscany”) during a multiuser videoconferencing session. The measurements, carried out at the Telecommunication Network Laboratory at the University of Pisa, have highlighted that the traffic generated by this kind of application [5] is characterised by the typical features of self-similar models either at the frame level or at the segment level. Because of the strong impact of LRD over resource utilisation, adequate Dynamic Bandwidth Allocation algorithms are required to
take into account the statistical nature of the multiplexed traffic sources. Section 2 presents the experimental test-bed adopted to collect three traffic data series corresponding to different multiuser videoconferencing sessions and the relative statistical analysis, aiming to point out the self-similar nature of the acquired sample paths. Section 4 describes, in detail, the proposed multiplexer scheme and its Dynamic Bandwidth Allocation algorithm, based on the prediction filter and on the underlying fractal model mentioned in Section 3. Section 5 compares the proposed multiplexing strategy with fixed rate allocation schemes in terms of cell loss probability and mean waiting Time, while the advantages of the proposed solution are highlighted in the Conclusions Section. 2 EXPERIMENTAL MEASUREMENTS AND DATA ANALYSIS The goal of the traffic measurements carried out at Pisa (Italy) is represented by the analysis of the traffic offered by a specific multimedia application for multiuser videoconferencing to a broadband network infrastructure, represented by a 140 Mbit/s DQDB MAN known as MAN “Tuscany” [6]. Our interest was directed to desktop conferencing tools for common PCs performing software coding of voice and video signals. The traffic was collected by a UNIX workstation and forwarded (on a multicast tunnel) to another multimedia workstation located at a remote site connected to the MAN via an SMDS server. The videoconferencing sessions were implemented over this broadband infrastructure considering only two nodes of the MAN (figure 2.1), connected respectively via a LAN bridging access (local site) and an SMDS access (remote site). To avoid the transmission of local broadcast traffic to all the sites reached by a LAN bridging service, the interconnection of the user’s LANs to the MAN is produced by a LAN-to-LAN router which controls effectively and filters the traffic offered by the local LANs to the LAN directly interconnected to the MAN. In this way, it is possible to collect only the traffic offered to the MAN using a UNIX workstation running TCPDump as a protocol analyser. The video and voice packets were generated by a group of four PCs at the local site and software coded by the CU-SeeMe application (a widely used public-domain Internet software developed by Cornell University). Consequently, the encoded data was sent as unicast packets to a multipoint communication unit, called reflector, which transmitted them as multicast IP packets to the remote site. There was no special advantage using multicast in this videoconferencing scenario since at the remote site there was only one receiver. In general, however, several remote users could collect the videoconferencing transmission without increasing the traffic
generated by a local reflector. On the remote site the receiver was represented by a multicast videoconferencing Station (Silicon Graphics Indy), running video (NV) and audio (VAT) applications, which can be configured to be compatible with CU-SeeMe. This station had multimedia capability and exchanged video and audio signals with the multicast reflector redistributing them to the PCs in the local site. Lan bridge/SMDS Gateway
10 Mbps Ethernet Central Multicast Reflector
10 Mbps Ethernet
PowerMac
140 Mbps DQDB 34 Mbps DQDB SMDS 2 Mbps
Multicast VideoConference Station
values of the Hurst parameter have been estimated considering both the Variance-Time plot and the R/S Statistic [7]. To highlight the self-similar nature of the acquired data, the relevant statistics (namely the autocovariance coefficients, the V-T plot and the R/S statistic) for the trace of figure 2.2 are shown in figures 2.3, 2.4 and 2.5 respectively. As presented in figure 2.3, the autocovariance coefficients of the considered trace present a hyperbolic decay, typical of processes with LRD. The shape of the autocovariance coefficients for the other traces is very similar, although they correspond to traces measured in different time periods and under different load conditions. 1
CGW 0.9
10 Mbps Ethernet
SMDS ROUTER Traffic Measurement
0.8
MultiProtocol Router
The acquired traces represent the number of arrivals (segmented as ATM cell payload) on disjoint intervals of duration Tu=100 msec collected over the Ethernet LAN. A portion of the data measured on May 29th, 1996 is shown in the figure 2.2, which points out the high burstiness of the pattern trace.
Autocorrelation coefficients
0.7
Fig 2.1. Measurement testbed
0.6 0.5 0.4 0.3 0.2 0.1 0 0
200
400
600
800
1000
Lag (k) 300
Fig 2.3 Autocorrelation coefficients of trace 1 10000
250
150 Log10(Variance)
Cells/Tu
200
100
1000
50
0 0
500
1000
1500 Time unit (Tu)
2000
2500
3000
Fig 2.2 Number of cells per Time Unit for trace 1
Trace N. -1May 29th -2June 6th -3July 30th
Trace Length sec
Average cells/Tu
Peak cells/Tu
Std Dev cells/Tu
Peak to Mean Ratio
1956
95.814
300
54.92
3.3
2766
138.14
344
55.471
2.5
4000
27.72
120
13.96
4.3
Tab. I
The 30th July session is characterised due to the lowest activity since it involved only two videoconferencing sessions, while four of them have been opened during the other two measurements. The measured traces showed a persistence phenomenon (i.e. local trend can differ significantly from the global behaviour) typical of LRD traffic. Hence the acquired data has been examined to test their fractal nature and the
100 1
10
100 Log10(k)
Fig 2.4 Variance-Time plot. The dotted line represents the best fitting line to evaluate the H parameter. 1000
100
Log10(R/S)
Table I summarises some basic statistical indexes of measured traces.
10
1
0.1 10
100
1000
Log10(k)
Fig 2.5 Average value of the R/S statistics (plotted with a '◊' mark). The dotted lines correspond to a slope equal to 1 (∝ k, the upper) and 0.5 (∝ k 0.5, the lower), that represent the limit values of H for fractal data.
All these statistical indexes confirm the self-similar nature of the measured traces and are the basis for the evaluation of the Hurst parameter H (see Table II). Rough estimates of H values are easily related to the slope β of the best fitting line in log log scale both for the R/S statistics ( H = β ) and the V-T plot ( H = 1 − β ).
rx ( p − 1) rx ( p − 2) rx (0)
(3.5)
P( k ) = [rx ( k ) rx ( k + 1)rx (k + p − 1)]
T
2
Trace N. -1May 29th -2June 6th -3July 30th
rx ( 0) rx (1) r (1) rx ( 0) Rx = x rx ( p − 1) rx ( p − 2)
with rx ( k ) = E{x ( n) x ( n + k )} . Since the coefficients {w(i )}i = 0
p −1
H Estimate based on V-T Statistic
H Estimate based on R/S Statistic
0.93
0.91
0.89
0.875
0.88
0.77
can be determined by the knowledge of the autocorrelation function of the data, the entire trace has to be acquired before determining the predictor form. For this reason we used the autocorrelation function of fGn processes, that represent a proper traffic model in the fitting of real traffic data [7] and are described by only three coefficients, namely the mean, the variance and the Hurst parameter.
Tab. II
3 BANDWIDTH ALLOCATION STRATEGIES AND TRAFFIC MODELING Dynamic Bandwidth Allocation schemes have to be employed in order to achieve high resource utilisation. In this paper, the allocated bandwidth is determined by a linear prediction of the input traffic [8]. This strategy will be compared to deterministic multiplexing in terms of cell loss probability and mean waiting time spent in the buffer for different values of the output link capacity. The allocation strategy is based on traffic prediction obtained by a Minimum Mean Square Error (MMSE) linear predictor, described by the following equation: p −1
Source 1
Source 2
x ( n + k ) = ∑ w(i )x ( n − i )
4 THE MULTIPLEXER MODEL The key element of the proposed multiplexer, implemented with a particular FIFO queue, is shown in Figure 4.1. The sequence describing the number of cells per time unit offered to the multiplexer by each source is the input of the linear predictor. The predictions are used by the multiplexing system to state the priority level of each cell. The priority information allows to manage the discarding of incoming cells, while the service strategy is strictly FIFO (i.e. without priorities). Linear Predictor
Linear Predictor
(3.1)
Priority Logic
i =0
where p represents the predictor order, {w(i )}i = 0 are the filter p −1
coefficients, x (n) and x ( n) are respectively forecasted and the actual value of the number of arrivals during the n-th time interval. Hence, the number of cells at the generic step n+k is predicted with a linear law from the knowledge of the samples at steps n, n-1, .... , n-p+1. Using the notation: w = [w( 0), w(1), , w( p − 1)]
T
x( n) = [x ( n), x ( n − 1), , x (n − p + 1)]
T
(3.2)
the error function can be rewritten as: e( n) = x ( n + k ) − x ( n + k ) = x ( n + k ) − w T x( n)
(3.3)
The optimum MMSE predictor is obtained minimising, respect to the w(i ) coefficients, the Mean Square Error. In matrix form, this leads to the following linear system whose solution gives the optimum filter coefficients (Weiner-Hopf linear equation): R x w = P( k ) (3.4) where the above symbols have the following meanings
Source N
Linear Predictor
MULTIPLEXER
Fig 4.1 Multiplexing structure
The rate of the output link reserved to each source depends on the linear prediction of the incoming traffic: if the overall prediction is greater than the link capacity, the exceeding cells are tagged with the low priority level. The priority management strategy (for a generic input flow labelled by i), presented in figure 4.2, consists of the following steps: • at the beginning of each Time Unit, the arrival counter Ai is set to 0 and a threshold Pi is evaluated according to the selected prediction strategy. • at the arrival of a new cell from the considered input stream, Ai is incremented by 1. • the new value Ai is compared with the threshold Pi: if (Ai > Pi) CellPriority = 1;
• if the previous condition is not matched, the overall number of arrivals (for all the input stream) is then compared to the link capacity C:
N
if (
∑A >C ) i
deterioration with respect to the use of the real sample autocorrelation function. In figure 5.1 the comparison among the cell loss probabilities for the two strategies is plotted versus the normalised offered load (i.e. offered load to service time ratio) of the multiplexing system .
CellPriority = 1;
i =1
else CellPriority = 0;
• the cell is then sent to the queue. New Arrival
0.1
Ai = A i + 1
yes
Ai > P i
Cell Loss Probability (Pl)
0.01
Prio = 1
no
∑ Ai > C
yes
Prio = 1
0.001
0.0001
1e-05
i
no 1e-06 0.65
Prio = 0
0.7
0.75 Normalized Offered load
0.8
0.85
Fig 5.1 Cell loss probability with linear prediction (continuos line) and fixed rate allocation (hatch line). Mux. Queue
As mentioned above, in the queuing system of the multiplexer, incoming cells are served according to the FIFO strategy. The priority is used exclusively to manage cell discarding: at the arrival of a high-priority cell, if the buffer is full, the new cell is queued in the last position of the buffer provided that at least one low-priority cell is currently stored (otherwise it must be rejected). The free space in the buffer is obtained by discarding the older low-priority cell. Low priority arrivals are simply refused if the buffer is full (irrespective of the cell priority in the buffer). 5 EXPERIMENTAL RESULTS The multiplexing performances, obtained with the proposed dynamic bandwidth allocation strategy, have been compared with a particular fixed rate resource reservation in terms of cell loss probability and mean waiting time. The simulation studies have been conducted using the previously described LAN Ethernet traces in the following working conditions: • each input trace represents the number of arrivals per Time Unit; • the interarrival time between cells is assumed to be uniformly distributed over each Time Unit; • the waiting queue length has been chosen equal to 1000 cells (i.e. a maximum delay around 300 msec). In the fixed rate scheme, the input flow share the output capacity proportionally to their peak values. The prediction for each source trace has been made using the MMSE predictor with an order p=10 and a number of step ahead k=5 (corresponding to an adjournment time of half a second for the bandwidth allocated to each input stream). The chosen value of p achieves a negligible mean square error with a limited complexity of the linear filter. Although the results presented in this section refer to the model based approach, they do not present significant
The curve referring to the dynamic multiplexing strategy give rise to significantly better performances than those achievable with the fixed rate allocation: in terms of offered load, the multiplexing gain is around 15%. 1
0.1
Mean Waiting time (sec)
Fig 4.2 Priority Management Strategy
0.01
0.001
0.0001 0.65
0.7
0.75 Normalized Offered Load
0.8
0.85
Fig 5.2 Comparison based on Mean Waiting Time with linear prediction (continuos line) and fixed rate allocation (hatch line).
As shown in figure 5.2, the analysis based on the mean waiting time (a fundamental parameter when dealing with real-time applications) leads to similar results. For example, values of the mean waiting time around 10 msec can be reached only when the normalised load is equal to 0.61 and to 0.85 respectively for the fixed and dynamic allocation strategy. From previous results it can be observed that, even with the fixed rate allocation, an output link capacity around 50% of the overall peak rate (corresponding to a normalised load equal to 0.85) is sufficient to avoid relevant loss probabilities. This can be explained observing that, though persistence manifests itself with the different activity levels, the highest peaks of the analysed traces are relatively short in duration. Hence, in the considered scenario, the multiplexer can manage the peaks thanks to the queue buffer, without introducing any
cell loss. The results of the previous simulations have been validated by also using “shuffled” traces [3] with the same autocovariance structure of the original data, confirming in this way the dependence of network performances on longterm correlation. Even a particular rearrangement of the traces, leading to the coincidence of the three peaks, does not introduce significant differences in queuing performances. In fact, the multiplexing strategy efficiently handles the incoming traffic: in steady-state conditions, the number of queued cells is equal to the prediction error and so the probability that a sudden spike finds a relatively empty buffer is quite high. This behaviour absorbs traffic peaks without increasing the loss probability. The various simulations showed that a higher resource utilisation can be reached when the number of multiplexed sources increases, as highlighted in the following table. Number of multiplexed Sources 3 dynamic allocation 3 fixed allocation 3 dynamic allocation 3 fixed allocation 6 dynamic allocation 6 fixed allocation 6 dynamic allocation 6 fixed allocation
Mean Traffic/C 0.7 0.7 0.8 0.8 0.7 0.7 0.8 0.8
E{Tw} (sec.) 1.6E-3 19E-3 0.01 0.102 6E-4 16E-3 5E-3 62E-3
Pl 0 9.3E-5 1.3E-4 6.6E-3 0 1E-3 6.5E-5 12E-3
Tab III
As in previous simulations, the results are “robust” with respect to trace shuffling (provided that LRD is preserved). Moreover, since in a static peak-rate allocation strategy the bandwidth is determined by the sum S of the peak values of all the input streams, the multiplexing gain grows as the ratio between S and the mean activity level increases. This confirms the efficiency of the proposed bandwidth allocation scheme and the adopted multiplexing strategy. 6 CONCLUSIONS The analysis carried out is referred to several videoconferencing sessions set up among Work Stations and PCs on different LANs interconnected by the MAN “Tuscany”. The acquired traces showed a significant burstiness and high values of the peak to mean ratio, with a short time duration of the most relevant peaks. Starting from these statistical features, the objective of the paper is to emphasise the advantages, over the conservative method of peak allocation for each source, of a more efficient Dynamic Bandwidth Allocation strategy. The latter would be able to manage the peaks of each traffic flow using the channel capacity left unused by the other sources. The simulations driven by several shuffled versions of the measured traces highlighted that the proposed scheme is able to reach a resource utilisation of around 0.77 with negligible loss probabilities (lower than 10-5) and, at most, a few milliseconds of queuing delay. The obtained values for the
resource utilisation are remarkable especially if the high burstiness of the analysed traces is considered. The main advantage of the proposed scheme is its structural and algorithmic simplicity, based on the use of an extremely agile logic. At least for traffics characterised by spikes and high peak-tomean ratios, the analysed multiplexing algorithm can be used to handle a new connection set-up. An equivalent bandwidth can be defined as the product between the mean traffic intensity and the normalised traffic load. The simulation results indicated that a normalised load around 0.7 is able to match the Quality of Service requirements of VBR video services. The connection can be accepted only if the available capacity in the system is higher than the just defined equivalent bandwidth. This strategy, extremely simple in its implementation, is relatively conservative since, as stated above, the performances of the multiplexer increase with the number of independent input streams: it is reasonable to assume that the new call, provided it respects the negotiated traffic parameters, does not alter the overall network performances. REFERENCES [1] M.Livny, B.Melamed, A.K.Tsiolis “The impact of autocorrelation on queuing systems”, Management science, 39 pp. 322-339, 1993. [2] I.Norros “A storage model with self-similar input”, Queueing Systems and their Applications, 16, pp. 387-396, 1994 [3] A.Erramilli, O.Narayan, W.Willinger “Experimental queuing analysis with long-range dependent packet traffic”, IEEE/ACM Transactions on networking, vol. 4, n° 2, pp. 209-223, April 1996. [4] R.G.Garroppo, S.Giordano, M.Pagano, F.Russo “Self Similar Source Modelling of VBR Packet-Video Traffic”, PVW’96 Brisbane, March 18-19 1996, pp. 1-6 [5] A.Fenyves,. S.Giordano, A.Lazzari, M.Pagano, F.Romani, F.Russo “Traffic Measurements on Broadband Networks and discrete event queuing performances simulations”, Eighth IEEE Workshop on Local and Metropolitan Area Networks, August 25-28, 1996 [6] S.Giordano, G.Pierazzini, F.Russo “Multimedia experiments at the University of Pisa: from videoconference to random fractals” INET ‘95 Honolulu (Hawaii) June 26-28, 1995, pp. 543-550 [7] W.E.Leland, M.S.Taqqu, W.Willinger, D.V.Wilson "On the Self-Similar Nature of Ethernet Traffic (extended version)" IEEE Transaction on Networking, Vol. 2, n. 1, February 1994, pp. 1-15 [8] A.Adas “Supporting real time VBR video using Dynamic reservation based on linear prediction” Proc. of IEEE Infocom ‘96, S. Francisco, March 24-28, 1996, pp 14761483