VoIP Playout Buffer Adjustment using Adaptive Estimation ... - CiteSeerX

3 downloads 812 Views 404KB Size Report
The poor quality of Voice over IP can be improved by adaptive playout buffering at the receiver. ..... Engineering Laboratory in Dublin (IRELAND), and another one at the Computer Center of .... call quality decreases due to one-way delay (Fig.
VoIP Playout Buffer Adjustment using Adaptive Estimation of Network Delays Miroslaw Narbutt and Liam Murphy* Department of Computer Science University College Dublin, Belfield, Dublin 4, IRELAND

Abstract The poor quality of Voice over IP can be improved by adaptive playout buffering at the receiver. This technique dynamically adapts the playout deadline to network conditions, thus minimizing both late packet loss and buffering time. A standard playout buffer strategy uses an estimate (Exponentially Weighted Moving Average) of the mean and variance of network delay to set the playout deadline. This estimation is characterized by a fixed, constant weighting factor. We show that tuning of this parameter so that the strategy works very well for all network conditions is not feasible. Therefore we propose to extend this standard buffer strategy by replacing the fixed, constant weighting factor with a dynamic one. In our solution, the weighting factor is dynamically adjusted according to the observed delay variations. When these variations are high (which implies that the network conditions are changing), the parameter is set low, and vice-versa. This allows rapid adaptation to network variations and reduces the frequency of late packets (or buffering time). Simulations and experimental results show that with our strategy, the trade-off between buffering delay and late packet loss at the receiver is improved significantly.

1. INTRODUCTION A typical VoIP application buffers incoming packets and delays their playout in order to compensate for variable network delays (jitter). This allows the slowest packets to arrive in time to be played out. The fluctuating end-to-end network delays may cause playout times to increase to a level, which is irritating to users (when the buffer is too big) or may cause packet losses due to their late arrivals (when the buffer is too small). The two conflicting goals of minimizing buffering time and minimizing late packet loss have engendered various playout algorithms. The need for adaptive buffering comes when the end-to-end delay is high (close or above the interactivity constraint of 100-150ms) and when the delay is unknown and the receiver does not know how to select appropriate playout times [1]. Adaptive playout mechanism makes it possible to balance the length of the buffer – a major addition to end-toend delay – with the possibility of packet loss. Generally, a good playout algorithm should be able to achieve the best possible trade-off between loss and delay. In this paper we present a new playout buffer algorithm that significantly improves this trade-off. * [email protected], [email protected]

In section two the motivation of our work is demonstrated and basic idea of the new proposed algorithm is outlined. In section three the new algorithm is described and potential improvements are outlined. Later its effectiveness is evaluated through simulations with the use of network emulator (section four) and through experiments on a real network (section five). In section six effects of the new buffering scheme on the subjective quality is addressed. Finally, in section seven conclusions are drawn.

2. MOTIVATION Most of the adaptive playout algorithms described in the literature perform continuous estimation of the network delay and its variation to dynamically adjust the talkspurt playout time. Standard adaptive playout algorithm [2] is based on Jacobson’s work on TCP roundtrip time estimation [3]. The algorithm estimates two statistics: the delay itself, and its variance and uses them to calculate the playout time. Both estimated are in the form of: ∧



d i = α ⋅ d i −1 + (1 − α ) ⋅ ni ; ∧





v i = α ⋅ v i −1 + (1 − α )⋅ | d i − ni | ; ∧



where d i and vi and are the i-th estimates of delay and its variance respectively, while ni is the i-th packet delay. Parameter α has a critical impact on the rate of convergence of this estimation. Following the claim made in [2], and in accordance with NeVot [4], the weighting factor α is fixed and chosen to be high (α = 0.998002) to limit sensitivity of the estimation to short-term packet jitter. By experiments with different values of α we observed that such high value of α is good only in situations when network conditions are stable (delay and jitter are constant). When network conditions are changing rapidly (sudden increases/decreases in delay) smaller values of α (0.7, 0.8, 0.9) were more appropriate. Figure1 illustrates that as a decreases, calculated playout times (solid lines) track variations of network delays (dots) more efficiently. As a result less packets arrive too late (from 3.5% down to 1%) and the average buffering time is smaller (from 27.8ms to 7.4ms).

Fig. 1 Calculated playout times for two different

α

Unfortunately, a single tuning of the parameter α that works well for all network conditions is not easy (or not even a feasible) problem to solve. Figures 2 and 3 show that there is no optimal fixed value of α when network condition vary in time.

Fig. 2, 3. Calculated playout times for various values of α

When jitter is small and fluctuations in the end-to-end delays are large (Fig. 2), the best results are achieved when α is small. In this case both the packet loss ratio and average buffering time are relatively small (3.7% of lost packets and 3ms of buffering time). When α is set to 0.998002, the packet loss ratio is high (11.7%), and the buffering time is much larger than necessary (36.6ms). On the other hand, when jitter is large but average network delay is constant (Fig. 3), the best results are achieved when α = 0.998002. In this case, the packet loss ratio is below 1%. When α is small, the algorithm is too sensitive to short-term delay jitter and this causes larger late packet loss (2.7%). Since there is no optimal fixed value of α that works well for all network conditions we claim that the accuracy of the estimates can be greatly improved by dynamically choosing the values of α.

3. PLAYOUT BUFFER ALGORITHM WITH ADAPTIVE α The idea behind our algorithm is to adaptively adjust the value of α depending on the variation in the network delays (α is set high when end-to-end variations are small and viceversa). This new, dynamic parameter α (recomputed with each incoming packet) can be used to perform continuous estimation of the network delay and its variation in the same way like before. Let α i be a dynamic parameter based on new estimates of the variance vˆi′ of the end-to-end delays between source and destination: α i = f (vˆi′ ) ,

where the function f (vˆi′ ) was chosen experimentally to maximize the performance of our algorithm over a large set of network traces. The dynamic version of parameter α is now used to maintain adaptive estimations of average delay and its variation: ∧



d i = α i ⋅ d i −1 + (1 − α i ) ⋅ ni ∧





v i = α i ⋅ v i −1 + (1 − α i )⋅ | d i − ni |

Finally the playout time pi at which the the i-th packet, assumed to be the first packet in a talkspurt played at the destination is calculated as follow: ∧



pi = t i + d i + β ⋅ v i Parameter ß controls delay/packet loss ratio. The larger the coefficient, the more packets are played out at the expense of longer delays. Any subsequent packets of that talkspurt are played out with rate equal to the generation rate at the sender - that is, p j = pi + t j − ti

SENDER

tj

ti

sending time

RECEIVER

SPEAKER

reception time

pi

ni

pj

playout time

network delay buffering delay playout delay

Fig. 4. Playout time etimation.

This mechanism uses the same playout delay throughout a given talkspurt but permits different playout delays for different talkspurts. The variation of the playout delay introduces artificially elongated or reduced silence periods between successive talkspurts.

4. BUFFERING PERFORMANCE TESTS THROUGH NETWORK EMULATIONS We have tested the performance of the new algorithm through network emulations. For the test we have chosen NISTNET 2.1.0 network emulation software [5] and we modeled various delay patterns (Fig. 5, 6, 7, 8) using its default Pareto distribution.

Fig. 5. First delay pattern - delay and jitter are constant (delay = 100ms, jitter = 50 ms).

Fig. 7. Third delay pattern - delay varies in time, jitter is constant (delay jumps between 100, 150 and 200ms every minute, jitter = 30ms).

Fig. 6. Second delay pattern - delay constant and jitter varies in time (delay = 100ms, jitter jumps between 0, 10, 20, 30, 40, 50 ms every minute) .

Fig. 8. Fourth delay pattern - delay and jitter vary in time (delay jumps between 50, 100 and 150ms, jitter jumps betwen 0, 10, 20, 30, 40, 50 ms every 10 seconds) .

During experiments we used two voice sources (with and without hangover time). Regarding ITU-T recommendation P.59 [6], human speech was modeled as a process that alternates between talkbursts and silence periods that follow exponential distributions (Fig. 9,10) with a mean of 227 and 596ms, without hangover time or 1004 and 1587ms with hangover time respectively. In our model voice packets were generated every 30ms. No packets were generated during silence periods. Total duration of each simulation was 1 hour.

1000 MIN TALKBURST = 33 ms MAX TALKBURST = 1760 ms TOTAL TALK TIME = 1001 s

500 0 0

500

1000 duration [ms]

TALKBURSTS AND GAPS w. HANGOVER TIME 200

TALKBURSTS DISTRIBUTION : MEAN TALKBURST = 227 ms

# talkbursts

# talkbursts

TALKBURSTS AND GAPS w/o HANGOVER TIME

TALKBURSTS DISTRIBUTION : MEAN TALKBURST = 1004 ms

100

MIN TALKBURST = 79 ms MAX TALKBURST = 7363 ms TOTAL TALK TIME = 1447 s

0 0

1500

2000

GAPS DISTRIBUTION : MEAN GAP = 596 ms

500

MIN GAP = 52 ms MAX GAP = 5122 ms TOTAL GAPS TIME = 2599 s

0 0

1000

2000 3000 duration [ms]

4000

Fig. 9. Talkbursts and gaps generated by the voice source without hangover time.

MIN GAP = 79 ms MAX GAP = 11840 ms

100

TOTAL GAPS TIME = 2152 s

0 0

5000

6000

GAPS DISTRIBUTION : MEAN GAP = 1587 ms

200 # gaps

# gaps

1000

4000 duration [ms]

2000

4000 6000 8000 duration [ms]

10000 12000

Fig. 10. Talkbursts and gaps generated by the voice source with hangover time.

In order to compare the performance of the new playout algorithm with the basic one, we recorded network delays at the receiver and processed that data with the program that simulated the behaviour of the two algorithms. The delay/packet loss ratio was controlled by different values of the ß factor (2

Suggest Documents