Prediction of TCP Throughput: Formula-based and History-based ...

Prediction of TCP Throughput: Formula-based and History-based Methods Qi He

Constantinos Dovrolis

Mostafa Ammar

Georgia Tech

Georgia Tech

Georgia Tech

[email protected]

[email protected]

[email protected]

and then apply the estimated loss rate pˆ and RTT Tˆ to the throughput formula. The FB predictor that we use is given below: 8 Mq > , min( q 2bpˆ > p ˆ < +T0 min(1, 3b )p(1+32 ˆ p ˆ2 ) Tˆ 3 8 ˆ W R= ) if pˆ >0 (2) > Tˆ > :min( W , A) ˆ if pˆ =0 Tˆ

Categories and Subject Descriptors: C.2.5 [Computer Communication Networks]: Internet General Terms: Experimentation, Measurement

1.

INTRODUCTION

With the advent of overlay and peer-to-peer networks, Grid computing, and CDNs, network performance prediction becomes an essential task. Such predictions are used in path selection schemes for overlay and multihomed networks, dynamic server selection, and peer-to-peer parallel downloads. In this work, we focus on the throughput prediction of a bulk TCP transfer (”target flow”) in a particular network path, prior to starting that flow. We first classify the existing prediction techniques into two categories: Formula-Based (FB) and History-Based (HB). Within each class, we develop representative prediction algorithms that we evaluate empirically over the RON testbed. Our goal is to examine the key issues with each prediction category, evaluate their accuracy under different conditions, and provide insight regarding the factors that affect the predictability of TCP throughput. This note is a summary of [2]; the interested reader can find more details about our approach and results in that paper.

2.

Note that for lossless paths (ˆ p = 0), formula (1) does not apply, and the prediction is based on the available bandwidth ˆ (avail-bw), unless the flow is window-limited, i.e., W/T < A. For lossy paths (ˆ p > 0), the accuracy of predictor (2) depends on how close Tˆ is to T , and pˆ is to p. For lossless ˆ is to R. The path, the accuracy depends on how close A following factors could cause prediction errors: Errors due to the extra load introduced by the target flow: The target flow can cause an increase in the queueing delay and/or the loss probability on the network path, especially if the bottleneck link is heavily loaded. Therefore, the estimates Tˆ and pˆ can be lower than the RTT T and loss rate p experienced by the flow. The net effect is that the FB predictor would overestimate the actual TCP throughput. Errors due to TCP sampling behavior: TCP reduces its throughput when the path is congested, so it tends to underestimate the RTT and loss rate in heavy load conditions. On the other hand, when self-clocking fails, TCP tends to send long packet bursts, experiencing higher queueing delays and loss rate than measurement techniques that generate a periodic packet stream (such as ping). As a result, the FB prediction can be either underestimation or overestimation of the actual TCP throughput. Errors due to the difference between avail-bw and ˆ of the TCP throughput: In lossless paths, the avail-bw A path prior to a flow can be lower or higher than the actual throughput the flow attains, depending on factors such as the congestion responsiveness of the background traffic and the buffering at bottleneck links [4].

FORMULA-BASED PREDICTION

The central component of a Formula-based (FB) predictor is a mathematical formula that expresses the TCP throughput as a function of the underlying path characteristics, such as loss rate, RTT, and available bandwidth. The TCP throughput formula that we use is the PFTK result of [3]: 1 0 W M A q , E[R] = min @ q T 2bp + T min(1, 3bp )p(1 + 32p2 ) T 3

o

8

(1) where p and T are the average loss rate and RTT experienced by the target TCP flow, To is the TCP retransmission timeout period, W is the maximum window size (limited by the send or receive socket buffer size), while M and b are, respectively, the TCP segment size and the number of segments released per ACK. The main advantage of FB prediction is that it does not require any history of previous TCP transfers. A typical FB predictor measures the loss rate and RTT before the transfer with utilities such as ping,

3. HISTORY-BASED PREDICTION History-based (HB) techniques predict the throughput of TCP flows from a time series of previous TCP throughput measurements on the same path. HB approach is possible in applications where large TCP transfers are performed repeatedly over the same path. Our investigation of HB predictions is based on simple linear predictors, including Moving Average, Exponential Weighted Moving Average, and non-seasonal Holt-Winters.

Copyright is held by the author/owner. SIGMETRICS’05, June 6–10, 2005, Banff, Alberta, Canada. ACM 1-59593-022-1/05/0006.

388

We have not examined more complex predictors such as ARMA or ARIMA because the selections of both their order and of their coefficients require a large number of past measurements. While experimenting with various predictors, we found that two heuristics can noticeably improve the accuracy of HB predictors. The first is to detect and ignore outliers, and the second is to detect level shifts and restart the HB predictors. In addition, we found no major differences among the above simple linear HB predictors, after using these two heuristics.

4.

100

CDF (%)

80

60

40 Lossless path predictions All predictions Lossy path predictions

20

0 -4

-2

0

2

4

6

8

10

Relative Error E

Figure 1: CDF of relative error E for FB prediction.

RESULTS

We collected 245 TCP throughput time series, from 35 Internet paths connecting RON nodes [1]. 7 “traces” were collected on each path. Each trace consists of 150 back-to-back “epochs”. An epoch consists of an avail-bw measurement Aˆ using Pathload, followed by a measurement of pˆ and Tˆ using a ping utility that generates a 41-byte probing packet every 100ms, followed by a 50-second TCP transfer (target flow) generated by IPerf, which gives us the actual TCP throughput R on a path. Unless otherwise noted, we used W =1MB, which is large enough to cause congestion on all the paths in our experiments. ˆ we calculate the Based on the measurements pˆ, Tˆ and A, ˆ F using (2). Similarly, we apply FB predicted throughput R the predictors mentioned in §3 to the time series of past TCP ˆH . throughput measurements to obtain the HB prediction R ˆ and an actual measurement R, we define For a prediction R the relative prediction error E as E=

can exhibit distinct patterns of prediction accuracy; (5) HB prediction is more accurate when the transfer is windowlimited. 100

80

CDF

60

3-min interval 6-min interval 24-min interval 45-min interval

40

20

0 0.01

0.1 0.2 0.4 RMSRE (log scale)

1

5

Figure 2: HB prediction error with different measurement periods (Holt-Winters predictor). Figure 2 shows how the measurement period affects the prediction error. To obtain these results, we down-sampled the original traces at different frequencies and applied the HB predictor (Holt-Winters) to the resulting TCP throughput traces. Using simple queuing models, we also explained why some paths are much more predictable than others, based on the utilization and the degree of multiplexing at the bottleneck link. We concluded that: (1) the relative prediction error increases with the CoV of the underlying time series, (2) the CoV of the avail-bw process (on a non-congested link) or the CoV of a flow’s throughput (on a congested link) increases with the offered load on the link, (3) the CoV of the availbw process decreases with the number of competing flows on the link, as long as the utilization remains constant.

ˆ−R R ˆ R) min(R,

To report a single figure for n measurements in a time series, we use the Root Mean Square Relative Error (RMSRE). We highlight next some key results of our empirical evaluation. Formula-based Prediction: Our evaluation has shown that FB prediction can be very inaccurate, mostly in lossy paths and when the target flow saturates the underlying path. Figure 1 shows the CDF of E for all measurements. It also shows separately the CDF of E for the subset of predictions that are based on the PFTK model versus the availˆ The fact that overestimation occurs more bw estimate A. often than underestimation indicates that the major cause of prediction errors is that the RTT and loss rate before the transfer are significantly different than while the transfer is in progress. The error caused by the different sampling behavior between TCP and periodic probing is also important. The significance of both errors suggests that more accurate estimates of p and T would help improve FB prediction. We also observed that FB prediction for window-limited flows is much more accurate than for congestion-limited flows, and that FB prediction is highly path dependent. History-based Prediction: We have evaluated the accuracy of HB prediction with respect to several factors, some of which have not been examined before. Specifically, our empirical results indicate that: (1) even a limited history of sporadic TCP transfers is often sufficient to achieve a fairly good prediction accuracy; (2) simple heuristics to detect outliers and level shifts can significantly reduce the number of large prediction errors; (3) HB prediction is on average much more accurate than FB prediction; (4) different paths

5. REFERENCES [1] D. Andersen, H. Balakrishnan, F. Kaashoek, and R. Morris. Resilient Overlay Networks. In Proceedings of ACM Symposium on Operating Systems Principles, October 2001. [2] Q. He, C. Dovrolis, and M. Ammar. On the Predictability of Large Transfer TCP Throughput. Technical Report GIT-CERCS-05-06, College of Computing, Georgia Tech, 2005. [3] J. Padhye, V.Firoiu, D.Towsley, and J. Kurose. Modeling TCP Throughput: A Simple Model and its Empirical Validation. IEEE/ACM Transactions on Networking, 8(2):133–145, April 2000. [4] R. S. Prasad, M. Jain, and C. Dovrolis. Socket Buffer Auto-Sizing for High-Performance Data Transfers. In Journal of Grid Computing, Special Issue on High Performance Networ king, 1(4):361–376, 2004.

389

Prediction of TCP Throughput: Formula-based and History-based ...

Prediction of TCP Throughput: Formula-based and History-based ...

Suggest Documents

Fast Pattern-Based Throughput Prediction for TCP Bulk Transfers

TCP Throughput and Buffer Management - Semantic Scholar

CDMA Channel Parameters Maximizing TCP Throughput

Improving TCP Throughput Using Modified Packet Reordering ...

Measurement and Analysis of TCP Throughput Collapse in Cluster ...

Analytical TCP Throughput Model for HSDPA - BME

TCP Throughput Enhancement over Wireless Mesh Networks

Measurement and Analysis of TCP Throughput Collapse in Cluster ...

Enhancing Fairness and Throughput of TCP in ... - Semantic Scholar

Modeling and Throughput Prediction for Flexible ... - Semantic Scholarhttps://www.researchgate.net/...Causey/.../Modeling-and-throughput-prediction-for-fle...

High-Throughput, Kingdom-Wide Prediction and

High-throughput chinmedomics-based prediction

Trade-off evaluation between fairness and throughput for TCP ...

Balancing Sum Rate and TCP Throughput in OFDMA ... - IEEE Xplore

Balancing Throughput and Fairness for TCP Flows in ... - CiteSeerX

TCP Throughput Measurements and Analysis in ... - Semantic Scholar

Saving Energy and Improving TCP Throughput with Rate Adaptation ...

Improvement of Throughput Prediction Scheme Considering Terminal ...

Cross-Layer Optimization of TCP Throughput for DVB-S2 ... - CiteSeerX

Energy/Throughput Tradeoffs of TCP Error Control ... - CiteSeerX

Throughput analysis of Scalable TCP congestion control - Periodica ...

Energy/Throughput Tradeoffs of TCP Error Control ... - Semantic Scholar

Throughput Analysis of TCP Congestion Control Algorithms in a Cloud ...

Throughput of TCP over Cognitive Radio Channels - arXiv