TCP New Vegas: Improving the Performance of TCP Vegas Over High Latency Links Joel Sing and Ben Soh Department of Computer Science and Computer Engineering La Trobe University Bundoora VIC 3083, Australia Email:
[email protected],
[email protected]
Abstract TCP Vegas provides significant performance gains over most TCP variants, especially when used over networks that utilise error prone links. However, whilst examining the impact of latency on TCP we noticed that the performance of TCP Vegas decreases significantly when the network round trip time exceeds 50ms. This paper details research undertaken to identify the cause of this performance decreasing behaviour in TCP Vegas. Three sender-side modifications are proposed and implemented as TCP New Vegas, before being validated via simulation.
1. Introduction Instead of using the traditional congestion control algorithms as designed by Jacobson, TCP Vegas [7] implements congestion control via the use of Round Trip Time (RTT) measurements. Unlike most variants, TCP Vegas does not force the network to drop packets in order to gauge the available network capacity. As a result, the same amount of data can be transferred with less network traffic, achieving higher goodput. Additionally, TCP Vegas is able to more readily identify segments lost due to corruption, retransmitting lost segments sooner than Jacobson based TCP variants. It has been shown [7, 8, 3] that due to these properties TCP Vegas achieves 40-70% more throughput than TCP Reno. Whilst investigating the impact that delayed acknowledgments, latency and initial window size have on a range of TCP variants, we observed some unusual behaviour with TCP Vegas [16]. In general we observed that TCP Vegas’ performance decreased significantly when the network latency exceeded an RTT of 50ms, making it a poor choice for users of long delay satellite links, despite its ability to normally perform well in a wireless environment. It was
also noted that performance fluctuated when delayed acknowledgments were implemented at the receiver or an initial window size other than two segments was used. This paper1 details research undertaken to identify the cause of the performance decreasing behaviour within TCP Vegas’ congestion control algorithms. Solutions are provided for each of the three primary problems identified. In section 2 we provide a summary of the traditional TCP congestion control algorithms and TCP Vegas’ congestion control algorithms. Performance problems exhibited by TCP Vegas are described in section 3. Possible solutions are proposed in section 4, an implementation of these solutions is outlined in section 5 and the results achieved from the implementation are detailed in section 6. Finally, future research is presented and conclusions are drawn.
2. Background 2.1. TCP Tahoe, Reno and New Reno Most TCP variants implement the slow-start and congestion-avoidance algorithms as designed by Jacobson during the late 1980’s, being detailed in his renowned paper [11] and implemented in TCP Tahoe. During the slowstart phase, TCP increases its congestion window (cwnd) exponentially, effectively increasing the window size by one segment for each segment that is successfully delivered and acknowledged. At anytime, the maximum amount of unacknowledged traffic that may be in transit across the network is the lesser of cwnd and receiver’s advertised window (rwnd). Slow-start is terminated if a segment is dropped by the network, the congestion window reaches the slow-start threshold (ssthresh, which is initialised to 65536 bytes when the connection starts) or if the rwnd size is reached. If slowstart was terminated due to the loss of a packet, ssthresh is 1 Due to space constraints, some content and graphs have been omitted. For a full copy of this paper please contact the authors via email.
Proceedings of the 2005 Fourth IEEE International Symposium on Network Computing and Applications (NCA’05) 0-7695-2326-9/05 $20.00 © 2005
IEEE
set to half of the current value of cwnd, cwnd is reinitialised to the initial window size and slow-start is restarted. If slowstart was terminated due to cwnd reaching the ssthresh or rwnd value, the congestion-avoidance phase begins. During congestion-avoidance, the congestion window is increased 1 by cwnd for each acknowledged segment. If a packet is lost during the congestion-avoidance phase, the slow-start phase will be resumed after setting ssthresh to half of cwnd and reinitialising cwnd to the initial window size. This follows the additive increase, multiplicative decrease model, providing a stable network that will not suffer from congestion collapse [11]. Two additional algorithms, fast-retransmit and fastrecovery, are implemented in TCP Reno, allowing for the recovery of a single lost segment within a window; without causing a major impact on throughput. The initial versions of the fast-retransmit and fast-recovery algorithms are detailed in RFC2001 [17] and are further refined in RFC2581 [4]. TCP New Reno implements a modified version of these algorithms, which is capable of recovering from multiple lost segments within a single window. The New Reno version of these algorithms is detailed in RFC3782 [10].
the network. Once this difference (known as delta) exceeds a certain threshold (gamma, typically set to one packet), slow-start is terminated and congestion-avoidance is commenced. Upon exiting slow-start, TCP Vegas decreases the congestion window by one eighth of its current size in order to ensure that the network does not remain congested. Unlike Jacobson based variants, TCP Vegas has the ability to terminate slow-start before it exceeds the network’s available capacity, instead of doubling the congestion window until congestion occurs and packets are dropped by the network. During slow-start, cwnd is increased by one segment for every two RTTs, differing from one segment per acknowledgment as used in traditional TCP. When in the congestion-avoidance phase cwnd will be increased by 1 cwnd , decreased by one segment or left unchanged, with this decision being made once per RTT. As shown by a number of researchers [7, 8, 3, 14], the use of RTT measurements results in congestion control algorithms that achieve better throughput and transfer more data for the number of packets transmitted across the network, resulting in increased goodput. TCP Vegas is also more resilient to error prone links and will retransmit packets that have been lost due to corruption far sooner than other variants.
2.2. TCP SACK The use of Selective Acknowledgments (SACK) [13] provides increased feedback to the sender in regard to segments that have been received and queued by the receiver, but are not yet acknowledged due to one or more segments that are missing from the window. This information allows the sender to selectively retransmit the packets that have not been received, reducing the occurrence of costly timeouts. This is especially critical where packets have been lost due to corruption, as it allows for these packets to be simply retransmitted without retransmitting packets that have already been successfully received but not yet fully acknowledged. The base congestion control algorithms used for a TCP SACK implementation are the same as the Reno congestion control algorithms, with minor modifications to allow for the use of the SACK information during retransmissions.
2.3. TCP Vegas Unlike most TCP variants, TCP Vegas does not rely on lost packets in order to gauge network capacity, instead using RTT measurements to determine the available network capacity. The congestion control algorithms within TCP Vegas calculate the expected throughput rate and the actual throughput rate once per RTT. The difference between the actual and expected rates is then calculated, effectively indicating the number of packets which are being queued within
3. TCP Vegas performance problems Our previous simulation work [16] has shown that the performance of TCP Vegas decreases significantly when the network latency exceeds an RTT of 50ms. Additionally, performance has been shown to fluctuate when delayed acknowledgments are implemented at the receiver or an initial window size other than two segments is used. In order to investigate the exact impact of each of these parameters on TCP Vegas, numerous simulations were performed allowing RTT measurements to be taken, congestion window growth to be monitored and achieved throughput to be measured. All simulations have emulated a 30 second FTP file transfer between two hosts (A and B), over a single hop (R). In order to prevent third-order effects caused by packet loss and associated error recovery, all links have been simulated as being error free, with router queue sizes being large enough to prevent packets from being dropped. All simulations use a Maximum Segment Size (MSS) of 500 bytes, an initial window size of two segments and delayed acknowledgments at the receiver, unless otherwise stated. During our research we have compared the performance of TCP Vegas against that of TCP SACK, the most recent variant based on the Jacobson congestion control algorithms. TCP SACK is also arguably the most ubiquitous variant in today’s Internet.
Proceedings of the 2005 Fourth IEEE International Symposium on Network Computing and Applications (NCA’05) 0-7695-2326-9/05 $20.00 © 2005
IEEE
Figure 1. TCP Vegas per packet RTT - 560ms RTT network, non-delayed acknowledgments
3.1. Impact of latency In order to understand the performance problems exhibited by TCP Vegas when used over long delay links, a number of base simulations were executed and results recorded. Since the Vegas congestion control algorithms are based on averaged RTT measurements, traces of these measurements, as calculated by TCP Vegas, were created using a modified version of Network Simulator 2 (ns2) [1]. These traces provided a number of insights. Firstly, the averaged RTT measurements oscillate and increase, almost doubling in wavelength when every second measurement is taken. Secondly, the duration of the oscillations increases as the RTT increases. Thirdly, the lower the RTT, the higher the oscillation frequency. In order to better understand the oscillation in the averaged RTT measurements during slow-start, traces of the per packet RTT and congestion window growth were made. A per packet RTT trace is provided in figure 1, for a network exhibiting an RTT of 560ms, being that of a typical network which utilises a geostationary satellite link. As can be seen from this trace, the oscillation in the averaged RTT measurements is a direct result of the per packet RTT increasing in a near linear fashion, which occurs as the congestion window is doubled during the slow-start phase. The exponential window growth occurs every second RTT and is implemented by increasing the congestion window size by one segment upon the receipt of an acknowledgment. This process causes bursts of packets to enter the network, with the size of the burst doubling every second RTT, resulting in large transient queues as the packet burst attempts to traverse the network. This causes two flow-on effects. Firstly, since TCP Ve-
gas measures the per packet RTT as being the duration between the packet leaving the sender’s transmission queue through to the time that the packet is acknowledged, a queue of packets within the network will result in each consecutive packet having an RTT that is larger than the previous packet. Secondly, in a real-world environment, a burst of packets may result in buffers reaching capacity in one or more routers across the network path. The bursty nature of TCP traffic has been documented in previous research [2, 15, 12] occurring during slow-start, when recovering from multiple losses (where a single acknowledgment can result in a large number of segments being transmitted), when acknowledgments are compressed on the return path and when traffic multiplexing occurs at a bottleneck link. The extreme window growth caused by the exponential slow-start algorithm, when used over links that have a large Bandwidth Delay Product (BDP), has also been documented [9, 12].
3.2. Impact of delayed acknowledgments Internet hosts should implement delayed acknowledgments as detailed in RFC1122 [6] and RFC2581 [4]; that is one acknowledgement should be sent for every two packets of data received, without delaying an acknowledgment for more than 500ms [6]. As a result, a TCP Vegas sender should expect that the receiver will implement delayed acknowledgments. When delayed acknowledgments are used the averaged RTT fluctuates more and is significantly higher during the first several RTTs for the connection. Whilst the previously noted oscillation still occurs, the waveform becomes irregular. Additionally, more time is required for the averaged RTT to settle and maintain a consistent value. A per packet RTT trace is provided in figure 2 for a network exhibiting an RTT of 560ms with the receiver implementing delayed acknowledgments. As can be seen from the trace, the per packet RTT varies from 564ms through to 664ms on a regular basis. This variance equates to the maximum delay implemented by the ns2 delayed acknowledgment receiver, which by default is 100ms. It is worth noting that in a real-world implementation this delay would typically be around 200ms [4] and is permitted to be as large as 500ms [6]. A delay will occur whenever a packet arrives at the receiver without a second packet arriving immediately after. Obviously a large fluctuation in the per packet RTT measurements will affect the averaged RTT calculation as used by the Vegas congestion control algorithms, potentially resulting in an incorrect decision being made with regards to congestion window growth. Delayed acknowledgments also decrease the amount of feedback from the receiver to the sender by half, typically returning one acknowledgment for every two packets. This
Proceedings of the 2005 Fourth IEEE International Symposium on Network Computing and Applications (NCA’05) 0-7695-2326-9/05 $20.00 © 2005
IEEE
comes large enough to allow the sender to fully utilise the available bandwidth. Additionally, many TCP sessions are short-lived meaning that the congestion window size may never reach the optimal level before the session is terminated. The impact of this can be compared to a network with an RTT of 260ms, which has an optimal congestion window size of (2∗106 ∗0.260) = 130 segments. If the current window size is 500∗8 128 segments, the window size will converge to the optimal value within two RTTs or 520ms.
4. Solutions
Figure 2. TCP Vegas per packet RTT - 560ms RTT network, delayed acknowledgments
in turn slows the rate at which the congestion window is increased, since this is usually done each time an acknowledgment is received. When delayed acknowledgments are used with TCP Vegas, the congestion window will increase at an effective rate of cwnd per RTT during slow-start. 4
3.3. Early termination of slow-start One of the design features of TCP Vegas’ congestion control algorithms is that it does not force the network to drop packets in order to gauge network capacity. As a result, TCP Vegas will terminate slow-start and commence the congestion-avoidance phase as soon as it determines that doubling the congestion window will result in a significantly increased RTT and potential packet loss due to congestion. Whilst this approach works well for networks with a small RTT, where a minimal congestion window size is required in order to perform optimally, it quickly fails when a network has a large RTT and requires a large congestion window size. In certain situations the optimal congestion window size may be less than double the current size, however this may still equate to a very large number of segments. As an example, for a 2Mbps network exhibiting an RTT of 330ms and using an MSS of 500 bytes, the optimal congestion 6 ∗0.330) = 165 segments. If the window size will be (2∗10500∗8 current window size is 128 segments, doubling this to 256 segments will result in the network becoming congested. However, increasing this at a rate of one segment per RTT as the Vegas congestion-avoidance algorithm does (assuming non-delayed acknowledgments), will require 12.21 seconds before it converges to the optimal size. This will result in decreased performance until the congestion window be-
In order to increase the performance of TCP Vegas, particularly when used over networks that exhibit high latency, we propose three solutions - one for each of the problems identified in the previous section.
4.1. Packet pacing As detailed in section 3.1, doubling the congestion window during the slow-start phase results in large bursts of packets being injected into the network. This causes large fluctuations in the per packet RTT, which in turn impacts the averaged RTT calculations used by the congestion control algorithms. In order to prevent bursts of packets from being injected into the network, resulting in inaccurate RTT measurements, it is proposed that pacing be used during slow-start; RT T should be inserted between the transthat is a delay of cwnd mission of each segment. This delay will result in the window of packets being evenly spaced over the network RTT, reducing the queue of packets within the network and resulting in more accurate per packet RTT measurements. Once slow-start is terminated the congestion window should be at a near optimal size, with each incoming acknowledgment resulting in a new outgoing packet, keeping the packets evenly spread over the entire RTT. The concept of pacing packets is not new and has been suggested by other researchers in numerous contexts [15, 5, 12, 18]. However, at least one paper [2] has shown that TCP Reno performs worse when paced, due to the fact that pacing reduces queuing delays. This results in the network becoming fully congested before a loss occurs and congestion is detected. Unlike Jacobson based variants, TCP Vegas does not rely on packet loss to detect congestion, instead detecting impending congestion when the actual throughput rate begins to differ from the estimated throughput rate. This allows it to terminate slow-start before the network becomes congested. Due to these aspects we believe that packet pacing will have a positive impact on TCP Vegas performance.
Proceedings of the 2005 Fourth IEEE International Symposium on Network Computing and Applications (NCA’05) 0-7695-2326-9/05 $20.00 © 2005
IEEE
4.2. Packet pairing The impact of delayed acknowledgments on TCP Vegas is primarily a result of the maximum delay incurred when the receiver receives a single packet without a second immediately following it. As such, sending packets in pairs will result in an acknowledgment being returned immediately, preventing the receiver from delaying the acknowledgment and skewing the RTT measurements. We propose that during every second RTT of the slow-start phase, the congestion window be increased by two segments on every second acknowledgment instead of one segment on every acknowledgment. Providing the initial window size used is an even number, this process will ensure that the congestion window is kept at an even number of segments, preventing the transmission of a single packet without a second being immediately available for transmission. If the packet pacing detailed in the previous section is implemented, the delay should be inserted after every second packet, causing packets to be transmitted in pairs without a delay in between. To the best of our knowledge this technique has not been suggested or documented elsewhere.
4.3. Rapid window convergence In order to provide a solution to the early termination of slow-start problem detailed in section 3.3, an increase which is slower than the exponential growth used during slow-start and faster than the very slow linear growth used during congestion-avoidance is required. This increase needs to occur after slow-start terminates, during the initial part of the congestion-avoidance phase. From our experimentation, the delta value calculated by the Vegas congestion control algorithms will exceed the gamma threshold by one or two packets, when doubling the window size will result in the network becoming congested. It will however, immediately return to a lower or zero value when congestion-avoidance is commenced. If the network is congested at the current congestion window size the delta value will remain at a large value for several RTTs as the window size is reduced or in extreme cases, packet loss occurs. In order to utilise this information, the congestion window size is recorded in a variable known as sscwnd at the termination of slow-start. If during the congestionavoidance phase, the delta value is less than the alpha threshold the congestion window would normally be in1 for each acknowledged segment. Instead creased by cwnd we propose a linear increase corresponding to a percentage of the congestion window size at the termination of slowstart. We have termed this process rapid window convergence since the intention is to have the congestion window converge to an optimal value at a faster rate than would nor-
mally occur during the congestion-avoidance phase. Experimentation has shown that a linear increase of sscwnd per RTT yields the best performance. However, 8 per RTT can we have noticed that an increase of sscwnd 8 still result in TCP Vegas detecting impending congestion (delta exceeds beta) even when the congestion window can still safely be increased. If this occurs, very slow window growth results once again, as it attempts to converge at a rate of one segment per RTT. False detection of impending congestion appears to primarily occur for links with large BDPs (which have very large congestion windows upon termination of slow-start), especially when the congestion window reaches the size that it was just prior to slow-start being terminated. To overcome this problem our implementation halves the increase rate the first two times that impending congestion is noticed within the network, resulting in an increase of sscwnd per RTT, followed by sscwnd per RTT and finally 8 16 sscwnd per RTT. When impending congestion is detected 32 for a third time we return to the normal increase of one segment per RTT. This process appears to allow for optimal convergence without overloading the network. Given that TCP Vegas will not increase the congestion window unless it believes that there is sufficient capacity available within the network (delta is less than alpha), this increased linear growth should never result in congestion collapse. In some ways this approach is similar to the limited slowstart suggested by Floyd in [9], however differs by it being a linear growth that only occurs after slow-start has terminated. The use of limited slow-start in combination with rapid window convergence may provide even better performance.
5. Implementing New Vegas In order to validate and test the solutions proposed in the previous section, a modified version of TCP Vegas was developed within ns2. In the spirit of past TCP variant naming conventions we have named this modified version TCP New Vegas. Our congestion control algorithms have been modified to include the three solutions as outlined in the following sections. It is worth noting that each of the three solutions are implemented at the sender, with no changes being required at the receiver.
5.1. Packet pacing At the end of the output() function a delay of is inserted after the transmission of every second segment, if Vegas is currently in slow-start. The following code implements the packet pacing and part of the packet pairing solution proposed: baseRT T cwnd
Proceedings of the 2005 Fourth IEEE International Symposium on Network Computing and Applications (NCA’05) 0-7695-2326-9/05 $20.00 © 2005
IEEE
// If in slow-start, pace window over RTT if(cwnd < ssthresh && v cntRTT > 0) {
6. Results
if(seqno > 0 && seqno % 2 != 0) { double delay = v baseRTT / cwnd ; delsnd timer .resched(delay); } }
5.2. Packet pairing In the recv() function the congestion window is increased by two segments on the receipt of every second acknowledgment, during the slow-start phase. It is worth noting that v_incr_ alternates between 0 and 1 every RTT, as assigned in the once per RTT code. This implements the second part of the packet pairing solution, as follows: if(v incr >0 && (cwnd -(t seqno -last ack ))v beta ) { /* * slow down a bit, retrack * back to prev. rtt’s cwnd * and dont incr in the nxt rtt */ --cwnd ; if(cwnd 1){ v ssdelta = 0; }else{ v sscwnd = v sscwnd * 1/2; } v rwcdec ++; } else if(delta