Department of Computer Sciences University of California, Los Angeles Technical Report No. 990038
BA-TCP: A Bandwidth Aware TCP for Satellite Networks Mario Gerla1 , Wenjie Weng1 , Renato Lo Cigno2 1 Computer Science Department, UCLA, Los Angeles, CA 90095, USA e-mail: gerla,
[email protected], Fax: +1-310-825 7578 2 Dipartimento di Elettronica, Politecnico di Torino, Italy e-mail:
[email protected], Fax: +39-011-564 4099
August 24, 1999 Abstract
In the presence of satellite channels, one of the most challenging problems for TCP is to achieve fair bandwidth sharing among several competing connections with round-trip propagation delays that may dier by more than two orders of magnitude. The Bandwidth Aware TCP (BA-TCP) provides a fair solution while maintaining the end-to-end semantics of transport protocols. At the same time, it allows backward compatibility and it does not require substantial changes to the existing TCP. In BA-TCP, the network layer is assumed to be able to convey propagation delay and available bandwidth measurements to end users, for instance using IPv6 optional elds. TCP receivers employ this information to compute a generalized advertised window, which, in turn, controls the amount of data the source injects into the network. Experimental results show that a satellite connection can fairly share a bottleneck with wired connections. Furthermore, since BA-TCP exploits the bandwidth-delay product to control the source congestion window, the queuing delay at the bottleneck link tends to zero at steady state and buer over ow is negligible in most situations. This work was supported by NASA contract NAG2-1249, by NSF contract 8NI9805436 and by a contract between Politecnico di Torino and CSELT (Telecom Italia Research Center). This is an extensive version of the paper accepted by IEEE ICCCN99.
1
1 Introduction Satellite systems are expected to provide remote users an Internet access and broadband data services. However, it is not clear whether TCP will work appropriately in the satellite environment due to the large round-trip delay, asymmetries, and transmission errors introduced by it. Many communication satellites are located at Geostationary Orbits (GEOs) with an altitude of approximately 36,000 km [1]. Therefore, the round-trip time (RTT) of a satellite connection is at least 479.2 ms [2]. This large RTT leads to several problems when endto-end TCP is used for data transfer over satellite networks, such as the longer connection setup time, coarser retransmission timer, and longer time for loss recovery. Another problem which is inherent in current TCP is the unfair sharing of bandwidth among the competing
ows with dierent round-trip delays. This unfairness problem is exacerbated when both satellite and ground links are present in the network. TCP uses a three-way handshake to setup a connection [3]. This connection setup requires 1-1.5 RTTs, depending upon whether the data sender started the connection actively or passively. To reduce connection setup time for connections with large RTT, T/TCP [4] proposes to allow senders to begin transmitting data in the rst segment along with the SYN after the rst connection between a pair of hosts is established. This can improve TCP performance for short transfers over satellite channels. Another proposal for short transfers is to use an initial congestion window of 4380 bytes (or a maximum of four segments) rather than one segment [5]. By increasing the number of bytes sent during the initial transmission, transfers of les with less than 4 Kbytes can usually be completed in one RTT. This may be helpful for Web server since many Web pages are less than 4 Kbytes. However, other mechanisms are needed for long lived connections. TCP acknowledgements are data driven. A connection with large RTT usually has smaller window growth rate and requires coarser retransmission timer for loss recovery. Satellite links have higher transmission errors than wired links. Depending on the satellite techniques and transponders used, the bit-error rate ranges from 10?10 to 10?4. The high biterror rate combined with large RTT seriously degrades performance for satellite connections. To overcome the performance degradation, researchers proposed to shield high-latency lossy links from the rest of networks by \splitting" transport connections in a manner transparent to end users, such as TCP spoo ng [6], TCP splitting, and Web caching. In TCP spoo ng, the gateway prematurely acknowledges data coming from the ground network side of the connection and destined to the satellite host. TCP splitting proposes to split the connection at the gateway and have two TCPs running at each network side. Web caching splits any TCP connection for requests that result in a cache miss. These approaches can improve performance for the satellite network. But they require substantial amount of buers. Especially if the number of active connections is large, it is not trivial to manage and to allocate 2
the computing resources for the connections. Besides, these proposals introduce a single point of failure within the network. Satellite channels can provide Internet access to home users at a speed 20 times faster than that of an average telephone modem [7]. However, this introduces both path and bandwidth asymmetries in the satellite network. Some satellite channels also exhibit a bandwidth asymmetry, with a larger data rate in one direction than the other, because of limits on the transmission power and the antenna size at one end of the link. A recent paper [8] proposes a Satellite Transport Protocol (STP) which uses selective negative acknowledgements, rather than the positive acknowledgement method of TCP. The STP transmitter retransmits only those speci c packets that have been explicitly requested by the receiver. Their results show that STP achieves an order of magnitude reduction in the bandwidth used in the reverse path, as compared to standard TCP, when conducting large le transfers. Perhaps the most challenging problem for application of TCP/IP protocol suite in united satellite and ground networks is how to achieve fair allocation of bandwidth among multiple connections with dierent RTTs sharing a bottleneck link. This paper proposes an enhanced end-to-end TCP for providing fair utilization of bandwidth in the satellite networks. We rst review TCP congestion control in the next section and then describes our proposed scheme in section 3. Section 4 presents the simulation results of our proposed scheme and compares it with other recent TCP proposals/implementations. Section 5 concludes the paper.
2 TCP congestion control TCP is a window protocol with dynamic window adjustment. Current TCP implementations are devised for wired networks where the link latency is small, and are based on the idea of probing network by increasing oered trac, until it becomes congested. Then, packets are dropped and are interpreted by TCP as congestion symptoms. TCP steps back by reducing the oered load and start a new cycle again. This mechanism inherently induces oscillations in the network. However, it works well when the time needed to transmit a window of data is larger than the round trip propagation delay (RTPD). It avoids ooding the network with inappropriately large bursts of data. The most commonly used mechanisms in TCPs are slow start, congestion avoidance, timeout expiration, fast retransmit, and fast recovery [9]. The slow start and congestion avoidance are the normal procedures for increasing the congestion window (cwnd) when data packets are transmitted and acknowledged successfully, while the other algorithms are related to congestion recovery. The congestion window is a sender-computed limit on the amount of data that can be transmitted in one window. The receiver, using ACK packet, advertises the amount of data it can store (rwnd). The actual data transmission is governed by the minimum of cwnd and rwnd. 3
A TCP sender operates in slow start mode when the connection is opened and every time a timeout expires. The initial value of cwnd is one packet. During slow start, a sender increases its window by one every time it receives an acknowledgment (ACK). If every packet is ACKed, cwnd is doubled every RTT. The sender enters the congestion avoidance mode when the congestion window reaches the slow start threshold (ssthresh). During congestion avoidance, the sender after an RTT, instead of doubling the congestion window, increases it by an amount of data equal to a maximum segment size (MSS) every RTT. Senders initialize ssthresh to in nity (i.e., to the maximum allowed window size) and reset it to half the current window size when a packet loss is detected. Early TCP implementations detected losses using timeout expiration at the source [3], causing slow start to be triggered upon every such loss [10]. This can lead to severe oscillations in buer occupancy at the routers, as well as in the TCP session throughput, when losses are random. This de ciency was corrected in more recent versions (e.g., Reno TCP) by adding Fast Retransmit and Fast Recovery mechanisms. When an out-of-order packet is received, the receiver sends a duplicate ACK (i.e., an ACK that carries the same sequence number as a previous one) to inform the sender that a packet was received out-of-order and which is the next expected packet. The fast retransmit algorithm uses the arrival of three duplicate ACKs as an indication of a packet loss. When three duplicate ACKs are received, the sender immediately re-transmit the packet expected by the receiver. The subsequent behavior depends on the protocol versions. For example, Reno TCP will invoke fast recovery procedure by setting both ssthresh and cwnd to half the window size and transmitting one new packet for each duplicate cumulative ACK received, even if this causes it to exceed the congestion window size. The fast recovery algorithm is terminated when a non-duplicate ACK is received. The sender then enters the congestion avoidance mode. In Reno TCP, the fast retransmit and fast recovery algorithms cannot always recover the lost packets. Consider the case where the congestion window is smaller than 4, the fast retransmit will not be triggered because the receiver cannot send 3 duplicated ACKs. Another example is when two consecutive packets are lost and three duplicate ACKs are received, the sender performs fast retransmit for the rst lost packet. If at this point the congestion window reaches the receiver's advertised window size, no new packets can be transmitted in fast recovery. After recovering one packet loss, the sender will be forced to wait for a timeout and then enter the slow start mode. To eliminate the possibility of timeout when multiple packets are lost from one window of data, NewReno TCP [11, 12] treats partial ACKs (i.e., the ACKs with sequence number smaller than the last byte transmitted when the fast retransmit was invoked) as an indication that the following packet was lost, and should be retransmitted. Thus, when multiple packets are lost from one window of data, NewReno can recover one lost packet per RTT until all 4
of the lost packets from that window have been retransmitted. NewReno remains in fast recovery until all the data outstanding at the beginning of the fast recovery is ACKed. TCP Selective Acknowledgement (SACK) is another proposal for improving TCP performance in the presence of multiple packet losses from one window [13, 14]. The SACK option proposed in [13] contains a number of SACK blocks, where each block reports a non-contiguous set of data received and queued at the data receiver. The rst SACK block speci es the data receiver's most recently received segment, and the additional SACK blocks repeat the most recently reported SACK blocks. With SACK option, the data receiver can inform the sender about all packets that have arrived successfully, so the sender only needs to retransmit the missing data packets. SACK TCP introduces additional overhead to manage the selective acknowledgement and requires both senders and receivers implement SACK. However, it can be very eective for recovering multiple losses, especially for connections with large RTT. Previous studies show that NewReno and SACK TCPs provide better performance than Reno TCP for loss recovery [14] when there is a single satellite connection in isolation. However, when these schemes are used in presence of both satellite and ground connections competing for the same (ground) bottleneck link, the satellite connection usually experiences substantial performance degradation as will be shown in our experimental results in section 4. This is because current TCPs have not knowledge about link bandwidth and current load. To overcome this and to improve the TCP performance in satellite networks, we propose a bandwidth awareness protocol as described in detail in the following section.
3 A Bandwidth Awareness TCP Protocol (BA-TCP) Recently, we explored the possibility of decoupling the error recovery capability of TCP from its congestion control capability. This is obtained by delegating the latter to an explicit control scheme based on network feedback [15, 16]. The scheme is based on sound control theory, which ensures the stability and performance of the system. It has proven easy to implement, backward compatible, resilient to varying network conditions and friendly to older TCP versions. The bandwidth awareness protocol proposed in this paper has many similarities with the generalized window advertising TCP (GWA-TCP) proposed in [16], such as requiring explicit feedback from network. Like GWA-TCP, it is a preventive algorithm for end-to-end TCP congestion control. GWA-TCP requires the network layer to notify end users of the router's available buer space, which in turn is used to set an upper bound on congestion window. GWA-TCP does guarantee no packet loss due to congestion. However, it is at the expense of large buer size at network routers for the case of large propagation delay and high bandwidth. BA-TCP reduces this cost and at the same time decouples the error 5
recovery capability from the congestion control. This decoupling is crucial for achieving a good performance for an end-to-end TCP in wireless (e.g., satellite) networks. In BA-TCP, the receiver's advertized window is used to convey to the sender both ow control and congestion control information. To achieve this, it requires the network layer to provide to the TCP layer information about round-trip propagation delay Rt and available bandwidth Ba for each connection. Assume that each IP router knows the propagation delay for each outgoing link connected to it and is able to compute the bandwidth Bi available for each ow1 . Then, Rt is obtained by summing all the link delay in both forward and backward paths. The Ba is the minimum of available bandwidths (Bi for i = 1; 2; :::; m, assume m links along the path) oered by the links in the forward path. Each IP router computes the available bandwidth for an outgoing link by
Bi = B=n where B is the total bandwidth of the outgoing link and n is the number of active ows. By using IPv6 with a round-trip propagation delay (RTPD) eld and an available bandwidth (ABW) eld in IPv6's extended header, Rt and Ba can be obtained from the network layer and conveyed to the end users. We assume that all packets of a connection follow the same forward and backward paths, as is the case in typical internet conditions. The connection setup packets (SYN) with RTPD eld in the IPv6 are used for obtaining a connection's round-trip propagation delay. The data packets with ABW eld in IPv6 are used for obtaining the available bandwidth of the forward path of the connection. Both the round-trip propagation delay and the available bandwidth are conveyed to the TCP receiver. As a consequence, the modi cation to the TCP receiver is (1) add a state variable for storing the round-trip propagation delay Rt upon receiving it from the connection setup packets (SYN); (2) after receiving a data packet with available bandwidth noti cation, the receiver computes BaRt and puts the minimum between BaRt and the initial rwndin in the receiver's advertized window eld rwndout in the ACK packet. Namely, rwndout = min (Ba Rt ; rwndin). With this modi cation, the advertised window eld in ACK packets has been \generalized" to combine both ow control and congestion control. The TCP transmitter reacts to the receiver's advertized window the same way as in conventional TCP. In the absence of receiver buer constraints (declared in rwndin), the window Ba Rt guarantees that the \pipe is kept full", which is the optimal operating condition for a sliding window protocol [17]. Moreover each ow gets equal share of bandwidth. As we know, TCP has slow start feature with initial window of 1 segment. A sudden increase of reasonable number of active ows usually does not lead to packet loss when BACurrent routers do not know the propagation delay of the outgoing links. One possible way is to load delay and bandwidth values at link con guration time. Another way of estimate link delay is by sending data between the two nodes (\pinging") during light load. The bandwidth of a link can be estimated by directly measuring the throughput of outgoing link during a busy period, when queue is not empty. In order to provide QoS in the Internet, we believe future routers will be able to provide this information. 1
6
TCP is used. This is because BA-TCP is aware of the available bandwidth provided by the network. Therefore, the queuing delay under steady state is very close to zero. If each router has adequate buer space to hold the sudden burst of packets, the oered load will be adjusted after one RTT. This suggests that there will be no packet loss due to congestion when BA-TCP is used. When a packet loss occurs, BA-TCP treats it as an indication of link error. This property is important in wireless and satellite environments where wireless links have higher BER and sometimes become unavailable due to whether conditions or handos, etc. In fact, it allows us to avoid shrinking the transmission window when packets get lost due to losses other than congestion. Accordingly, we modify the TCP sender in BA-TCP such that it does not reduce the window as a response to packet losses. More speci cally, BA-TCP implements fast retransmit and fast recovery like in SACK TCP except that it does not reduce the window size and slow start threshold after the fast retransmit. Upon timeout, the source retransmits one data packet and resets the retransmission timer to the previous value (i.e., the timer does not grow exponentially). The source resumes transmission when the retransmitted data packet is ACKed. For backward compatibility, it is advisable to superimpose the BA feature on top of an existing TCP. In this paper, the proposed BA-TCP version is implemented by modifying Reno and SACK TCPs in the ns-2 simulator2, developed at UC Berkeley within the VINT project3 . We have used the available ns-2 modules wherever possible. We introduce the RTPD and ABW elds in the IPv6 header and make the routers update the elds if needed. The drop-tail queuing discipline (no per ow queuing) is used since this is sucient for supporting BA-TCP.
4 Simulation results This section presents performance of ve dierent TCP implementations and network congestion control schemes in a satellite network. They are
BA feature on top of Reno TCP with droptail queuing discipline, named BA-Reno;
BA feature on top of SACK TCP with droptail queuing discipline, named BA-SACK;
Reno TCP with Random Early Detection (RED) [18] IP routers, named Reno;
NewReno TCP with RED IP routers, named NewReno;
SACK TCP with RED IP routers, named SACK.
2 ns-2 is the last version of the simulator; all necessary information and the software can be retrieved at the URL http://www-mash.cs.berkeley.edu/ns/ns.html 3 See the URL http://netweb.usc.edu/vint for additional information on VINT
7
1.5 Mbits/s 250 ms
G3 10 Mbits/s S1 S2 S3 S4 5 ms S5 S6 S7 S8
S0
G1
30 ms
G2
S9
D9
D0
D1 D2 D3 D4 5 ms D5 D6 D7 D8
Figure 1: Simulation scenario. In the last three schemes, the minimum and maximum thresholds for the average queue size of the RED routers are set to 1/10-th and 3/10 of the buer size, respectively. In all simulations, the router's buer size is 200 Kbytes. We want to focus on a case where several ground connections compete for a resource with a satellite connection. We therefore choose the simulation scenario shown in Fig. 1. There are 10 sources sending data to 10 distinct destinations through a single bottleneck link. Source S 0 sends data to the destination D0 (on path G1, G2, and G3) with a satellite link between G3 and the D0. The satellite link has a propagation delay of 250 ms and bandwidth of 1.5 Mbits/s. All the other links are wired with bandwidth of 10 Mbits/s. The propagation delay is 30 ms between G1 and G2 and is 5 ms for all the other links. With this topology, the RTT for connection 0 (i.e., ow ID 0) is 580 ms and is 80 ms for all the other connections. We are interested in TCP performance over a satellite connection in the presence of other competing ground ows at a wired bottleneck link. We assume that each source has an in nite data backlog to send. The size of data packets is 1 Kbyte. The simulation is run for 105 seconds. The start times for the ows are randomly and uniformly distributed over the rst second of the simulation. TCP clock granularity is set to 0.3 seconds. The receiver's window size is 128 Kbytes.
4.1 The \perfect" link case We rst focus on the performance of these schemes by assuming that there is no link error (i.e., perfect link). Fig. 2 shows the goodput for each ow during dierent intervals of the simulation run. The goodput shown in this gure has been normalized by the bottleneck fair share (i.e., by 1 Mbits/s). Therefore, a goodput of 1 represents a data throughput of 1 Mbits/s. The rst 5 seconds of the simulation run are discarded to exclude the network transient behavior. Fig. 2(a) shows the gootput averaged during time period 5-15 second for 8
(b) 5 - 35 second
BAP-SACK BAP-Reno NewReno SACK Reno
Goodput
Goodput
(a) 5 - 15 second 2 1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0
2 1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0
0 1 2 3 4 5 6 7 8 9 Flow ID
0 1 2 3 4 5 6 7 8 9 Flow ID
(d) 5 - 105 second
BAP-SACK BAP-Reno NewReno SACK Reno
Goodput
Goodput
(c) 5 - 55 second 2 1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0
BAP-SACK BAP-Reno NewReno SACK Reno
0 1 2 3 4 5 6 7 8 9 Flow ID
2 1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0
BAP-SACK BAP-Reno NewReno SACK Reno
0 1 2 3 4 5 6 7 8 9 Flow ID
Figure 2: Average goodput for each connection over time period (a) 5-15 second; (b) 5-35 second; (c) 5-55 second; and (d) 5-105 second of the run. the 10 ows. This gure shows an unfair division of bandwidth when Reno, NewReno, or SACK scheme is used. However, the throughputs obtained by BA-Reno and BA-SACK are fair in spite of the satellite connection (i.e., ow ID 0) with a very large RTT. Figs. 2(b), (c), and (d) show the goodputs averaged over longer time periods. These gures suggest that Reno, NewReno, and Sack can achieve equal division of bandwidth among non satellite connections if the experiment is long enough. Fairness however cannot be achieved for the satellite connection (i.e., connection 0). The goodput of the satellite connection obtained by Reno, NewReno, and SACK is between 0.2 to 0.3, which is much smaller than its fair share. To understand the dynamic behavior of each scheme, we plot the sender's congestion window size as a function of time for the satellite connection and a connection with ow ID 7 (this is randomly chosen from the 9 ground connections). We also plot the queue length at the bottleneck link for NewReno, SACK, and BA-Reno. For NewReno and SACK schemes [Figs. 3(a) and 4(a)], the satellite congestion window shows strong uctuations in the rst 20 seconds and then reaches a relatively steady state with average window size between 15 and 20 Kbytes. The ground connection reaches the steady state within 5 seconds. As expected, the connection with larger RTT needs longer time to adjust its window and thus its oered trac to the network condition. Another interesting feature is that after about 20 seconds, 9
window Size (Kbytes) queue length (Kbytes)
120
satellite connection
100 80
wired connection P
60
40 20
PP q P
B B B B
BN
0 0
10
20
30 Time (seconds) (a) NewReno, window size
40
50
0
10
20
30 40 Time (seconds) (b) NewReno, bottleneck queue length
50
200 150 100 50 0
queue length (Kbytes)
window Size (Kbytes)
Figure 3: The evolution of (a)congestion window; and (b) queue length at bottleneck link obtained from simulation of NewReno TCP. 120
satellite connection
100 80
wired connection
60 40
20
@ R @
A A A A U A
0 0
10
20
30 Time (seconds) (a) SACK, window size
40
50
0
10
20
40
50
200 150 100 50 0 30 Time (seconds) (b) SACK, bottleneck queue length
Figure 4: The evolution of (a)congestion window; and (b) queue length at bottleneck link obtained from simulation of SACK. the average values of the congestion window for both satellite and ground connections are almost identical. This is the main cause of unfair division of bandwidth among ows with dierent RTTs. This unfairness derives from the RED router's dropping policy which drops all incoming packets with the same probability. As can be expected, similar behavior exists 10
window Size (Kbytes) queue length (Kbytes)
120 100
satellite connection with 580 ms RTT
80 60 40
wired connection with 80 ms RTT
20 0 0
10
20
30 Time (seconds) (a) BA-SACK, window size
40
50
0
10
20
30 40 Time (seconds) (b) BA-SACK, bottleneck queue length
50
200 150 100 50 0
sequence number (Kbytes)
Figure 5: The evolution of (a)congestion window; and (b) queue length at bottleneck link obtained from simulation of BA-SACK. 7000
BA-SACK
6000 5000
@ @ R @
SACK
4000
HH H
3000 2000 1000
NewReno
HH j H
?
0 0
10
20
30 send time (seconds)
40
50
Figure 6: The sequence number versus sending time for the satellite connection. in Reno scheme ( gure not shown). The queue length at the bottleneck link [Figs. 3(b) and 4(b)] indicates the oscillatory feature of the RED routers. The congestion window obtained from BA-SACK [Fig. 5(a)] shows a very dierent behavior from those obtained from Reno, NewReno, and SACK. The satellite connection reaches steady state in about 5 seconds, i.e., the convergence is much faster than with previous schemes. During steady state, the congestion windows for both satellite and ground connections hold almost a constant value. However, the window size of the satellite connection is much larger than that of the ground connection. This is due to the fact that the steady state window size in this scheme re ects the fair bandwidth-delay product. Since each source transmits data packets based on the available bandwidth at the bottleneck link and on the round-trip propagation delay, the queue length is almost 0 [Fig. 5(b)] at steady state. The increase of sequence number with time (Fig. 6) for the satellite connection also con rms 11
the superior throughput performance of BA-SACK as compared with NewReno and SACK schemes. The simulation results obtained by BA-Reno are similar to Fig. 5.
4.2 Simulation results with lossy link Our next experiments are intended to investigate the performance of Reno, NewReno, Sack, BA-Reno, and BA-SACK in the same network topology shown in Fig. 1 by introducing errors in the satellite link. We rst assume there are random bit errors in both forward and backward direction with the same error rate. Fig. 7(a) gives the throughput obtained when the packet loss rate changes from 10?6 to 10?3. The throughputs obtained by BA-Reno and BA-SACK show only a slight decrease with the increase in error rate. The throughputs obtained by Reno, NewReno, and SACK is very similar to those without link errors (see Fig. 2). 1
1
0.8
0.8 BAP-SACK
goodput
goodput
BAP-SACK
0.6
BAP-Reno NewReno
0.4
SACK
0.6
NewReno
0.4
Reno
SACK Reno
0.2 0
BAP-Reno
0.2 0
10–6 10–5 10–4 10–3 packet error rate (%) (a) with random losses
0 0.02 0.04 0.06 0.08 0.1 down time interval (second) (b) with burst losses
Figure 7: The throughput of satellite connection with various packet error rates. We then investigate the performance of each scheme when multiple consecutive packets are lost from one window of data. This is done by introducing link down time interval for the satellite link. During this down time interval, all the packets transmitted to this link are lost. In our experiment, the link down time is randomly distributed with an average frequency of once every 10 seconds. We choose down time intervals of 0.01, 0.03, 0.05, 0.07, and 0.1 seconds, which correspond to a maximum losses of 2, 6, 10, 14, and 19 packets, respectively. Our simulation results [Fig. 7(b)] show that use of BA results in a much higher throughput than the other three TCP schemes. The throughput obtained by BA-Reno decreases with the increase of the down time interval. This is because although BA prevents the congestion window reduction after packet loss, Reno TCP only recovers one packet in each RTT. The throughput obtained by BA-SACK shows much better performance than BA-Reno due to that SACK mechanism combined with a selective repeat retransmission policy can recover multiple losses in one RTT. The throughputs obtained by Reno, NewReno, and SACK does 12
1.5 Mbits/s 250 ms
G3 10 Mbits/s S1 S2 S3 S4 5 ms S5 S6 S7 S8
S0
G1
30 ms
S9
G2
D9
1.5 Mbit/s B
D0 0.01 ms
D1 D2 D3 D4 5 ms D5 D6 D7 D8
Figure 8: Simulation scenario. not change much with the down time interval. There is throughput uctuation probably due to the eect of random drops at RED routers.
throughput (Mbps)
1 0.8 0.6
BAP-SACK BAP-Reno NewReno
0.4
SACK Reno
0.2 0
0 0.1 0.2 0.3 0.4 maximum down time (second)
Figure 9: The throughput of satellite connection with various maximum link down time intervals. Our next experiment investigates the performance of these schemes in the presence of both satellite and cellular networks (i.e., both space and ground wireless segments). We modify our simulation scenario as shown in Fig. 8, where there is a base station at the right side of the satellite link. The destination D0 is a mobile host. The link between the base station and the mobile host has bandwidth of 1.5 Mbit/s and propagation delay of 0.01 ms. In these experiments, we still assume that the ten sources have in nite data to send. The satellite link has bit error rate of 10?5 and the ground wireless link has down time interval randomly and uniformly distributed between 0 and t milliseconds. Fig. 9 shows the throughput of the satellite connection when t is chosen to be 50, 100, 200, and 300 ms. Despite the link errors introduced in the satellite connection, BA-SACK scheme is quite robust in achieving fair share of the bandwidth at the bottleneck link for the satellite connection with large RTT. BA-Reno obtains throughput smaller than BA-SACK, but still 13
much larger than the throughputs obtained by the other three schemes. This suggests that with BA added on top of other TCPs, the connections with large RTTs get fair treatment, as opposed to the case where no BA is used. The throughputs obtained by the other three schemes are around 0.2 Mbit/s or smaller and do not show a clear advantage of one over the other. This is because we introduced random variables in the simulation and therefore, each simulation has dierent dynamics.
5 Conclusions This paper investigates the performance of end-to-end TCP implementations on satellite channels in the presence of competing ground ows in the wide-area Internet. We have chosen three TCP implementations { Reno, NewReno, and SACK { since these schemes provide relatively better performance for loss recovery. Our simulation results show that the satellite connection experiences substantial throughput loss and has a clear disadvantage with respect to the smaller RTT connections. This is due to the fact that current TCPs do not have appropriate information of the underlying networks and therefore, their congestion avoidance algorithm results in an unfair bandwidth allocations when multiple connections with dierent RTT's share a bottleneck link. This problem has been reported by several researchers (e.g., [8], [19]). To overcome this de ciency, we propose a bandwidth awareness TCP protocol (BA-TCP) in which the TCP source sets the upper bound of the congestion window by considering both the receiver's buer space and the bandwidth-delay product conveyed from the IP layer. BA-TCP can be implemented by slightly modifying an existing TCP with fast retransmit and fast recovery algorithms. For the case of perfect links, we only need to modify TCP receiver by (1) adding one state variable to store the round-trip propagation delay and (2) advertizing the minimum of the receiver's buer space and the available bandwidth-delay product. Considering that data packets may be lost due to satellite link errors, we modify TCP source by keeping the congestion window and slow start threshold unchanged after a fast retransmit. This is because BA-TCP can avoid losses due to network congestion. Receiving of three duplicate acks is treated as an indication of packet loss due to link error. We have implemented BA-TCP by modifying Reno and SACK TCPs in ns-2 . Our simulation results suggest that BA-TCP achieves fair allocation of bandwidth for large RTT satellite connection when competing with other ground connections. The performance of the satellite connection shows about three times improvement over the TCP implementations presented in this paper. Furthermore, BA-TCP has much smaller transient time and the queuing delay at the bottleneck link is almost zero at steady state. Our simulation also shows that BA-TCP is robust in the presence of random link error with error rate smaller than 10?3. 14
The BA-TCP proposed in this paper is in an eort to maintain the end-to-end semantics and backward compatibility. BA-TCP provides a way to achieve fair sharing of bandwidth between satellite and ground connections by appropriately setting the upper bound of the congestion window and by keeping the window unchanged as response to packet losses. However, we have left the loss recovery to an existing TCP. For example, Reno TCP recovers packet losses at a rate of one packet per RTT. When a single packet is lost from one window of data, we would expect that the BA feature on top of Reno TCP does not suer much. This is shown in our simulation results. When multiple packets are lost from one window of data, the satellite connection experiences signi cant performance degradation while competing with other ground connections. When BA is implemented on top of SACK TCP, we showed that this enhanced TCP is quite robust in the presence of burst losses. We therefore suggest to combine the BA feature with SACK in the satellite networks.
6 Acknowledgements The authors wish to thank Dr. Saverio Mascolo for the useful discussions about control theory and its applications to network congestion control.
References [1] William Stallings. Data and Computer Communications. MacMillian, 4th edition, 1994. [2] James Martin. Communication Satellite Systems. Prentice Hall, 1978. [3] J. Postel. Transmission control protocol, Internet RFC 793, 1981. [4] R. Braden, T/TCP{TCP extensions for transactions, functional speci cation. Internet RFC 1644, 1994. [5] M. Allman, S. Floyd, and C. Partridge, Increasing TCP's initial window. Internet RFC 2414, 1998. [6] Y. Zhang, D. DeLucia, B. Ryu, and S. Dao, Satellite communications in the global internet: Issues, pitfalls, and potential. In Proc. INET'97, June 1997. [7] I. Minei, and R. Cohen, High-speed Internet access through unidirectional geostationary satellite channels. IEEE J. on selected areas in communications, vol. 17, no. 2, 1999. [8] T. R. Henderson, and R. H. Katz, Transport protocols for internet-compatible satellite networks. IEEE J. on selected areas in communications, vol. 17, no. 2, 1999. [9] W. Stevens, TCP/IP illustrated, Volume 1: The Protocols. Addison-Wesley, 1994. 15
[10] V. Jacobson, Congestion avoidance and control. In Proc. of ACM SIGCOMM'88, pp. 314-329, 1988. [11] J. Hoe, Start-up dynamics of TCP's congestion control and avoidance schemes. Jun. 1995, Master's thesis, MIT. [12] K. Fall, and S. Floyd, Simulation-based comparisons of Tahoe, Reno, and SACK TCP. ACM Comput. Commun. Rev., vol. 26, pp. 5-21, July 1996. [13] M. Mathis, J. Mahdavi, S. Floyd, and A. Romanow, TCP selective acknowledgment options. Internet RFC 2018, 1996. [14] S. Floyd, SACK TCP: the sender's congestion control algorithms for the implementation \sack1" in LBNL's \ns" simulator (viewgraphs). Technical report, Mar., 1996. [15] S. Mascolo, Smith's Predictor for Congestion Control in TCP Internet Protocol, To appear in Proceedings of American Control Conference 1999. [16] M. Gerla, R. Lo Cigno, S. Mascolo, and W. Weng, Generalized Window Advertising for TCP Congestion Control, CSD-TR 990012, UCLA, CA, USA, Feb. 1999. [17] A. S. Tanenbaum, Computer Networks. Prentice Hall PTR, 3d edition, 1996. [18] S. Floyd, and V. Jacobson, Random Early Detection Gateways for Congestion Avoidance, IEEE/ACM Transaction on Networking, Vol. 1, No. 4, pp. 397{413, Aug. 1993 [19] T. Lakshman, and U. Madhow, The performance of TCP/IP for networks with high bandwidth-delay products and random loss. IEEE/ACM Trans. Networking, vol. 5, pp. 336-350, June 1997.
16