Adaptive scheduling in DOCSIS-based CATV networks - Computer ...

5 downloads 20871 Views 336KB Size Report
Taipei, Taiwan. Email: [email protected] ... sending data packets, and improves the aggregate downstream throughput. ... ns2 to compare the original DOCSIS MAC layer control ..... use two-way TCP bulk transfers as traffic sources and.
Adaptive Scheduling in DOCSIS-based CATV Networks Huei-Jiun Ju and Wanjiun Liao Department of Electrical Engineering National Taiwan University Taipei, Taiwan Email: [email protected]

its effective time).

Abstract This paper studies the effect of the DOCSIS MAC layer on the performance of two-way TCP transfers in Hybrid Fiber Coax (HFC) networks. We propose a new adaptive scheduling scheme called Long Packet Deferment (LPD) at the headend to improve TCP performance in DOCSIS-based HFC networks. LPD reduces the frequency of transmission of long packets and, if such long packets are transmitted, they are scheduled towards the end of each transmission period. Thus, it allows the system to behave as in a symmetric network earlier, reduces the round trip delay of sending data packets, and improves the aggregate downstream throughput. We have conducted simulations using network simulator ns-2 to compare the simple first-come-first-served scheduling of DOCSIS and IEEE 802.14 like mechanism with LPD. The results show that LPD has better performance in terms of higher aggregate downstream throughput and shorter access delay.

Previous studies [1-4] have shown that TCP performance degrades when operating in HFC networks due to the bandwidth asymmetry of downstream and upstream channels. [5] investigates how DOCSIS MAC layer affects bandwidth asymmetry. This paper proposes a new protocol called “Long Packet Deferment” at the headend to solve the TCP performance problem caused by DOCSIS MAC layer scheduling and allocation mechanisms. In particular, we focus on two-way TCP transfers, i.e., both data and ACK packets are transmitted on the upstream channel. The rest of the paper is organized as follows. Section 2 analyzes the behavior of two-way TCP transfers over DOCSIS’s MAC layer. Section 3 presents the proposed mechanism called “Long Packet Deferment (LPD)” to improve TCP performance in DOCSIS-based HFC networks. Section 4 shows the simulation results using ns2 to compare the original DOCSIS MAC layer control mechanism with our mechanism. Finally, we conclude in Section 5.

Keywords: TCP, HFC, DOCSIS, asymmetric network

1. Introduction A Hybrid Fiber Coax (HFC) network is a promising technology to provide broadband access to the Internet. It is a tree-and-branch, point-to-multi-point access network in the downstream direction but a multipoint-to-point, bus access network in the upstream direction. The Multimedia Cable Network System Partners (MCNS) [1] and the IEEE 802.14 Working Group [2] are developing standards to enable data communication capabilities over HFC networks.

2. TCP Performance over the DOCSIS MAC Layer 2.1 The Effect of Bandwidth Asymmetry on TCP Performance Asymmetric networks, such as HFC and xDSL, are defined as networks with different channel capacities in the downstream and upstream directions. The main effect of bandwidth asymmetry on TCP performance is that TCP ACK clocking may be disrupted. [1] defines a bandwidth asymmetric ratio, k, to better understand the behavior of TCP in asymmetric networks:

MCNS’s Data-Over-Cable Service Interface Specifications (DOCSIS) specifies the physical layer modulation and the MAC layer operation for HFC networks. In DOCSIS, the upstream channel is modeled as a stream of mini-slots, is based on a contention-based reservation control mechanism to arbitrate random access to the channel, and is controlled by the Cable Modem Termination System (CMTS) at the headend. Each Cable Modem (CM) sends requests to the CMTS for the use of the upstream channel, and waits for data grants (i.e., slots) allocated by the CMTS. The CMTS regularly transmits downstream management messages called Upstream Bandwidth Allocation (denoted as MAP), which defines transmission intervals on the upstream channel. Each transmission interval specified by a MAP contains request mini-slots and data mini-slots. Request mini-slots are used by CMs to request upstream bandwidth; data mini-slots are used by CMs to transmit data frames. All CMs learn the assignment of bandwidth from the MAP. Thus, each MAP must be received by all the CMs before the beginning of the described transmission period (i.e., 0-7803-7533-X/02/$17.00 (C) IEEE

forward channel bandwidth ACK packet length × reverse channel bandwidth data packet length C L = d × ack (1) Cu Ldata

k=

TCP behaves normally when k is less than or equal to one. When bandwidth is asymmetric (i.e., k >1), ACK packets arrive at the bottleneck link in the reverse direction at a rate faster than the bottleneck link can support. As a result, the sender clocks out data at a slower rate and slows down the growth of the congestion window, which in turn lowers the throughput in the 543

downstream direction. Let Nc be the number of mini-slots allocated to the request contention period. Thus, for an ACK packet, (1) when NdCM ≦ 2Np_REQ, Tusv is bounded by

2.2 The Effect of DOCSIS MAC Layer on Bandwidth Asymmetry

[5] showed that eq (1) alone cannot adequately explain TCP’s behavior in DOCSIS-based HFC networks. Considering the MAC layer operation of DOCSIS v1.1, k is modified as

k =α ×

C d × Tusv d × L data × N dCM

( N c + NuCM Nu _ data + N dCM Nu _ ack ) tms ≤ Tusv

(5)

(2 N c + 2 NuCM Nu _ data + N dCM Nu _ ack ) × tms (2) when NdCM > 2Np_REQ, Tusv is bounded by

(2)

, for one - way transfers 1  N dCM L data , for two - way transfers where α =   N dCM L data + N uCM L ack d 

where d is a parameter of the delayed ACK policy (i.e., sending one ACK packet to acknowledge the receipt of d data packets), Tusv is defined as the average time between sending two consecutive packets in the CM buffer, NdCM is the number of simultaneous TCP downloading, and NuCM is the number of simultaneous TCP uploading.

( N c + NuCM Nu _ data + N dCM Nu _ ack ) tms ≤ Tusv



(6)

[ N c + NuCM Nu _ data + ( N dCM − N p _ REQ ) Nu _ ack ] tms N dCM ( N dCM − N p _ REQ )

Substituting eqs. (5) and (6) into (2), we can derive the two bounds of k for the two cases accordingly.

3. Long Packet Deferment In DOCSIS, the CMTS allocates at most one data grant to each client in a MAP, irrespective of the number of mini-slots requested. The design philosophy behind “one-CM-one-Data-IE” in a MAP is to fairly share the channel bandwidth, i.e. no one will monopolize all of the bandwidth in a transmission period. Such fairness, however, holds only for one-way TCP transfers. With two-way TCP transfers, both data and ACK packets are transmitted upstream. Typically, TCP packets are ten times larger than ACK packets. Thus, both types of packets require different sizes of data grants to be allocated by the CMTS. [5] showed that with two way transfers, the upstream TCP traffic (i.e., long data packets) might throttle the downstream traffic (i.e., short ACK packets). The reasons are as follows. (1) Long data packets going upstream cause high asymmetric ratio, and thus long round trip time, to short ACK packets. Long round trip delay in turn reduces the growth rate of the downloading CM’s congestion window because the congestion window of TCP grows at a rate inversely proportional to average round trip delay. (2) The “one-CM-one-Data-IE” allocation of DOCSIS leads to both upstream and downstream transfers experiencing the same round trip delay. Thus, the congestion windows of these two types of transfers grow at approximately the same rate. This results in low bandwidth utilization on the downstream channel because the downstream channel typically has far higher capacity than the upstream one. Should these two types of TCP transfers grow their congestion windows at the same rates, it would underutilize the bandwidth on the downstream channel.

In DOCSIS, the upstream channel is modeled as a stream of mini-slots. A transmission starts only at the beginning of any mini-slot. Let Nu_ack be the number of mini-slots used to transmit one ACK packet on the upstream. Given Lack, Cu,, and tms, we can derive the number of mini-slots used to transmit an ACK packet as L 1  (3) N u _ ack =  ack ×  C t ms   u where Lack is the size of an ACK packet, tms is time period defined as one mini-slot on the upstream channel, and Cu is the upstream channel capacity. Similarly, we can derive the number of mini-slots used to transmit a data packet as L 1  N u _ data =  data ×  , where Ldata is the size of a data C t ms   u packet. A MAP describes the bandwidth allocation in a transmission period. It should be received by all the participating CMs before its effective time. Each MAP may be transmitted before some requests, especially piggybacked ones, have arrived and been processed at the CMTS. The late requests are deemed pending in the current transmission period and become backlogged requests in the next period. These pending requests plus new requests which arrive at the CMTS during the next transmission period will be waiting to be granted in the next MAP. Let DMAP be the time difference between when a MAP is transmitted and when it goes into effect. Since the size of the ACK packet is fixed, the maximum number of pending requests, Np_REQ, (i.e., those arrive at the CMTS during a DMAP.) can be expressed as

D 1  N p _ REQ =  MAP ×  Nu _ ack   tms



To summarize, DOCSIS treats TCP data packets the same way as ACK packets on the upstream channel. It

( 4)

544

3.2 Analysis of the LPD Mechanism

allows each CM to send at most one data grant in every MAP, irrespective of its frame size. This causes the same round trip delays for both downstream and upstream transfers, and results in poor bandwidth utilization on the downstream channel. To solve this problem, we propose a mechanism called "Long Packet Deferment" (LPD), which treats long packets (data packets) differently from short packets (ACK packets) on the upstream channel. The design goal of the LPD protocol is to reduce the sending rates of long packets and increase those of short packets, in an attempt to achieve true fairness in resource sharing. Thus, we can shorten the round trip delay of downstream TCP transfers without seriously degrading the performance of upstream TCP transfers. Note that the LPD mechanism is designed to operate at the CMTS only. No modification is required on CMs.

The LPD mechanism makes downloading CMs experience different round trip delays from uploading CMs, with the ratio of the upstream packet sending rates of these two types of CMs being Ndef:1. In addition, LPD forces the CMTS to place long data grants at the end of each MAP because it starts allocation from the short job queue. Thus, the LPD mechanism keeps the asymmetric ratio and TCP round trip delay at the lower bound if there is any long data grant allocated in a MAP. (a) Asymmetric ratio The LPD mechanism causes different round trip delays to downstream and upstream TCP transfers. We first derive the upstream data service times for both downloading and uploading CMs.

3.1 LPD Fundamentals

(1) For downloading CMs (i.e., ACK packet), N Tusv = ( N c + uCM × N u _ data + N dCM Nu _ ack ) tms N def

We assume all data packets are of fixed size, and distinguish between two types of requests only (i.e., for long and short data grants). Later we will extend to packets with variable size in Sec. 3.3. Let δ be the threshold to determine the type of a request, where N u _ data − N u _ ack δ= . Intuitively, if the data grant size 2 requested exceeds the threshold δ , the request should be deferred a few more steps before the allocation is granted. Let Ndef be the number of steps a long packet should be deferred. Each downloading CM (i.e., those transferring short ACK packets) can get a data grant in every MAP, but each uploading CM (those transferring long data packets) can only get one data grant in every Ndef MAPs.

(7)

Since the number of mini-slots for a long data grant (for TCP packet) is much larger than the number of mini-slots for a short data grant (for ACK packet), Nu_data >> Nu_ack and N def >> 1 . Comparing eqs. (7) with (5) and (6), we

see that the upstream data service time of the downloading CM is significantly reduced. (2) For uploading CMs (i.e., data packet), Tusv = ( N def N c + NuCM Nu _ data + N def N dCM N u _ ack ) tms Comparing eqs. (8) with (5) and (6), we see that the LPD mechanism may slightly increase the upstream data service time of the uploading CM.

Suppose that the CMTS have two types of queues to store requests: a long job queue and a short job queue. Upon receipt of a new request, the CMTS processes the request as follows. If the data grant size requested is larger than δ , the CMTS initializes the request’s deferred step to Ndef and puts it into the long job queue; otherwise, the number of deferred steps is set to one and the request is put into the short job queue. Later when it comes to the transmission time of the next MAP, the CMTS will start processing requests from the short job queue, followed by the long job queue, all on a first-come-first-served basis. The CMTS allocates data grants only to those requests with the number of deferred steps less than or equal to one, and removes the allocated requests from the respective queues. For those with larger-than-one deferred steps, the CMTS decrements the values of their deferred steps, issues them data pending IEs in the MAP, and puts them back into the queues. This process continues until the limitation of a MAP is reached (2048 mini-slots and 240 IEs), when the CMTS will stop both allocating data grants and decrementing the deferred steps. It will then start issuing the remaining eligible requests (i.e., those with deferred steps of less than or equal to one) data pending IEs on the MAP because the CMs should be notified that their requests are pending, not lost.

The asymmetric ratio k of downstream TCP transfers for LPD can be derived as N def N dCM Ldata Cd k= × × NuCM Ldata d × Ldata N def N dCM Ldata + d NuCM × Nu _ data + N dCM Nu _ ack ] tms [ Nc + N def (9) N dCM Compared with the original scheduling (i.e., simple FCFS), k in eq. (9) is smaller due to smaller upstream data service time. A smaller k will speed up the dropping of the value of k to one, when the system will behave normally, i.e., as in the symmetric network. This in turn shortens the round trip delay, which results in larger downstream throughout. (b) Round trip delay

With two-way transfers, the system is mostly operated as in asymmetric networks. Eq. (10) shows the average round trip delay of sending a packet as k>1.

545

(8)

RTT = 2 T + Ttrans + BCM × Tusv

Typically, N uCM is smaller than N dCM . Both values are usually very hard to determine dynamically.

(10)

where BCM is the buffer size of the CM, and

Ttrans

 Ldata Lack  C + C , for downstream traffic  d u = L L  data + ack , for upstream traffic  C u Cd

From the discussion above, D x should be upper   C  Lx  bounded by r  and r d  , where 0 < r < 1 .  N u _ ack   Cu  In the LPD protocol, we set the number of deferment  C  groups n = r d  , where 0 < r < 1 1 . Each group  Cu  corresponds to a certain range of data grant size. Dx of request R x is determined as follows.

Substituting eqs. (7) and (8) into (10), we can derive the average round trip delays of downstream TCP transfers and upstream TCP transfers accordingly. The last term of eq. (10) dominates RTT. Thus, LPD has far shorter round trip delay for the downstream TCP transfer compared to the original scheduling (i.e., simple FCFS) but slightly longer round trip delay for the upstream TCP transfer.

(1) If 0 < Lx