Experiences With WM: A Centralised Scheduling ... - ECE@IISc

5 downloads 4869 Views 483KB Size Report
Email: malati, pavank, vasudevkr, anand, [email protected], [email protected]. Abstract—In ... bulk transfers for practical sizes of the AP buffer (which stores the ..... channel to various services such as http, ftp, and smtp, to various ...
Experiences With WM: A Centralised Scheduling Approach for Performance Management of IEEE 802.11 Wireless LANs Malati Hegde∗, Pavan Kumar∗, K. R. Vasudev∗, S. V. R. Anand∗, Anurag Kumar∗, Joy Kuri† ∗ Department

of Electrical Communication Engineering, Indian Institute of Science, Bangalore, India for Electronic Design and Technology, Indian Institute of Science, Bangalore, India Email: malati, pavank, vasudevkr, anand, [email protected], [email protected]

† Centre

I. I NTRODUCTION Enterprise and home WLANs are based on the CSMA/CA based Distributed Coordination Function (DCF) MAC (medium access control) standardised in IEEE 802.11. It is well known that, due to the intrinsic properties of the DCF MAC protocol, there are several limitations to the QoS (quality of service) offered by WiFi WLANs. For example, from Figure 1 we see that when out of 3, 6, 9, 12, etc., STAs if 31 of the STAs are associated with an AP at the each of the rates 54 Mbps, 24 Mbps, and 6 Mbps, then the aggregate bulk download throughputs of the STAs associated at each rate is the same (just over 3 Mbps: bottom three curves) and the total download throughput is about 9.5 Mbps This work was supported by the Department of Information Technology (DIT), Ministry of Communications and Information Technology, India.

2.4e+07

"Aggregate_thruput" "rate_class_54" "rate_class_24" "rate_class_6" "only_54M_stations" "only_24M_stations" "only_6M_stations"

2.2e+07 2e+07 Throughput in bps --->

Abstract—In our earlier work ([1]) we proposed WLAN Manager (or WM) a centralised controller for QoS management of infrastructure WLANs based on the IEEE 802.11 DCF standards. The WM approach is based on queueing and scheduling packets in a device that sits between all traffic flowing between the APs and the wireline LAN, requires no changes to the AP or the STAs, and can be viewed as implementing a “Split-MAC” architecture. The objectives of WM were to manage various TCP performance related issues (such as the throughput “anomaly” when STAs associate with an AP with mixed PHY rates, and upload– download unfairness induced by finite AP buffers), and also to serve as the controller for VoIP admission control and handovers, and for other QoS management measures. In this paper we report our experiences in implementing the proposals in [1]: the insights gained, new control techniques developed, and the effectiveness of the WM approach in managing TCP performance in an infrastructure WLAN. We report results from a hybrid experiment where a physical WM manages actual TCP controlled packet flows between a server and clients, with the WLAN being simulated, and also from a small physical testbed with an actual AP. Index Terms—WLAN QoS management, WLAN controllers, split-MAC WLAN controllers, IEEE 802.11 wireless networks

1.8e+07 1.6e+07 1.4e+07 1.2e+07 1e+07 8e+06 6e+06 4e+06 2e+06 0

5

10 15 Number of stations --->

20

25

Fig. 1. Simulation results showing TCP download throughputs for n STAs when all n STAs are associated at the PHY rates of 54 Mbps, or at 24 Mbps, or at 6 Mbps, and also when 13 of the STAs are at each of the three rates.

(third curve from the top), whereas when all the STAs are associated at 54 Mbps then the aggregate throughput is about 22.5 Mbps (the top curve); see also [2] for the same observation in a saturated node setting. In practical WLAN deployments, such mixed rate associations can be expected, thus leading to a reduction in the overall network throughput. For an analytical understanding of this phenomenon in the context of TCP controlled transfers see [3] and [4]. Another QoS problem reported (see [5]) is that of unfairness between upload and download TCP controlled bulk transfers for practical sizes of the AP buffer (which stores the packets contending for access to the wireless medium). Real time interactive traffic such as that due to voice over IP (VoIP) telephony is known to perform poorly in the presence of TCP traffic in WLANs. The IEEE 802.11e standard provides an Enhanced DCF (EDCF) in which STAs carrying voice transfers can obtain prioritised service. While these mechanisms serve to isolate VoIP traffic from TCP transfers, admission control of VoIP connections is needed, specially if some rate guarantees are to be provided to TCP transfers (see

Fig. 2. Top: An enterprise network in which all traffic between the WLAN and the wireline network passes through one link. Bottom: The same network with a WM device in place.

[6]). Further, after a VoIP call has been set up, if the wireless voice end-point moves across the coverage areas of APs then handover mechanisms are needed. WLAN Manager (WM): It is with these QoS issues in mind that we proposed WLAN Manager (WM [1]), a device through which all traffic between the WLAN and the wireline LAN passes (see Figure 2). While several design principles employed in WM were elucidated in [1], and were supported by simulations or small experiments, we have since developed a WM prototype (on a standalone Linux machine). In this paper we report some of our experiences in implementing WM, the highly encouraging results obtained, the insights gained, and the new control techniques developed; we focus here exclusively on QoS management of TCP transfers. Contributions of this Paper: In [1] we proposed a Split-MAC based architecture (see the CAPWAP RFC [7]) that works with off-the-shelf commercial products without requiring any changes. In this paper we present TCP performance enhancement features that have been incorporated in the WM prototype based on the same architecture. As a part of enterprise WLAN management, these features allow the administrator to easily implement different WLAN performance policies in order to provide different service levels to its WLAN users. We present a series of experimental results to demonstrate the effectiveness of the proposed solution under different WLAN scenarios. We report on the insights that were gained and the control techniques that were developed. We show that it is indeed possible to meet the QoS performance objectives without compromising the wireless channel utilization and scalability, all externally. We show that WM can effectively manage upload-download fairness, fairness among stations while maximizing channel utilization, and policy based service differentiation. Related Literature: Since the publication of our pro-

posal [1] another research group has independently reported a similar idea (see [8] and the references therein). While the general motivating concerns in the two proposals are similar, we have taken some specific TCP performance issues, namely (i) fairness and efficiency due to different PHY rates, (ii) fairness for arbitrary combinations of upload and download connections, and (iii) policy based TCP throughput differentiation, to the stage of complete implementation as a WM prototype. Further, we have provided a technique for WM to adaptively learn the effective service capacity of the WLAN. In the published literature, several other solutions have been proposed for the TCP QoS problems discussed earlier. These solutions are based on queueing and scheduling policies to be implemented in an AP, or on the regulation of MAC parameters, or on changes in the client drivers. A variety of techniques that employ modifications to the AP Software are: selective packet marking (Wu et al. [9]), weighted round robin service in the AP (Vacirca et al. [10]), TBF (Blefari-Melazzi et al. [11]), receiver window modification (Pilosof et al. [12]), rate limiter (Detti et al. [13]), and Weighted Fair–EDCA (WF–EDCA) (Lee et al. [14]). Some proposals are based on setting various 802.11 parameters to appropriate values: MTU (Yoo et al. [15]), CWmin (Kim et al. [16]), PIFS (Kim et al. [17]), AIFS, TXOP, and CMIN (Leith et al. [18]). Some of the prominent vendors that manufacture WLAN controllers/switches for managing WLAN channel are Meru Networks, Extricom, Cisco/Airespace. The controllers seem to require proprietary APs. The QoS functionality provided by the access controllers could be as simple as mapping connections to the IEEE 802.11e access categories (ACs). Meru Networks and Extricom use a single channel across all the APs to provide QoS for voice over WLANs. The controllers of other vendors focus on RF management and/or security authentication. Thus, these approaches require either changes to MAC parameters and/or modifications to the firmware running on the AP. On the other hand WM [1] does not require any modification to the MAC parameters, and works with any existing IEEE 802.11 based infrastructure WLAN. II. O UR E XPERIMENTAL A PPROACH In this paper we report results from a hybrid testbed that combines physical network devices and a QualNet simulator, and also from a testbed with all physical network devices. The hybrid testbed is shown in Figure 3, and permits us to experiment with a larger WLAN than is practical to set up in our laboratory. The QualNet simulator (v 4.0) is used to emulate the MAC and PHY of the WLAN devices. QualNet provides a mechanism to inject actual Internet traffic into the simulated wireless nodes through its external interface APIs. As shown

6 Download Aggregate Throughput Upload Aggregate Throughput

Throughput (Mbps) --->

5

4

3

2

1

0 0

50

100

150

200

250

300

AP Buffer Size (Packets) --->

Fig. 3.

Fig. 4. Aggregate upload and download throughputs of 10 STAs (5 performing uploads and 5 performing downloads) at 11 Mbps with varying input AP buffer size. TCP window scaling is enabled.

The hybrid testbed

in the figure, the hybrid testbed comprises a physical server, an Internet link emulator, a router, an Ethernet LAN switch, the Linux machine on which WM is implemented, and laptops on which the client applications run. The QualNet simulator models virtual STAs, each of which is mapped to a real end system. The TCP transfers are initiated from the end systems; then the packets of the TCP flows pass through the simulated MAC in the QualNet simulator, thus experiencing the WLAN contention and delays as they would have in a real network. Results from a completely physical experiment were based on a single WME (WiFi Multimedia Extension) compliant Cisco Aironet AP (1240 series), and standard laptops running the Linux OS (Fedora 8). III. TCP Q O S: I LLUSTRATION OF

THE

P ROBLEMS

A. Multi-rate Unfairness This problem was already mentioned in the Introduction and illustrated via Figure 1, which was obtained from the hybrid testbed. The mixed PHY rate scenario leads to a drastic reduction in the overall WLAN throughput, while equalising connection throughputs, since the DCF MAC is packet fair and hence gives an equal chance for all backlogged nodes to contend irrespective of the PHY rates at which they can send. B. Upload-Download Unfairness In addition to the literature discussed earlier, we have made observations about TCP upload-download unfairness that we briefly present here. Let us first recall the reason for the upload-download unfairness reported in [5]. In an infrastructure WLAN, all traffic between the STAs and a server on the wireline network must pass through the AP. We recall that the AP is one device that contends to send packets for all the connections, whereas each STA contends only for its own connection. The AP is thus a bottleneck

in the infrastructure WLAN setting (see [3] and [4]). The packets from upload and download TCP connections (respectively, TCP ACKs, and TCP data packets) fill up the AP queue, which is stored in a finite buffer. When the TCP windows grow, packet drops take place at the tail of the AP buffer. For the downlink transfers, data packet drops at the AP buffer affect the growth of the TCP congestion windows. Since ACK packets (corresponding to uplink transfers) are smaller, they suffer a smaller drop rate. Further, TCP ACKs are cumulative, i.e., each ACK acknowledges all the packets up to the packet that it acknowledges. Hence, uplink transfers may not be as severely affected by tail-drop packet losses at the AP buffer. In our experiments we have found that with the recent versions of TCP that implement window scaling this reported upload-download unfairness can be very severe, and is not relieved by increasing the AP buffer space; see the results reported later in this section. To illustrate the unfairness problem between uplink and downlink traffic flows, we ran experiments on the hybrid testbed. Ten wireless STAs are connected at the 11 Mbps PHY rate to an AP. STA 1 to STA 5 each upload a large file to a local server via the AP, while STA 6 to STA 10 download the same size file from the server. All the transfers are initiated simultaneously using the Linux wget file transfer utility. TCP window scaling is turned on by default on most operating systems; hence, the receiver advertised windows can grow to very large values. TCP receivers employ the delayed ACK mechanism. The buffer size in the AP is varied from 10 to 350 packets (15 KBytes to 525 KBytes buffer at AP) and the TCP maximum segment size (MSS) is set to 1500 bytes. The 10 transfers are allowed to run for a period of 300 seconds. Figure 4 shows the aggregate upload and download throughputs for the two groups of 5 STAs each, for varying AP buffer size. We see that the 5 STAs uploading files to the server obtain by far

1

1000

snd_cwnd_sta6 snd_cwnd_sta7 snd_cwnd_sta8 snd_cwnd_sta9 snd_cwnd_sta10

800

0.8 Throughput (bps) --->

Segments (congestion window)

900

STA1 Download Throughput STA2 Download Throughput STA3 Download Throughput STA4 Download Throughput STA5 Download Throughput STA6 Upload Throughput STA7 Upload Throughput STA8 Upload Throughput STA9 Upload Throughput STA10 Upload Throughput

700 600 500 400

0.6

0.4

300 0.2

200 100 0 0

50

100

150 200 Time (seconds)

250

300

0 100

350

150

200

250

300

AP Buffer (Packets) --->

Fig. 5. Congestion windows (vs. time) for the 5 upload TCP transfers, when 5 upload transfers and 5 download transfers are in progress. AP buffer size = 100 packets

Fig. 7. Individual upload and download throughputs of 10 STAs at 11 Mbps with varying input AP buffer size. TCP window scaling is disabled. 1.4 Average Uplink Throughput

45

snd_cwnd_sta1 snd_cwnd_sta2 snd_cwnd_sta3 snd_cwnd_sta4 snd_cwnd_sta5

35

1.2

1 Throughput (Mbps)

Segments (congestion window)

40

30 25 20 15

0.8

0.6

0.4

10 0.2

5 0

0 0

50

100

150 Time (seconds)

200

250

300

STA1 STA2 STA3 STA4 STA5 STA6 STA7 STA8 STA9 STA10 Time (seconds)

Fig. 6. Congestion windows (vs. time) for the 5 download TCP transfers, when 5 upload transfers and 5 download transfers are in progress. AP buffer size = 100 packets.

Fig. 8. Upload transfer throughputs of 10 STAs associated at 11 Mbps PHY rate.

the larger part of the network throughput. We observe that an increase in the AP buffer size has little impact on the aggregate throughput seen by both uplink and downlink transfers. Figure 5 and Figure 6 shows the evolution of the TCP congestion windows for the uplink and downlink transfers of these 10 STAs for an input AP buffer size of 100 packets (150 KBytes). It is seen from these plots that the TCP congestion windows of the uplink transfers grow to large values and thereby the AP more often serves packets of these connections, which therefore obtain a very high throughput as compared to the downlink transfers whose congestion windows stay at much smaller values. In a similar experiment reported in [5], the throughput of upload and download transfers becomes comparable at an AP buffer size of 150 packets without TCP window scaling. For confirmation we conducted a similar experiment on our hybrid testbed, with window scaling turned off, where we observed that the aggregate upload and

download throughputs become roughly equal at an AP buffer size of 230 packets, as shown in the Figure 7. It is important to note, however, that most Linux kernels and Microsoft Windows systems implement TCP window scaling in order to efficiently fill up the pipe with a large RTT. C. Unfairness Among Multiple Uploads We have also found that where there are only several upload transfers a newly initiated TCP flow can lose ACK packets associated with its first few data transmissions, thus inducing a timeout. The ACK packets for the retransmitted data packets can also be lost, leading to further timeouts (with associated doubling of the retransmit timer), and so the flow can be completely starved for long periods. We conducted an experiment with 10 STAs all associated at 11 Mbps rate to the AP on the hybrid testbed. Upload transfers are initiated simultaneously from all the STAs and throughput observations are made for 300 seconds. The bar plot in

Fig. 9. WM: High level software architecture. When the positions of the two “switches” are as shown, then WM is in the path of the traffic flow; when both positions are reversed, WM is bypassed.

Figure 8 shows the average throughputs of the 10 upload connections. Only some of the STAs (6 to 10) obtain considerable bulk transfer throughputs, whereas some STAs (3, 4 and 5) obtain much smaller throughputs. Thus, even among upload transfers there can be random unfairness depending on loss events, and the consequent unfairness in TCP window evolution. IV. WM: A LGORITHMS

AND I MPLEMENTATION

A. The WM Approach [1] We have implemented WM on a Linux based PC. The device is located (see Figure 2) so that all packets that pass through the AP(s) also pass through the device. For TCP-controlled traffic, all packets (data or acknowledgments) going towards the WLAN are queued in a hierarchical WFQ queueing engine in WM (see Figure 9). These packets are then served by a virtual server that serves the queues using a fair queueing algorithm (see Figure 10). A key innovation here is that the rate of the virtual server is dynamically adjusted by a rate adaptation algorithm. The service rate is supposed to track an effective service rate on the wireless medium, which depends on the number of STAs connected at each PHY rate and on the fairness objective. Details are provided in this section. The basic ideas behind WM [1] are the following: 1) If packets are allowed to accumulate in the AP and the STAs, then the usual behavior of IEEE 802.11 DCF shows up; see Figure 1. WM aims to retain the AP queue within itself, and to manage the release of packets in such a way that the user configured throughput “fairness” objectives are met. Upload TCP connections are controlled by managing the downlink TCP ACKs in separate queues in WM.

Fig. 10. WM: The core queueing and rate adaptation schematic. The left panel shows the hierarchical queue. The right panel shows packet classification, admission control, and virtual server with an adaptive service rate.

2) WM implements a hierarchical fair queuing engine with an adaptive rate virtual server. The choice of the virtual server’s service rate is crucial to the proper working of WM. If the service rate is too high, packet accumulation occurs in the AP and not in WM, and thus WM loses the ability to enforce a desired service ratio between the packets of the various connections. On the other hand if the WM service rate is too low, then the WLAN “starves” and the system is inefficient. We have developed an on-line rate adaptation algorithm that dynamically adapts the service rate, even as the number of STAs associated at each rate varies. 3) WM schedules the packets such that there are no packet drops at the AP. Further, any desired throughput fairness (e.g., proportional fairness, or max-min fairness) between uplink and downlink data transfers can be achieved. 4) WM is able to handle connection admission control of CBR packet voice while providing the desired fairness among TCP connections. This is achieved by exploiting the service differentiation (between voice and TCP) provided by IEEE 802.11e, along with the adaptive rate virtual server described above. B. WM Software Architecture A schematic of the high level WM software architecture is depicted in Figure 9. In the packet capture module we capture the incoming traffic from the wired link towards the AP. The idea is to buffer only the packets coming from the wired link, going towards the WLAN, in order to achieve the desired fairness in both directions. Note that this is possible for TCP upload connections since the flow of ACKs in the downlink direction can be controlled, thereby controlling

14

"54M x 4, 24M x 4, 6M x 4" "54M x 3, 24M x 3, 6M x 3" "54M x 1, 24M x 2, 6M x 3"

12

Throughput in Mbps --->

the TCP throughput. The packet classifier module places an arriving packet into one of the physical queues in a leaf node of the hierarchy shown in the left panel of Figure 10. The hierarchy is created based on configurable policies that define the allocation of time on the wireless channel to various services such as http, ftp, and smtp, to various users or to groups of users. The information in the IP and TCP headers is used to form a 4-tuple (source address, destination address, source port and destination port) in identifying the direction and the leaf node in the hierarchy. The voice packets are given strict priority and appropriate DSCP marking is done in WM before forwarding to the AP. These marked packets get mapped to the voice access category, AC3, within an 802.11e AP. WM therefore takes advantage of the QoS offered by 802.11e. The SNMP module periodically polls the APs to obtain information on their associated STAs; this relationship is explained later in Section IV-B2. The SMCD module captures and stores various statistics of the traffic passing through WM. The WM Core is where all the QoS algorithms are implemented. Below we describe the major building blocks of WM Core. See Figure 10. 1) Optimal service rate in WM: The methodology to achieve the desired throughput fairness requires the WM scheduler to serve the packets at an optimal service rate; recall the discussion in Section IV-A. We need to obtain the optimal WM service rate that permits the queueing to take place in WM so that it can control the throughput ratio, while keeping the WLAN maximally utilized. Here is how we can think of the optimal service rate. Suppose there are m STAs associated with the AP, at the PHY rates ri , 1 ≤ i ≤ m. Each is carrying out a large file download from the server on the Ethernet LAN. Let us imagine m queues in WM, one for each connection. The queues contain all the application level bits that need to be transmitted on the wireless medium for that connection. Thus, we imagine that for each downlink data packet even its uplink TCP ACK is queued in WM; and for each downlink ACK packet, the corresponding TCP data packets that will be generated are also queued in WM. Suppose that ideal (bit level) fair queueing is used to serve these queues Pm using the weights φi = (φ1 , φ2 , ..., φm ), with j=1 φj = 1. Suppose all the queues are backlogged. Then, out of b bits (where, b is a large number) sent by WM, φi b bits belong to connection i. Let Ci be the effective rate at which these bits are actually transmitted over the wireless medium, where we now account for the MAC and PHY overheads, and various interframe spaces. Then the time occupied on the medium by these bits from connection i is φCiib . The total time taken to transmit all the b bits from the Pm m connections is then given by i=1 φCiib . Dividing b

10

8

6

4

2

0

0

5

10 15 Serving Rate in Mbps(C) --->

20

25

Fig. 11. Aggregate throughput for bulk file downloads, with WM, for various numbers of STAs associated at 54, 24 and 6 Mbps (IEEE 802.11a) plotted vs. the WM service rate C ∗ .

by this expression yields the effective rate at which the medium will carry bits above the MAC layer, i.e., C∗ =

1 φ1 C1

+

φ2 C2

+ ... +

φm Cm

(1)

Thus, C ∗ depends on the weights, φi , 1 ≤ i ≤ m, and the PHY rates at which the STAs are connected. Now the idea of serving the queues of actual packets in WM is the following: 1) Virtually replace each packet in the WM queue by the number of higher layer bits (above MAC and PHY) that will need to be sent on the medium if the packet is released into the WLAN (details provided later in this section). 2) Adapt the service rate of these queues (of virtual bits) so that the rate is a little less than C ∗ . Adaptation of C ∗ is needed since the population of active STAs, and the PHY rates at which they are associated will keep varying over time. As an example, if mj STAs are associated at rate rj , 1 ≤ j ≤ n, then we have n queues in WM. Consider the following weights, for 1 ≤ i ≤ n, mi Ci φi = Pn j=1 mj Cj

(2)

i.e., proportional fairness or time fairness (i.e., the target throughputs are proportional to the PHY rates of the connections), then Pn mi Ci ∗ (3) C = Pi=1 n j=1 mj The above discussion is illustrated by the hybrid experiment results shown in Figure 11, where the scenario is that n1 , n2 and n3 STAs are associated, respectively, at the PHY rates of 54 Mbps, 24 Mbps, and 6 Mbps. In WM there is one queue for each rate class. The

20 Analysed Throughput Aggregate Throughput With WM Aggregate Throughput Without WM

Throughput (Mbps)

15

10

5

0 0

500

1000

1500

2000

2500

3000

3500

4000

Time (seconds)

Fig. 12. WM service rate adaptation for various combinations of STAs associated at different PHY rates.

service weights are given by the expression in Equation (2). The WM service rate C is then varied and the aggregate download throughput is plotted vs. C. There are three plots, corresponding to n1 = n2 = n3 = 4, n1 = n2 = n3 = 3, and n1 = 1, n2 = 2, n3 = 3. The calculated values of C ∗ , from Equation (3) (where the various parameter values from IEEE 802.11a are used), are 14.14 Mbps, 14.14 Mbps, and 11.00 Mbps. We notice that as C increases from a small value the aggregate throughput is equal to C, showing that WM is the bottleneck and the wireless medium is underutilized. However, when C approaches close to C ∗ , the aggregate throughput drops sharply and stabilises at a value significantly smaller than C ∗ , as C further increases. Basically, the queues move into the AP and the default behaviour takes over. 2) WM service rate adaptation: Above we arrived at the optimal service rate C ∗ analytically for a set of weights φi , 1 ≤ i ≤ m. The experimental result in Figure 11 also illustrates the necessity of adapting the WM service rate as the number of STAs associated at each rate changes over time. There is another important reason why the WM service rate needs to be adaptively learnt, and cannot simply be computed from Equation (1). Note that the rates Ci may be achieved only if there are no wireless channel losses. When there are losses, due to the SINR for connection i not being the best possible for the PHY rate at which the STA is associated, then more packets will be sent on the medium than are accounted for by introducing virtual bits in the WM queues. These losses will result in a smaller value of Ci , which will show up if the WM service rate is adapted based on on-line measurements. Based on the insights gained from Figure 11 we have implemented a simple rate adaptation algorithm. This algorithm requires information about the rates at which the STAs are associated with the AP, and this information

is obtained by WM via SNMP polls to the AP. To illustrate the performance of our approach we provide a sample result from the hybrid testbed; see Figure 12. The setup is the same as that for the “open-loop” experiment conducted for Figure 11. Now, the number of STAs in each rate class is changed at an interval of 20 minutes in the following order: • 3 STAs in each rate class for first 20 minutes, • 1, 2, and 3 STAs in 54 Mbps, 24 Mbps and 6 Mbps rate classes, respectively, for the next 20 minutes, and • 4 STAs in each rate class for the last 20 minutes In Figure 12, we show the values of C ∗ for each interval, and the service rate (say, Cˆ ∗ ) to which WM adapts. We also show the aggregate throughput without WM; this is the bottom plot in the figure. We see that with WM the aggregate throughput tracks (but falls a little short of) the computed value of C ∗ . When C ∗ changes, the adaptation takes 50 to 60 seconds. 3) STFQ based packet scheduler: The WM queues carry TCP DATA packets for download connections and TCP ACK packets for upload connections. Separate queues are maintained for TCP DATA and ACK packets. The WM packet scheduler releases these packets into the WLAN. A TCP ACK packet transmitted by the AP results in a certain number of uplink DATA packets towards AP. One issue we need to address when trying to use an existing fair queueing scheduler, meant for a wired full-duplex link, is the half-duplex nature of the shared wireless link. The WM packet scheduler therefore has to release DATA and ACK packets in such a way that it accounts for the time required on the medium for the packets that will be triggered by the reception of these packets by the STAs. When an ACK packet corresponding to the uplink traffic is served, in the TCP steady state, this results in two uplink DATA packets at the STA because of TCP’s delayed ACK mechanism. The scheduler replaces each packet queued in the WM buffer with a number of “virtual bits” corresponding to the number of bits that will actually be carried by the WLAN MAC and PHY as a result of the packet being released into the WLAN. Thus, for each ACK, the scheduler assumes a number of virtual bits equal to an ACK and two DATA packets. For a DATA packet the scheduler assumes a number of virtual bits equal to a DATA packet and half an ACK (to roughly account for delayed ACKs). Packet scheduling is performed using the Start Time Fair Queuing (STFQ) [19] scheduling policy by suitably tagging start and finish numbers to each packet, and scheduling the packet transmissions appropriately. Let us consider an arrival of a packet, k + 1, of virtual length (j) (j) lk+1 (see above) into a queue j, at arrival instant ak+1 . (j) Let Fk denote the finish number of packet k in queue j.

V (t) denotes the (global) virtual time at time t. The start (j) number Sk+1 is computed as (j)

(j)

Then

(4)

(j)

(j) Fk+1

=

(j) Sk+1

l + k+1 φj

(5)

(j)

where F0 = 0. In STFQ, instead of computing the virtual time from a simulation of the corresponding GPS system, the following approximations are made. The virtual time is initialised to 0, and increases in jumps as follows. When a packet arrives (say, at t), and if there is a packet in service, then STFQ approximates V (t) as the start time of the packet in service. If the packet arrives, at time t, to an idle system then the value of V (t) is taken to be the finish time of the last packet in the previous busy period. See [19] for details. Packets are released into the WLAN in the order of their start numbers. This is because the actual server is the wireless medium itself. After releasing a packet into the WLAN from queue i, the WM scheduler allows time for the virtual bits corresponding to the packet to be served at the current WM service rate Cˆ ∗ . V. E XPERIMENTAL RESULTS A. Results from the Hybrid Testbed Recall Figure 3 which showed a schematic of the hybrid testbed. The wireless PHY and MAC are simulated in Qualnet. The WAN between the web servers and the WM is emulated by a computer with a propagation delay of 300 ms. The AP buffer is set to the default size of 100 packets. The wireless STAs and the web server run on the Linux OS. We have used the default Linux kernel settings. Thus, TCP window scaling option is enabled, and delayed acknowledgment is on. Bulk file transfers (upload and downloads) are started between the STAs and the web server. WM is configured to provide equal aggregate throughputs to uploads and downloads. In the experiments described below, initially WM is bypassed (cases labeled “Without WM”), and later WM is introduced (cases labeled “With WM”). 1) STAs associated at multiple PHY rates: We consider 6 wireless STAs with half of them downloading files from the web server and the other half uploading files. The 6 wireless STAs are grouped into three sets of 2 STAs each associated at 11 Mbps, 5.5 Mbps and 2 Mbps, respectively. The plot in Figure 13 shows the throughputs During the initial 300 seconds WM has been bypassed. We notice that the download transfers obtain very small throughputs (the cluster of 4 plots near the bottom, including the solid plot for aggregate throughput). Recall the discussion in Section III-B. The upload transfers obtain equal throughputs of 0.8–0.9 Mbps (the three jagged

Aggregate Downlink Throughput Aggregate Uplink Throughput STA1 Downlink Throughput STA2 Downlink Throughput STA3 Downlink Throughput STA4 Uplink Throughput STA5 Uplink Throughput STA6 Uplink Throughput

3.5 3 Throughput (Mbps)

(j)

Sk+1 = max{Fk , V (ak+1 )}

4

2.5 2 Without WM

With WM

1.5 1 0.5 0 100

200

300

400

500

600

Time (seconds)

Fig. 13. Hybrid testbed: Uplink and downlink throughputs vs. time 6 STAs associated at 11, 5.5 and 2 Mbps PHY rates with and without WM

plots in the middle), with an aggregate throughput of about 2.6 Mbps (top-most plot). After WM is introduced, the upload and download throughputs for each PHY rate become equal; the aggregate upload and download throughputs become about 1.578 Mbps each, totaling to 3.156 Mbps, which is just less than C ∗ = 3.26 Mbps obtained from the parameters of this experiment. Further, proportional service differentiation leads to the 11 Mbps, 5.5 Mbps, and 2 Mbps transfers obtaining aggregate throughputs of, respectively, 2 × 0.778 Mbps = 1.556 Mbps, 2 × 0.545 Mbps = 1.09 Mbps, and 2 × 0.255 Mbps = .51 Mbps, which are roughly in the ratio of the C1 = 4.89 Mbps, C2 = 3.33 Mbps, and C3 = 1.58, the ideal contention free TCP throughputs at 11, 5.5 and 2 Mbps. Thus, WM provides a higher aggregate throughput for the network, while providing upload-download fairness, and proportional throughput differentiation across the rate classes. 2) STAs associated at a single PHY rate: In the setup shown in Figure 3, 10 STAs are now associated with the AP at 11 Mbps PHY rate, keeping the rest of the setup the same. The following experiments are concerned with fairness between upload and download TCP transfers, and between just upload TCP transfers. Uploads and Downloads: In Figure 14, we have plotted the throughputs of 10 TCP transfers one from each of 10 STAs, half of which perform uploads and the other half perform downloads, of large files to and from the web server. We observe from the “Without WM” segment that the download throughputs are very small; these are the plots near the bottom, before 300 sec. On the other hand the aggregate upload throughput is about 4.8 Mbps. WM is introduced at 300 sec, and after a transient period, the aggregate upload and download throughputs become equal (about 2.4 Mbps each), and so do the individual throughputs of the 10 file transfers

8

35 Aggregate Downlink Throughput Aggregate Uplink Throughput STA2 Downlink Throughput STA3 Downlink Throughput STA4 Downlink Throughput STA5 Downlink Throughput STA6 Downlink Throughput STA7 Uplink Throughput STA8 Uplink Throughput STA8 Uplink Throughput STA9 Uplink Throughput STA10 Uplink Throughput

Throughput (Mbps)

6 5 4 Without WM

3

STA1 Downlink Throughput STA2 Uplink Throughput 30

25 Throughput (Mbps)

7

With WM

20

15

10

2

Without WM

5

1 0

0 0

100

200

300

400

500

600

700

800

0

100

200

Time (seconds)

300

400

500

600

Time (seconds)

Fig. 14. Hybrid testbed: Throughputs of 10 STAs, 5 of which are performing uploads and 5 performing downloads. The STAs are associated at 11 Mbps with an AP. Results with and without WM are shown.

Fig. 16. Physical AP: Upload and download throughputs for 2 STAs associated at 54 Mbps (IEEE 802.11g) with and without WM.

35

10

STA1 Uplink Throughput STA2 Uplink Throughput

Aggregate Uplink Throughput STA1 Uplink Throughput STA2 Uplink Throughput STA3 Uplink Throughput STA4 Uplink Throughput STA5 Uplink Throughput STA6 Uplink Throughput STA7 Uplink Throughput STA8 Uplink Throughput STA9 Uplink Throughput STA10 Uplink Throughput

6

30

25 Throughput (Mbps)

8

Throughput (Mbps)

With WM

4 Without WM

With WM

20 Without WM

15

With WM

10 2 5

0

0 0

100

200

300

400

500

600

700

Time (seconds)

0

100

200

300

400

500

600

700

Time (seconds)

Fig. 15. Hybrid testbed: Upload throughputs obtained by 10 STAs associated at 11 Mbps with and without WM.

Fig. 17. Physical AP: Upload throughputs for 2 STAs associated at 54 Mbps (IEEE 802.11g) with and without WM.

(about 0.48 Mbps each). Uploads only: There are 10 upload file transfers from 10 STAs. Figure 15 shows the results from the hybrid testbed. Before WM is introduced there is considerable unfairness between the upload throughputs. Recall the discussion in this Section III-C. On the other hand, with WM the upload throughputs become exactly equal.

Upload and Download: See Figure 16. A large file download from the wireline web server was initiated from STA1, and was allowed to run for 200 seconds; we observe a throughput of about 18.56 Mbps. Then an upload was started from STA2; the upload throughput was 21.2 Mbps, whereas the download throughput fell to 0.34 Mbps. The upload transfer obtains a slightly higher throughput than a download alone because some of the TCP ACKs are lost in the AP, and hence fewer bits are carried on the medium per data packet. Once WM is turned on at 400 seconds, the throughputs equalise, giving a total throughput of about 19 Mbps. Two Uploads: This experiment demonstrates unfairness among uploads, and WM’s ability to enforce fairness. In Figure 17 in the “Without WM” period only the upload from STA1 obtains throughput (about 21.5 Mbps), whereas the one from STA2 is starved completely. Once WM is introduced, both the uploads obtain equal throughputs, with the aggregate being 19 Mbps.

B. Results from a Testbed with a Physical AP There is an IEEE 802.11g Cisco Aironet AP with which two Linux based laptops (STA1 and STA2) are associated at 54 Mbps. The remaining experimental testbed and parameters are the same as in Section V-A. Delayed ACKs were enabled in the TCP receivers; RTS/CTS was used to send data packets, whereas Basic Access was used for TCP ACKs. With this the maximum possible throughput on the wireless medium (with 54 Mbps PHY rate) can be calculated to be 23.14 Mbps. The actual throughputs achieved will, of course, be less.

C. Policy Based Service Differentiation Suppose there is a requirement to give privileged access to an STA. For example, consider two 11Mbps stations, STA1 and STA2 each downloading a large file. STA1 is a privileged user and requires some guaranteed fraction of the total throughput. WM provides for such configurable policies. In Table I we have presented two cases. In the first row WM is configured to give STA1 five times the throughput of STA2. In the second row STA1 gets nine times the throughput of STA2. In WM a queue is created for each STA and appropriate weights (recall the φi values from Section IV-B) are assigned. We see from Table I that WM is able to provide the required service differentiation. The total throughput in each case is the same: 4.655 Mbps. Configured Ratio

STA1 Thpt Mbps

STA2 Thpt Mbps

Total Thpt Mbps

5:1 9:1

3.879 4.190

0.776 0.465

4.655 4.655

TABLE I S ERVICE D IFFERENTIATION USING WM.

VI. C ONCLUSION We had proposed WM ([1]), a centralised WLAN controller that can manage certain QoS issues in an infrastructure WLAN, without any modification to APs or the clients. WM can be viewed as implementing a “Split-MAC” architecture. In this paper we have reported our experiences in implementing some of the WM proposals on a Linux platform, and the performance results we have obtained. All the results we have reported are obtained from testbeds in which the actual physical WM was in place, and was carrying TCP traffic between a server and clients. We have reported that several TCP QoS problems can be effectively solved by WM; these are the problems of throughput ineffciency when STAs are associated at high and low PHY rates, uploaddownload unfairness due to losses in the AP buffer, and unfairness between upload connections. WM can also be used to create desired service differentiation between STAs. Another experience we must report was that, in the course of evolving and experimenting with the WM idea, the insights gained from analytical models (such as [3] and [4], to list only a couple of many papers) were invaluable in (i) understanding, explaining and predicting the experimental results, and in (ii) guiding the design of the algorithmic techniques. In ongoing work we have also shown the efficacy of WM in managing web-like short TCP controlled transfers. We have implemented voice admission control in WM and are currently experimenting with a WME (WiFi Multimedia Extension) enabled AP so that TCP

transfer performance in the presence of VoIP telephony can be studied. R EFERENCES [1] M. Hegde, S. V. R. Anand, A. Kumar, and J. Kuri, “WLAN manager (WM): a device for performance management of a WLAN,” International Journal of Network Management, vol. 17, pp. 155–170, January 2007. [2] G. Berger-Sabbatel, F. Rousseau, M. Heusse, and A. Duda, “Performance anomaly of 802.11b,” in Proceedings of IEEE Infocom 2003, IEEE, 2003. [3] R. Bruno, M. Conti, and E. Gregori, “An accurate closed-form formula for the throughput of long-lived TCP connections in IEEE 802.11 WLANs,” Computer Networks, vol. 52, pp. 199– 212, 2008. [4] G. Kuriakose, S. Harsha, A. Kumar, and V. Sharma, “Analytical models for capacity estimation of IEEE 802.11 WLANs using DCF for internet applications,” Wireless Networks. to appear. [5] Q. Wu, M. Gong, and C. Williamson, “TCP fairness issues in IEEE 802.11 wireless LANs,” Computer Communications. to appear. [6] S. Harsha, A. Kumar, and V. Sharma, “An analytical model for the capacity estimation of combined VoIP and TCP file transfers over EDCA in an IEEE 802.11e WLAN,” in Proceedings of the International Workshop on Quality of Service (IWQoS), 2006 2006. [7] L. Yang, P. Zerfos, and E. Sadot, “Architecture taxonomy for control and provisioning of wireless access points (CAPWAP),” June 2005. RFC 4118. [8] K. Cai, R. Lotun, M. Blackstock, J. Wang, M. Feeley, and C. Krasic, “A wired router can eliminate 802.11 unfairness, but it’s hard,” in ACM HotMobile 2008: The Ninth Workshop on Mobile Computing Systems and Applications, 2008. [9] Q. Wu, M. Gong, and C. Williamson, “TCP fairness issues in IEEE 802.11 wireless lans,” Computer Communications, pp. 2150–2161, 2008. [10] F. Vacirca and F. Cuomo, “Experimental results on the support of TCP over 802.11b: an insight into fairness issues,” 2007. [11] N. Blefari-Melazzi, A. Detti, A. Ordine, and S. Salsano, “Controlling TCP fairness in wlan access networks: Problem analysis and solutions based on rate control,” IEEE Transactions on Wireless Communications, vol. 6, pp. 1346–1355, April 2007. [12] S. Pilosof, R. Ramjee, D. Raz, Y. Shavitt, and P. Sinha, “Understanding TCP fairness over wireless LANs,” INFOCOM 2003. Twenty-Second Annual Joint Conference of the IEEE Computer and Communications Societies. IEEE, vol. 2, pp. 863–872, March 2003. [13] A. Detti, E. Graziosi, V. Minichiello, S. Salsano, and V. Sangregorio, “TCP fairness issues in IEEE 802.11 based access networks.” [14] J. F. Lee, W. Liao, and M. C. Chen, “Proportional fairness for QoS enhancement in IEEE 802.11e WLANs,” pp. 503–504, November 2005. [15] S.-H. Yoo, J.-H. Choi, J.-H. Hwang, and C. Yoo, “Eliminating the performance anomaly of 802.11b,” Springer: Berlin, 2005. [16] H. Kim, S. Yun, I. Kang, and S. Bahk, “Resolving 802.11 performance anomalies through QoS differentiation,” IEEE Communications Letters, pp. 655–657, 2005. [17] S. W. Kim, B.-S. Kim, and Y. Fang, “Downlink and uplink resource allocation in IEEE 802.11 wireless lans,” IEEE Transactions on Vehicular Technology, vol. 54, pp. 320–327, January 2005. [18] D. J. Leith and P. Clifford, “Using the 802.11e EDCF to achieve TCP upload fairness over WLAN links,” no. ISBN:0-7695-2267X, pp. 109–118, 2005. [19] P. Goyal, H. M. Vin, and H. Cheng, “Start-time fair queuing: A scheduling algorithm for integrated services packet switching networks,” Technical Report: CS-TR-96-02, January 1996.