Fair, Efficient and Scalable Scheduling Without Per

0 downloads 0 Views 81KB Size Report
the Deficit Round Robin (DRR) scheduler, whereas the core switches use the proposed Aggregated Flow Fair Queueing. (AFFQ) scheduling discipline.
Proceedings of the IEEE International Performance, Computing and Communications Conference Pheonix, Arizona, USA, April 4–6, 2001

Fair, Efficient and Scalable Scheduling Without Per-Flow State∗ Salil S.Kanhere and Harish Sethu Department of ECE Drexel University 3141 Chestnut Street, Philadelphia, PA 19104-2875. E-mail: {salil, sethu}@ece.drexel.edu Abstract In recent years, parallel computer systems are being increasingly used in multi-user environments, with several users sharing the interconnection network at the same time. As a result a large number frequently contend for link bandwidth at the core switches in the network. Traditional fair scheduling disciplines need to maintain per-flow states and perform packet scheduling on a per-flow basis, which increases the complexity of implementation at high speeds for large numbers of flows. In this paper we present an efficient, fair, simple and scalable solution which requires the schedulers in the entire network to maintain only a perlink state as opposed to a per-flow state. In our scheme, the edge switches in the network employ a modified version of the Deficit Round Robin (DRR) scheduler, whereas the core switches use the proposed Aggregated Flow Fair Queueing (AFFQ) scheduling discipline. We prove that AFFQ maintains the same relative fairness bound as Deficit Round Robin and has a work complexity of O(1). Our scheme is also applicable in other networks like the Internet, where the number of active flows in the core routers can be very large.

1

Introduction

Most of the switches used for designing interconnection networks of parallel systems do not provide any fairness guarantees beyond eliminating starvation of the flows. Fair resource allocation is essential for improving performance by eliminating bottlenecks and for isolation between the traffic generated by the different users. First-in First-out (FIFO) scheduling, which is popular in current switches, schedules packets in the order of their arrival times. The disadvantage of this scheme is that a rouge source sending packets at a rate higher than its fair share will grab an arbitrary share of the bandwidth. An alternate technique would be Packet-based Round Robin (PBRR) wherein the scheduler transmits one packet from each flow contending for the outgoing link in a round-robin fashion. The PBRR ∗ This work was supported in part by NSF CAREER Award CCR9984161 and U.S. Air Force Contract F30602-00-2-0501.

scheduler, however, is not fair since the flows transmitting longer packets use up an unfairly high fraction of the available transmission bandwidth. Efficient scheduling disciplines with an O(1) work complexity such as Deficit Round Robin (DRR) [1] and Elastic Round Robin (ERR) [2] have recently been proposed to achieve fair bandwidth allocation. These disciplines maintain per-flow states, i.e., a state for each active flow through the system and perform operations on a per-flow basis. For each arriving packet, the scheduler needs to identify the flow to which it belongs, enqueue the packet in the appropriate per-flow queue and update certain per-flow state variables maintained at the scheduler. In recent years, with the growth in the popularity of multi-user environments in parallel systems [3], an increasing number of users are competing for the shared resources in the interconnection network resulting in a large number of active flows in the core switches of the network. Maintaining per-flow state and manipulating flow information for these large number of flows requires substantial computational overhead. A similar problem exists in the backbone routers in the Internet where tens of thousands of sessions are competing for a single link. A simpler and scalable approach is desirable given a large number of flows. Recently two schemes, Core-Stateless Fair Queueing (CSFQ) [4] and Rainbow Fair Queueing (RFQ) [5] have been proposed to eliminate the problem of maintaining perflow states in Internet routers. Both these architectures distinguish between the edge routers and the core routers in a network. While the edge routers do perform per-flow management, the core routers do not perform per-flow management and hence can be easily implemented at very high speeds. In the CSFQ architecture, the edge routers estimate the arrival rate for each flow using exponential averaging and label the packets with these rate estimates. The core routers periodically estimate the fair share rate of the output links based on the measurements of the aggregate traffic arriving at the router. These routers perform FIFO scheduling, where the forwarding probability for each packet is a function of the flow rate estimate car-

ried in the packet labels and the fair share estimate maintained by the core router. Finally the core scheduler has to relabel the forwarded packets to reflect the change in the flow’s rate. In RFQ, each flow is divided into a set of layers based on the rates, with a color assigned for each layer. The labeling of the packets with their respective colors is performed at the edge schedulers. As compared to CSFQ, the packets now have to carry only the color labels and not the explicit rate estimates. The core schedulers as in CSFQ use simple FIFO scheduling with a color thresholdbased packet dropping mechanism. Note that the operation of the core scheduler is simplified as compared to that in CSFQ, since it no longer has to estimate the link fair share. However, both these schemes approach but do not achieve the same level of fairness as obtained by DRR. In this paper, we present Aggregated Flow Fair Queueing, a simpler approach which, in addition, maintains the same relative fairness bound as DRR and with a work complexity of O(1). AFFQ, as in [6], requires the schedulers in the entire network to maintain only a per-link state as opposed to a per-flow state. In our scheme, the schedulers closest to the traffic sources employ a modified version of the DRR scheduling discipline. The DRR schedulers and the traffic sources label selected packets with some per-flow information. All other schedulers in the network then treat the traffic arriving at each link as an aggregated flow, and maintain a per-link state which is equal to the number of constituent flows. These schedulers implement AFFQ over the aggregated flows, with the number of constituent flows as the weight assigned to each link and make use of the labels for fair allocation of the bandwidth. The labels used to carry the per-flow information are contained in the packet headers. Our scheme can also be applied for the backbone routers in the Internet where tens of thousands of sessions are competing for a single link. As in [5], we can make use of the Type of Service (TOS) field of the IP header in IPv4 for the labels.

2

Deficit round robin

In this section we will briefly describe the DRR scheduling algorithm which is employed by the edge switches in our scheme and state some of its important analytical results. A more detailed description of the scheduling discipline can be found in [1]. Consider an output link at a switch, the access to which is controlled by the DRR scheduler. Let there be a total of n flows each with an associated queue, all headed for the same output link. A flow is said to be active during a time interval, if it always has packets awaiting service during this time interval. The DRR scheduler maintains a linked list of the active flows, called the ActiveList. At the start of the active period of a flow, the flow is added to the tail of this list. A round consists of one round robin iteration

during which the DRR scheduler serves all the flows that are present in the ActiveList at the onset of the round. The DRR scheduler assigns a Quantum to each flow which is defined as the service that the flow should receive during each round robin service opportunity. Note that during a certain service opportunity, a flow may not be able to transmit a packet because doing so would cause the service received by the flow in that service opportunity to exceed its allocated quantum. In that case, the scheduler remembers the remainder of the quantum in the deficit count associated with that flow. This deficit count is added to the quantum in the subsequent round. Hence a flow that does not receive its fair share of the bandwidth during a certain round is given an opportunity to receive proportionately more service in the next round. We refer to the actual service received by a flow i during a round robin service opportunity as the Scheduled Quantum for the flow during that round. Note that the scheduled quantum for the same flow in different rounds of service may not be the same. Let Qi represent the quantum assigned to flow i. Let SQi (s) represent the scheduled quantum for flow i during round s , i.e, the total service actually received by flow i during the s-th round robin service opportunity. Also let DCi (s) represent the deficit count for flow i after it has finished its service opportunity during the s-th round. The total bandwidth available for flow i during its service opportunity in the s-th round is the sum of its deficit count from the previous round and its quantum, i.e., DCi (s − 1) + Qi . The DRR scheduler will begin the transmission of the packet at the head of the queue associated with flow i if the sum of the packet length and the the total service received by flow i so far is less than or equal to the sum of Qi and DCi (s−1). If this condition is not satisfied, the scheduler stops the service of flow i and begins serving the next flow in the ActiveList. If flow i is still active, it is added to the tail of the ActiveList and its new deficit count, DCi (s) is calculated as follows, DCi (s) = Qi + DCi (s − 1) − SQi (s)

(1)

In our scheme, when an edge DRR scheduler commences the service opportunity of any flow, it sets a flag, SQ, in the packet header of the first packet scheduled from that flow. The SQ flag allows the core schedulers to distinguish between the scheduled quanta from different flows. Definition 1. Define m as the size in bits of the largest packet that is actually served during the execution of a scheduling algorithm . Definition 2. Define M as the size in bits of the largest packet that may potentially arrive during the execution of a scheduling algorithm. Note that, M ≥ m. Consider k consecutive rounds in an execution of the DRR scheduler during which flow i is continuously active.

Scheduler A

SA Traffic Sources

Link L

during which no packets are generated. We consider the traffic source to be on during the active period and off otherwise. We assume that a traffic source has the capability to announce the start and the end of the active periods. To do this the source sets a certain on flag in the packet header of the first packet it creates during the active period. Similarly when the traffic source is about to become inactive, i.e., no further packets will be generated by the source until the next active period, a similar off flag is set in the packet header of the last packet generated during that active period. These on and off flags allow the core AFFQ schedulers to determine the number of constituent flows present in the aggregated packet stream arriving at each input link.

Scheduler C A

Link L B SB Traffic Sources

Scheduler B

Figure 1: Network Model

It has been proved in [1] that the total service received by flow i during these k round robin service opportunities, N , is bounded by, k(Qi ) − (m − 1) ≤ N ≤ k(Qi ) + (m − 1)

(2)

In order that the work complexity of the DRR scheduler is O(1), the quantum assigned to each flow i, Qi should be greater than or equal to M [1].

3

Aggregated flow fair queueing

This section describes Aggregated Flow Fair Queueing (AFFQ) using the following abstraction of the network model. Our model consists of edge schedulers which are closest to the traffic sources, and core schedulers which are all other schedulers in the network. As shown in Figure 1, consider an edge scheduler A, connected to SA traffic sources. The packets from these flows are queued at scheduler A. Assume that all these packets are headed for output link LA . Similarly, consider another edge scheduler B, connected to SB traffic sources with flows headed to output link LB . A core scheduler C, which receives packets from links LA and LB , maintains only per-link queues as opposed to per-flow queues. Assume that all packets at scheduler C are competing for access to the same output link. In our scheme, the edge schedulers A and B employ the DRR scheduling discipline described in the previous section to schedule the packets arriving for each output link. All edge schedulers use identical DRR algorithms with the same assumed value of M . The core scheduler C employs the AFFQ discipline at each outgoing link, to schedule packets arriving from A and B. Typically, over its lifetime, a traffic source oscillates between a period of activity during which it generates a stream of packets for the flow and a period of inactivity

Assume that all of the SA + SB traffic sources are active. The DRR schedulers A and B, in each of their respective rounds, serve one scheduled quantum from each of the traffic sources SA and SB respectively. The aggregated packet stream queued at the input links LA and LB of the core scheduler C is a repeated sequence consisting of one scheduled quantum from each of the SA and SB traffic sources respectively. Hence between any two scheduled quanta of the same flow in a per-link queue at the core scheduler, there will be exactly one scheduled quanta from each of the other active flows that are multiplexed on the same link. Let t1 be the time instant when a core scheduler starts serving the packet with the on flag set for flow i. Also let t2 be the time instant when the scheduler finishes transmitting the packet with the corresponding off flag set. Flow i is said to be active with respect to the scheduler during the time interval (t1 , t2 ). Note that the time interval during which a flow is active with respect to a certain core scheduler will not necessarily coincide with the time interval during which the traffic source of that particular flow is active. This is because the packets generated by the traffic source of a flow experience some delay as they traverse through the network. Hence the same flow will be active with respect to different core schedulers over different noncoinciding time intervals. The core schedulers maintain a per-link state called the the FlowCount which equals the number of flows that are active at a link with respect to the core scheduler. Let FlowCounti represent the flow count associated with link i. This quantity simply allows the scheduler to determine the number of constituent flows in the aggregated packet stream arriving at each input link. We define a link as active when the queue associated with that link is not empty. The scheduler maintains a linked list, called the ActiveList of links which are active. This is in contrast to DRR, for example, where the ActiveList is a list of active flows. The active links are served by the AFFQ scheduler in a round robin fashion. A round consists of a round-robin iteration over all queues that are

backlogged at the outset of the round. An inactive link is added to the tail of the ActiveList when a new packet arrives on that link. The AFFQ scheduler maintains a quantity know as the RoundRobinVisitCount which keeps track of the number of links that the scheduler has yet to serve during the round in progress. The RoundRobinVisitCount is initialized to the total number of active links present in the ActiveList at the start of a round. RoundRobinVisitCount is decremented by one after the AFFQ scheduler finishes serving each active link. When this quantity equals zero, it indicates the end of the round in progress. Assume that all the SA flows multiplexed on link LA are active with respect to the core scheduler C. When the AFFQ scheduler begins serving link LA , it copies FlowCountLA into the variable, RunningCount. Hence at the start of the service opportunity for link LA , the RunningCount is initialized to SA , the total number of flows on link LA that are active with respect to the AFFQ scheduler. Each time the scheduler begins serving a new scheduled quantum, RunningCount is decremented by one. The scheduler can detect the start of a quantum by checking if the SQ flag in the packet header of the packet at the head of the queue is set. Once RunningCount is decremented to zero, and the scheduler finishes serving the new scheduled quantum, the scheduler does not continue serving that link. Following the service of link LA , if its associated queue is empty, then link LA is removed from the ActiveList. Otherwise, if the associated queue has packets awaiting service, the link is added to the tail of ActiveList. Note that, during the service opportunity if the queue for link LA is found to be empty, then the service to that link is discontinued. Thus, in one service opportunity the core AFFQ scheduler C serves SA scheduled quanta from the queue associated with link LA . Since the aggregated traffic at link LA consists of a repeated sequence of one scheduled quanta from each of the SA flows, we can conclude that the core scheduler C serves one scheduled quanta from each of the SA active flows. If the core scheduler C serves a packet with the on flag set during the service opportunity of link LA , then FlowCountLA is incremented by one since a new flow j∈ / SA will now be active with respect to the scheduler on link LA . Note that flow j is not included in the RunningCount since it was not active with respect to the scheduler at the start of this service opportunity. In order to ensure that all the active flows SA on link LA and the new flow j are served by the AFFQ scheduler during this service opportunity , the RunningCount is also incremented by one. On the other hand if the scheduler C serves a packet with the off flag set during the service opportunity of link LA , it implies that some flow j ∈ SA flows will no longer be active with respect to the AFFQ scheduler. Hence the sched-

uler decrements FlowCountLA by one since there will be no further packets arriving from this flow on link LA . Note that the RunningCount remains unaffected since the quantum belonging to flow j containing the packet with the off flag set has already been served by the scheduler C in this service opportunity. In this section we have focussed our discussion on the core scheduler C which serves packets arriving from the edge schedulers. Note that everything said about scheduler C holds true for any other core scheduler in the network. The on and off flags together with the per-link state, FlowCount enable a core scheduler to determine the number of flows that are contained in the aggregated traffic at each input link. However note that the scheduler cannot explicitly determine the time interval during which each of the constituent flows of the traffic stream is active with respect to the scheduler. The scheduler uses the per-flow information carried in the SQ flags of the packets to serve one scheduled quantum from each constituent flow of the packet stream. Note that the core scheduler operates on a per-link basis and hence cannot distinguish packets on a per-flow basis. Figure 2 presents a pseudo-code description of the AFFQ scheduling algorithm consisting of Initialize, Enqueue and Dequeue routines. The Enqueue routine is invoked whenever a new packet arrives at a link of the AFFQ scheduler. As long as there are packets queued for the output link, the Dequeue routine is active. When the AFFQ scheduler finishes the transmission of a packet, the Dequeue routine selects the next packet to be scheduled and begins its service.

4 Performance analysis Assume that a total of n flows are contending for a transmission link, the access to which is controlled by a scheduler. Consider an execution of the scheduler over the n flows. The work involved in processing each packet at the scheduler involves two parts: enqueueing and dequeueing. Hence the work complexity of a scheduler is defined as the order of time complexity, with respect to n, of enqueueing and then dequeueing a packet for transmission [1,2]. It has been proved in [1] that the work complexity of the DRR scheduler is O(1). It is easily observed that our modification does not affect this measure and so the edge schedulers have a work complexity of O(1). One may also readily verify from the pseudo-code in Figure 2 that the work complexity of the core AFFQ scheduler is also O(1). In the following, we will briefly present an analysis of the fairness properties of AFFQ. In our fairness analysis, we use the popular metric, Relative Fairness Bound (RFB) proposed in [7]. Definition 3. Let SentSi (t1 , t2 ) represent the total number of bits served by the scheduler S from flow i during

Initialize: (Invoked when the scheduler is initialized) RunningCount = 0; for (i = 0; i < n; i = i + 1) FlowCount(i) = 0; end for Enqueue: (Invoked when a packet arrives) i = QueueInWhichPacketArrives; if (ExistsInActiveList(i) == FALSE) then AddToActiveList(i); Increment SizeOfActiveList; end if Dequeue: do if ( RoundRobinVisitCount == 0 ) then RoundRobinVisitCount = SizeOfActiveList; end if i = HeadOfActiveList; RemoveHeadOfActiveList; RunningCount = FlowCount(i); ContinueService = TRUE; do Packet = PacketAtHeadofQueue(i); if (Packet has On flag set) then Increment FlowCount(i); Increment RunningCount; end if if(Packet has Off flag set) then Decrement FlowCount(i); end if if (Packet has Q flag set) then Decrement RunningCount; end if Transmit Packet; if ( (Queue(i) is empty) or ((RunningCount == 0) and (Packet at head of Queue(i) has Q flag set)) ) then ContinueService = FALSE ; Decrement SizeOfActiveList; end if while (ContinueService == TRUE) if (Queue(i) not empty) then AddQueueToActiveList(i); end if Decrement RoundRobinVisitCount; while (TRUE)

Figure 2: Pseudo-Code for AFFQ

the time interval (t1 , t2 ). The relative fairness over the time interval (t1 , t2 ), RF (t1 , t2 ) is defined as the the maximum value of |SentSi (t1 , t2 ) − SentSj (t1 , t2 )| for all pairs

of flows i and j that are active during this time interval. The maximum value of RF (t1 , t2 ) over all possible time intervals (t! , t2 ) is defined as the relative fairness bound (RFB) for the scheduler S. Lemma 1. Consider n consecutive rounds in an execution of the AFFQ scheduler during which flow i is continuously active with respect to the scheduler. The bounds on the total service received by flow i during these n round robin service opportunities, S(i,n) , are given by, k(M ) − (m − 1) ≤ S(i,n) ≤ k(M ) + (m − 1)

(3)

Proof. The AFFQ scheduler serves one scheduled quantum from flow i during each of the n consecutive rounds under consideration. Now since a scheduled quantum for a flow i equals the actual service by flow i in some round robin service opportunity at the edge DRR scheduler for flow i, S(i,n) is equal to the total service received by flow i at the edge DRR scheduler during some n consecutive rounds. Hence the lemma is proved using Equation (2) and assuming that Qi is equal to M . Definition 4. Consider an execution of the AFFQ scheduling discipline. Let T be the set of all the time instants during this execution. Let the set Ts include all the time instants at which the scheduler finishes the serving one flow and begins scheduling another one. It has been proved in [2] that to obtain the relative fairness bound of the Elastic Round Robin scheduling discipline, we only need to consider the time intervals (t1 , t2 ) such that both the time instants t1 and t2 belong to Ts . It can be easily verified that this proof holds true for the edge DRR schedulers as well as the core AFFQ schedulers. Hence in order to prove Theorem 1 we need to consider only those time intervals which are bounded by time instants that coincide with the starting or ending of the service of a flow. Theorem 1. For the AFFQ scheduling discipline, RFB < M + 2m. Proof. Let us consider a time interval (t1 , t2 ) during the execution of a core AFFQ scheduler S, such that both t1 and t2 belong to Ts . Consider two flows i and j which are active with respect to the scheduler S during this time interval. Let mi and mj be the number of service opportunities received by flows i and j at the core scheduler C, respectively during the time interval (t1 , t2 ). During one round, the AFFQ scheduler serves one scheduled quantum from each flow that is active with respect to the scheduler. Since flows i and j are active during the time interval under consideration, between any two service opportunities received by flow i, flow j will be served once by the AFFQ scheduler and hence |mi − mj | ≤ 1.

SentSi (t1 , t2 ) ≤ mi (M ) + (m − 1)

(4)

Similarly for flow j, SentSj (t1 , t2 ) ≥ mj (M ) − (m − 1)

(5)

Subtraction Equation (5) from Equation (4) and using the fact that |mi − mj | ≤ 1, the statement of the theorem is proved.

In this section, we present simulation results on the fairness properties of the AFFQ scheduler. We show that the fairness achieved by the AFFQ scheduler is identical to that obtained by a DRR scheduler which maintains per-flow state. For our simulations we use the network topology illustrated in Figure 3. Schedulers A and B employ the DRR scheduling discipline with an identical value of M whereas the scheduler C employs the proposed AFFQ scheduling discipline. We assume that there are 6 traffic sources with flows ids from 0 to 5 and 8 sources with flow ids 6 to 13 connected to schedulers A and B respectively. We collect results for a period of 4 million cycles during which we ensure that all the flows except the flow 0 at scheduler A and flow 13 at scheduler B are active. These two flows keep oscillating between active and inactive periods. The flows remain active for a time interval which is randomly chosen from a range of 100 to 200 cycles. The active period is followed by a period of inactivity, the length of which is uniformly distributed between from 4000 to 5000 cycles. The arrival rate in terms of packets per second into the queue corresponding to flow 2 and 8 is twice the rate of the other flows. Also, the packet lengths are uniformly distributed between 1 and 64 bytes for all flows except flow 1 and flow 11. Packets arriving at these two flows have lengths uniformly distributed between 1 and 128 bytes. Note that in this experiment, M is equal to 128 and the largest packet that actually arrives is also 128 since the simulation is executed for a sufficiently large number of cycles. We assume that all the schedulers dequeues one byte from one of the queues in each cycle. We plot the total service received by all the active flows at scheduler C during the time interval of the simulation. In order to compare the fairness properties of AFFQ with DRR, we replace the AFFQ scheduler with a DRR scheduler which maintains a per-flow state as opposed to the per-link state of AFFQ and repeat the simulation. Figures 3(a) and 3(b) demonstrate that the both DRR and AFFQ offer equal throughput to all the active flows independent of the packet sizes and the packet arrival rates at the different flows. From these two plots we

DRR AFFQ 162

161

160

159

Simulation results

1

2

3

4

5

6 7 8 Flow ids

9

10 11 12

(b) 170

Average Relative Fairness in Bytes

5

(a) 163

Number of KBytes Transmitted

Without loss of generality we will assume that the during the time interval under consideration, flow i receives more service than flow j. From Lemma 1 for flow i,

DRR AFFQ

160 150

140

130 120

110

3

4

5

6

7

8

Number of flows at each scheduler (n)

Figure 3: Fairness comparison between AFFQ and DRR

can also conclude that AFFQ is as fair as DRR in terms of the total service by each of the active flows. Flows 0 and 13 are not included in the plots since they are not active throughout the duration of this simulation. Since these two flows oscillate between active and inactive periods, they result in frequent updates to the FlowCounts of the two links, LA and LB and possibly also the RunningCount in the AFFQ scheduler. However, as seen from Figure 3(b), these changes to the per-link state in the AFFQ scheduler do not affect the service received by the other active flows. In our second set of experiments we compare the RFB of AFFQ and DRR. We assume that n flows are connected to each of the edge schedulers, A and B. The flow ids for the flows connected to scheduler A are 0 to n − 1 whereas the ids for those flows connected to scheduler B are from n to 2n − 1. As in the first experiment, the simulation is run for a period of 2 million cycles during which all the

flows except the first flow at scheduler A with flow id 0 and the last flow at scheduler B with flow id 2n − 1 are active. These two flows alternate between active and inactive periods as described in the first experiment. The packet lengths are uniformly distributed between 1 and 64 bytes for all the flows except flows 1 and n + 2 at each scheduler, for which the packet lengths vary uniformly between 1 and 128 bytes. The packet arrival rates for the flows 2 and n + 1 is twice that of the other flows. We compute the average RFB achieved by the AFFQ and DRR schedulers over 10,000 randomly chosen intervals during the period of 2 million cycles The number of flows at each scheduler, n are varied from 3 to 8. Figures 4(a) and 4(b) demonstrate that the average value of RFB for both AFFQ and DRR are similar.

6

Conclusion

In this paper, we presented a simple and efficient scheme, Aggregated Flow Fair Queueing (AFFQ) for achieving fair bandwidth allocation without maintaining per-flow state. The scheme requires the schedulers in the entire network to maintain only a per-link state as opposed to a per-flow state, allowing a scalable mechanism for fair queueing. The edge schedulers, closest to the traffic sources, implement the modified DRR scheduling discipline, while the core schedulers which are all other schedulers, use AFFQ. We have proved that our scheme achieves a work complexity of O(1), with the same bound as DRR on the relative fairness measure. Our scheme is also applicable in other contexts such as scheduling of datagrams in the Internet where a very large number of flows are contending for the link bandwidth at the core routers.

References [1] M. Shreedhar and George Varghese, “Efficient Fair Queuing Using Deficit Round-Robin,” IEEE/ACM Transactions on Networking, vol. 4, no. 3, pp. 375386, June 1996. [2] Salil Kanhere, Alpa Parekh and Harish Sethu, “Fair and Efficient Packet Scheduling in Wormhole Networks,” Proceedings of IPDPS 2000, pp. 623 - 631, Cancun, Mexico, May 2000. [3] Harish Sethu, C.B.Stunkel and R.F.Stucke, “IBM RS/6000 SP large system interconnection network topologies,” Proceedings of the International Conference on Parallel Processing, pp. 620-627, Minneapolis, MN, August 1998. [4] Ion Stoica, Scott Shenker and Hui Zhang, “CoreStateless Fair Queueing: Achieving Approximately Fair Bandwidth Allocations in High Speed Networks,” Proceedings of SIGCOMM’98, pp. 118 - 130, Vancouver, Canada, Sept. 1998. [5] Zhiruo Cao, Zheng Wang and Ellen Zegura, “Rainbow Fair Queueing: Fair Bandwidth Sharing Without Per-Flow State,” Proceedings of IEEE INFOCOM’00, vol. 2, pp. 922 - 931, Israel, March 2000. [6] Werner Almesberger, Tiziana Ferrari, and Jean-Yves Le Boudec, “Scalable Resource Reservation for the Internet,” Proceedings of the IEEE Conference on Protocols for Multimedia Systems—Multimedia Networking, pp. 18 - 27, 1997. [7] S. J. Golestani, “A Self-Clocked Fair Queuing Scheme for Broadband Applications,” Proceedings of IEEE INFOCOM’94, pp. 634 - 646, Toronto, June 1994.

Suggest Documents