Control of the Available Bit Rate ATM Service. Sharat Prasad. 1. Broadband Network Lab, Samsung. Telecommunications America, Richardson, TX 75081.
LAPLUS : An Efficient, Effective and Stable Switch Algorithm for Flow Control of the Available Bit Rate ATM Service Sharat Prasad1
Kamran Kiasaleh and Poras Balsara
Broadband Network Lab, Samsung Telecommunications America, Richardson, TX 75081
Dept of Elect. Engg., University of Texas at Dallas, P.O.Box 830688, Richardson, TX 75083
ABSTRACT LAPLUS is a novel switch algorithm for flow control of the Available Bit Rate (ABR) Asynchronous Transfer Mode (ATM) service. It ensures a steady-state rate allocation satisfying the MCR-plus-equal-share criterion. It only requires constant-time processing and two tags to be stored per flow. It is naturally able to take Peak Cell Rates of flows into account. LAPLUS solves in a novel way the problem of selecting a measurement interval. The solution allows it to contain queue growth and keep utilization high on one hand and control low speed flows and operate with stability on the other. We describe results of simulation study of LAPLUS. The results show it to be fair, responsive and stable.
1. 1. Introduction The ATM Forum has defined the Available Bit Rate (ABR) service for applications that require from the network, in addition to an optional minimum bandwidth, an amount of bandwidth that is difficult to specify precisely. The ABR service dynamically varies the bandwidth allowed to an application based on both the needs of the application and the congestion state of the network. The members of the ATM Forum have reached an agreement to employ the rate-based closed-loop feedback flow control for supporting the ABR service. The Forum has specified [1] the behavior required of end-stations and switches to be compliant with the ABR service. In this paper we focus our attention on the algorithm a switch may use to manage congestion, maximize the utilization of resources and to meet the QOS guarantees applicable to ABR connections. There are two applicable guarantees - a fair access to the available bandwidth and a specified low Cell Loss Ratio (CLR). There are several possible fairness criteria [2]. Two we will use here are the Max-Min [3] and the MCR-plus-equal-share criteria. The simple distributed algorithm [3] for Max-Min fair rate allocation, and many others reported in the literature [4, 5, 12], require the current rate of each flow to be stored. This calls for a large amount of high-speed memory in the switches. Some variations of the simple distributed algorithm update the rate allocation each time a forward RM cell arrives. Others recompute the allocation periodically. Both the update and the recomputation require time proportional to the square of the number of flows making the simple distributed algorithm
1
impractical. On the positive side, the algorithm quickly converges to the Max-Min fair allocation. At the other end of the spectrum are algorithms which rely on approximations to reduce the computation time and memory size [6 – 9, 13]. Network configurations can be constructed which make algorithms of this class converge to unfair rate allocation [10]. In the middle are clever implementations of the basic algorithm, which require significantly smaller amount of memory and enable efficient computation of the rate allocation [11]. All algorithms that recompute the allocation periodically are prone to problems that arise from the period being too short or too long. A long period makes an algorithm less responsive. A short period, when low speed flows are present, leads to errors in the measurement of traffic load on a link. If a short period is used for measuring the available bandwidth, an unstable control results [16]. All switch algorithms for flow controlling the ABR ATM service have to deal with growth of switch queues during transients. Techniques for containing the queue growth include requiring source end-stations to delay rate increases but to affect rate reductions immediately upon being notified [11, 12], setting aside a fraction of the link bandwidth [7-9, 13], etc. Both the techniques lower link bandwidth utilization. Delaying rate increases limits low utilization to periods of transients. But delaying rate increases requires mechanisms in the source endstation which have not been specified by the ATM Forum. In this paper we present the algorithm named LAPLUS. LAPLUS approximates the bottleneck rate of a link as the LArgest among flow rates PLUs the Surplus bandwidth divided by the number of flows with the largest rate. This estimate, when bound below at a suitable value enables links to succeed in determining their bottleneck rates in the order of their bottleneck levels. While rates of flows differ greatly from their respective MCR-plus-fair-share values, this approximation causes flows to change their rates by amounts that have correct sign but the magnitude may be imprecise. When rates of flows satisfy a lock condition, the LAPLUS estimates the correct bottleneck rate. The algorithm is naturally able to take Peak Cell Rates of flows into account. The algorithm only requires constant-time processing and two flags per flow to be stored while ensuring MCR-plus-equal-share rate allocation in the steady state. It solves in a novel way the problem of selecting a control or measurement interval. It uses nested measurement intervals. High rate flows are controlled using inner (therefor short) intervals to keep utilization high and to contain queue growth. Outer (therefor long) intervals are used to control low rate flows
This work was performed at Texas Instruments, Incorporated.
0-7803-4386-7/98/$10.00 (c) 1998 IEEE
k − 1 inclusive. Let, for each flow Fi in Cj, ri be the constraint rate of flow Fi . Each link in the network learns the constraint rates of the flows which traverse it, determines Cj and then computes its bottleneck rate as [3] 1 R j − ∑ F ∈C ri i j BR j = Bj
and to measure available bandwidth. This ensures stable operation in presence of short-lived large magnitude changes in the available bandwidth. The rest of this paper is organized as follows. In Section 2 we consider fair rate allocations and switch algorithms for computing them. Section 4 presents SLAPLUS, the simple version of our algorithm to compute Max-Min fair rate allocation. In Section 5 we describe LAPLUS, the enhanced algorithm which uses nested intervals and the MCR-plus-equalshare criterion and considers PCRs of flows. Simulations highlighting problems and their solutions are presented throughout the paper. Section 6 presents results of simulating the LAPLUS algorithm. We summarize the paper in Section 7 and mention our ongoing work.
In effect a link Lj assigns flows in Cj their respective constraint rates and then equally divides the remaining bandwidth among flows in Bj. The amount of bandwidth received by each flow in Bj is the bottleneck rate of the link. While this is an algorithm for distributed computation of rate allocation, its time complexity is O N 2j and it needs to store constraint rates of all flows.
2. Switch Algorithms and Fairness
3. A Simple Distributed Algorithm
Fairness is one of the two assurances the network offers to the users of the ABR service. Informally, for a rate allocation to be fair, it must offer a flow as big a share of the bandwidth of the most congested link it traverses as any other flow traversing the same link [3]. We define the Max-Min fair rate allocation by describing an iterative procedure [3] for computing it. At the outset set variables u1 and v1 are initialized to the set of all links making up the network and the set of all ABR flows traversing the network, respectively. Variables b j and n j are initialized to the
As we just saw, the computation of the rate allocation for the set of ABR flows V sharing a link L involves determining the set C of flows. Each flow in C is constrained at a link other than L. Any flow with a constraint rate rl which is less than the bottleneck rate BR of link L is clearly constrained at a link other than L. This fact suggests the following method for determining C [11]. Consider a hypothetical sequence with the constraint rates of the flows as elements. Each rate rl occurs only once in the sequence
bandwidth available to and the number of ABR flows sharing the link L j , respectively. During the iteration l, we determine rl as the smallest among the ratio b j n j for all links L j ∈ u l . Let
{ }
Wl = L j ⊆ u l such that b j n j = rl for each link L j ∈ W l . Let
S l = {Fi }⊆ v l where each flow Fi travels over at least one link in Wl. Links in Wl are the level l bottleneck links. Flows in Sl are the level l bottleneck flows. rl is the bottleneck rate of each link in W l and the constraint rate of each flow in S l .
Now a reduced network ul + 1 is constructed by subtracting the set Wl from u l . v l +1 = v l − S l is the set of flows whose constraint rates remain to be determined. Let m be the number of flows which are in S l and which also travel over any link L j ∈ ul + 1 . To complete the construction, we subtract mrl from b j and m from n j for each L j ∈ ul + 1 . If ul + 1 is null, the bottleneck rate of each link and the constraint rate of each flow has been found. The above procedure cannot be used as part of a switch algorithm as it requires a central entity having global knowledge. A practical algorithm must allow the links in the network to determine their respective bottleneck rates in a distributed fashion. Consider again the link Lj with ABR bandwidth Rj shared by Nj ABR flows. The set of flows V j traversing link Lj can be divided into two subsets - the subset Cj containing the flows that are constrained by other links and the subset Bj containing the flows for which Lj is the bottleneck link. If bottleneck level of Lj is k, then Cj contains flows with bottleneck levels 1 through
( )
and has a tag ml which gives the number of flows which have the rate rl . Let the elements be arranged in descending order and k be the length of the sequence. Consider the following inequality k 2 R − ∑l = l * m l rl rl * −1 ≥ > r * l k N − ∑l = l * m l Where N = ∑l =1 ml is the total number of, and R the total k
bandwidth available to, ABR flows. Comparing the middle subexpression of (2) with the right-hand side of expression (1), we see that they are equal if C only contains flows which have a rate rl ≤ rl* . Hence inequality (2) tries to find rl * , the smallest rl ,
rl * are considered
such that if all flows with rate smaller than
constrained at other links and BR of the link is computed, it is found to satisfy rl * −1 ≥ BR > rl * , as it should. If the inequality (2) is not satisfied even for l * = 2 , then, k
r1
G + 1 , we are forced to use l * = G + 1 . This amounts to assuming that flows with rates rl < rG are constrained elsewhere in the network. An incorrect BRO is computed when it is not so. The search to determine l * now only requires O(G ) time. In this paper we only concern ourselves with the case of G = 1 , i.e. when only the largest rate is stored. In [17] we report simulation studies of use of values of G greater than one. It is seen that for the network GFCII configuration described earlier, the performance of the algorithm when G = 3 is indistinguishable from when the complete sequence of rates is maintained. Maintaining only r1 , i.e. the largest rate, and assuming that
Figure 6A: Source ACRs when SLAPLUS algorithm is used 8.0e+07
7.0e+07 6.5e+07 6.0e+07 5.5e+07 5.0e+07 4.5e+07 4.0e+07 3.5e+07 3.0e+07 2.5e+07
corresponds to l = 2 . Substituting in (5),
1.5e+07
R − ∑l = 2 m l rl k
BR ′ =
N − ∑l = 2 m l k
R − ∑l =1 m l rl k
= r1 +
m1
src_a* src_b* src_c* src_d* src_e* src_f* src_g* src_h*
7.5e+07
flows with rates rl < r1 are constrained elsewhere in the network *
k
divided equally among the flows with rate r1 . The phrase Simple LArgest PLUs Surplus (SLAPLUS) will be used to refer to the algorithm just described. Results from simulation of the GFC2 network with switches employing the SLAPLUS algorithm are presented graphically in Figure 6. It can be seen that the sources converge to the exact Max-Min fair rates. Preceding the attainment of steady values there are oscillations that last a few measurement intervals. These oscillations are due to two reasons. For over-subscribed links, in absence of the knowledge of rates other than the largest rate, SLAPLUS computes a BR which reduces the largest rate by the amount necessary to end the over-subscription. But this value of BR may cause some of the sources with rates smaller than the largest rate to also reduce their rates. Overall a larger than required reduction results. Conversely, SLAPLUS may compute a larger than required value for BR for an under-subscribed link. These under- and over-estimations prolong the time required to reach convergence and affect the amount of buffers required in switches. But they do not lead to persistent oscillations or unfairness [17].
ACR (bps)
Number of cells
7.0e+02
2.0e+07
1.0e+07 5.0e+06 0.000
0.010
0.020
0.030
0.040
0.050 0.060 Time (sec)
0.070
0.080
0.090
0.100
k
The aggregate rate of all the flows
R N is a lower bound on BR. BR equals R N when all flows are bottlenecked at the link under consideration. BR is larger than R N when one or more flows are constrained at other links in the network. Hence for an over-subscribed link, k 6 R R − ∑l =1 m l rl BR ′′ = max , r1 + m1 N
∑m r l =1
l l
, the total number of
flows N, the largest rate r1 and the number m1 of flows with the largest rate are all computed incrementally. To prevent multiple accounting, a flag seeni is associated with each flow Fi in S . The flows are observed over an interval T0.
0-7803-4386-7/98/$10.00 (c) 1998 IEEE
Figure 6B: Switch queues when SLAPLUS algorithm is used
the effect of low-pass filtering the available bandwidth versus time function and is known to make the control stable [16]. Use of nested intervals partitions the set of flows traversing a link into levels (Figure 1). A flow with a Current Cell Rate of CCR is said to be of level k given by 8 N rm + 1 k = log M CCR × T0
5.1e+02 cc_0 cc_1 cc_2 cc_3 cc_4 cc_5 cc_6
4.8e+02 4.5e+02 4.2e+02 3.8e+02
Number of cells
3.5e+02 3.2e+02 2.9e+02 2.6e+02 2.2e+02
where N rm CCR is the inter-arrival time of RM cells. The
1.9e+02 1.6e+02
above relation assigns to a flow the level k such that N rm CCR is strictly smaller than the duration of the level k measurement interval given by T0 × M k . Alternatively, a flow is said to be of
1.3e+02 9.6e+01 6.4e+01 3.2e+01 0.0e+00 0.000
0.020
0.040
0.060
0.080
0.100
Time (sec)
Each time a forward RM cell for a flow Fi with seeni = false arrives,
∑
k
l =1
ml rl , N, r1 and m1 are updated and seeni is set to
true. Each update is a constant time operation. At the end of the interval, BR is computed and reinitialization is performed. Computing BR using Eqn. (7) also required only O(1) time. Unfortunately resetting of flags seeni in a straight-forward manner requires O(N ) time. This is avoided by using two arrays instead of one. During a measurement interval, one array is in use while the other is being reset. At the start of a new measurement interval the roles of the two arrays are swapped.
5. LAPLUS : LArgest PLUs Surplus A short measurement interval improves the responsiveness of an algorithm. On the other hand, too short a measurement interval has two undesirable effects. First, no RM cell for a flow may be seen during the interval ∆T if the flow has a cell rate less than Nrm ∆T . This will make the mi s smaller than they really are and the changes in BR larger than they need to be. Oscillations in cell rates result and queues in the switches may grow until buffers overflow. Second, if too short an interval is used to measure the available bandwidth, the flow control tries to adapt to even short lived changes in the available bandwidth. An unstable control results [16]. In presence of these conflicting consideration some researchers have chosen to use large measurement intervals and accept the reduced responsiveness [11]. Others use a short measurement interval but also use methods such as exponential averaging to deal with the errors [4, 9]. A better solution to the problem is using nested intervals. Each of the innermost intervals are as small as is necessary to achieve the desired responsiveness. Intervals other than the innermost are an integer multiple of the length of the next inner interval long. For example the innermost intervals may have a duration of T0 and an interval of level k may have a duration of
Tk = T0 M k . The ATM Forum mandates that an active flow must send an RM cell every 100 ms. Hence the outermost interval must be larger than 100 ms to ensure that at least one RM cell from each active flow is seen during the outermost interval. Nested intervals enable an outer and hence a large interval to be used for the determination of the available bandwidth. This has
level k iff RLk −1 ≥ CCR > RLk , where RLk = N rm T0 M k . A useful observation to make at this point is that when a level k interval expires, intervals at levels k − 1 , k − 2 , .., 0 also expire. Now a pair - {Rk , N k }- is maintained for each level k in the nesting. The elements of the pair are intended to be the aggregate rate of level k flows and their number. At the start of every level k measurement interval, Rk and N k are initialized to zero. When a forward RM cell for a flow arrives, the CCR for the flow is used to assign the flow a ratelevel k. If this is the first forward RM cell for the flow seen during the current level k interval, CCR is added to Rk and N k is incremented. When intervals at levels k ≤ k * expire, BR needs to be computed. To compute BR, the aggregate rate RT of all flows and the total number N T of flows are required. Figure 3 shows two levels deep nesting being used. The inner level is referred to as the level zero and the outer level as level one. At t = T0 , a level zero interval has ended and N 0 gives the number of level zero flows. But N1 is not equal to the number of level one flows at t = T0 or any t < T1 = 4T0 . Only at t = T1 , when a level one interval has expired, does N1 equal the number of level one flows. To work around this problem, when an interval expires, the corresponding number of flows and the aggregate rate of flows is saved away for use during the following interval. Let SN k and SRk be the saved away N k and Rk . The question we seek to answer is, whether, at any instant when intervals at levels k ≤ k * have expired, ∑k ≤k * N k + ∑k >k * SN k equal N T . When nested intervals, rather than a single interval, are being used, changes in rates of the flows may result in changes in their levels. Referring again to Figure 3, an RM cell for a flow A is seen at the point in time marked a1 with a value for CCR which classifies the flow as a level one flow. Rate allocation is recomputed at time 4T0. N 0 and N1 are saved away in SN 0 and
SN1 , respectively, and are initialized to zero. If the flow A then increases its rate sufficiently to move to level zero, an RM cell for the flow may arrive during the level zero interval ending at time 5 T0 . As the seen flag for the flow is clear, the flow is classified as a level zero flow and N 0 is incremented. At the expiry of the level zero interval ending at time 5 T0 ,
0-7803-4386-7/98/$10.00 (c) 1998 IEEE
N T ≠ N 0 + SN 1 as flow A is being counted twice - once in N 0 and once in SN1 . To guard against this error, SNks and SRk s are incrementally updated as necessary. At the beginning of a level k measurement interval, SRk and SNk are equal respectively to the aggregate rate of all level k flows and the total number of level k flows seen during the previous level k interval. During the interval if a flow moves from level k P to level k, k P ≠ k , SN k , SRk , SNk P
P
and SRk are appropriately updated. To understand the remaining problem, consider the case when the only level zero flow, e.g. C in Figure 3, reduces its rate. An RM cell for flow C arrives at time c1, 2T0 ≤ c1 < 3T0 , with a value for CCR which classifies the flow as a level zero flow. Rate allocation is re-computed at t = 3T0 and N 0 is initialized to zero. Then the flow reduces its rate sufficiently enough to move to level one. When this happens no RM cell for the flow may be seen during the level zero interval ending at t = 4T0 . Flow C is not counted in N T = N 0 + SN 1 and so is invisible to the computation at t = 4T0 . The number of such level k flows which were previously at level k − 1 and then reduced their rates is given by 9 NI k = SN k −1 − N k −1 Note that (9) is an approximation as, in practice, a level k − 1 flow may have reduced its rate to any of the lower levels and not specifically to level k. But note that any error introduced is short-lived as eventually an RM cell for the flow arrives, causes the correct N k to be incremented and, one measurement interval later, SN k to be set to N k . It also helps the stability of the control as any flow which climbs several levels down is taken down one level per measurement interval until an RM cell for the flow does arrive. The total number of flows is now approximated by 10 N T = ∑ SN k k
The aggregate rate of all the flows is now approximated by 11 RT = ∑ Rk + ∑ (SN k − N k )RLk + ∑ SRk k >k * 1≤k ≤k * k ≤k * The bottleneck rate is once again computed as before, except that the aggregate rate of all flows and the total number of all flows given above are used. Hence, 12 R R − RT BR = max , r1 + m1 NT
5.1.
The Rate-Level and Time-Stamp Tags
A one bit seen flag is inadequate as the flag must also convey the rate level of the flow. A two bit tag may be used for a three levels deep nesting of intervals with one of the four values standing for unseen. A two bit time-stamp is also associated with each flow. As many modulo-4 counters as is the depth of nesting of intervals are maintained. All the counters are initialized to zero at power-up and are incremented each time an interval of respective level expires. Whenever the first RM cell for a flow
arrives, its time-stamp is set to the modulo-4 count then associated with its current rate level. This time-stamp makes it possible to tell whether an RM cell for a flow last arrived during the current interval, the previous interval or two intervals back. The complete pseudo-code for the LAPLUS algorithm is given in Appendix.
5.2.
Minimum Cell Rate (MCR) and Peak Cell Rate (PCR) of Flows
It is simple to change the criterion from being Max-Min to MCR-plus-equal-share. The available bandwidth is reduced by the sum of MCR of ABR flows, Max(0, ACR-MCR) is used as the constraint rate of a flow in place of its ACR and the ER field of returning RM cells is compared against BR + MCR. LAPLUS determines the bottleneck rates by allowing flows to increase or asking them to decrease their rate over a number of measurement intervals. For a link with a bottleneck rate greater than the PCR of some of the flows traversing it, once the largest rate grows beyond the PCRs of any flows, those flows come to be regarded as being constrained elsewhere (in this case, at the source) as they should be. Thus PCRs of flows are taken into consideration in the normal course of operation.
6. Simulation results Models of an end-station and a switch were built using the OPNET tool [14]. The network shown in Figure 1 was setup. This network is called the Generic Fairness Configuration II in [15]. cc_0 through cc_6 represent seven switches connected by six links. This network has embedded within it the parking-lot and the chain configurations. It is known [10] that the parking lot configuration causes utilization to be low for some algorithms (e.g. [6]) and the chain configuration causes some algorithms (e.g. [8]) to converge to unfair allocations. The links connecting cc_0 - cc_1 and cc_5 - cc_6 run at 50 Mbps. The links connecting cc_2 - cc_3 run at 100 Mbps. Finally the links connecting cc_3 - cc_4 and cc_4 - cc_5 run at 150 Mbps. Table 1 below gives the fair rate allocation. The end station to switch distances is assumed to be 200 m and the distance between switches 200-km. Figure 7 shows the source ACRs and the switch queue sizes when LAPLUS algorithm is employed. Comparing with the results for SLAPLUS we see that the magnitude of oscillations is much reduced. Queues at every switch except cc_5 are seen to be shorter. This is because SLAPLUS used a single measurement interval T = 3 ms. LAPLUS controls flows g1 through g7, which are bottlenecked at link cc_5 - cc_6, using the level one measurement interval T1 = 4 ms and the rest of the flows using the level zero measurement interval T0 = 1 ms. A smaller measurement interval makes the algorithm more responsive and better at preventing queue growth. To study the performance of the LAPLUS algorithm when there are sudden changes in bandwidth demand or availability, simulation was carried out with all but sources g1 through g7 starting transmission first, attaining the Max-Min fair rates for this reduced configuration and then sources g1 through g7 begin transmission. We again see (Figures 8 A and B) that all sources
0-7803-4386-7/98/$10.00 (c) 1998 IEEE
Figure 8B: Switch queues when some sources start after a delay
attain their correct Max-Min fair rates. It is known [10] that some algorithms (e.g. [9]) are unfair to flows starting after other flows. A recent version of [9] does not suffer from this unfairness problem.
1.0e+03 cc_0 cc_1 cc_2 cc_3 cc_4 cc_5 cc_6
9.6e+02 9.0e+02 8.3e+02 7.7e+02
Figure 7A: Source ACRs when LAPLUS algorithm is used 7.0e+02
8.0e+07
7.0e+07 6.5e+07 6.0e+07
Number of cells
src_a* src_b* src_c* src_d* src_e* src_f* src_g* src_h*
7.5e+07
5.1e+02 4.5e+02
3.2e+02
5.0e+07
ACR (bps)
5.8e+02
3.8e+02
5.5e+07
2.6e+02 4.5e+07 1.9e+02 4.0e+07
1.3e+02
3.5e+07
6.4e+01
3.0e+07
0.0e+00 0.000
2.5e+07 2.0e+07 1.5e+07 1.0e+07 5.0e+06 0.000
0.010
0.020
0.030 0.040 0.050 0.060 0.070 0.080 Time (sec) Figure 7B: Switch queues when LAPLUS algorithm is used
0.090
0.100
1.0e+03 cc_0 cc_1 cc_2 cc_3 cc_4 cc_5 cc_6
9.6e+02 9.0e+02 8.3e+02 7.7e+02 7.0e+02
Number of cells
6.4e+02
6.4e+02 5.8e+02 5.1e+02 4.5e+02 3.8e+02 3.2e+02
0.030
0.045
0.060
0.075 0.090 Time (sec)
0.105
0.120
0.135
0.150
The bandwidth available to ABR flows was set to be 95 % of the bandwidth remaining after guaranteed flows are provided for. The sources g1 through g7 were modeled as ON-OFF processes. The ON-OFF periods were made short enough to not allow the network to reach a steady-state during them. To simplify interpretation of results, the amount of traffic offered by these sources, when they are ON, was set to their fair share. Hence if the flow-control algorithm works properly, when sources g1 through g7 are on, the other sources must receive the same bandwidth as specified in Table 1 and when sources g1 through g7 are off, the sources must be given their fair share for the reduced configuration (without sources g1 through g7). It can be seen from Figure 9A that it indeed is the case. Figure 9B shows that there is no uncontrolled growth of switch queues.
2.6e+02
Figure 9A: Source ACRs with frequent and sharp changes in the available bandwidth 8.0e+07
1.9e+02
src_a* src_b* src_c* src_d* src_e* src_f* src_g* src_h*
7.5e+07
1.3e+02
7.0e+07
6.4e+01 0.0e+00 0.000
0.015
6.5e+07 0.010
0.020
0.030
0.040
0.050 0.060 0.070 0.080 Time (sec) Figure 8A: Source ACRs when some start after a delay
0.090
0.100
6.0e+07 5.5e+07
8.0e+07
6.5e+07 6.0e+07
ACR (bps)
7.0e+07
ACR (bps)
5.0e+07
src_a* src_b* src_c* src_d* src_e* src_f* src_g* src_h*
7.5e+07
4.5e+07 4.0e+07 3.5e+07 3.0e+07
5.5e+07
2.5e+07
5.0e+07
2.0e+07
4.5e+07
1.5e+07
4.0e+07
1.0e+07
3.5e+07
5.0e+06
3.0e+07
0.0e+00 0.000
2.5e+07 2.0e+07
1.250
cc_0 cc_1 cc_2 cc_3 cc_4 cc_5 cc_6
9.6e+02
1.5e+07
9.0e+02
1.0e+07
8.3e+02 0.015
0.030
0.045
0.060
0.075 0.090 Time (sec)
0.105
0.120
0.135
0.150
Finally to study the performance of LAPLUS algorithm in presence of frequent and sharp changes in CBR/VBR traffic, simulation was carried out with traffic offered by sources g1 through g7 assigned to the guaranteed class whereas the traffic offered by rest of the sources was assigned to the ABR class. The handling by the switches of the traffic classes was chosen to be the simplest possible as to offer the least help to the flowcontrol algorithm. The switches maintain two queues at each output, one for each traffic class, and serve the queues in strict static priority. Hence the ABR queue is only served when the guaranteed queue is empty.
7.7e+02 7.0e+02
Number of cells
5.0e+06 0.000
0.125 0.250 0.375 0.500 0.625 0.750 0.875 1.000 1.125 Time (sec) Figure 9B: Switch queues with frequent and sharp changes in the available bandwidth
1.0e+03
6.4e+02 5.8e+02 5.1e+02 4.5e+02 3.8e+02 3.2e+02 2.6e+02 1.9e+02 1.3e+02 6.4e+01 0.0e+00 0.000
0.125
0-7803-4386-7/98/$10.00 (c) 1998 IEEE
0.250
0.375
0.500
0.625 0.750 Time (sec)
0.875
1.000
1.125
1.250
7. Conclusions We described LAPLUS, a switch algorithm for flow control of the ABR ATM service. LAPLUS requires as few as four - two to store the rate level and two to store a modulo-4 time stamp - flag bits of memory storage per flow. The algorithm has an operational time complexity of O(1) . The low operational timecomplexity is a consequence of the use of the novel largest plus surplus heuristic to estimate the bottleneck rate of a link. Note that the algorithm though, on account of flags associated with each flow, scales only as well as O(N ) . We simulated LAPLUS on network configurations which are known to cause algorithms to be unfair. LAPLUS is seen to enable flows to attain their exact Max-Min fair rates in the steady state. LAPLUS uses nested measurement intervals. Use of the outermost (therefor a large) measurement interval to determine available bandwidth filters out short-lived large magnitude changes. Appropriate (based on the rate of flows) measurement intervals, are used to reliably determine the number of and the bandwidth used by flows. Rate allocation is recomputed once every an inner (therefor short) interval. The specific interval used depends on the largest rate. Therefor LAPLUS is able to contain queue growth and keep link utilization high. In practice though, only a small depth of nesting (e.g. three) may be used. Hardware modules implementing LAPLUS in a 32 port switch handling 64k flows and having an aggregate bandwidth of 320 Gbps have been designed. Work with the objective of analytically proving the convergence property and deriving the upper bound on convergence time is being pursued. Acknowledgments The authors wish to express their thankfulness to Martin Izzard and Nick McKeown for discussions and feedback which made it possible to maintain the focus on minimizing the time and space requirement of the algorithm. They also thank Dave Scott and Bob Hewes.
SRk = Aggregate constraint rate of level k flows computed incrementally. TS k = A modulo-4 counter incremented every level k interval i = VCI of the RM cell ratelevi = Rate level of flow i TSVi = “Time-stamp” identifying when ratelevi was updated Functions
function prevseen(i ) Let k P = ratelevi if TSVi = TS k P error /* Information not available */ /* Modulo-4 */ else if TSVi = TS k P − 1 return k P else return k P + 1 end function seen(i ) Let k P = ratelevi if TSVi = TS k P return k P else return unseen end Initialization
∀i : ratelevi = unseen , TSVi = 3 ∀k : N k = Rk = rk = m k = N T = RT = 0, TS k = 0 Event : Forward RM cell arrival Let k be such that RLk −1 > CCR ≥ RLk If seen(i ) = unseen
N k = N k + 1 , Rk = Rk + CCR If CCR > r1
r1 = CCR , m1 = 1 else if CCR = r1 m1 = m1 + 1
Appendix Pseudo-code for the LAPLUS algorithm
Let k P = prevseen(i )
Design Parameters K = Nesting depth of intervals T0 = Interval base M = Interval scale factor
if k ≠ k P
SN k = SN k + 1 , SRk = SRk + CCR
If k P ≠ unseen SN k P = SN k P + 1
Parameters R = Bandwidth available to ABR flows RLk = Minimum rate for a flow to be considered level k
if k P > k
Variables
SRk = SRk − RLk P
N k = Number of level k flows Rk = Aggregate constraint rate of the level k flows r1 = Largest among flow rates m1 = Number of flows with rate r1 SN k = Number of level k flows computed incrementally
P
P
−1
else
SRk = SRk − RLk P
P
P
ratelevi = k, TSVi = TS k Event : Expiry of measurement interval Let k * be the lowest level where the interval has expired.
0-7803-4386-7/98/$10.00 (c) 1998 IEEE
∑ Rk + ∑ (SN k − N k )RL k + ∑ SR k k >k 1≤k ≤k * k ≤k * N T = ∑ SN k
RT =
*
k
R R − RT ,r + BR = max N T k* mk *
For k = k * downto 0 do SRk = Rk , SN k = N k
N k = Rk = rk = mk = r1 = m1 = 0 TS k = TS k + 1 /* Modulo-4 */ Initiate Sequential Re-initialization. Event : Backward RM cell arrival If ER > BR ER = BR Sequential Re-initialization Let k * be the lowest level where the interval has expired. For i = 1 to number of flows Let k P = ratelevi if k P ≤ k * if TSVi = TS k P − 2
/* Modulo-4 */
ratelev i = k + 1 , TSVi = TS k − 1 P
P
Bibliography [1] F. Bonomi and K. W. Fendick, “The Rate-Based Flow Control Framework for the Available Bit Rate ATM Service,” IEEE Network, March/April 1995, pp 25 - 39. [2] S.S.Sathaye, “ATM Forum Traffic Management Specification,” AFTM-0056, June 1996. [3] D.Bertsekas and R.Gallager, Data Networks, Englwood Cliffs, Nj: Prentice Hall, 1992.
[4] N. Ghani and J. W. Mark, “Dynamic Rate-Based Control Algorithm for ABR Service in ATM Networks,” Proc. GLOBECOM’96, November 1996. [5] G. Bianchi et. al., “Congestion Control Algorithms for the ABR Service in ATM Networks,” Proc. GLOBECOM’96, November 1996. [6] S. Muddu et. al., “Max-Min Rate Control Algorithm for Available Bit Rate Service in ATM Networks,” Proc. GLOBECOM’96, November 1996. [7] A.Barnhart, “Enhanced Switch Algorithm for Section 5.4 of TM Spec.,” AF-TM 95-0195, Feb 1995. [8] L.Roberts, “Enhanced PRCA (Proportional Rate Control Algorithm),” AF-TM 94-0735R1, Aug 1994. [9] R.Jain et. al., “ABR Switch Algorithm Testing: A Case Study With ERICA,” AF- TM 96-1267, October 1996. [10] F.M.Chiussi and A. Varma, “QOS and Congestion Control in ATM Networks,” IEEE Workshop on VLSI in Communications, 1996. [11] A.Charny et. al., “Time Scale Analysis and Scalability Issues for Explicit Rate Allocation in ATM Networks.,” IEEE/ACM Trans. on Networking, pp 569 - 581, August 1996. [12] D.H.K.Tsang et al., “A New Rate-Based Switch Algorithm for ABR Traffic to Achieve Max-Min Fairness with Analytical Approximation and Delay Adjustment,” INFOCOM’96, Mar 1996. [13] Y. Afek et. al., “Phantom: A Simple and Effective Flow Control Scheme,” Proc. SIGCOMM’96, August 1996. [14] Opnet Modeller, Volumes 1 - 8. MIL 3 Inc., Washington. [15] R.Simcoe, “Test Configurations for Fairness and other Tests,” AF-TM 94-0557, Jul 1994. [16] Y.Zhao et. al., “Feedback Control of Multiloop ABR Traffic in presence of CBR/ABR Traffic Transmission,” Proc. ICC’96, June 1996. [17] S.Prasad et. al., “LAPLUS: A Provably Convergent Switch Algorithm for Flow Control of the Available Bit Rate ATM Service,” In preparation.
Figure 4 - The Generic Fairness Configuration II network
0-7803-4386-7/98/$10.00 (c) 1998 IEEE