Adaptive Dynamic Traffic Engineering for DiffServ-Enabled MPLS Networks A. Hafid1, N. Natarajan2 1 Network Research Laboratory, University of Montreal, Pavillon André-Aisenstadt H3C 3J7, Canada
[email protected] 2 Telcordia Technologies, Inc. 331 Newman Springs Road, Red Bank, NJ, 07701
[email protected]
Abstract The current practice in traffic engineering for Multi-Protocol Label Switching (MPLS) networks is to use expected traffic matrix and Service Level Agreements (SLA) policies as input, and compute a set of Label Switched Paths (LSPs) that satisfy the requested requirements. However, mismatch between traffic forecast and actual load including distribution among Differentiated Services (DiffServ) classes is unavoidable in dynamic network environments. This may cause overload and unbalanced utilization of LSPs. Furthermore, network failures may occur causing the failure of LSPs and possibly their planned backups; even in the case of dynamic backups, the MPLS network may not be able to establish backup paths because of resource shortages. Consequently, there is a critical need for adaptive dynamic traffic engineering. In this paper, we propose two novel schemes to realize such adaptive dynamic traffic engineering. A traffic rearrangement scheme that executes when changes on link bandwidth partitions among DiffServ classes are needed to alleviate the overload situations and a traffic restoration scheme that kicks into action whenever the MPLS network fails to restore failed LSPs upon network failures. Keywords: MPLS, DiffServ, traffic restoration, traffic rearrangement
1. Introduction Traffic engineering is the ability to plan and control routing of traffic through a network ensuring efficient and balanced utilization of network resources while simultaneously satisfying traffic Quality of Service (QoS) requirements and Service Level Agreements with users. Using expected traffic matrix and SLA policies as input, MultiProtocol Label Switching (MPLS) traffic engineering tools (e.g., [1]) compute a set of Label Switched Paths (LSPs) that satisfy the given requirements. Policies for routing flows over LSPs are also computed. The policies and LSP configuration produced by these tools work well as long as the current
network conditions do not deviate from the input data given to these tools. However, mismatch between traffic forecast and actual load including distribution among Differentiated Services (DiffServ) classes is unavoidable in dynamic network environments. This may cause overutilization of some LSPs. Further, in situations where admission control based on bandwidth availability on LSPs for additional traffic is used to regulate admission of new flows or sessions in the network, underutilization of some LSPs also may occur. Such anomalies may be alleviated by creating new LSPs [2, 3, 4, 5] to adapt to increases in traffic demand for specific traffic classes, either on an edge pair or network wide basis. LSPs may be dynamically created for each DiffServ class; DiffServ aware LSP creation can be realized by viewing the MPLS network as a collection of logical networks (one logical network per DiffServ class). However, this is not sufficient in the case of resources shortage for one or more DiffServ classes; in this case, adjusting the resource partitions, per DiffServ class, is necessary. Indeed, we may find that the resources assigned to one DiffServ class (e.g., EF) are underutilized while resources assigned to another DiffServ class (e.g., AF1) are over-utilized (and thus causing overload for this DiffServ class). To alleviate this type of situations, the values of the bandwidth partitions (hereafter called DiffServ policies) need to be adjusted in such a way that resources utilization is maximized while providing assured level of services to the customers. Adjusting the ratio affects the amount of bandwidth assigned to a service class. To enforce this adjustment, LSP rearrangement is necessary; we are not aware of any existing approach that rearranges LSPs to satisfy changes in bandwidth partitions. In this paper, we propose a rearrangement scheme that reconfigures LSPs, adjusts bandwidth of existing LSPs. creates new ones if needed, and assigns flows to LSPs to satisfy the new DiffServ policies while minimizing the disruption of ongoing traffic. Furthermore, the proposed scheme accommodates high priority traffic first and then low priority traffic. In MPLS networks, in case of failures (e.g., link failure), traffic is switched over from failed LSPs to the corresponding backup LSPs. However, this does not guarantee the
0-7695-2622-5/06/$20.00 (c) 2006 IEEE
restoration of failed traffic upon any failure; indeed, the network failure may cause the failure of some primary LSPs and their corresponding pre-planned backups and/or there are not enough available resources to set up dynamic backups or to restore/reroute traffic on failed primary LSPs that share backup LSPs; MPLS network restoration is based on provisioned bandwidth of LSPs [6]. In this paper, we propose mechanisms for finer granular traffic restoration than supported by MPLS network; we present a restoration scheme that is based on utilized bandwidth; this means LSPs are created to restore traffic, called restoration LSPs, with reserved bandwidth that is equal to the bandwidth used in the failed LSPs instead of their provisioned bandwidth; thus, it has better chances to find such LSPs than MPLS network restoration schemes. Furthermore, the proposed scheme restores high priority traffic first; then, it restores, if possible, lower priority traffic. The restoration LSPs are used only to carry the failed traffic and are removed when failures are repaired and the traffic is rerouted to the repaired LSPs; this is done to maintain the pre-defined optimal layout of LSPs [7]. The remainder of the paper is organized as follows. Section 2 describes the traffic rearrangement scheme. Section 3 describes the restoration scheme. Section 4 describes a prototype implementation of a policy-based network management system developed at Telcordia making use of the rearrangement and the restoration schemes. Finally, Section 5 concludes the paper.
2. The traffic rearrangement scheme The goal of the traffic rearrangement scheme is to perform adaptive dynamic traffic engineering in response to policy changes. Two types of policy changes are supported: • Edge-to-edge capacity change: this type of change is used to accommodate changes in traffic load between a specific edge pair; the request includes the amount of bandwidth, per DiffServ class, to be allocated between a pair of ingress and egress nodes; • DiffServ policy change: this type of change is requested when the current traffic class distribution is significantly different from previously planned distribution; the request specifies the new DiffServ policies (i.e., bandwidth partitions among DiffServ classes in terms of link bandwidth percentages) to be enforced. The traffic rearrangement scheme realizes adaptive traffic engineering via LSP reconfiguration, LSP creation, flow to LSP assignment, and preemption of lower priority traffic in order to satisfy the new policies. The design goal of the scheme is to produce a minimum number of LSPs and to minimize the disruption of ongoing traffic. The scheme for determining and generating policy changes is out of scope of this paper; this is the subject of a patent that has been filed [8]. Let us first define the following terms that will be used in the remainder of the section: (1) Provisioned bandwidth (L, C), where C is a DiffServ class and L is an LSP: the bandwidth provisioned for C in L;
the sum of Provisioned bandwidth (L, C) for all classes C supported by L is equal to the bandwidth reserved for L, at LSP setup time, by the network. (2) Allocated bandwidth (L, C): the amount of bandwidth on L that has been assigned to traffic/flows of C (e.g., if an LSP, L, carries 2 EF flows of 1 Mbps, then Allocated bandwidth (L, EF) is 2 Mbps). (3) Available bandwidth (L, C): Provisioned bandwidth (L, C) minus Allocated bandwidth (L, C). (4) Available link bandwidth (E, C) where E is a link: the maximum capacity that is permitted for C on E minus the sum of Provisioned bandwidth (L, C) for all LSPs that traverse E. It is defined only for links for which this value is greater than zero. (5) Violated bandwidth of link (E, C): the sum of Provisioned bandwidth (L, C) for all LSPs that traverse E minus the maximum capacity that is permitted for C on E. It is defined only for links for which this value is greater than zero.
2.1. Traffic rearrangement In response to an edge-to-edge capacity change request, the traffic rearrangement scheme tries first to reconfigure existing LSPs to accommodate the requested amount of bandwidth. More specifically, it identifies the LSPs that can support the requested DiffServ classes between the requested ingress and egress nodes; an LSP can support one or more DiffServ classes depending on the Traffic Engineering policies specified by the network provider. Then, for each of these LSPs, it determines for each DiffServ class C, the potential bandwidth increase, B, which is the minimum of available link bandwidth (E, C) for all Es traversed by LSP. Then, it selects, a minimum number of LSPs, such that the sum of the corresponding Bs is larger than the requested amount. If this selection succeeds, then the rearrangement scheme reconfigures these LSPs by upgrading their provisioned bandwidth; otherwise, the reconfiguration of these LSPs is supplemented with the computation of new LSPs (see Section 2.2 for more details) to accommodate the remainder of the requested bandwidth. In response to a policy change request that requires DiffServ policy changes, the traffic rearrangement scheme tries to reconfigure a minimum number of LSPs by decreasing their reserved bandwidth without rerouting any traffic; the reduction will correspond to the bandwidth that is reserved for the LSP but not allocated to any flow. If such LSP reconfigurations produce a network configuration that satisfies the new policies (i.e., provisioned bandwidth on each network link, for each DiffServ class, does not exceed the amount defined by the new DiffServ policy), then the traffic rearrangement scheme is complete. Otherwise, the scheme will create a minimum set of new LSPs (see Section 2.2 for details) and reroute the minimum amount of traffic needed to produce a network reconfiguration that satisfies the new policies. Thus, the proposed rearrangement scheme will not affect ongoing traffic unless it is really necessary, and in the
0-7695-2622-5/06/$20.00 (c) 2006 IEEE
case when it is necessary, it will minimize the amount of traffic rerouted. If the new set of LSPs cannot accommodate the traffic that needs to be rerouted, lower priority traffic will be preempted to allow for the accommodation of higher priority traffic. The traffic rearrangement scheme for DiffServ policy changes consists of three phases as described below; Phases 2 and 3 are executed only when Phase 1 does not succeed in producing a network configuration that satisfies the new DiffServ policies. Phase 1: Reduction of provisioned bandwidth of LSPs This phase consists of identifying the list of LSPs to reconfigure (i.e., reduce the provisioned bandwidth) in order to satisfy, if possible, the new DiffServ policies. A simple approach to realize this phase consists of considering each link that violates the new policies, and then selecting one or more LSPs, that traverse this link, for reconfiguration. However, this will produce a large set of LSPs to reconfigure. To minimize the number of LSPs to reconfigure, the proposed traffic rearrangement scheme uses a heuristic, called “LSP over maximum violated links”; this heuristic consists of considering first LSPs that traverse most of the links violating the new DiffServ policies for bandwidth reduction; this means an LSP that traverses n (violating) links is considered rather than an LSP that traverses m • Current network state; list of links in the network with their available bandwidth per DiffServ class. Output • RouteLSPs = [(Route1, {, , …}), (Route2, {, , …}),….] Variables • DB0, DB1,..,DBmaxIndex are variables that represent lists of the form {, …}. Initially, they are set to null • •
Begin While (L is not empty) do Remove from L; /* is the first element in L */ Compute a route, Routej, with maximum bandwidth, BWmj, for DiffServClassj, between ingress and egress; /* this step uses a modified version of the shortest path algorithm; it uses as input the current network state */ Set all DBm (0=BWk) then can be accommodated; otherwise, it cannot */ Add to DB0 any element, , that has been accommodated by Routej; Add (Routej, DB0) to RouteLSPs; Remove from L any element, , that has been accommodated by Routej; Update the network state by updating the available bandwidth, for each DiffServ class that is accommodated, on each link traversed by Routej; /* new available bandwidth, for DiffServClassj, on a link traversed by Routej= (current available bandwidth, for DiffServClassj, on the link) – (BWj ) */ else Compute as few routes, Routes={Routej0, Routej1, .., Routejm}, as possible that can accommodate BWj; /* each route, Routejp (where 0