Photon Netw Commun (2009) 17:35–47 DOI 10.1007/s11107-008-0141-2
p-cycle Protection in multi-domain optical networks János Szigeti · Ricardo Romeral · Tibor Cinkler · David Larrabeiti
Received: 6 May 2008 / Accepted: 16 July 2008 / Published online: 13 September 2008 © Springer Science+Business Media, LLC 2008
Abstract Providing resilient inter-domain connections in multi-domain optical GMPLS networks is a challenge. On the one hand, the integration of different GMPLS domains to run traffic engineering operations requires the development of a framework for inter-domain routing and control of connections, while keeping the internal structure and available resources of the domains undisclosed to the other operators. On the other hand, the definition of mechanisms to take advantage of such automatically switched inter-domain connectivity is still an open issue. This article focuses on the analysis of applicability of one of these mechanisms: p-cycle-based protection. The proposed solution is based on the decomposition of the multi-domain resilience problem into two sub-problems, namely, the higher level interdomain protection and the lower level intra-domain protection. Building a p-cycle at the higher level is accomplished by certain tasks at the lower level, including straddling link connection, capacity allocation and path selection. In this article, we present several methods to realize inter-domain p-cycle protection at both levels and we evaluate their performance in terms of availability and spent resources. A discussion on a proposal of implementation of signalling based on J. Szigeti (B) · T. Cinkler Department of Telecommunications and Media Informatics, HighSpeed Networks Laboratory, Budapest University of Technology and Economics, Magyar tudósok krt. 2, 1117 Budapest, Hungary e-mail:
[email protected] T. Cinkler e-mail:
[email protected] R. Romeral · D. Larrabeiti Department of Telematic Engineering, Carlos III University of Madrid, Leganes 28911, Spain e-mail:
[email protected] D. Larrabeiti e-mail:
[email protected]
extensions of existing protocols such as RSVP-TE and the PCE architecture illustrates the practical viability of the approach. Keywords
Resilience · p-cycle · Multi-domain networks
1 Introduction In optical transport networks, the huge amount of transported traffic belongs to several service classes. These service classes usually have different reliability requirements to be guaranteed to the connections or to the traffic belonging to the class. Without a suitable resilience method, most of these reliability requirements cannot be fulfilled. In order to enhance the reliability of optical networks, the simplest way is to form BLSRs (Bidirectional Line Switched Ring); however, there exist several other failure handling methods, which provide different availability improvement and traffic loss, and require different amount of spare capacity, switching intelligence and routing protocols. In a multi-domain environment, the expectation of the end-users is that they get the same or near the same reliability for long inter-domain connections as for the short intra-domain connections. This expectation is, however, set back by several difficulties: (a) physically longer connections may fail with higher probability and hence by using merely those traditional protection schemes that were satisfying in intra-domain cases, we do not get the desired grade of availability; and (b) setting up protection in a multidomain environment, where multiple service providers are present, also raises common management and control issues. In this article, we present a method to provide resilient inter-domain connections based on the so-called p-cycle protection scheme and also give the protocollar background
123
36
Photon Netw Commun (2009) 17:35–47
of the method to realize it in today’s and tomorrow’s GMPLS controlled optical networks. The article is organized as follows. The next section recalls the p-cycle concept and specifies the method used to estimate the availability of a connection in the article. Section 3 deals with the definition of the target problem. Then, Sect. 4 gives an overview of the options for intradomain interconnection for the p-cycle and the optimization objective. Section 5 describes the key aspects of signalling required to realize the proposed approach. Section 6 presents the results of the simulations, and finally, Sect. 7 draws general conclusions.
– 2.
their scope: – – –
3.
2 p-cycle protection –
2.1 Classification Network failures can be handled in several ways. However, it is common in each failure handling strategy that it intends to provide a backup path for the traffic that is going through the broken part (cable or switching equipment) of the network. The difference of various resilience mechanisms, as partially presented in [5], can be viewed from many aspects and classified by 1.
4.
is the whole working path protected by an alternate, disjoint backup path (end-to-end protection), are the segments of the working path separately protected [6,7] or the spans between two adjacent nodes (e.g. p-cycles or RPR (Resilient Packet Ring) [8]) are the subject of the protection;
the moment of their assignment: –
The p-cycle protection scheme was originally proposed by Grover and Stamatelakis [1], combining the advantages of the ring and of the mesh: it realizes ring-like recovery speed while retaining the capacity efficiency of the mesh-based methods. The p-cycle is a cyclic, pre-calculated, pre-assigned, closed path with a certain amount of allocated spare capacity [2]. It provides protection for any link that has both end nodes on the cycle as either an on-cycle link or a straddling link. Similar to BLSR rings, if an on-cycle link fails, then it is replaced by a protection path along the remaining cycle. In contrast to BLSR rings, the p-cycle is also able to protect straddling links that results in higher capacity efficiency than for the ring of the same size [3]. In the case of a straddling link failure, the p-cycle can simultaneously protect two units of working capacity on the straddling link by providing two alternative backup paths around the p-cycle. It must be also mentioned that p-cycle is a link protecting scheme, i.e. it does not protect the connections against node failures; however, there are extensions of it to make it capable of protecting nodes too [4].
or is the backup path activation performed locally?
is the backup path defined and allocated in advance (protection) or after the failure is detected (restoration);
the quality (does the backup path provide the same service level as the original one, what is the protection preemption strategy of the scheme, etc.) and the quantity of the backup connection: – –
is the whole amount of the traffic protected or only a certain ratio of it.
There may be other aspects taken into consideration as well. According to these four aspects of classification, the p-cycle fits into the set of local(1) span(2) protection(3) of the different resilience schemes. Regarding the quality and quantity(4), passing by more devices, the backup connection suffers longer transmission delay and the amount of the protected traffic as well as the preemption strategy may be set dynamically. 2.2 Availability of p-cycles The availability is one of the most important Quality of Service (QoS) attributes of a connection. In Service Level Agreements (SLA), the minimally expected availability is described usually as the allowed yearly outage (e.g. hours/ year). In order to provide the desired grade of availability, the availability of connections must be estimated in advance. In order to estimate the availability of the optical cables, we take two basic invariants, namely, MTTR (Mean Time To Repair) and CC (Cable-Cuts—the average cable length suffering 1 cut a year). Having a cable of length l, its MTBF (Mean Time Between Failures) value is given in hours as
the responsibility of activation:
CC · 365 · 24 . l Next, the unavailability ratio is calculated as
–
U=
need the end nodes of the connection or some immediate nodes along the connection to be notified about the failure event to activate the protection (connection oriented)
123
MTBF =
MTTR MTTR = · l. MTBF CC · 365 · 24 Finally, we get the availability metric (A) of a network element as the inverse of unavailability: A = 1 − U .
Photon Netw Commun (2009) 17:35–47
In our simulations (Sect. 6), we denote the invariant with LFC (Link Failure Coefficient) taking its value from the range [3 · 10−7 , 3 · 10−4 ], which covers the optimistic, nominal and conservative values of MTTR and CC listed in [9]. The total availability of the whole connection can be calculated as a compound of the availability metrics of the devices along the connection paths. If there are many overlapping parts in the default and backup paths of the connections, and especially if the backup resources are shared, getting the exact values of availability based on the parallel/serial calculation model [10] gets overly complex. There are, however, heuristics that estimate the availability fast and very accurately. Huang et al. [11] traces back connection availability on link availabilities in networks without resource sharing, Mello et al. [12] suggests a method for estimating Shared Backup Path Protection (SBPP) availability, Pándi and Gricser [13] estimates connection availability by evaluating N failures in the network and giving an upperbound for the estimation deviation, Clouqueur and Grover [14] investigates dual failures and Schupke [15] analyses multiple failures in p-cycles. Another possibility is using heuristics like Monte-Carlo simulation or Tabu-Search for estimating connection availability. Szigeti and Cinkler [16] pointed out that, for p-cycle protected connections, using the serial/parallel model without considering network resource overlaps the inaccuracy of the calculated unavailability remains moderate. In the simulations, we calculate connection availability by a simple serial/ parallel model, where two elements having the availability of A1 and A2 connected serially have the availability of As = A1 · A2 and A p = A1 + A2 − A1 · A2 when connected parallel. According to these equations, we estimate the availability of a path made up of a sequence (serie) of n protected (parallel) n AiP , where AiP = Aiw + Abn − An · Abn . links as A P = i=1 P A is the total availability of the whole path, Aiw is the availability of link i without protection, AiP is the total availability of the protected link i and Aib is the availability of the path protecting link i. Note that this is a lower bound estimation, since the different backup paths protecting each link may share segments with each other leading to dependency among protected segments and a hence higher product of availabilities. MTTR CC·365·24
3 Multi-domain optical networks and resilience Today, optical circuit provisioning is usually performed manually and constrained to a single operator. Upcoming standards such as ASON and GMPLS will make automatic set up of optical circuits usual within a few years. Each optical node will have an attached CPU running an IP routing protocol and a signaling protocol that will be used to set con-
37
nections up according to the topology and addressing plan. At a first stage, automatically switched connections will be intradomain. In a second phase, the benefits of extending the service to multiple domains will probably make operators sign cooperation agreements to create optical net exchanges supporting signalling. One of the advantages of these exchanges is fast provisioning and exploitation of optical circuits when the client sites are connected to different carriers; another advantage is the sharing of inter-domain protection costs, which is the focus of this article. Indeed, the possibilities of having shared connections traversing multiple domains for the sake of resilience, virtually extend the ability of the operators to protect their connections beyond their own domains. 3.1 Inter-domain resilience in GMPLS Failure handling in an inter-domain environment is more complex than in the intra-domain case due to the traditional isolation of domains. No information on topology, capacity, etc is disclosed to competing operators and the responsibility for a failure is difficult to locate. Furthermore, existing protocols are not designed to deal with such a context. The consequence of this is that today, inter-domain resilience for optical networks is merely based on local protection, through the duplication of links and nodes in the interdomain. This is an expensive static solution that can be improved in the case of groups of operators willing to cooperate in a very controlled way. Essentially, in order to make multi-domain resilience feasible, a number of key elements are required: (a) a signalling protocol suitable to set up connections throughout domains; (b) an inter-domain routing protocol to at least partially distribute the domain topology; (c) a secure mechanism to provide access to a domain’s resources, and at the same time, constrain the view of most of these resources (internal topology, used/spare capacities, etc) for neighbouring domains. One possible triplet to solve the question are extensions of IETF initiatives: (a) RSVP-TE, (b) BGP [17] and (c) Path Computation Element (PCE). In fact, quite a few of the concepts developed for multi-domain IP-MPLS networks are also applicable to the optical circuit switching domain. However, to make this possible, various aspects must be addressed. In order to make it possible for RSVP-TE to signal a connection, it is necessary to provide a unique IP destination address as connection tail-end and a path. This means that the GMPLS operators should coordinate with each other to have a global IP addressing scheme at least for edge nodes. Thus, BGP and the successive interior gateway routing protocols would help to route the RSVP request. However, since it is well known that BGP provides a slanted view of the domain topology (BGP only propagates one AS path to a destination), it is not appropriate to compute disjoint domain paths or cycles as required. Therefore,
123
38
path computation can be provided by an additional element, such as the PCE [18], a server specialised in path computation in and through domains. The good thing of this type of service is that PCEs can become the brokers that implement the operator’s exchange policies, including privacy policies, for example, hiding private path information behind opaque tokens delivered to the originating domain, which are carried by the connection signalling messages. An example of sensitive information dealt with is SRLG data. Moreover, a connection brokering scheme could even lead to do without an inter-domain routing protocol and a common addressing plan. This will require specific ways to denote domains and external nodes. This implicitly suggests the creation of a complex common management plane. Simpler examples of this exist in other contexts like Metro Ethernet, with the Ethernet Connectivity Fault Management [19]. Creating a common management plane is suggested where all the operators are involved. This common plane decides how to protect against inter-domain link failures. In the presented solution, to support Multi-Domain p-cycles (MDPC), we assume the existence of such a common management for dealing with the inter-domain link protection. In the followings, we discuss why is it worth using MDPC.
3.2 Multi-domain p-cycles Although the p-cycle protection scheme applied in multidomain environment may consume more spare capacity than other, e.g. end-to-end dedicated protection (see Sect. 6.3), its application has more advantages. It inherits the fast reacting time property of p-cycles, it does not require detailed intra-domain topology information (Sect. 3.3) and it is easy to deploy (Sect. 5) in GMPLS controlled networks. It also provides higher availability for the connections and this availability, compared to the conventional protection schemes, is predictable. For example, an end-to-end protected connection may fail with higher probability than expected (pre-calculated as common outage of the two disjoint paths) in that unfortunate case when failures or SRLGs in foreign, competitor domains are not independent. Typically, disasters cause problems in many networks of different operators at the same place. If we also consider the feasibility of a protection scheme in this manner and want to eliminate such dependencies, topologically hierarchical networks with inter-operator contracts and protection based on such contracts gain importance. However, this requires a (hierarchically) higher-level management plane where these inter-operator contracts are agreed. In addition, at this point, p-cycle or any other not connection-oriented protection scheme becomes important as they do not require new contract for each connection.
123
Photon Netw Commun (2009) 17:35–47 higher level
Aggregated topology + inter−domain links
lower level
Complete network topology
Intra domain parts
Fig. 1 Multi-domain network model
3.3 Computing topology for MDPCs Both for privacy and scalability reasons, the interior topology of domains is not propagated outside the domain. However, either via a EGP or a PCE, it is possible to find out what domain a node belongs to and the domain paths that lead to it, and the border nodes traversed. Therefore, it is possible to model a multi-domain network by a two-level topology as shown in Fig. 1. At the higher level, only the border nodes and interdomain links are known definitely. The internal topology of the domains, i.e., the real nodes and links, is hidden at the higher level, only virtual links between the border nodes denote the connectivity. We shall call this virtual intradomain topology Border Node Connectivity Information (BNCI). At the lower level, each interior node has knowledge of its own internal topology, but has no information about the surrounding world outside the domain. In order to construct the upper level topology available to the common management plane of PCEs, the domains must share their own BNCI as an aggregation of their topology. Practically, this BNCI is described as a full mesh graph, consisting of logical links representing lower level—real or possible—paths between border nodes. If we want to select candidate p-cycles, e.g. with availability restriction, the BNCI should contain availability metrics; in the case of having hop-constraint on the length of the cycles, the BNCI must contain hop-count metric, etc. The simplest case is when there are not any constraints on MDPCs. In that case, the BNCI is uniform for each border node pair and does not carry any information, and each domain can be represented by a single node and the aggregation becomes much like that used in Private Network-Node Interface (PNNI) [20]. In the following sections, we present how MDPCs can be realized using this model. The problem can be split into two parts:
1.
Section 4 discusses the planning of MDPCs, including the selection of candidate p-cycles, the intra-domain cycle-part composition and the assignment of p-cycles to connections.
Photon Netw Commun (2009) 17:35–47
2.
39
Section 5 identifies additional control aspects that are required when the failure occurs on a straddling link.
4 Assigning candidate MDPCs Having the upper level domain topology available, there are several fast and efficient cycle-search algorithms [21] that may be applied to get the (upper level) inter-domain p-cycle candidate set. Selecting the candidate p-cycles is the task of the common management plane (proposed in Sect. 3.2). A first approach to exploit inter-domain p-cycles would be to pass a p-cycle by every pair of inter-connected border nodes between two or three domains that have several interdomain links. This way, it would be possible to fast-reroute the connections on a straddling and on an on-cycle failing link. In this case, the protection is provided by a set of short p-cycles and there is not much difference with 1:1 protection of interdomain links. In this article, we propose and study the usage of longer p-cycles traversing a greater amount of domains, with the purpose of achieving greater sharing of reserved bandwidth in interdomain links. The idea is to prevent passing the p-cycle through all the border nodes of a domain; this was required to protect straddling links. Instead, a pre-configured intradomain circuit is linked to the p-cycle just when a straddling link fails. The examples in Figs. 2 and 3 show which inter-domain links and intra-domain connections between border nodes (intra-domain parts) must be activated in the case of on-cycle and straddling link failures, respectively. Note that there are two types of border nodes in the figure: CBN is the ending of an on-cycle link (on-Cycle link Border Node) and SBN is the ending of a straddling link (Straddling link Border Node). The figures also show that the intra-domain part of the cycles is reconfigured to cross different border nodes depending on the inter-domain link that fails. Normally, the cycle passes through the shortest path between the two CBNs. However, in the case of a straddling link failure, in both domains linked by the straddling link, the cycle uses both paths between the SBN and the two CBNs (Fig. 3).
Fig. 3 Straddling link failure handling with higher level p-cycle
(a) Internal connections to realize
(b) Most reliable internal connections (MR)
(c) Least cost internal connec- (d) Ring-based internal connections (LC)
tions (RB)
Fig. 4 Logical internal p-cycle connections and alternate resolutions
As long as there are no straddling links connected to the domain, the intra-domain planning task (finding a route between two gateways) is easy and self-evident: the gateways should be connected via the most suitable route. However, if one or more straddling links end is not attached to a CBN, the planning of intra-domain connection becomes more complex. Figure 4a shows the logical connections between border nodes that must be preconfigured inside the domain. These logical connections must be further refined into real paths, i.e. we have to define which intra-domain links realize the logical connections. This can be realized in different ways that we present in the next section. Each of these ways provides a different global availability and spends a different amount of resources. 4.1 Optimization goal
Fig. 2 On-cycle link failure handling with higher level p-cycle
At this stage of network resolution, from the perspective of global optimization, there are two opposite goals: either we focus on the expenses and get the connections with the least cost (LC), or we can keep optimization on reliability providing the highest availability for the connection (Most Reliable—MR). If we take the latter choice, first, we must select the most reliable paths separately between each SBN–CBN pair and
123
40
Photon Netw Commun (2009) 17:35–47
also between the two CBNs (Fig. 4b). After the paths are found, the capacity assignment algorithm is easy: Initially, we allocate one unit of capacity on each link that is selected, i.e. the link is part of at least one path. Next, we examine the SBNs node-by-node: on those links that are common in the two paths leading from the SBN going to the two CBNs, we must allocate two units (dashed lines) of capacity instead of one unit (solid lines), as they have to conduct both units of capacity of the protected straddling link. Finding the cheapest and the least resource-consuming connection between border nodes inside the domain (Fig. 4c) is a more complex optimization problem, which may be solved by ILP or other general solver. The LC-finder algorithm we use is a greedy heuristic connecting in each step the closest border node to the increasing spanning tree between border nodes. The capacity assignment in this simple heuristic is the same as described for the MR case. In both approaches, it is assumed that the network supports LSP aggregation (this may require specific hardware) and that in the event of a failure the SBN notifies one of the CBNs to connect the preconfigured intra-domain connection (this is discussed in Sect. 5). 4.2 Achieving further reliability Compared to the intra-domain p-cycles, the inter-domain cycles may be and usually are also much longer and also less reliable. That makes reasonable to improve their availability and also protect the inter-domain cycles or at least some part of it. One possibility is to assign two separate p-cycles to the inter-domain links, which have only the protected link in common. This way, the first protection is protected by another p-cycle. This choice arises many questions, partially addressed to the general, not only inter-domain, simple p-cycles, and topologically is not always feasible. Similarly, routing the default path on a straddling link of a cycle, the cycle can serve as two disjoint backup paths for the link (see p-cycle Multi-Restorability Capacity Placement [22]). In this article, we do not deal with these options. We propose to assign protection to those parts of the long loop that are easy (from topological and protocollar aspects) to protect. In our case, these easy-to-protect parts are the intra-domain parts of the long inter-domain cycles. We get each internal connection realized if we find a cycle that passes through each affected border node (Fig. 4d). Additionally, in this case, there are two link-disjoint paths between each border node pair. We have to deal, however, with the following problems: –
There are no suitable cycles stringing up all the border nodes. In that case, we may find the cycle that is the richest in border nodes, i.e. the cycle that contains most CBNs and SBNs. Afterwards, we can connect the remain-
123
–
ing border nodes to this cycle. Of course, the latter connected nodes will not benefit of the advantage of the RB solution. There may be more than one cycle that contains each border node. Which one should we choose among them? If we take capacity–efficiency also into account, a reasonable choice is that cycle, or the shortest of those cycles, which have a direct SBN-free connection between the two CBNs. These cycles require less capacity on the direct connection. While discussing the resource requirement of RB, we will see why.
It is clear that, compared to LC and MR, the RB intradomain connection realization requires the most resources as we define (at least) two paths for each CBN–CBN or SBN–CBN connection. Of course, these connections share the resources among themselves, but the capacity requirement is still high. Figure 5 illustrates the capacity requirement of ring-based intra-domain solution. The normal case is when the protection is routed between the two CBNs. As Fig. 5a shows, the internal cycle provides two alternate routes; we use the shortest one by default, and whenever a link failure makes that route unavailable, we can still route the protection via the longer path. This scenario requires one unit of capacity on each link. The next figures, Figs. 5b–d, show scenarios where there is an inter-domain straddling link failure. In Fig. 5b, we see the default case where there is no failure in the internal cycle at all or the failure does not affect either r oute1 or r oute2 .
te 1
rou
(a) Topology of RB internal routes
(c) Failure on route1
route2
(b) No internal failure
(d) Failure on route2
Fig. 5 Capacity requirement of RB solution on different failure scenarios
Photon Netw Commun (2009) 17:35–47
Figures 5c–d illustrate scenarios when either r oute1 or r oute2 fails. In these cases, the two routes have common parts on some links where two units of capacity must be allocated. Generally, for RB solution, to protect the inter-domain p-cycle against intra-domain failures, instead of one unit, two units of capacity are required on those links that route the protection from the SBN directly to the CBNs. Consequently, if there are SBNs on both half-cycles between the two CBNs, two units of capacity must be allocated everywhere on the cycle.
5 Protocollar issues of inter-domain p-cycles As stated before, there are many protocollar aspects of the target scenario still to be developed and standardised by the research community. The main one is reaching a consensus on an inter-domain interaction framework, which should include a common addressing scheme and a means to stitch connections through multiple domains, in this case to build p-cycles. According to this expected consensus that will be mainly driven by commercial reasons, one or another combination of protocols to support multi-domain connections will eventually succeed. Therefore, as of today, we just can identify the key aspects of signalling for the proposed solution. 5.1 Entities and protocols for p-cycle computation and set up
PCEs all complex inter-domain issues, such as path computation, addressing of external nodes, connection admission control, security and privacy issues. As described, once the PCE transaction is complete and all the information is available, the actual p-cycle setup and activation is expected to be signalled by a protocol such as RSVP-TE. 5.2 Link failure recovery procedure Let us show two examples to illustrate our proposal for interdomain p-cycles activation upon a link failure. Figure 6 shows the p-cycle configuration for five domains. The p-cycle uses the inter-domain links and internal connections R12 –R21 – R23 –R32 –R35 –R53 –R54 –R45 –R41 –R14 –R12 protecting the inter-domain links and for protecting R24 –R42 straddling link, it additionally uses the internal connections R21 –R24 , R23 –R24 and R41 –R42 , R45 –R42 . If an inter-domain on-cycle link fails, e.g. R14 –R41 in Fig. 7, some inter-domain LSPs can be affected like LSP1 in the figure. When node R14 detects the failure, it sends a Notify message along the p-cycle to inform the whole p-cycle control plane that a failure has been detected and it is in use. Immediately, nodes R14 and R41 re-route the connections over the p-cycle. If the connections were MPLS LSPs with generic MPLS encapsulation rather than switched circuits, it would be possible to share a single p-cycle LSP by all recovered LSPs by means of label stacking. Otherwise, the p-cycle must bear as many protection circuits as circuits to be protected. If the failure is on a straddling inter-domain link, as R24 – R42 in Fig. 8, the signalling is a bit more complicated. When node R24 detects the failure, the affected connections, e.g. LSP2 in Fig. 8, must be switched over the p-cycle. For this purpose, node R24 sends to its preconfigured CBN R23 a notify TE message to indicate that the traffic/connections of LSP2 will be forwarded through it using the backup p-cycle
PCE4 R1J
Domain4
Domain1
R42
R12
PCE1
R4L
R41
R14
R45
l cyc p−
R21
e
In order to resolve the higher level inter-domain protection using p-cycles, it is necessary to find a cycle that contains all domains involved. Given the limitations of BGP [23,24] to provide the complete topology of domains, it is probable that the IP-GMPLS multidomain interconnection framework will have to rely on the PCE concept currently under definition at the IETF PCE Working Group [18] (or a specific connection brokering scheme performing a function alike). A set of operators willing to collaborate and share p-cycles are supposed to set up and federate their PCEs. A PCE is a server associated to a domain that keeps a traffic engineering database and serves intra and inter-domain path computation requests. The procedure that we propose is the following. In order to create a p-cycle, a traffic engineering tool running a PCE client (or a network node itself) would issue a request to its PCE for a path through a given domain sequence (denoted by the AS number if BGP is run). After requesting the other domains’ PCEs, the PCE would return either the explicit path for the p-cycle or a list of tokens to be handed by RSVP-TE at each domain traversed. Each token would be translated into a concrete path within the domain, again with the help of the PCE. Thanks to this approach, it is theoretically possible to issue p-cycle computation requests by delegating on
41
R24 R2K
Domain2 PCE2
R23
R54 R53
R32
Domain5
R35
Domain3
PCE5
PCE3
Fig. 6 Inter-domain p-cycle protecting six inter-domain links
123
42
Photon Netw Commun (2009) 17:35–47
tion (DP) and No Protection (NP), however, these are used only as references in a global network without any domain boundaries and topology aggregation.
PCE4
LSP1 R1J
R14
Domain4
Domain1
R42
R45
6.1 Simulation setup
le cy c
p−
y Notif
R12
PCE1
R4L
R41
R21 R24
Numerically, we have investigated the topic of inter-domain p-cycles in five dimensions:
R2K
Domain2 PCE2
R23
R54 R53
Domain5
R32 R35
Domain3
1. 2.
PCE5
3. PCE3
4.
Fig. 7 On-cycle failure
5. PCE4 R1J
R41
R14
Domain4
Domain1
6.1.1 Test networks R42
R12
R45
p−
PCE1
R4L
We examined the protection schemes on three different networks:
l cyc
R21
e
R2K
No
tif
Domain2
y1
R24
LSP2
R23
PCE2
R54
y2 Notif
R53
R32
Domain5
– –
R35
Domain3
PCE5
–
PCE3
Fig. 8 Straddling link failure
and that its next hop is R42 in Domain4. R24 switches the LSP2 to R23 via an intra-domain LSP. R23 receives the Notify message and the traffic from R24 . Then, it sends a Notify message over the p-cycle to inform the straddling link failure over this half of the p-cycle. A similar interaction happens in Domain4 between R42 and R45 . R45 extends the p-cycle to include R42 thanks to the intra-domain LSP and the connectivity is restored.
6 Numerical results In this section, we analyse the performance of the different MDPC solutions focussing on the newly proposed, ringbased (RB) intra-domain cycle resolution scheme (see Sect. 4.2). We compare it to the LC (Least Cost) and MR (Most Reliable) solutions and examine it from the aspect of resource usage and provided availability. The figures also show curves corresponding to Dedicated End-to-End Protec-
123
the network connectivity (described in Sect. 6.1.1); the value of the Link Failure Coefficient (defined in Sect. 2.2); what intra-domain cycle resolution we use (LC, MR or RB, see Sect. 4.1 and 4.2); what intra-domain protection do we apply (described in Sect. 6.1.2) and how much spare capacity we provide for resilience (results in Sect. 6.5).
E1Net [25], a realistic European multi-domain network consisting of 17 national domains (Fig. 9). Xnet, a fairly regular grid network organized into grid groups of 16 nodes (Fig. 10). Tnet, which gives a fair compromise between realistic and regular/artificial networks, with eight domains and an average nodal degree of three (Fig. 11). This network was used also in [21] for simulations.
Whilst Xnet is well suited for testing the dependence between resource consumption and provided availability, Tnet provides a good testbed to examine availability improvement, and with the E1Net, we can make observations on realistic, heterogeneous cases when double connectivity between node or domain pairs cannot always be achieved, the traffic is not unified distributed among the nodes, etc. 6.1.2 Protection on intra-domain part of working paths Besides having different topologies, each inter-domain p-cycle protection can be combined with a different intradomain protection scheme protecting the intra-domain segments of the working path. We use the same combinations as presented in [21]. Regarding the intra-domain protection, it means: – – –
p-cycle protection (CIDA), dedicated protection (CIDED) or no protection at all (CIDA0).
Photon Netw Commun (2009) 17:35–47
43
Fig. 11 Topology of Tnet Fig. 9 Topology of the E1Net
Unavailability
10-1 10
-2
10-3 CIDED-RB CIDA-RB CIDED-MR CIDA-MR CIDED-LC CIDA-LC Dedicated E2E No protection
10-4 10-5 10-6 3⋅10-7
1⋅10-6
3⋅10-6
1⋅10-5
3⋅10-5
1⋅10-4
3⋅10-4
LFC
Fig. 12 Average unavailabilities of protection schemes
Figure 12 shows that on the regular topology (Xnet), CIDED-RB results in the least unavailability, and in case of LFC < 3.0 · 10−6 , CIDA-RB also provides better availability metrics than any other strategy, except for the theoretical Dedicated End-to-End protection. 6.3 Resource consumption Fig. 10 Topology of the regular Xnet
6.2 Unavailability reduction Our basic aim with planning different p-cycle protection schemes was reducing the unavailability of connections. We already know how do LC and MR strategies perform when applied in CIDA and CIDED schemes. We want to compare the RB strategy to them.
As expected, the strategies that result in higher availability demand a few more additional resources. The differences are all below one order of magnitude. Figure 13 points out this behaviour. Dedicated protection requires 2.5 times more capacity than connections without any protection (i.e. the backup routes are average on 1.5 times longer than the working paths), and each cycle-based protection scheme requires even more: CIDED-RB approximately four times as much. The intra-domain links employed by higher level p-cycles are wasted in the sense that their resources are allocated; how-
123
Photon Netw Commun (2009) 17:35–47 5
3 2 1 0 CIDED-RB CIDA-RB CIDED-MR CIDA-MR CIDED-LC CIDA-LC Ded. E2E
No prot.
Relative Thrift
Fig. 13 Resource consumption of protection schemes compared to the No Protection case 10
3
10
2
0.8
0.6
0.4
0.2
0
CIDED-RB CIDA-RB CIDED-MR CIDA-MR CIDED-LC CIDA-LC Dedicated E2E No Protection
0.9999
0.99992
0.99994
0.99996
0.99998
1
0.99998
1
Minimum availability requirement CIDED-RB CIDA-RB CIDED-MR CIDA-MR CIDED-LC CIDA-LC Dedicated E2E No protection
101
100
10-1
Ratio of connections fulfilling the availability requirement
1
E1Net Tnet Xnet
4
10-6
10-5
10-4
LFC
Fig. 14 Relative thrift of different protection schemes
Fig. 15 Tail behaviour of protection schemes in Xnet 1
Ratio of connections fulfilling the availability requirement
Relative Resource consumption
44
0.8
0.6
0.4
0.2
CIDED-RB CIDA-RB CIDED-MR CIDA-MR CIDED-LC CIDA-LC Dedicated E2E No Protection
0 0.9999
0.99992
0.99994
0.99996
Minimum availability requirement
ever, in contrast to the inter-domain links, the higher level pcycle does not offer protection for their traffic. This explains the relatively high resource consumption of MDPCs. The relative thrift of the schemes can be measured as the ratio of gained reduction in unavailability and additional resource requirement. Figure 14 illustrates that having the Link failure coefficient within the range of 10−6 < LFC < 10−5 , the unavailability reduction is 10–100 times more than the relative additional resource requirement of the schemes. Moreover, it can be seen that it is worth to invest into CIDEDRB because its relative thrift is the highest. 6.4 Getting the desired ratio of connection with defined availability The previous figures are not enough to really understand the difference among the different protection schemes. In this section, we compare these protection schemes on different topologies by means of the tail behaviour of the connection availability. The figures here and in the following section (Sect. 6.5) show results for LFC = 3 · 10−6 (conforming to the nominal values of CC = 300 km and MTTR = 8 h). Figures 15, 16 and 17 show what percentage of 3,000– 5,000 connections have higher availability than a given (x) lower limit. Figure 15 illustrates the strength of the CIDED and RB schemes on the regular topology: CIDED-RB, CIDED-MR and CIDA-RB satisfies the most connections with high availability.
123
Fig. 16 Tail behaviour of protection schemes in Tnet
Figure 16 depicts the trends in Tnet. It is worth to see the behaviour of the curve corresponding to the DP scheme. To a small ratio of connections, these are basically the short connections, it can provide high availability (as much as CIDEDRB does); however, for most connections, it offers a relatively low availability. The Pan-European multi-domain network, the E1Net, shows different behaviour as described for the other test networks. Figure 17 shows that on the E1Net topology CIDED performs worse than CIDA. This can be explained by the fact that dedicated protection inside the domain cannot be established in each domain. The same fact explains the negligible dominance of the RB schemes, as the cycle connecting the border nodes in the RB strategy also requires two disjoint paths between the border node pairs. 6.5 Performance in overloaded networks Section 6.3 points out that the price of providing high availability is high: a complex protection scheme demands 3 or 4 times more free resources than the connection provisioning without protection. Because of this, inter-domain p-cycles are suggested to be used in networks with plenty of free capacity. However, what happens if the network gets partially overloaded?
Photon Netw Commun (2009) 17:35–47
45
loaded, resource shortage happens sooner, than the resource consumption of the different schemes suggests (Fig. 13).
Ratio of connections fulfilling the availability requirement
1
0.8
7 Conclusions
0.6
0.4
0.2
0
CIDED-RB CIDA-RB CIDED-MR CIDA-MR CIDED-LC CIDA-LC Dedicated E2E No Protection
0.9999
0.99992
0.99994
0.99996
0.99998
1
Minimum availability requirement
Fig. 17 Tail behaviour of protection schemes in E1Net
Ratio of connections fulfilling the availability requirement
1
0.8
0.6
0.4
0.2
CIDED-RB CIDA-RB CIDED-MR CIDA-MR CIDED-LC CIDA-LC Dedicated E2E No protection
0 0.996
0.9965
0.997
0.9975
0.998
0.9985
0.999
0.9995
1
Minimum availability requirement
Fig. 18 Tail behaviour of protection schemes in overloaded networks
Figure 18 shows the tail behaviour of the investigated protection schemes in Xnet, where without any protection there was an average link load of 20% (intra-domain) and 31% (inter-domain). It can be seen that, compared to Fig. 15, the Dedicated end-to-end protection is the only scheme that does not suffer any loss due to resource shortage. Using other schemes, only about 80% of the connections can be protected fully. The performance of CIDA-RB is surprising: despite its relatively high resource usage (see Fig. 13), it provides high availability for more connections than schemes requiring less spare capacity. The reason for this behaviour is that the p-cycle used at the lower level to protect intra-domain working path is much more elastic than the dedicated protection: the dedicated protection scheme leaves the working path unprotected if a disjoint backup path to it cannot be found, whereas the p-cycle tries to protect as many links as possible and finally results in much less unprotected working path links than the dedicated protection does. The simulations also showed that the inter-domain links did not get overloaded; only the intra domain links did at the lower level. However, when increasing the link-capacity by 10%, which means setting the initial intra-domain load to 18%, there was no resource shortage at all. To summarize, even though CIDED strategies provide the highest availabilities due to their rigidity, they perform poorly in an overloaded network, and since the network links are not homogeneously
In this article, we proposed and analysed a procedure to use p-cycles to provide resilience of inter-domain links in a multiple-domain GMPLS scenario. The proposal consists of two parts. We first proposed a mechanism to realise and exploit inter-domain p-cycles over the topology of domains, and then, we combined the inter-domain p-cycle protection with different intra-domain protection schemes. We also presented the fundamental protocollar issues related to this solution: (a) the practical construction of the topology of domains, (b) the set-up of p-cycles and (c) the procedure to recover connections over on-cycle and straddling links. The discussion on a proposal of implementation of signalling based on extensions of existing protocols such as RSVP-TE and the PCE architecture illustrated the practical viability of the approach. Comparing the different protection p-cycle-based multi domain schemes and setting them against the dedicated endto-end protection scheme, we found that the schemes provide nearly the same amount of unavailability reduction with respect to the no-protection case (Fig. 12). The dominance of CIDED strategies (inter-domain p-cycles combined with intra-domain dedicated protection) are confirmed by analysing the tail behaviour of the availability metric at the cost of additional resource consumption. If we have an optimization objective that takes the resource consumption and the provided availabily together into account, e.g. the relative thrift metric as the ratio of unavailability reduction and additional resource usage, it is clear that CIDED scheme with RB (Ring Based) intra-domain border node connection strategy shows the best performance. These results are general except for poorly connected networks, e.g. in the E1Net, where disjoint paths between border nodes cannot be found inside the domains where CIDED and RB solutions have weak performance metrics. Furthermore, these protection schemes should be used only in networks where there are plenty of resources to spare and it is convenient to use them instead of investing in local protection links. Finally, we can observe that, in general, p-cycles are a viable traffic engineering tool to improve resilience of inter-domain connections and to benefit from well-known advantages of this technique, such as low maintenance, short recovery times and low cost. Further research towards the development of an appropriate business model to support these concepts may draw the interest of operators to this type of techniques in the next years. Acknowledgements The work described in this article was carried out with the support of the BONE-project (“Building the Future
123
46 Optical Network in Europe”), a Network of Excellence funded by the European Commission through the 7th ICT-Framework Programme, ePhoton/ONe+ and Spanish grant TSI2005-07384-C03-02. Many thanks to the LEMON-team (http://lemon.cs.elte.hu) for developing and providing their excellent graph template library to accelerate our simulations.
References [1] Grover, W.D., Stamatelakis, D.: Cycle-oriented distributed preconfiguration: ring-like speed with mesh-like capacity for selfplanning network restoration. In: ICC 1998, Atlanta, GA, June 1998, pp. 537–543 (1998) [2] Blouin, F.J., Sack, A., Grover, W.D., Nasrallah, H.: Benefits of p-cycles in a mixed protection and restoration approach. In: DRCN 2003, Banff, Alberta, Canada, October 2003, pp. 203– 211 (2003) [3] Grover, W.D., Stamatelakis, D.: Bridging the ring-mesh dichotomy with p-cycles. In: DRCN 2000, Münich, Germany, April 2000, pp. 92–104 (2000) [4] Shen, G., Grover, W.D.: Extending the p-cycle concept to path segment protection for span and node failure recovery. IEEE J. Sel. Areas Commun. 21(8), 1306–1319 (2003) [5] Ramamurthy, S., Mukherjee, B.: Survivable WDM mesh networks. II. Restoration. In: ICC ’99, vol. 3, pp. 2023–2030 (1999) [6] Ho, P.-H., Mouftah, H.T.: A framework for service-guaranteed shared protection in WDM mesh networks. IEEE Commun. Mag. 40(2), 97–103 (2002) [7] Ho, P.-H., Tapolcai, J., Cinkler, T.: Segment shared protection in mesh communications networks with bandwidth guaranteed tunnels. IEEE/ACM Transac. Netw. 12(6), 1105–1118 (2004) [8] Spadaro, S., Solé-Pareta, J., Careglio, D., Wajda, K., Szyma´nski, A.: Positioning of RPR standard in contemporary environments. IEEE Netw. 18, 35–40 (2004) [9] Verbrugge, S., Colle, D., Demeester, P., Huelsermann, R., Jaeger, M.: General availability model for multilayer transport network. In: DRCN 2005, Ischia, Italy, October 2005, 8 pp (2005) [10] Dhillon, B.S.: Reliability in Computer System Design. Ablex Publishing Corporation, Norwood, NJ (1987) [11] Huang, Y., Wen, W., Heritage, J.P., Mukherjee, B.: A generalized protection framework using a new link-State availability model for reliable optical networks. J. Lightwave Technol. 22(11), 2536– 2547 (2004) [12] Mello, D.A.A., Schupke, D.A., Waldman, H.: A matrix-based analytical approach to connection unavailability estimation in shared backup path protection. IEEE Commun. Lett. 9(9), 844– 846 (2005) [13] Pándi, Zs., Gricser, A.: Improving connection availability by means of backup sharing restrictions. OSA J. Optic. Network. 5(5), 383–397 (2006) [14] Clouqueur, M., Grover, W.D.: Availability analysis of spanrestorable mesh networks. IEEE J. Sel. Areas Commun. 20(4), 810–821 (2002) [15] Schupke, D.A.: Multiple failure survivability in WDM networks with p-cycles. In: ISCAS ’03, vol. 3, pp. 866–869 (2003) [16] Szigeti, J., Cinkler, T.: Incremental availability evaluation model for p-cycle protected connections. In: DRCN 2007, La Rochelle, France, October 2007 (2007) [17] Rekhter, Y., Li, T., Hares, S.: A border gateway protocol 4 (BGP4). In: IETF RFC 4271, January 2006 (2006)
123
Photon Netw Commun (2009) 17:35–47 [18] Farrel, A., Vasseur, J.-P., Ash, J.: A Path Computation Element (PCE)-based architecture.In: IETF RFC 4655 (Informational), August 2006 (2006) [19] Draft standard for local and metropolitan area networks—Virtual bridged local area networks—Amendment 5: connectivity fault management. In: IEEE 802.1ag standard (2006) [20] ATM Forum: private network–newtwork interface specification Version 1.0. In: af-pnni-0055.000 (1996) [21] Farkas, A., Szigeti, J., Cinkler, T.: p-cycle Based protection schemes for multi-domain networks. In: DRCN 2005, Ischia, Italy, October 2005, 8 pp. (2005) [22] Clouqueur, M., Grover, W.D.: Availability analysis and enhanced availability design in p-Cycle-based networks. Photonic Netw. Commun. (Kluwer) 10(1), 55–71 (2005) [23] Romeral, R., Yannuzzi, M., Larrabeiti, D., Masip, X., Urueña, M.: Multi-domain G/MPLS recovery paths using PCE. In: 10th European Conference on Networks & Optical Communications, NOC 2005, UK, July 2005 (2005) [24] Quoitin B., Pelsser C., Swinnen L., Bonaventure O., Uhlig S.: Interdomain traffic engineering with BGP. IEEE Commun. Mag. 41(5), 122–129 [25] Meskó, D., Viola, G., Cinkler, T.: A hierarchical and a non-hierarchical European multi-domain reference network: routing and protection. In: Networks 2006, New Delhi, India, November 2006 (2006)
Author Biographies János Szigeti received his M.Sc. Degree in Computer Science from the Budapest University of Technology and Economics, Hungary, in 2002. He is currently a Ph.D. student at the same university and a member of the High Speed Network Lab. His research interests focus on network controlling, inter-domain routing, resilience, modeling and generalization. He has participated in several research project supported by the EU including IP NOBEL and NOBEL2; NoE e-Photon/ONe and NoE e-Photon/ONe+.
Ricardo Romeral is assistant professor of Telematic and Switching at University Carlos III of Madrid (UC3M). He got his M.Sc. in Telecommunications Engineering from University Carlos III of Madrid in 2001 and then he obtained a Ph.D. with a thesis work focused on inter-domain protection schemes in 2007. His research interests include traffic engineering and reliability in inter-domain IP/MPLS networks.
Photon Netw Commun (2009) 17:35–47 Tibor Cinkler [M’95] has received M.Sc.(’94) and Ph.D. (’99) degrees from the Budapest University of Technology and Economics (BME), Hungary, where he is currently associate professor at the Department of Telecommunications and Media Informatics. His research interests focus on optimisation of routing, traffic engineering, design, configuration, dimensioning and resilience of IP, Ethernet, MPLS, ngSDH, OTN and particularly of heterogeneous GMPLS-controlled WDM-based multilayer networks. He is author of over 130 refereed scientific publications and of three patents. He has been involved in numerous related European and Hungarian projects including ACTS METON and DEMON; COST 266, 291, 293; IP NOBEL I and II and MUSE; NoE e-Photon/ONe and NoE e-Photon/ONe+; CELTIC PROMISE; NKFP, GVOP, ETIK; and he is member of ONDM, DRCN, BroadNets, AccessNets, IEEE ICC and Globecom, EUNICE, CHINACOM,
47 Networks, WynSys, ICTON, etc. Scientific and Programm Committees. He has been guest editor of a Feature Topic of the IEEE ComMag and reviewer for many journals.
David Larrabeiti is professor of Switching and Networking Architectures at University Carlos of Madrid (UC3M). He got his M.Sc. and Ph.D. in Telecommunications Engineering from University Politecnica de Madrid in 1991 and 1996, respectively. Since 1998 to 2006 he was associate professor at UC3M and led a number of international research projects. His research interests include the design of the future Internet infrastructure, ultra-broadband multimedia transport and traffic engineering over IP-MPLS backbones. He is UC3M responsible for ePhoton/ONe+ network of excellence on Optical Networking, and IEEE member since 1996.
123