Mitigation of Transient Loops in IGP

0 downloads 0 Views 467KB Size Report
The basic idea is that once an ingress edge router receives an IP ... approach for IGP networks performed Forwarding Information Base (FIB) updates in.
Mitigation of Transient Loops in IGP Networks Mohammed Yousef1, David K. Hunter1 1

School of Computer Science and Electronic Engineering, University of Essex, Wivenhoe Park, Colchester, Essex CO4 3SQ, UK

{mayous,dkhunter}@essex.ac.uk

Abstract. Routing loops have recently re-emerged as an important issue in new carrier class Ethernet technologies such as IEEE 802.1aq. While they can waste resources through full link utilization, routing loops were originally mitigated in IP networks by TTL expiration, resulting in wasted resources before packets were dropped after a few seconds. In this paper a new mitigation approach based upon Early Packet Dropping (EPD) is developed, in which a router can drop looping packets earlier than expected if full link utilization will occur before TTL expiration. While the EPD algorithm deals only with loops between two nodes, a proposed solution known as TTL Relaxation is also combined with this loop mitigation mechanism in order to minimize the use of EPD and minimize the performance degradation introduced by loops between more than two nodes. Keywords: routing loops, network performance.

1 Introduction Routing loops are a potential issue in data networks which employ dynamic routing protocols. Recently, this has re-emerged as an important consideration in new carrier Ethernet technologies such as IEEE 802.1aq where a link-state routing protocol determines the shortest path between Ethernet bridges. Routing loops which arise because of unsynchronized routing databases in different nodes are known as transient loops, because they exist for only a specific period before the network converges again. Such routing loops have always existed within Interior Gateway Protocol (IGP) networks, while loops occurring specifically between two nodes comprised 90% of the total in one study [1]. The Time-to-Live (TTL) field within an IP packet implements the basic loop mitigation process. It ensures that packets won’t loop forever, as the TTL will eventually expire, causing the packet to be dropped. Packets may still circulate for a long time before expiring, during which the link is fully utilized and router resources are wasted, delaying other traffic passing through the affected links. This paper introduces a new loop mitigation mechanism based upon Early Packet Dropping (EPD) where packets are dropped only when a loop traversing two nodes is expected to utilize the link fully before the network converges and before the circulating packets get dropped as a result of TTL expiration. The overall proposed solution is divided into two parts. The first part of the mitigation mechanism is known

as TTL Relaxation which is a slight modification to the “exact hop count” approach proposed in [2]. The basic idea is that once an ingress edge router receives an IP packet from the clients (i.e. from the Ethernet port), the TTL of the packet is relaxed to a new TTL which is two hops greater than the diameter of the network (the longest shortest path in terms of hop counts) from the perspective of the ingress edge router. Two more hops are added so that traffic can tolerate rerouting. In the event of a loop, packets with the new TTL will be dropped sooner, relieving the problem and reducing bandwidth consumption. Conversely, when an egress edge router forwards an IP packet to a client, it sets the TTL to a reasonable value (64 in this paper). The remainder of this paper is organized as follows. Section 2 discuses the related studies. Section 3 introduces the methodology while section 4 provides a full discussion of the EPD algorithm. Section 5 and section 6 provide the results and the conclusions respectively.

2 Related Studies While minimizing the convergence time by altering the default routing protocol timers has been proposed to minimize the duration of these routing loops [3], [4], other approaches aim to avoid such loops completely [5], [6], [7] by developing routing algorithms to ensure loop-free convergence. Another loop prevention approach for IGP networks performed Forwarding Information Base (FIB) updates in a specific order so that a node does not update its FIB until all the neighbours that use this node to reach their destinations through the failed path have updated their FIBs [8], [9], [10]. IP fast reroute (IPFRR) solutions which include the installation of backup routes considered loop-free convergence [11]. Another technique was proposed where loops are mitigated, but not avoided, by maintaining an interfacespecific forwarding table [12], [13]; packets are forwarded according to both their destination and the incoming interface, so that looped packets can be identified and dropped at the FIB because they arrive from an unexpected interface. Other studies aimed to avoid routing loops only in cases where network changes are planned [14], [15]. A more general overview of the routing loop problem and existing techniques to mitigate or avoid such loops is discussed in [16].

3 Methodology Part of the GEANT2 network (Fig. 1) is simulated with the OPNET simulator as a single-area Open Shortest Path First (OSPF [17]) network. In Fig. 1, the numbers on the links represent the routing metrics. Full segment File Transfer Protocol (FTP) traffic is sent from the servers to the clients (150 flows), using Transmission Control Protocol (TCP). The servers and clients are connected to the backbone through access nodes. Traffic from the servers follows the shortest path through nodes TR, RO, HU, SK, CZ, DE, DK, SE, and FI. Backbone optical links are configured as OC1 (50 Mbps) with utilization maintained at below 30%. The transmission buffer capacity (in packets) in all routers is equal to Backbone Link Capacity × TCP Round Trip Time

(RTT) with an estimated RTT of 250 ms, hence router buffers have a capacity of 1020 packets. IP packets of 1533 bytes are assumed throughout the paper (including headers for IP, TCP, Ethernet and PPP).

Fig. 1. Simulated network – part of GEANT2

The looped arrows in Fig. 1 identify the six links containing loops. Such loops involve TCP traffic and arise because of link failure followed by delayed IGP convergence. Every simulation run involves only one link failure, creating of one of the loops shown in Fig 1. The shaded nodes delay shortest path calculation after the corresponding link failure, thus creating the loop. Every case was simulated with loop durations varying from 200 ms to 1 second in steps of 200 ms. Voice over IP (VoIP) traffic passed though each loop; this was carried by User Datagram Protocol (UDP) and consumed 10% of the link capacity. However the UDP traffic was not caught in the loop because it was not going through the failed link originally – only TCP traffic is caught in the loop. Since in every case the loop takes place at a different “hop count distance” from the source, TCP traffic enters it with a different Relaxed TTL (்்ܴ௅ ) as shown in Table 1. This is because node TR is rewriting the TTL of the source traffic with ்்ܴ௅ of 11 which gets subtracted by one every time it passes a backbone node. During the loop, the utilization of the corresponding link and the performance of the VoIP traffic passing through it was monitored and analyzed with and without the EPD mechanism. Table 1. ܴ ்்௅ of the traffic entering the loop for each case in Fig. 1 Case

1

2

3

4

5

6

ࡾࢀࢀࡸ entering the loop

10

9

8

7

6

5

4 Early Packet Dropping Loop Mitigation The EPD algorithm has been developed to mitigate loops arising between two nodes due to link failure, node failure and increase in link metric. However due to lack of space, this paper introduces the part of the algorithm which mitigates loops that arise between the node that detects the link failure and performs the mitigation process (Master), and the node to which the traffic was rerouted (Neighbour). As the node detecting the failure is the first to converge, it’s most likely that a routing loop will form with the Neighbour node which includes the failed link in its Shortest Path Tree and to which traffic is rerouted, because the Neighbour node takes longer to converge. Before describing the details of the algorithm, the following timers and parameters are considered: • Link Failure Detection Delay (‫ܦ‬௅ி஽ ): The time between detecting the link failure at hardware level (known in advance) and notifying the routing protocol stack (configurable timer). • SPF Delay (‫) ܨܲܵܦ‬: The default delay between receiving the Link-State Advertisement (LSA) and starting the Shortest Path First (SPF) calculation (configurable timer). ே௘௜௚௛௕௢௥

• Neighbour Dummy SPT Calculation Delay (‫ܦ‬ௌ௉் ): Delay due to Shortest Path Tree (SPT) calculation and routing table update performed at the Neighbour node, assuming the link to the Master node fails. ெ௔௦௧௘௥ • Master Dummy SPT Calculation Delay (‫ܦ‬ௌ௉் ): Delay due to SPT calculation and routing table update performed at the Master node assuming the link to the Neighbour node fails. • FIB Delay (‫ܦ‬ிூ஻ ): The time required for any node to upload the original routing table to the FIB. ே௘௜௚௛௕௢௥

ே௘௜௚௛௕௢௥

• Neighbour Total Convergence Delay (‫்ܦ‬௢௧௔௟ ): ‫ܦ‬ௌ௉ி + ‫ܦ‬ௌ௉் + ‫ܦ‬ிூ஻ . • ܶܶ‫ܮ‬஺௩௚ : Average value of the TTL of packets transmitted by each interface on the Master node. • ܵ: The throughput in packets per second (pps) of every interface on the Master node. All delays are measured in seconds, while the dummy SPT calculations and routing updates do not take part in forwarding decisions – their only purpose in the model is to measure the time consumed. As shown in the algorithm in Fig 2, in the negotiation phase each Neighbour node ே௘௜௚௛௕௢௥ sends ‫்ܦ‬௢௧௔௟ to a corresponding Master node; at this time, each node will take on the role of both Master and Neighbour at the same time. This will provide the Master node with the approximate maximum time required by any of its Neighbours to converge when each receives an update (LSA) from the Master node informing it of link failure. During the monitoring phase, the Master node registers the Link Capacity in pps (C), while monitoring S, ܶܶ‫ܮ‬஺௩௚ and Link Utilization (ܷ௅௜௡௞ ) for every interface (Neighbour).

Fig. 2. The Early Packet Dropping Loop Mitigation Algorithm. Int denotes Interface

If a failure takes place forming a loop that traps TCP traffic, the TCP source sends packets up to its Congestion Window (CWND) and waits until the Retransmission Timeout (RTO) has expired because no Acknowledgments (ACKs) are received from the destination. The utilization of the link increases while these packets bounce between both nodes, until either the routing protocol converges or the TTL expires. The sum of the CWND of all TCP sources measured in packets (்ܹ௢௧௔௟ ) will be equal to the capacity of the bottleneck buffer in the access node, which is 1020 packets. If such traffic bounces over an OC1 link more than 3 times, the link will be fully utilized. If the Master node knows ்ܹ௢௧௔௟ in advance, it can estimate the amount of link capacity consumed by the rerouted and looped traffic. As the nodes cannot know the exact ்ܹ௢௧௔௟ in advance, no more than S packets per second (throughput) will enter the loop if the interface on which S was registered has failed and S packets per second were rerouted. This is why S needs to be registered for every interface. We will denote ܶܶ‫ܮ‬௎௡௥௘௔௖௛௔௕௟௘ and ܵ ௎௡௥௘௔௖௛௔௕௟௘ as ܶܶ‫ܮ‬஺௩௚ and S respectively of the ஺௩௚ Master node’s interface at which the link was failed. Once the Master node detects a link failure, it registers ܶܶ‫ܮ‬௎௡௥௘௔௖௛௔௕௟௘ and ܵ ௎௡௥௘௔௖௛௔௕௟௘ and calculates Master Total ஺௩௚ ெ௔௦௧௘௥ ெ௔௦௧௘௥ ெ௔௦௧௘௥ Convergence Delay (‫்ܦ‬௢௧௔௟ )= ‫ܦ‬௅ி஽ + ‫ܦ‬ௌ௉் +‫ܦ‬ிூ஻ (where ‫ܦ‬ௌ௉் belongs to the failed interface). In the calculation phase, the Master node performs a set of calculations (equations 1-8) for every active interface (Neighbour). All other parameters except ܶܶ‫ܮ‬௎௡௥௘௔௖௛௔௕௟௘ and ܵ ௎௡௥௘௔௖௛௔௕௟௘ belong to the interface on which the calculation is ஺௩௚ ெ௔௦௧௘௥ performed. The difference in the convergence time between the Master (‫ܦ‬ௌ௉் ) ே௘௜௚௛௕௢௥ and the Neighbour (‫ܦ‬ௌ௉் ) indicates the maximum loop duration in seconds (ߜ௅௢௢௣ ):

ே௘௜௚௛௕௢௥

ெ௔௦௧௘௥ ߜ௅௢௢௣ = ‫்ܦ‬௢௧௔௟ − ‫்ܦ‬௢௧௔௟

(1)

Assuming that all the ܵ ௎௡௥௘௔௖௛௔௕௟௘ packets will be rerouted (alternative path exists) to a single interface, the maximum Queuing Delay in seconds (ܳ஽ ) at that interface is equal to: ܳ஽ =

ܵ ௎௡௥௘௔௖௛௔௕௟௘ (݅݊ ‫)ݏݐ݁݇ܿܽ݌‬ ‫ܥ‬

(2)

As justified later, the first second of the loop is being considered, which is why ܵ ௎௡௥௘௔௖௛௔௕௟௘ is effectively in units of packets, not packets per second. We define LTT as the Loop Trip Time, ܶ஽ as the transmission delay and ܲ஽ as the propagation delay (all in seconds). The Master node can estimate the duration of a loop trip with any of its Neighbours (Master NeighbourMaster) by calculating: ‫ = ܶܶܮ‬2 × (ܶ஽ + ܲ஽ ) + ܳ஽

(3)

In the backbone network where ܵ ௎௡௥௘௔௖௛௔௕௟௘ is large and the transmission speed is high, ܶ஽ and ܲ஽ can be neglected so that ‫ܳ ≈ ܶܶܮ‬஽ . LT is defined as the Loop Traversals which indicates the maximum total number of times the packet travels round the loop before being dropped.

‫= ܶܮ‬

ܶܶ‫ܮ‬௎௡௥௘௔௖௛௔௕௟௘ ஺௩௚ 2

(4)

‫ܶܮ‬ఋಽ೚೚೛ is the number of times packets travel round the loop while considering the loop duration that the Master calculated (ߜ௅௢௢௣ ).

‫ܶܮ‬ఋಽ೚೚೛ =

ߜ௅௢௢௣ ‫ܶܶܮ‬

(5)

LT' is the number of Loop Traversals during the first second of the loop. The rationale for considering the first second is discussed later.

‫ ܶܮ‬ᇱ =

1 ‫ܶܶܮ‬

(6)

Hence, the Master node is able to calculate the utilization that will be introduced by a loop (ܷ௅௢௢௣ ) with any Neighbour and the total utilization during the loop (்ܷ௢௧௔௟ ):

ܷ௅௢௢௣ = ܵ ௎௡௥௘௔௖௛௔௕௟௘ × min(‫ܶܮ‬, ‫ ܶܮ‬ᇱ , ‫ܶܮ‬ఋಽ೚೚೛ )/‫ܥ‬

(7)

்ܷ௢௧௔௟ = ܷ௅௢௢௣ + ܷ௅௜௡௞

(8)

Any interface with ்ܷ௢௧௔௟ ≥ 1 will be marked as a Loop Mitigation Interface (LMI). A timer equal to the largest ߜ௅௢௢௣ of any LMI interface is invoked, and packets received by and destined to the same LMI interface are dropped (mitigation phase). In equation (7), min(‫ܶܮ‬, ‫ ܶܮ‬ᇱ , ‫ܶܮ‬ி௔௜௟ ) is considered for the following reason. If LT is the minimum value then all the packets will be dropped during the loop (TTL expiration). If ‫ܶܮ‬ఋಽ೚೚೛ is the minimum then the loop will end before the TTL expires. If ‫ܶܮ‬′ is the minimum then ܷ௟௢௢௣ will reach 100% (assuming that ܵ ௎௡௥௘௔௖௛௔௕௟௘ is large) except when ‫ ܶܶܮ‬is large. While ܳ஽ is known in advance and ܶ஽ is normally very small, the propagation delay ܲ஽ should be large enough to make the packets take longer to traverse between the nodes, hence maximizing ‫ ܶܶܮ‬and minimizing ܷ௟௢௢௣ . While ܲ஽ between any two nodes is not expected to be large in current optical backbone networks, traffic bouncing inside the loop can easily utilize the link fully in the onesecond period. 4.1 The Algorithm’s Complexity and Applicability Although in the EPD algorithm, every node runs the SPT algorithm once for every link connected to it, introducing a complexity of ܱ(ܰ(‫ ܧ‬+ ܸ݈‫ ))ܸ݃݋‬where N is the average node degree, such computational complexity only arises after the network is fully converged and hence does not affect the convergence time, and the nodes can perform the SPT calculations while traffic is being forwarded normally. The complexity of the calculations performed between a link failure or an LSA being received, and convergence taking place, is only ܱ(ܰ). Because these calculations do not involve any complex computational operations like calculating a tree or building a complex data structure, we expect that they will consume negligible processing time. Measuring through simulation the time taken for these calculations on a router with 10 interfaces revealed that it takes less than 1 millisecond. The requirements for implementing the EPD algorithm in today’s routers are divided into two categories, namely protocol modification and router modification. The protocol modifications do not require changing the way shortest paths are calculated but it requires the link-state protocol to register and keep track of different timers and be able to send these timers to the router’s neighbours as an LSA/LSP. Router modification must permit monitoring of ܶܶ‫ܮ‬஺௩௚ at every interface, and performance of the calculations presented earlier. Once the router marks an interface as an LMI, there should be a way for the router to determine whether the packets have the same incoming and outgoing interfaces (packets arrive from a next hop node) before dropping them. This requires the router to maintain interface-specific forwarding tables instead of interface-independent forwarding tables (the same FIB for all interfaces). These modifications were proposed in [12]. The results from [12] showed no extra convergence delay against

the convergence delay of a traditional link-state protocol. Table 2 shows the interfaceindependent forwarding table while Table 3 shows the interface-specific forwarding table for node RO in the network of Fig 3. These forwarding tables are produced after failure of link RO↔HU. Table 3 is produced after node RO carried all the calculations in equations 1 to 8, in order to determine whether the loop utilization (ܷ௅௢௢௣ ) on any interface is equal to or greater than 1 (100% utilization). The entries in Table 2 are all marked with the next hop while the entries in Table 3 are marked with the next hop, ‘−‘, or next hop referenced with X. Entries with ‘−‘ will never be used, for example, node RO will never receive traffic from node BG destined to node BG itself. Entries marked with the next hop and referenced with X indicate a possibility of a loop, for example, if node RO received traffic from node BG with destination HU and sent it back again to node BG, a loop will be triggered as node BG is not converged yet and still believes that node RO is the next hop to destination HU.

Fig. 3. Example network illustrating the interface-specific forwarding table to be implemented with the EPD mechanism. Table 2. Interface-independent forwarding table at node RO. Int denotes Interface.

Int

Destination HU

BG

TR

BG → RO

BG

BG

TR

TR → RO

BG

BG

TR

Table 3. Interface-specific forwarding table at node RO when implementing the EPD algorithm. Int denotes Interface.

Int

Destination HU

BG

TR

BG → RO

BG (X)



TR

TR → RO

BG

BG



If any of the interfaces that are referenced with an ‘X’ for any of the destinations in the interface-specific forwarding table had ܷ௅௢௢௣ equal to or greater than 1, that interface will be referenced with an LMI (instead of ‘X’) for those destinations. Packets that are now received by such an interface, and destined to the same interface, will be dropped. The difference between our approach and the approach introduced in [12] is that the forwarding table may contain loops but these loops are not harmful. Once the loop is predicted as being harmful, the interfaces which were referenced as X will be referenced as LMI. The establishment of this interface-specific FIB will only require the original SPT which was computed with Dijkastra’s algorithm.

5 Implementations and Results 5.1 TTL Relaxation Fig 4 shows the utilization of the link containing the loop for all six cases, for varying loop durations, and without consideration of EPD. In the first four cases (14), utilization increases with the duration of the loop in a linear fashion until it reaches 100% when the loop duration is one second. In cases 1-4, the packets forming the total CWND of all TCP sources (்ܹ௢௧௔௟ ) traverse the link four or more times, resulting in full link utilization. The benefit of TTL Relaxation is shown clearly in Case5 and Case6 where almost all packets were dropped (TTL expired) before full link utilization took place, because of the low ்்ܴ௅ of TCP packets in the loop. In both cases (Case 5 and Case 6) all the packets which are bottlenecked by the access router’s buffer don’t travel the link more than 3 times hence, full utilization does not occur.

Fig. 4. Utilization of the link containing the loop for all cases with different loop durations and different ܴ ்்௅ when entering the loop in each case.

Fig. 5. Total TTL expirations during different loop durations. Fig 5 shows the total number of dropped packets (TTL expiration) in all six cases. The graph shows that when R ୘୘୐ was small, more packets were dropped. Because the R ୘୘୐ in Case 1 was the largest, almost half the packets were not dropped, which made the utilization reach 100%. We expect that when the link is fully utilized during a loop, any traffic passing through it will be delayed as the node’s buffer becomes congested. To prove this point, the VoIP traffic which passes through the loop (but is not looping) was analyzed during the loop in order to see whether such real time traffic is affected and delayed by the loop. The Mean Opinion Score (MOS) which determines the quality of the call at the destination and which is presented by a numeric value ranging from 1 to 5 was analysed for the VoIP traffic. Fig 6 shows the average MOS value (as generated by the simulator) of the VoIP traffic as received by the VoIP receivers in all six cases during the loop. The graph shows that for Case5 and Case6 the MOS value stayed almost constant during the loop at approximately 3.7 (high quality) as the link was not fully utilized. The MOS value for the other four cases decreased as the duration of the loop increased until it reached about 3.1 during which the quality of the VoIP call degrades. Although in this case the VoIP traffic was not very greatly affected by the loop (because of the low ்்ܴ௅ ), there are cases during which the quality of such traffic may become greatly compromised by the loop. In order to show a worst case scenario, the same simulation was repeated for Case1 except that ்்ܴ௅ was set to 64 and the loop duration was set to 1, 3 and 5 seconds respectively. Fig 7 shows the average MOS value of the VoIP traffic for this scenario. While the VoIP quality is still acceptable during the first second of the loop, it gets as low as 2 when the loop continues to exist. Such a low MOS value indicates that almost all the call receivers will be dissatisfied with the call quality, which degrades noticeably during the loop.

Fig. 6. The MOS value for the VoIP traffic that passes through the loop in all six cases for different loop durations.

Fig. 7. VoIP MOS for Case1. ்்ܴ௅ is 64. The loop took place when the simulation time was 210 seconds.

5.2 TTL Relaxation with Big Loops As the lifetime of the packet depends upon the number of nodes that it travels through, increasing the number of nodes participating in the loop should cause packets to be dropped sooner and hence reduce bandwidth consumption during the loop. Through changing the link metrics, a three-node loop was created between nodes TR, RO and BG and nodes HR, AT and SI with no traffic passing through the loop. In the first case where packets entered the loop with ்்ܴ௅ = 11, the utilization of links TR→RO and RO→BG was almost 100% while it decreased on link BG→TR to be 75%. This is because both nodes TR and RO sent the traffic four times while

node BG sent the traffic only three times. In the second case where traffic enters the loop with ்்ܴ௅ = 8, the utilization of links HR→AT and AT→SI was almost 65% while it decreased on link SI→HR to 45%. The results show that TTL Relaxation reduces the bandwidth consumed in larger loops while limiting the number of links that are fully utilized by them. 5.3 TTL Relaxation and EPD After implementing the algorithm in OPNET according to the analysis and equations provided earlier, the algorithm performed as expected and mitigated the loop in Cases 1, 2, 3 and 4 but not Cases 5 and 6 where the loop wasn’t expected to fully utilize the link according to the calculations. Fig 8 shows the utilization in the link containing the loop for Case1 when the loop duration was 1 second. The utilization was expected to be 100% during the loop (Fig 4) but the graph shows that the packets that bounced back from node BG were dropped by the Master node RO which marked the interface connected to node BG as an LMI interface. When the loop took place, node RO sent the traffic only once and this is why the utilization at that time was approximately 35%; 10% was VoIP traffic and 25% was TCP traffic which is determined by ்ܹ௢௧௔௟ and is limited by the access router buffer. The algorithm was extended further to drop a certain number of packets during the loop (instead of dropping all the packets) in order to minimize the packet loss, although this is not discussed here due to space limitations.

Fig. 8. Link utilization in Case1 when the loop duration is 1 second with EPD applied. The loop took place when the simulation time was 210 seconds.

5.4 Efficient Early Packet Dropping While the EPD algorithm mitigates the harmful loops by dropping all looped packets, it might be more efficient in some cases to drop a certain number of looped packets so that utilization during the loop doesn’t reach 100%, although some of the looped packets reach their destination. Such an increase in the efficiency of the algorithm will provide better quality for UDP Video-on-Demand and UDP video downloading. In these two cases dropping all the looped packets will increase the probability of the I-frames (compressed video frames that all the proceeding frames depend on) being dropped and hence, noticeable quality degradation will take place at the video receiver. Therefore, the equation in (7) can be altered so that: ܵௌ௔௙௘ = ܷ௅௢௢௣ × ‫)ݏݐ݁݇ܿܽ݌ ݊݅(ܥ‬/‫ܶܮ‬

(9)

ܷ௅௢௢௣ is the utilization during the loop. Because the Master node knows the original utilization of the link that contains the loop, the Master node can calculate the number of packets that can loop in the link (ܵௌ௔௙௘ ) without causing the utilization to reach 100% while dropping the rest of the packets. In this case only LT is considered (maximum Loop Traversals) because once some packets have been dropped, the queuing delay will decrease causing the packets to circulate up to their maximum limit (ܶܶ‫ܮ‬௎௡௥௘௔௖௛௔௕௟௘ /2). For example, in Case1 when the loop duration is 1 second ஺௩௚ and a harmful loop is expected, the number of packets that may loop without causing the link to be fully utilized is: ܵௌ௔௙௘ = 0.8 × 4080/5 = 653 Number of packets that need to be dropped (ܵ஽௥௢௣ ) is: ܵ஽௥௢௣ = ܵ ௎௡௥௘௔௖௛௔௕௟௘ (݅݊ ‫ )ݏݐ݁݇ܿܽ݌‬− ܵௌ௔௙௘ = 1143 − 653 = 490 ‫= ݋݅ݐܴܽ݌݋ݎܦ‬

ௌ ೆ೙ೝ೐ೌ೎೓ೌ್೗೐ ௌ ವೝ೚೛

=

ଵଵସଷ ସଽ଴

= 2.3

The Drop Ratio can be rounded up to the next integer so that the Master node will drop 1 packet out of each 3 packets. After running the Efficient EPD algorithm, the utilization of the link containing the loop for Cases 1, 2, 3 and 4 was approximately 90% instead of 100% without EPD or 35% with EPD. Fig 9 shows the total number of packet drops during the loop in all four cases with the implementation of either the EPD algorithm or Efficient EPD algorithm. The graph shows that when packets enter the loop with large ்்ܴ௅ , the efficient algorithm saves many packets from being dropped and allows them to be delivered to the destination. With small ்்ܴ௅ , there are normally the same number of packet drops, because the remaining packets that were not dropped by the efficient algorithm will circulate faster because of the decreased queuing delay, and will be dropped once their TTLs expire.

Fig. 9. Total number of packet drops for loop duration of 1 second when applying both algorithms.

6 Conclusion This paper introduced an easily implementable loop mitigationn mechanism which mitigates routing loops that take place between two nodes in IGP networks only when the loop is predicted to utilize the link fully, which would congest router buffers and delay any traffic sharing links on the loop. The overall mechanism was divided into two sub-solutions; solutions; the first solution was based upon TTL Relaxation where the TTL of the packets entering the network is relaxed so that the packets caught in the loops are ar dropped sooner than before. The second part of the mechanism is the Early Packet Dropping mechanism which takes place only in cases where the loop is expected to utilize the link fully. The TTL Relaxation scheme reduces bandwidth consumption in loopss of any size, and minimizes the delay experienced by traffic passing through them, especially when whe the loop is close to the egress edge router. The VoIP traffic passing through loops closer to the source was slightly degraded (delay up to 130 ms) because of the th small network diameter which implies that ்்ܴ௅ is small and hence packets are dropped sooner. In networks with large diameters, diameter  can be large which in turn causes the packets to circulate longer, longer potentially delaying VoIP traffic by more than an 150 ms. While the EPD mitigation mechanism mitigates the loop by dropping all the looped packets, the Efficient EPD mechanism showed better performance by dropping dro a certain number of packets, packets thus ensuring that the link won’t be fully utilized while delivering livering the rest of the packets to their the final destination.

References 1.

2. 3.

4. 5. 6. 7.

8.

9.

10. 11.

12.

13.

14. 15. 16. 17.

U. Hengartner, S. Moon, R. Mortier, and C. Diot, "Detection and analysis of routing loops in packet traces," Imw 2002: Proceedings of the Second Internet Measurement Workshop, pp. 107-112 326, November 2002. M. Seaman, "Exact hop count," December 2006. [Online]. Available: http://www.ieee802.org/1/files/public/docs2006/aq-seaman-exact-hop-count-1206-01.pdf. P. Francois, C. Filsfils, J. Evans, and O. Bonaventure, "Achieving sub-second IGP convergence in large IP networks," Computer Communication Review, vol. 35, pp. 35-44, Jul 2005. G. Iannaccone, C. N. Chuah, S. Bhattacharyya, and C. Diot, "Feasibility of IP restoration in a tier 1 backbone," Ieee Network, vol. 18, pp. 13-19, Mar-Apr 2004. S. Murthy and J. J. Garcia-Lunes-Aceves, "A Loop Free Algorithm Based On Predecessor Information," Proceedings of IEEE IC- CCN, October 1994. J. J. GarciaLunaAceves and S. Murthy, "A path-finding algorithm for loop-free routing," IEEE/ACM Transactions on Networking, vol. 5, pp. 148-160, Feb 1997. F. Bohdanowicz, H. Dickel, and C. Steigner, "Detection of routing loops," in Proceedings of the 23rd international conference on Information Networking Chiang Mai, Thailand: IEEE Press, 2009. P. Francois and O. Bonaventure, "Avoiding transient loops during the convergence of link-state routing protocols," IEEE/ACM Transactions on Networking, vol. 15, pp. 12801292, 2007. P. Francois, O. Bonaventure, M. Shand, S. Bryant, S. Previdi, and C. Filsfils, "Internet Draft: Loop-free convergence using oFIB," March 2010. [Online]. Available: http://tools.ietf.org/html/draft-ietf-rtgwg-ordered-fib-03. J. Fu, P. Sj¨odin, and G. Karlsson, "Loop-free updates of forwarding tables," IEEE Transactions on Network and Service Management, vol. 5, no.1, March 2008. M. Gjoka, V. Ram, and X. W. Yang, "Evaluation of IP fast reroute proposals," 2007 2nd International Conference on Communication Systems Software & Middleware, Vols 1 and 2, pp. 686-693 994, 2007. S. Nelakuditi, Z. F. Zhong, J. L. Wang, R. Keralapura, and C. N. Chuah, "Mitigating transient loops through interface-specific forwarding," Computer Networks, vol. 52, pp. 593-609, February 2008. S. Lei, F. Jing, and F. Xiaoming, "Loop-Free Forwarding Table Updates with Minimal Link Overflow," in ICC '09. IEEE International Conference on Communications, 2009, pp. 1-6. H. Ito, K. Iwama, Y. Okabe, and T. Yoshihiro, "Avoiding routing loops on the Internet," Theory of Computing Systems, vol. 36, pp. 597-609, Nov-Dec 2003. P. Francois, M. Shand, and O. Bonaventure, "Disruption free topology reconfiguration in OSPF networks," Infocom 2007, vol. 1-5, pp. 89-97, 2007. M. Shand and S. Bryant, "RFC 5715: A Framework for Loop-Free Convergence", Jan 2010. [Online]. Available:http://www.ietf.org/rfc/rfc5715.txt. J. Moy, "RFC 2328: OSPF version 2", April 1998. [Online]. Available:http://www.ietf.org/rfc/rfc2328.txt.