A Path-Finding Algorithm for Loop-Free Routing - Semantic Scholar

54 downloads 28261 Views 329KB Size Report
We call these type of algo- rithms path- ... LPA, when a router detects that it can create a routing ta- ble loop if .... in an update message, Procedure Message calls procedure. Update ...... search Center, Yorktown Heights, New York, April. 1983.
A Path-Finding Algorithm for Loop-Free Routing J.J. Garcia-Luna-Aceves and Shree Murthy Abstract| A loop-free path- nding algorithm (LPA) is pre- eliminate the possibility of temporary loops. All the loop-

sented; this is the rst routing algorithm that eliminates the formation of temporary routing loops without the need for internodal synchronization spanning multiple hops or the speci cation of complete or variable-size path information. Like other previous algorithms, LPA operates by specifying the second-to-last hop and distance to each destination; this feature is used to ensure termination. In addition, LPA uses an inter-neighbor synchronization mechanism to eliminate temporary routing loops. A detailed proof of LPA's correctness and loop-freedom property is presented and its complexity is evaluated. LPA's average performance is compared by simulation with the performance of algorithms representative of the state of the art in distributed routing, namely an ideal link-state (ILS) algorithm, a loop-free algorithm that is based on internodal coordination spanning multiple hops (DUAL) and a path- nding algorithm without the inter-neighbor synchronization mechanism. The simulation results show that LPA is a more scalable alternative than DUAL and ILS in terms of the average number of steps, messages, and operations needed for each algorithm to converge after a topology change.

1 Introduction RIP [9] is widely used in internets today. However, it is based on the distributed Bellman-Ford algorithm (DBF) for shortest-path computation [1], which su ers from bouncing e ect and counting-to-in nity problems. These problems are overcome in one of three ways in existing Internet routing protocols. OSPF [16] relies on broadcasting complete topology information among routers, and organizes an Internet hierarchically to cope with the overhead incurred with topology broadcast. BGP [12] exchanges distance vectors that specify complete path to destinations. EIGRP [2] uses a loop-free routing algorithm called DUAL [5], which is based on internodal coordination that can span multiple hops; DUAL also eliminates temporary routing loops. Recently, distributed shortest-path algorithms [3, 5, 8, 10, 17, 19] that utilize information regarding the length and second-to-last hop (predecessor) of the shortest path to each destination have been proposed to eliminate the performance problems of DBF. We call these type of algorithms path- nding algorithms. Although these algorithms provide a marked improvement over DBF, they do not J.J. Garcia-Luna-Aceves is with the Computer Engineering Department, University of California, Santa Cruz, California. Shree Murthy is with Sun Microsystems, Inc., Mountain View, California This work was supported in part by the Oce of Naval Research under Contract No. N-00014-92-J-1807 and by the Defense Advanced Research Projects Agency (DARPA) under contract F19628-93-C-0175.

free algorithms reported to date rely on mechanisms that require routers to either synchronize along multiple hops [5, 11, 15], or exchange path information that can include all the routers in the path from source to destination [7]. This paper presents the loop-free path- nding algorithm (LPA) which is the rst routing algorithm that is loopfree at every instant and does not use either of these two techniques. Like previous path- nding algorithms, LPA eliminates the counting-to-in nity problem of DBF using predecessor information. Because each router reports to its neighbors the predecessor to each destination, any router can traverse the path speci ed by the predecessors from any destination back to a neighbor router to determine if using that neighbor as its successor would create a path that contains a loop (i.e., involves the router itself). Furthermore, a router detects a temporary loop within a nite time that depends on the speed with which correct predecessor information reaches the router, and not on the distance values of the paths o ered by its neighbors; therefore, temporary loops are detected much faster than in DBF and its variations. Of course, updates take time to be propagated and routers have to update their routing tables using information that can be out of date, which can lead to temporary loops. In LPA, when a router detects that it can create a routing table loop if it changes its successor to a destination, it blocks such a potential loop. The router accomplishes this by reporting an in nite distance for the destination to all its neighbors and by waiting for those neighbors to acknowledge its message with their own distances and predecessor information, before the router changes its successor. Because of the overhead involved, a router should not send a query every time it has to change its successor to a destination; a router decides when to block a potential loop by comparing the distances reported by its neighbors against a feasible distance, which is de ned to be the smallest value achieved by the router's own distance since the last query sent by the router. The router is forced to block a potential loop with a query only when no neighbor reports a distance smaller than the router's own feasible distance. This feature accounts for the low overhead incurred in LPA to accomplish loop-free paths at every instant. The rest of the paper is organized as follows. Section 2 presents the network model assumed in LPA. Section 3 provides a description of LPA. Section 4 and Section 5 provide a detailed proof of LPA's loop-freedom and convergence to

correct routing tables, respectively. Section 6 addresses necessary; the value of at time is denoted by ( ). the complexity of LPA, analyzes its average performance Each router maintains a distance table, a routing table by simulation, and compares it with other algorithms. Fi- and a link-cost table. The distance table at each router nally, Section 7 presents our conclusions. is a matrix containing, for each destination and for each neighbor of router , the distance and the predecessor i and i , respectively. reported by router , denoted by jk jk The set of neighbors of router is denoted by i . The routing table at router is a column vector containA computer network is modeled as an undirected nite ing, for each destination , the minimumdistance ( ji ), the graph represented as ( ), where is the set of nodes predecessor ( i ), the successor ( i ), and a marker ( i ) and is the set of edges or links connecting the nodes. For used to updatej the routing table. j For destination , j ji simplicity, each node represents a router that is a comput- speci es whether the entry corresponds to a simple path ing unit involving a processor, local memory and input and ( ji = ), a loop ( i = ) or a destination output queues with unlimited capacity. LPA can be applied that has not been marked ( ji = ). j to networks in which nodes represent address ranges [22]. The link-cost table lists the cost of each link adjacent to A functional bidirectional link connecting nodes and is the router. The cost of the link from denoted by represented by ( ) and is assigned a positive weight in ik and is considered to be in nity whentothe islink fails. each direction. The link is assumed to exist in both direcAn update message from router consists of a vector of tions at the same time. All the messages received (trans- entries reporting incremental updates to its routing table; mitted) by a node are put in the input (output) queue and each entry speci es an update ag (denoted by i ), a destiare processed on a rst-come- rst-serve basis. Each node nation , the reported distance to that destinationj (denoted has a unique identi er. Any link cost can vary in time but i j ), and the reported predecessor in the path to the is always positive. The distance between two nodes in the by i network is measured as the sum of the link costs of the destination (denoted by j ). Thei update ag indicates i shortest path between the nodes. An underlying protocol whether the entry is an iupdate ( j = 0), a query ( j = 1) or a reply to a query ( j = 2). The distance in a query is assures that: always set to 1.  Within a nite time, a node detects the existence of a Because every router reports to its neighbors the predenew neighbor, loss of connectivity with a neighbor, or cessor in the shortest path to each destination, the comthe change in the cost of an adjacent link. plete path that a router assumes to any destination (called  All packets transmitted over an operational link are its implicit path to the destination) at a given time 0 is received correctly and in the proper sequence within a known by the router's neighbors at the subsequent time nite time. This assumption is made for convenience. 00 0 . This is done by means of a path traversal routine Reliable message transmission can be easily incorpo- on the predecessor entries reported by the router. In the rated into the routing protocol (e.g., [16, 21]). speci cation of LPA, the successor to destination for any  All update messages, changes in the link-cost, link fail- router is simply referred to as the successor of the router, ures and link recoveries are processed one at a time in and the same reference applies to other information mainthe order in which they occur. tained by a router. Similarly, updates, queries and replies When a link fails, the corresponding distance entry in the refer to destination , unless stated otherwise. Figures 1 node's distance and routing tables are marked as in nity. and 2 specify LPA in pseudo-code. The rest of this section A node failure is modeled as all the links incident on that provides an informal description of LPA. The procedures used for initialization are Init1 and Init2; node failing at the same time. A path from node to node is a sequence of nodes in Procedure Message is executed when a router processes an which ( 1), ( x x+1), ( r ) are the links in the path. update message; procedures linkUp, linkDown and linkChange A simple path from to is a sequence of nodes in which are executed when a router detects a new link, the failure of no node is visited more than once. The paths between any a link, or the change in the cost of a link. We refer to these pair of nodes and their corresponding distances change over procedures as event-handling procedures. For each entry time in a dynamic network. At any point in time, node in an update message, Procedure Message calls procedure is connected to node if a path exists from to at that Update, Query, or Reply to handle an update, a query, or of all time. The network is said to be connected if every pair of a reply, respectively. An important characteristic event-handling procedures is that they mark ji = operational nodes are connected at a given time. for each destination a ected by the input event. Router initializes itself in passive state with an in nite distance for all its known neighbors and with a zero distance to itself. After initialization, router sends updates 3.1 Information Stored and Exchanged containing the distance to itself to all its neighbors. In LPA's description, the time at which the value of a variable of the algorithm applies is speci ed only when it is X

t

X t

i

j

k

2 Network Model

i

k

D

p

i

N

i

j

G N; E

N

D

p

s

tag

E

j

tag

correct

tag

error

tag

i

null

j

i

i; j

tag

k

d

i

u

j

RD

rp

u

u

u

t

t

> t

j

j

i

i; n

j

n ;n

n ;j

i

j

i

j

i

j

tag

j

3 LPA Description X

i

i

null

3.2 Distance Table Updating

When router receives an input event regarding neighbor (an update message from neighbor or a change in the cost or status of link ( )), it updates its link-cost table with the new value of link ik if needed, and then executes procedure . The intent of this procedure is for the router to erase the outdated path information in the distance table by making path information from all neighbors consistent with the latest update. To accomplish this, DT updates i = k the distance and predecessor of neighbor as jk j and ijk = ik for each destination a ected by the input event. In addition, DT determines whether the path to any destination through any of the other neighbor of router includes neighbor . This is done by traversing the path speci ed by the predecessor entries reported by a neighbor from destination towards node . If the path implied by the predecessor reported by router ( 6= and 2 i ) to destination includes router , then node assumes that has outdated path information and substitutes the subpath from to reported by as part of its path to with the path to reported by itself. This is easily done by i + k and i = k . updating jbi = kb j j jb The example in Figure 3 illustrates how procedure DT helps to expedite LPA's convergence. In the example, router has reported to its predecessors to and , and infers that 's path to is . Router has reported to its predecessors to , , and , and infers that 's path to is . Router has reported to its predecessors to and , and infers that 's path to is . With these conditions, assume that sends an update stating that a a . Router uses procedure DT to j = 1 and j = ensure that the path information from the other neighbors re ects the most recent update obtained from any other neighbors for any destination. Since 's path to includes , substitutes the out-of-date subpath in 's path information with the information supplied by , which makes the path from to non-existent. The path from to does not include and does not change 's information. Note that does not have to wait for an update from to infer that it should not use as successor to . i

k

k

i; k

d

DT

k

p

p

D

D

j

j

i

k

j

i

b

j

b

k

b

k

N

i

b

k

j

b

i

j

k

D

D

D

x

p

p

i

x

y

j

d

j

cdabj

b

j

xyj

a

b

j

i

c

i

i

a

j

a

p

i

i

a

D

j

c

abj

i

null

i

c

a

j

i

abj

c

a

c

j

a

x

i

j

x

i

c

c

j

3.3 Blocking Temporary Loops

The example shown in Figure 4 illustrates the possibility of looping, even when path information is used. In the example, it is assumed that has reported the implicit path to and that has reported the implicit path to . Furthermore, this path information is outdated, because has changed its successor from to and the new path information has not reached . If link ( ) fails, simply using path information would permit to use as successor to . However, this would create a temporary routing loop. To eliminate temporary loops, a router forces its neighbors not to use it as a successor (next hop) when it detects the possibility of creating a temporary loop before changes its own successor. This is done using an interneigha

aj

bcdj

i

b

i

c

d

e

i

i; a

i

b

j

i

i

bor synchronization mechanism based on the notion of feasible distance. The feasible distance of router for destination (denoted by ji ) is the smallest value achieved by its own distance to since the last time initialized itself or sent a query reporting an in nite distance to . LPA allows a router to use a neighbor as its successor to destination only if it satis es the following condition. Feasibility Condition (FC): If at time router needs to update its current successor, it can choose as its new i ()+ successor ij ( ) any router 2 i ( ) such that jn i ( ) + ix( )j 2 i ( )g and f jx in( ) = min ( ) = i () i ( ). If no such neighbor exists and i ( ) jn j j 1, router must keep its current successor. If min ( ) = 1 then ij ( ) = . FC is used to establish an ordering of routers along a given loop-free path to , i.e., all the routers in a loop-free path to have feasible distances to that decrease as is approached. If router does not nd neighbor that satis es FC, it is forced to send a query to its neighbors reporting an in nite distance to and waits for the replies before it can change its own route. Because every router uses FC to decide whether to adapt a successor or to block paths through itself (as described in Section 3.4), no temporary loops can exist. i

j

FD j

i

j

k

j

t

s

d

D

t

t

t

n

D

t

< FD

t

M in D

N

t

t

D

d

t

x

N

D

t

t

t

D

i

s

i

t




FD

a

D

a

D

= jcd = 1, because the paths to reported by and include node . Because uses to reach , must also update its routing table. After updating its distance table, the neighbor that o ers the shortest distance to is itself. Furthermore, jjd = 0 2 = jd , and sends an update to all its neighbor with d j = 1 + 9 = 10 and a reply to node (Figure 5(c)). When node receives 's query (before receiving a new b = b = b = 1, beupdate from ), it must set ja jc jd cause all its neighbors have reported a path to that include node . Because 's own path to includes node , it must update its routing table. Node sends a query to its neighbors because every distance to through any neighbor is in nity (Figure 5(c)). When receives the replies from and , it makes node its new successor and also sends a reply to (Figure 5(d)). When receives the update from and the query from , c =1 4= c it makes its successor, because je j and o ers the shortest path to among all of 's neighbors. Accordingly, sends an update with its new distance of 10 and a reply to 's query. Finally, when receives all the replies to its queries, it sets as its successor and sends updates accordingly (Figure 5(e)).

1, and it sets

D

b

j

d jb

D

j

d

a

d

a

d

j

FD

j

D




D

t

t

because router [ + 1 ] must be a feasible successor for router [ ] to make it its successor at time ] is already pass[k+1; new]. Therefore, if router [ sive when it changes successor at time s[k+1; new], then s[k; new] s[k?1; new] j s[k+1; new]( ). j s[k; new] ( ) Alternatively, consider the case in which router [ ] is active from time k to time s[k+1; new] when it becomes passive again to join aj ( ). In this case, regardless of the value of js[k; new] ( k ), router [ ] must have s [ k; new ] sent a query to its neighbors with j ( k ) = 1 at time k , and all of those neighbors must acknowledge that value of js[k; new] ( k ) before router [ ] can make any changes to its distance at time s[k+1; new] . When router [ ? 1 ] makes router [ ] its successor when it joins aj ( ) at time s[k; new]  , it may or may not have processed any update or query sent by node [ ] at time s[k+1; new]  when that node joins aj ( ). In the rst case, s[k; new]( s[k?1; new] s[k+1; new]) j j s[k; new] ( ) = = js[k; new] ( s[k+1; new]) = js[k; new] ( ) s[k; new]( )  j s[k; new] j s[k+1; new] ( ) s k

;

new

Lemma 1 LPA is live.

s k; new

Proof: Consider the case in which the network has a stable topology. When a router is in the active state and receives a query from a neighbor, the router replies to the query with an in nite distance. The router updates its distance table entries when either an update or a reply message is received in active state. On the other hand, when a router in passive state receives a query from its neighbor, it computes the feasible distance and updates its distance and routing tables accordingly. If the router nds a feasible successor, it replies to its neighbor's query with its current distance to the destination. If the router can not nd feasible successor, it forwards the query to the rest of its neighbors and sends a reply with an in nite distance to the neighbor who originated the query. Accordingly, in a stable topology, a router that receives a query from a neighbor for any destination must answer with a reply within a nite time, which means that any router that sends a query in a stable topology must become passive after a nite time. Consider now the case in which the network topology changes. When a link fails or is reestablished, an active router that detects the link status change simply assumes that the router at the other end of the link has reported an in nite distance and has replied to the ongoing query. Because an active router must detect the failure or establishment of a link within a nite time, and because router fails [ k; new ] s [ k ? 1 ; new ] ( k ); this ures or additions are treated as multiple link failures or adIn the second case, j s[k; new] ( ) = j ditions, it follows from the previous case that no router can is impossible, because js[k; new]( k ) = 1 and node [ ? be active for an inde nite period of time, and the lemma 1 ] could not have chosen a neighbor reporting an is true. 2 in nite distance as its successor. From the above argument it follows that, if a router Lemma 2 TRT correctly enforces Property 1. [ ] is passive at time , then Proof: TRT correctly enforces Property 1 if the tag value s[k;new] s[k?1;new] given by TRT at router for destination equals correct. js[k+1;new]( ) js[k;new] is true only when the neighbor that router chooses However, because all the routers in the loop j ( ) are pas- This as successor to o ers the smallest distance from to each sive at time , traversing path ai ( ) leads to the erroneous node in its reported implied path from to . i i conclusion ja ( ) ja ( ). This implies that a loop canFirst note that, procedure before TRT not be formed when j ( ) is loop free before time and and ensures that router setsDTi is =executed 1 if its neighbor has a stable topology. On the other hand, the handling reports a path to that includes jb. Therefore, TRT deals of queries and replies in LPA is not modi ed with the es- with simple paths only. tablishment or failure of links, and is loop-free when it According to procedure TRT, there are two cases in is rst stored before any router has a nite distance to any which a router stops tracing the routing table: (1) the other node. Therefore, the theorem is true. 2 trace reaches node itself (i.e., ixns = ), and (2) a node on the path to is found with xi = correct. We prove that the correct path information is reached in both cases. Case 1: Assume that TRT is executed for destination To prove that LPA converges to correct routing-table val- after an input event. The tag for each destination afues in a nite time, we assume that there is a nite time c fected by the input event is set to null before procedure after which no more link-cost or topology changes occur. TRT is executed. Therefore, if TRT is executed for destiProcedure DT makes node update its distance table in a nation and node (the source) is reached, the tag of each way that the routing information assumed from all neigh- node in the path from to through neighbor must be bors is consistent, as discussed in section 3.2. However, null. Therefore, the distance from to through is the LPA is correct even if such consistency of neighbor infor- shortest path among all neighbors, because node chooses mation is not enforced, and is not assumed in the following the minimum in row entry among its neighbors for a given proof. destination , and the lemma is true for this case.

t

s k; new t

t

D

t

> D

s k; new

t

< t

t

P

D

t

t

s k; new

RD

t

t

RD

t

s k; new

t

s k

; new P

s k;

P

new

s k; new

t

t

t

t

t

t

D

RD

t

D

t

D

t

t

FD

>

t

D

t

D

RD

;

t

RD

t

t

s k

new

s k; new

t

t

> D

D

i

C

t

D

P

t

> D

t

S

G

j

n

t

i

j

t

i

n

t

i

G

j

D

b

b

i

G

i

5 Correctness of LPA

i

p

j

tag

j

T

i

j

i

i

j

n

i

j

n

i

j

Case 2: If node x1 with tagxi 1 = correct is reached, then the successor graph) is nite. Accordingly, the proof can

it must be true that either node or a node 2 with xi 2 = correct is reached from 1 . If node is reached from 1, then it follows from Case 1 that neighbor o ers the smallest distance among all of 's neighbors to each node in the implied subpath from to 1 reported by neighbor . Furthermore, because 1 is reached from , node must also o er the smallest distance among all of 's neighbors to each node in the implied subpath from 1 to reported by . Therefore, it follows from Case 1 that the lemma is true. Otherwise, if 2 is reached, the argument used when is reached from 1 can be applied to 2. Because router always sets ii = correct and TRT deals with simple paths only, this argument can be applied recursively only for a maximum of 1 times until is reached, where is the number of hops in the implicit path from to reported by to . Therefore, Case 2 must reduce to Case 1 and it follows that the lemma is true. 2 i

x

tag

x

i

x

n

i

n

x

n

j

x

n

i

x

j

n

x

i

x

x

i

tag

h


T H

i

H

T H

H

j

T H

i

j

H

i

T H j

i

i; s

T

< T H

j

C

i

C

D

Suggest Documents