Parallel Routing in Hypercube Networks with

0 downloads 0 Views 133KB Size Report
... in a hypercube network with faults. Our algorithm is optimal in terms of time and length of ... The study of strong fault tolerance in the star networks showed that ...
Parallel Routing in Hypercube Networks with Faulty Nodes  Eunseuk Oh Jianer Chen Department of Computer Science, Texas A&M University College Station, TX 77843-3112, USA feunseuko, [email protected]

Abstract The concept of strong fault-tolerance was introduced to characterize the property of parallel routing [15]. A network G of degree d is said strongly fault-tolerant if with at most d 2 faulty nodes, any two nodes u and v in G are connected by minfdegf (u); degf (v)g node-disjoint paths, where degf (u) and degf (v) are the numbers of non-faulty neighbors of the nodes u and v in G, respectively. We show that the hypercube networks are strongly fault-tolerant and develop an algorithm that constructs the maximum number of node-disjoint paths in a hypercube network with faults. Our algorithm is optimal in terms of time and length of node-disjoint paths.

1. Introduction Parallel routing on large size networks with faults is an important issue in the study of computer interconnection networks, which allows networks to have alternative routes to tolerate faulty nodes. The new concept of the network strong fault tolerance was introduced to measure fault tolerance for interconnection networks and has been studied for the star networks [15]. In the current paper, we continue the study of the strong fault tolerance of interconnection networks, particularly, of the popular hypercube networks. The n-dimensional hypercube Qn has been studied extensively by many researchers as an interconnection network topology for multicomputer systems. Parallel routing on hypercube networks without faulty nodes was first studied in [18]. An algorithm that constructs node-disjoint paths between disjoint source-destination pairs was proposed in [8, 14]. The problem of determining the diameter of hypercube networks with faults was considered in [10, 11]. Many

 This work is supported in part by the National Science Foundation under Grant CCR-0000206.

fault-tolerant communication algorithms concentrating on one-to-one routing or broadcasting in hypercube networks have been proposed [2, 4, 6, 7, 12, 13, 16, 17]. A network G of degree d is said strongly fault-tolerant [15] if with at most d 2 faulty nodes, any two nodes u and v in G are connected by minfdegf (u); degf (v)g nodedisjoint paths, where degf (u) and degf (v) are the numbers of non-faulty neighbors of the nodes u and v in G, respectively. The study of strong fault tolerance in the star networks showed that node-disjoint paths can be constructed efficiently based on the orthogonal partition of the star networks with faults, which decomposes the n-star network into n 1 (n 1)-dimensional substar networks and an independent set I of (n 1)! nodes [3]. Roughly speaking, a path from a non-faulty neighbor of the source node u to a non-faulty neighbor of the destination node v is constructed in a separated (n 1)-dimensional substar, and the independent set I helps the paths to enter the substar from a proper node. The algorithm proposed in [15] constructs the maximum number of node-disjoint paths of nearly optimal length in the n-star networks with at most n 3 faulty nodes. We observe that the techniques used in the previous studying for star networks [15] are not applicable to the case for hypercube networks. Specifically, the hypercube networks do not seem to have similar orthogonal decomposition structure. Parallel routing in the n-dimensional hypercube networks may require constructing n node-disjoint paths, while an n-dimensional hypercubes can be decomposed into at most n (n 1)-dimensional subcubes. Therefore, there may be no extra nodes available that can help to distribute the paths into the subcubes. We develop new techniques that construct node-disjoint paths between pairs of neighbors of the source node u and the destination node v . First, a prematching process pairs non-faulty neighbors of u and v in Qn . For given pairs

of neighbors of u and v , we introduce three procedures to construct paths by permutations of edge sequences between them. Node-disjoint paths are constructed by searching proper paths, ensuring that each node in a path is not used by other paths. Our algorithm constructs node-disjoint paths in optimal time and the length of paths is also optimal in the hypercube network Qn : For any two non-faulty nodes u and v in Qn , the algorithm constructs min fdegf (u); degf (v)g node-disjoint paths of minimum length plus 4 between u and v in time O(n2 ). The paper is organized as follows. Notations and terminology are introduced in section 2. In section 3, we discuss parallel paths between two non-faulty nodes. In case there are no faulty neighbors for both source and destination nodes, we pre-pair the neighbors of the source and the destination nodes by a process, called Prematch-I. There is a special situation that may block all possible sets of parallel paths between two neighbors of the source and the destination nodes induced from Prematch-I. In this situation, we use a different process, called Prematch-II instead. The third process Prematch-III covers the case in which there is at least one faulty neighbor of the source or the destination. The algorithm is presented and discussed in section 4 and the final section concludes the paper.

2. Preliminaries An n-dimensional hypercube Qn is an undirected graph consisting of 2n nodes represented by binary numbers from n n 1 0 to 2 1, and n2 edges connecting nodes whose binary representations differ in exactly one bit. An edge is called an i-edge if two nodes connected by it differ in the ith bit (the first bit is the leftmost bit). The Hamming distance between two nodes u and v , dist(u; v) is the length of the shortest path from u to v . Actually, dist(u; v) is the number of bits in which binary representations of u and v differ. Since the hypercube Qn is vertex-symmetric, a set of node-disjoint paths from a node u0 to a node v 0 can be mapped to a set of node-disjoint paths from the node r n r u = 1 0 to the node v = 0n in a straightforward way, where r = dist(u; v). Therefore, we will concentrate on the construction of node-disjoint paths from the node u to the node v in Qn . The node connected from the node u by an i-edge is denoted by ui , and the node connected from the node ui by a j -edge is denoted by ui;j . A path P from the node r n r u = 1 0 to the node v = 0n can be uniquely specified by a sequence of labels of the edges on P in the order of traversal. In particular, a path from the node u to the node v that uses an i1 -edge, an i2 -edge, : : :, an ir -edge, in that or-

der, will be denoted by uhi1 ; i2 ; : : : ; ir iv . For example, for the nodes u = 111100 and v = 000000, uh3; 1; 4; 2iv specifies the path 111100 ! 110100 ! 010100 ! 010000 ! 000000. We extend this notation for a single permutation to a set of permutations, as follows. Let S be a set of permutations, then the notation uh S iv denotes the set of paths:

f h1

2

i j

u j ; j ; : : : ; jr v

1 2

(j ; j ; : : : ; jr )

is a permutation in S g

For example, suppose S = f(3; 1; 4; 2), (1; 4; 2; 3), (4; 2; 3; 1), (2; 3; 1; 4)g, then uh S iv consists of four paths from u to v : uh3; 1; 4; 2iv , uh1; 4; 2; 3iv , uh4; 2; 3; 1iv , and uh2; 3; 1; 4iv . We say that an edge [w1 ; w2 ℄ does not lead to a shortest path to a node w3 if dist(w1 ; w3 )  dist(w2 ; w3 ). The following fact can be easily verified. Fact 2.1 If an edge [w1 ; w2 ℄ in Qn does not lead to a shortest path to w3 , then dist(w2 ; w3 ) = dist(w1 ; w3 ) + 1. In general, if in a path P from a node w1 to a node w3 , there are exactly k edges that do not lead to a shortest path to w3 , then the length of the path P is equal to dist(w1 ; w3 ) + 2k . It is known [18] that for any two nodes u and v in Qn , there exist n node-disjoint paths such that dist(u; v) of these paths are of length dist(u; v), and the remaining n dist(u; v) paths are of length dist(u; v) + 2.

3. Parallel paths between two non-faulty nodes In this section, we show how a set of paths between two non-faulty nodes u and v in the hypercube network Qn can be constructed. Our parallel routing algorithm is based on an effective pairing of the neighbors of the node u = 1r 0n r and n v = 0 . We first assume that the nodes u and v have no faulty neighbors. We pair the neighbors of u and v by the following strategy. Prematch-I fAssumption: u and v have no faulty neighbors.g 1. pair ui with vi 1 for 1  i  r 1 ; 2. pair uj with vj for r + 1  j  n; Under the pairing given by Prematch-I, we construct parallel paths between the paired neighbors of u and v using the following procedure. 1 The calculation for indices between 1 and r can be given by a rather lengthy formula based on modular operation. For simplicity, we only need to remember the following three special cases: Let i be an index between 1 and r. (1) for i = 1, i 1 is interpreted as r and i 2 is interpreted as r 1; (2) for i = 2, i 2 is interpreted as r ; and (3) for i = r , i + 1 is interpreted as 1.

u = 1110

0110

4

1111

0111

2 1010 1011

3

0010

0011

0100 1100

0101

1101

1000 1001

0001

v = 0000

Figure 1. Parallel paths between u = 1110 and v = 0000 in Q4 with two faulty(dark) nodes. Procedure-I 1. for 1  i  r, and the paired neighbors ui and vi 1 , we construct n 2 node-disjoint paths between ui and vi 1 , which consist of r 2 paths of the form ui

h 1i S

1;

vi

(1)

where S1 is the set of all cyclic permutations of the sequence (i + 1; : : : ; r; 1; : : : ; i 2), and of n r paths of the form ui

h

h; i+1; : : : ; r; 1; : : : ; i

i

2; h

vi

1 ; (2)

for all h, r + 1  h  n.

2. for r + 1  j  n, and the paired neighbors uj and vj , we construct n 1 node disjoint paths between uj and vj , which consist of r paths of the form uj

h

S

2 iv

(3)

j

where S2 is the set of all cyclic permutations of the sequence (1; 2; : : : ; r), and of n r 1 paths of the form uj

h

i  

h; 1; 2; : : : ; r; h

for all h 6= j , and r + 1

h

Lemma 3.1 Let (ux ; vy ) and (us ; vt ) be two pairs given by Prematch-I. Then, there is at most one path in the path set constructed by Procedure-I for the pair (ux ; vy ) that share common nodes with a path in the path set constructed by Procedure-I for the pair (us ; vt ).

vj ;

(4)

.

n

The paths constructed by cyclic permutations of a sequence are pairwisely disjoint(see, for example [18]). It is easy to verify that for each pair of neighbors of u and v , the paths constructed between them are pairwisely disjoint. Figure 1 shows an example of parallel paths in Q4 with two faulty(dark) nodes. For a path Pi from ui to vj , we define ui h: : : k i as the node on the path Pi starting from ui and following the edge labels in h: : : k i.

P ROOF. For two paths Px and Ps such that Px is for the pair (ux ; vy ) and Ps is for the pair (us ; vt ), x 6= s, assume that Px and Ps have a common node. Then, the same set of bits in ux h: : : k i and us h: : : k 0 i are different from those of u. Since xth bit and sth bit in the common node are different from those of u, it must be of the form 0 ux h: : : s; : : : k i = us h: : : x; : : : k i. We show below that this node must have the form ux hs; : : : k i. Thus, the path Px is uniquely determined by ux and ux;s . Case 1. Suppose 1  x  r and 1  s  r. Suppose the common node is of the form 0 ux h: : : s0 ; s; : : : k i = us h: : : x; : : : k i, then s0 must be s 1 (if s 6= x + 1), s 2 (if s = x + 1), or h for some h > r. If s0 = s 1 then the node us h: : : x; : : : k 0 i has (s 1)th bit identical to that of u while the node ux h: : : s0 ; s; : : : k i has (s 1)th bit different from that of u. If s0 = s 2 and s = x + 1 then there is no node of form 0 Thus, the index s0 does not exist, and us h: : : x; : : : k i. Px must be of the form Px = ux hs; : : :ivx 1 . If s0 = h then ux h: : : s; : : : k i must be of the form ux hh; : : : s; : : : k i. In that case, the sequences in Px and Ps are constructed by cyclic permutations of a sequence (1; : : : r) except the index h. It has been known that paths constructed by cyclic permutations of a sequence are disjoint. This property still holds when the index h > r is added such as ux hh; : : :ivx 1 . It contradicts the assumption that Px and Ps has a common node. Thus, the index s0 = h does not exist. Case 2. Suppose 1  x  r and r + 1  s  n, or r + 1  x  n and 1  s  r . First assume 1  x  r and r + 1  s  n. The sequence in the path Px must be of the form hh; x + 1; : : : ; hi for some h > r since s > r. Since h is the only index larger than r in this sequence and s > r, we must have h = s. Thus, the path Px must be of the form ux hs; : : : k; : : :ivx 1 . The case r + 1  x  n and 1  s  r can be proved by symmetry. Case 3. r + 1  x  n and r + 1  s  n. The sequences in Px and Ps cannot be a permutation of (1; : : : r) because x, s > r . Thus, Px must be of the form ux hh; 1; : : : r; hivx and Ps is us hg; 1; : : : r; g ivs for some h; g > r . In that case, s should be h and x should be g because h and g are the only indices larger than r. Thus, the path Px is of the form ux hs; : : : k; : : :ivx .

Combining all cases, we complete the proof. Fact 3.1 For a pair (ui ; vi 1 ), 1  i  r given by Prematch-I, a path of the form ui hi + 1; : : : r; 1; : : : i 2ivi 1 has no common nodes with any other paths constructed by Procedure-I. We have shown that for each paired nodes by PrematchI, the algorithm Procedure-I constructs at least n 2 disjoint paths between them. Since there may be up to n 2 faulty nodes, in the worst case, there can be a pair (ui ; vi 1 ) of nodes by Prematch-I, for which all n 2 paths constructed by Procedure-I are blocked. In this case, we pair the neighbors of u and v by the following rule: Prematch-II f Assumption: there is a pair (ui ; vi 1 ), 1  i  r given by Prematch-I such that all n 2 paths constructed by Procedure-I for (ui ; vi 1 ) are blocked by faulty nodes.g 1. ui is paired with vi 2 ; 2. ui 1 is paired with vi ; 3. ui+1 is paired with vi 1 ; 4. for other neighbors of u and v , use Prematch-I In Prematch-II, operations on indices between 1 and r are by mod r. For each pair given by Prematch-II, we construct a path as follows. Procedure-II

2 ), the path is u hi 1; i + 1; : : : ; r; 1; : : : ; i 3iv 2; 2. For a pair of form (u 1 ; v ), the path is u 1 hi + 1; i + 2; : : : r; 1; : : : ; i 2iv ; 3. For a pair of form (u +1 ; v 1 ), the path is u +1 hi + 2; : : : ; r; 1; : : : ; i 2; iiv 1;

1. For a pair of form

(ui ; vi

i

i

i

i

i

i

i

i

i

i

4. For other pairs, use Procedure-I to construct paths between them: For pair (ug ; vg 1 ), g 6= i 1; i; i+1, 1  g  r , the path is ug hg + 1; : : : ; r; 1; : : : ; g 2ivg 1 ; For pair (uj ; vj ), r + 1  j  n, the path is uj h2; 3; : : : ; r; 1ivj if i = 1, and uj h1; : : : ; r ivj if i 6= 1. Lemma 3.2 Under the conditions of Prematch-II, the algorithm Procedure-II constructs n fault-free parallel paths of length dist(u; v) + 2 from u to v . P ROOF.

It easy to see that Paths constructed by

Procedure-II have length bounded by dist(u; v) + 2. Except paths of form uj h2; : : :ivj or uj h1; : : :ivj , r + 1  j  n, whose length is dist(u; v) + 2, other paths have length dist(u; v). Now we show that all n paths constructed by ProcedureII are fault-free. After that, we show that these n paths are disjoint. Recall that all possible paths constructed by Procedure-I for the pair (ui ; vi 1 ) are blocked by faulty nodes. Denote these paths by F Pi . The path Pi = ui hi 1; i + 1; : : : ; r; 1; : : : ; i 3ivi 2 and F Pi only share a node ui because every node in F Pi has its (i 1)th bit identical to that of u while nodes except ui in Pi has (i 1)th bit different that of u. Since the node ui is non-faulty, the path Pi is fault-free. The path Pi 1 = ui 1 hi + 1; : : : ; r; 1; : : : ; i 2ivi and F Pi have no common nodes because ith bits in nodes in Pi 1 and F Pi are not identical. Thus, Pi 1 is fault-free. The path Pi+1 = ui+1 hi + 2; : : : ; r; 1; : : : ; i 2; iivi 1 and F Pi only share a node vi 1 because nodes except vi 1 (= ui+1 hi + 2; : : : ; i 2; ii) have ith bit identical to that of u while nodes in F Pi have ith bit different that of u. A path of form Pg = ug hg + 1; : : : ; r; 1; : : : ; g 2ivg 1 , 1  g  r has no common nodes with any other paths constructed by Procedure-I by fact 3.1. Since g 6= i, the path Pg is fault-free. Finally, consider a path Pj constructed for a pair (uj ; vj ), r + 1  j  n. If i = 1, all faulty nodes are in paths between u1 and vr and Pj is of the form uj h2; 3; : : : ; r; 1ivj . Since a node uj h2; : : : k i in the path Pj can be identical to only a node of form u2 hj; : : : k 0 i, k 0 6= 1 by lemma 3.1, the path Pj and F Pi have no common nodes. In case i 6= 1, all faulty nodes are in paths between u2 and v1 and Pj is of the form uj h1; : : : r ivj . Similarly, we can prove that Pj is fault-free since a node uj h1; : : : k i in Pj can be identical to only a node of form u1 hj; : : : k 0 i. Therefore, all paths constructed by Procedure-II are fault-free. Now we show that paths constructed by Procedure-II are disjoint. It is easy to see that Pi and Pi 1 have no common nodes because of an index i. Similarly, Pi 1 and Pi+1 have no common nodes because of an index i 1. Also, Pi and Pi+1 have no common nodes because nodes except vi 1 in Pi+1 have ith bit identical to that of u while nodes in Pi have ith bit different that of u, and the node vi 1 is not included in Pi . Thus, paths Pi 1 , Pi , and Pi+1 are disjoint. Each path of the form ug hg + 1; : : : ; r; 1; : : : ; g 2ivg 1 , 1  g  r is disjoint by fact 3.1. To show that paths constructed for pairs (ug ; vg 1 ), g 6= i 1; i; i + 1 are disjoint with paths Pi 1 , Pi , and Pi+1 , suppose h: : : g0 ; g; : : :i is the sequence of the path Pi . Since g 6= i 1; i+1, g0 becomes g 1 and the path

1110

1 4

0110 0111

u = 1111 2

1010 1011

0010

0011

3 0100 1100

0101

1101

1000 1001

0001

v = 0000

Figure 2. Paths constructed by Procedure-II between u = 1111 and v = 0000 in Qn .

Note that it is possible that an edge [ui ; ui;i0 ℄ with both ui and ui;i0 non-faulty is not paired with any edge because the corresponding edge in Prematch-III contains faulty nodes. For each pair of edges given by Prematch-III, we construct a path between them by the algorithm called Procedure-III. Basically, sequences in paths constructed in Procedure-III follow Procedure-I if there is no comment for that. We assume that for edges [ui ; ui;i0 ℄, 1  i0 (6= i)  n, their pairs are given in the increasing order of i0 , and operations on indices between 1 and r are by mod r. Procedure-III 1. for 1  i; i0  r and i0 = i + 1 and paired edges [ui ; ui;i0 ℄, [vi 1;i 2 ; vi 1 ℄, construct the path ui hi0 ; : : : i 2ivi 1 ;

cannot have common nodes with the path Pi . Suppose i is the sequence of Pi 1 , then since g 6= i + 1, 0 g0 becomes g 1 and Pg and Pi 1 have no common nodes. Also if h: : : g0 ; g; : : :i is the sequence of Pi+1 , then since g 6= i, g0 becomes g 1 when g 6= i + 2 and g 1 becomes i + 1 when g = i + 2. For both cases, it is easy to see that Pg and Pi+1 have no common nodes. Finally, each path of the form uj h1; : : : rivj or uj h2; : : : r; 1ivj , r + 1  j  n is disjoint with other paths because of a unique index j . Therefore, all paths constructed by Procedure-II are pairwisely disjoint. Figure 2 shows paths constructed by Procedure-II between u = 1111 and v = 0000 with two faulty(dark) nodes in Q4 . Pg

h

: : : g ; g; : : :

So far, we have assume that all neighbors of the source node u and the destination node v are non-faulty. Now we relax such a restriction to deal with any faulty neighbors of two nodes u and v . We introduce Prematch-III to pair the edges incident on the neighbors of the nodes u and v , instead of the neighbors of u and v . Prematch-III

f Assumption:

u or v have at least one faulty neighbor, and edges are paired only when all nodes on them are non-faulty. g

for each edge [ui ; ui;i0 ℄ where both are non-faulty do and i0 = [ui ; ui;i0 ℄ with the edge [vi

1. if

1



i; i

0

r

and i0 = [ui ; ui;i0 ℄ with the edge [vi0

2. if

1



i; i

0

 

r

ui

and

ui;i0

, then pair

i + 1

1 2 ; v 1 ℄; ;i

i

i

, then pair 2 ; v i0 1 ℄;

1

1

;i0

3. otherwise, pair [ui ; ui;i0 ℄ with [vj;j 0 ; vj ℄, where the indices j and j 0 are such that Prematch-I pairs the node ui0 with vj , and the node ui with vj 0 .

1 and 2. for i  i; i0  r and i0 = i paired edges [ui ; ui;i0 ℄, [vi0 1;i0 2 ; vi0 1 ℄, construct a path by flipping i and i0 in the path ui0 hi; : : : ; i0 2ivi0 1 ;

3. otherwise, for paired edges [ui ; ui;i0 ℄, 0 [vj 0 ;j ; vj ℄, if i < i , construct a path by flip0 ping j and j in the path ui hi0 ; : : : j ivj 0 ; if 0 0 i > i , construct a path by flipping i and i 0 in the path ui0 hi; : : : j ivj . Notice that Procedure-III forces the sequence between a pair induced by Prematch-III to be obtained from a path P constructed by Procedure-I such that P is of the form 0 0 0 0 = 1. ui hi ; : : : ; j ivj , where i < i or i = r when i 0 Thus, consider paths of the form ui hi ; : : :ivj constructed by Procedure-I such that i < i0 or i = r when i0 = 1. The following fact shows that any two paths of the forms 0 0 ux hx ; : : :ivy and us hs ; : : :ivt as described above have no nodes in common. It comes directly from lemma 3.1. Fact 3.2 Let Px be a path of the form ux hx0 ; : : :ivy such that x < x0 or x = r when x0 = 1 as given in Procedure-I. For a path Ps of the form us hs0 ; : : :ivt , s < s0 , Px has no nodes in common. Lemma 3.3 For a non-faulty node ui , Procedure-III constructs at most n 1 fault-free one-to-many disjoint paths from the node ui to all non-faulty nodes vj , 1  j (6= i)  n. P ROOF.

For all non-faulty edges of form

,

[ui ; ui;i0 ℄

 0  , consider edges paired by Prematch-III. Supis paired with pose   . For  0  , 1

i

1

[vi

1 2 ;i

n i

; vi

r

1



1

when

i

0

i

=

r

i + 1

[ui ; ui;i0 ℄

, and

[vi0

1

;i0

2; v 1℄ i0

when

0

Parallel-Routing

. Otherwise, [ui ; ui;i0 ℄ is paired with [vj;j 0 ; vj ℄ such that vj is paired with ui0 by Prematch-I. Since vj is uniquely paired with ui and ui0 by Prematch-I, there are at most n 1 non-faulty edge pairs constructed by Prematch-III. Suppose r + 1  i  n. Since for 0 1  i  n, a non-faulty node vj , 1  j  n is uniquely paired with a node ui0 by Prematch-I, there are at most n 1 non-faulty edge pairs constructed by Prematch-III. Now, consider paths between above edge pairs. All paths constructed by Procedure-III are obtained from paths constructed by Procedure-I such that these paths are of the form ui hi0 ; : : :ivj , i < i0 or i = r when i0 = 1, flipping first two indices or last two indices. We have shown that such paths of the form ui hi0 ; : : :ivj , i < i0 or i = r when 0 i = 1, are disjoint in fact 3.2. Thus, flipping first two indices on such paths make paths share the node ui , and flipping last two indices on these path make paths have all different entering nodes to v which are given by PrematchIII. Thus, Procedure-III constructs at most n 1 fault-free one-to-many disjoint paths from ui to all non-faulty nodes vj , 1  j  n. i

=

i

1

Further, the following fact can be easily verified from the above discussion. Fact 3.3 For any two edge pairs ([ux ; ux;x0 ℄; [vy;y0 ; vy ℄) and ([us ; us;s0 ℄; [vt;t0 ; vt ℄) given by Prematch-III, paths constructed by Procedure-III for these pairs have common nodes of form ux hx0 i = us hs0 i, where x0 = s and x = s0 .

4. Parallel routing algorithm on faulty hypercube networks First, consider the lower bound of the length of the minfdegf (u); degf (v)g node-disjoint paths from a node r n r u = 1 0 to a node v = 0n in hypercube Qn , where r  4. Suppose a neighbor node of u, ui ; r + 1  i  n be non-faulty, and we want to find a path from u to v via ui . Assume that all neighbors of ui are faulty except two nodes u and ui;i0 ; r + 1  i0 (6= i)  n. Then, a fault-free path of the form uhi; i0 ; : : :iv from u to v has length at least dist(u; v) + 4. Thus, the length of the minfdegf (u); degf (v)g disjoint paths from u to v is at least dist(u; v) + 4. We now ready to present our main algorithm for parallel routing in the hypercube networks Qn with at most n 2 faulty nodes. For two non-faulty nodes u = r n r and v = 0n in Qn , our algorithm constructs 1 0 minfdegf (u); degf (v)g node-disjoint fault-free paths from u to v such that the length of the paths is bounded by

Input: non-faulty nodes u = 1r 0n with at most n 2 faulty nodes.

f

r

and

v

= 0n

in

Qn

g

Output: min degf (u); degf (v ) parallel fault-free paths of length dist(u; v ) + 4 from u to v .



1. case 1. u and v have no faulty neighbors for each pair (ui ; vj ) given by Prematch-I do 1.1 if all paths for (ui ; vj ) by Procedure-I include faulty nodes then use Prematch-II and Procedure-II to construct n parallel paths from u to v; STOP. 1.2 if there is a fault-free unused path from ui to vj by Procedure-I then mark the path as used by (ui ; vj ); 1.3 if all fault-free paths constructed for (ui ; vj ) include used nodes then pick any fault-free path P for (ui ; vj ), and for the pair (ui0 ; vj 0 ) that uses a node on P , find a new path; 2. case 2. u and v have at least one faulty neighbor for each edge pair ([ui ; ui;i0 ℄; [vj;j 0 ; vj ℄) given by Prematch-III do 2.1 if there is a fault-free unused path from ui to vj by Procedure-III then mark the path as used by the pair ([ui , ui;i0 ℄, [vj;j 0 , vj ℄); 2.2 if all fault-free paths constructed for the pair include used nodes then pick any fault-free path P for the edge pair, and for the edge pair that uses a node on P , find a new path;

Figure 3. Parallel routing on the hypercube network with faulty nodes dist(u; v) + 4. The algorithm called Parallel-Routing is given in Figure 3. Lemma 3.2 guarantees that step 1.1 of the algorithm Parallel-Routing constructs n fault-free parallel paths of length  dist(u; v) + 2 from u to v . Step 1.3 of the algorithm requires further explanation. In particular, we need to show that for the pair (ui0 ; vj 0 ), we can always construct a new fault-free path from ui0 to vj 0 in which no nodes are used by other paths. This is ensured by the following lemma. Lemma 4.1 Let (ui ; vj ) and (ui0 ; vj 0 ) be two pairs given by Prematch-I such that two paths constructed for (ui ; vj ) and (ui0 ; vj 0 ) share a node. Then the algorithm ParallelRouting can always find fault-free paths for (ui ; vj ) and (ui0 ; vj 0 ), in which no nodes are used by other paths. P ROOF. We assume that we will search a fault-free and unused path for each pair given by Prematch-I in order of cyclic permutations as given in Procedure-I in the following discussion. For example, u1 ; h2; : : :ivr , u1 h3; : : :ivr ,   , u2 h3; : : :iv1 . Suppose all fault-free paths constructed for (ui ; vj ) include used nodes, and one of fault-free path P is picked for

, which includes a node used for (ui0 ; vj 0 ), i0 < i. Then we want to show that we can always find a fault-free unused path for (ui0 ; vj 0 ). Since all fault-free paths constructed for (ui ; vj ) are used, all unused paths must be blocked by faulty nodes. By lemma 3.1, a node of form ui hg; : : : k i can be identical to a node of form ug hi; : : : k 0 i. Thus, all paths leaving with g > i from ui are unused, and there are (n r) + (r i) = Remain paths of form ui hg; : : :ivj , n i such paths. 1  g  i 2 are used or faulty. Since a used path means that unused paths previously searched before it also should be blocked by faulty nodes, when we search new path for (ui0 ; vj 0 ) among paths unused by other pairs, there exists at least one unused fault-free path which can be used for (ui0 ; vj 0 ). From the above discussion, we can easily show that for a pair (ux ; vy ), x > i, there exist at least one unused fault-free path between them. Therefore, step 1.3 of the algorithm Parallel-Routing occurs once during the whole execution.

1110

(ui ; vj )

A similar analysis shows that step 2.2 of the algorithm Parallel-Routing can always construct a new fault-free path without nodes used by other paths. Lemma 4.2 Let ([ux ; ux;x0 ℄, [vy;y0 ; vy ℄) and ([us ; us0 ;s ℄, [vt;t0 ; vt ℄), be edge pairs given by Prematch-III such that two paths constructed for them share a node. Then the algorithm Parallel-Routing always find fault-free paths between them, in which no nodes are used by other paths. We summarize all these discussions in the following theorem. Theorem 4.3 If the hypercube network Qn has at most 2 faulty nodes, then for each pair of non-faulty nodes u and v in Qn , in time O(n2 ) the algorithm Parallel-Routing constructs minfdegf (u); degf (v)g node-disjoint fault-free paths of length bounded by dist(u; v) + 4 from u to v . n

We first discuss the length of minfdegf (u), It is easy to see that paths constructed by Procedure-I is length of at most dist(u; v) + 4. If paths are constructed by Procedure-II or Procedure-III, the length is still at most dist(u; v) + 4 because all paths constructed by Procedure-II or ProcedureIII are constructed based on Procedure-I, only flipping first or last two indices in paths. We now discuss the time complexity the algorithm Parallel-Routing. For each pair given by Prematch-I, a path is constructed by the algorithm by searching a proper path in a set of paths P ROOF.

g node-disjoint fault-free paths.

1

0110

4

0111

u = 1111 2

1010 1011

3

0010

0011

0100 1100

0101

1101

1000 1001

v = 0000

0001

Figure 4. Paths constructed by ParallelRouting between u = 1111 and v = 0000 in

4

Q

between them, which takes time O(ki  n + n), where ki is the number of faulty nodes in the set of paths for the pair (ui ; vj ). If we find a fault-free and unused path of the form 0 ui0 hi; : : :ivj 0 for a pair (ui0 ; vj 0 ), i < i, then mark a node ui;i0 as an used node. In such a way, we can detect used paths in time O(i) since there are at most i 1 used paths for (ui ; vj ). If all fault-free paths for the pair (ui ; vj ) include used nodes, we pick any fault-free path P for (ui ; vj ), and for the pair (ui0 ; vj 0 ) that uses a node on P , find a new path. As we have discussed in previous lemma, this happens once during the whole execution. Thus, the time complexity is bounded by (k1 n + : : : + kn n + n2 ) = O(n2 ) since the number k1 + : : : kn is bounded by n 2. If for a pair (ui ; vj ) given by Prematch-I, all possible paths are blocked by faulty nodes, we simply ignore all paths constructed for other pairs (ui0 ; vj 0 ), i0 < i, and apply Procedure-II. Thus, it takes additional O(n2 ) time to construct paths for pairs given by Prematch-II. For pairs given by Prematch-III, paths are constructed in the similar way to construct paths for pairs given by Prematch-I. Thus, without detail explanations, we conclude that the time complexity for constructing paths between non-faulty neighbors of u and v is bounded by 2 Figure 4 shows paths constructed by ParallelO(n ). Routing between u = 1111 and v = 0000 in Q4 .

degf (v)

5 Conclusion Network strong fault tolerance is a natural extension of the study of network fault tolerance and network parallel routing. In particular, it studies the fault tolerance of large size networks with faulty nodes. In this paper, we have studied the strong fault tolerance of the popular hypercube networks, and shown that hypercube networks are strong fault tolerant. We developed an O(n2 ) time algorithm that for two given non-faulty nodes u and v in a n-dimensional

hypercube Qn with at most n 2 faulty nodes, constructs minfdegf (u); degf (v)g node-disjoint fault-free paths from u to v such that the length of the paths is bounded by dist(u; v)+4. The time complexity of our algorithm is optimal since each path from u to v may have length as large as n, and there can be as many as n node-disjoint paths from u to v . Thus, even printing these paths should take time 2 O(n ). The length of the paths constructed by our algorithm is also optimal, as we can construct pairs of nodes u and v in the hypercube Qn with n 2 faulty nodes for which any set of n parallel paths connecting u and v has at least one path of length dist(u; v) + 4. Finally, our algorithm does not require prior knowledge of the failures. Strong fault tolerance for networks with bounded degree, such as ring networks, mesh networks, and butterfly networks, are relatively easier. On the other hand, strong fault tolerance for unbounded degree networks, such as networks based on Cayley graphs, seems much more difficult. The hypercube networks and the star networks are the first two such classes of networks whose strongly faulty tolerant have been proved. For star networks, the strong fault tolerance was proved based on the orthogonal partition of the star networks, while for hypercube networks, the strong fault tolerance was proved by careful pre-matching of the neighbors of the source and destination nodes. It will be interesting to study the strong fault tolerance of other hierarchical networks with unbounded degree.

References [1] G. Birkhoff and S. MacLane, ”A Survey of Modern Algebra”, The Macmillan Company, New York, 1965. [2] G. -M. Chiu and S. -P. Wu, ”A fault-tolerant routing strategy in hypercube multicomputers”, IEEE Trans. Computers 45, 1996, pp. 143-154. [3] C. C. Chen and J. Chen, ”Nearly optimal one-to-many parallel routing in star networks”, IEEE Trans. Parallel, Distrib. Syst. 8, 1997, pp. 1196-1202. [4] M. -S. Chen and K. G. Shin, ”Adaptive fault-tolerant routing in hypercube multi-computers”, IEEE Trans. Computers 39, 1990, pp. 1406-1416. [5] K. Day and A. Tripathi, ”A comparative study of topological properties of hypercubes and star graphs”, IEEE Trans. Parallel, Distrib. Syst. 5, 1994, pp. 31-38. [6] P. Fraigniaud, ”Asymptotically optimal broadcasting and gossiping in faulty hypercube multicomputers”, IEEE Trans. Computers 41, 1992, pp. 1410-1419.

[7] Q. -P. Gu and S. Peng, ”Optimal algorithms for nodeto-node fault tolerant routing in hypercubes”, The Computer Journal 39, 1996, pp. 626-629. [8] Q. -P. Gu and S. Peng, ”An efficient algorithm for the k -pairwise disjoint paths problem in hypercubes”, J. Parallel and Distributed Computing 60, 2000, pp. 764-774. [9] P. Hall, ”On representatives of subsets”, J. London Math. Soc. 10, 1935, pp. 26-30. [10] M. S. Krishnamoorthy and B. Krishnamurthy, ”Fault diameter of interconnection networks”, Comput. Math. Appl., 13, 1987, pp. 577-582. [11] S. Latifi, ”Combinatorial analysis of the fault-diameter of the n-cube”, IEEE Trans. Computers 42, 1993, pp. 27-33. [12] S. Lee and K. Shin, ”Interleaved all-to-all reliable broadcast on meshes and hypercubes”, Proc. Int’l Conf. Parallel Processing, 1990, pp.III-110-113. [13] T. C. Lee and J. P. Hayes, ”Routing and broadcasting in faulty hypercube computers”, Proc. third Conf. Hypercube Concurrent Computers and Applications, 1988, pp. 625-630. [14] S. Madhavapeddy and I. H. Sudborough, ”A topological property of hypercubes: node disjoint paths”, 2nd IEEE Symposium on Parallel and Distributed Processing, 1990, pp. 532-539. [15] E. Oh and J. Chen, ”Strong fault-tolerance: parallel routing in star networks with faults”, Tech. Report, Dept. Computer Science, Texas A&M University, 2001. [16] S. Park and B. Bose, ”All-to-all broadcasting in faulty hypercubes”, IEEE Trans. computers 46, 1997, pp. 749-755. [17] P. Ramanathan and K. Shin, ”Reliable broadcast in hypercube multicomputers”, IEEE Trans. Computers 37, 1988, pp. 1654-1657. [18] Y. Saad and M. H. Schultz, ”Topological properties of hypercubes”, IEEE Trans. Computers, 37, 1988, pp. 867-872.