An Adaptive Deadlock and Livelock Free Routing Algorithm - CiteSeerX

0 downloads 0 Views 120KB Size Report
... is based on a policy of. DL recovery and implements adaptive routing because, at .... pn(t1)=pn(t0) and a buffer B, with Free(B,Pki(t1))>0, is found among those ...
3rd Euromicro Workshop on Parallel and Distributed Processing - San Remo (Italy) January 25th-27, 1995. pp 288-295

An Adaptive Deadlock and Livelock Free Routing Algorithm M.Coli, P.Palazzari Dipartimento di Ingegneria Elettronica Università "La Sapienza" Via Eudossiana 18, 00184 Roma E-mail [email protected] [email protected] Abstract This paper is concerned with Store and Forward deadlocks (DL) arising in interprocessor network systems with buffered packet switched communications. Algorithms which implement DL free routing use adaptive or not adaptive routing modality. Not adaptive algorithms underuse the interconnection network bandwidth because they impose restrictions to the routing paths; adaptive algorithms are DL free only if certain hypothesis on the communication topology occur. In order to override these drawbacks, we have implemented an adaptive DL free routing which fully exploits the connectivity of the network and which is unrelated to its topology. DL is avoided by adopting a recovery policy: whenever DL arises, our algorithm is able to remove it within a finite time. We demonstrate 'deadlock' and 'livelock' avoidance by ensuring the presence of a hole in the network buffers; the hole is subjected to casual movement. Performance tests, executed on a transputer based parallel machine, show the effectiveness of the algorithm and demonstrate its fault tolerance capabilities. Keywords: adaptive, routing, deadlock, livelock, fault tolerance.

1 Introduction In parallel systems the deadlock (DL) phenomenon can take place when there is a share of resources [8]. In this work we refer to DLs caused by the filling of buffers present in the nodes of a packet switched communication network (Store and Forward DL [13]). In order to avoid deadlocks in packet switched communication networks, several techniques have been proposed. These techniques can be divided into two categories: - static routing. Only minimal length paths are used; a path between two nodes is uniquely determined by the source and destination nodes ([4], [5], [7]); - adaptive routing. All the minimal length paths are used [14]; in order to avoid DL, also the not minimal length paths are used ([1]-[3], [6], [9], [10], [12], [13]).

1

Static routing algorithms avoid DL by choosing communication paths so that the communication graph results acyclic. As a consequence, they underuse the communication network by imposing restrictions to the number of paths available to transmit a message between two processors. Adaptive routing algorithms can use all the minimal length paths available. DL is avoided either by ensuring that communication queue graph is acyclic [14] or by performing a misrouting (i.e. a communication which increases the distance of the message from its destination) whenever a channel which approaches a message to its destination cannot be found (deflection or hot potato routing; chaotic routing waits for the internal queue to become full before performing a misrouting). These algorithms ensure DL free routing only if each communication node has the number of input channels equal to the number of output channels. Deflection routing algorithms perform a misrouting as soon as a message cannot be transmitted on a direction which approaches it to the destination; as a consequence, a message can be deflected many times before reaching its destination, so that the communication latency can be considerably enlarged. We have developed a DL free adaptive routing algorithm which A- does not require communication nodes to have input degree equal to output degree; B- waits for a fixed time T before executing a misrouting, so that it is performed only when the possibility that DL occurs is high . A fault on one or more communication (input or output) channels does not stop the transmission of messages, because they are routed around the fault channels. Our algorithm ensures that routing modality still remains DL free when one or more faulty communication channels are present (point A). Other algorithms do not ensure that routing remains DL free in presence of faulty channels because the fault causes the input degree of a communication node not to be equal to the output degree. The algorithm we developed is based on a policy of DL recovery and implements adaptive routing because, at

3rd Euromicro Workshop on Parallel and Distributed Processing - San Remo (Italy) January 25th-27, 1995. pp 288-295

each step of retransmission, a free communication channel is chosen among those channels approaching the message to its destination. The algorithm is based on the concept of the hole in the network buffers [11]; we have demonstrated that adaptive deadlock and livelock free routing modality can be implemented through the assumption of not restrictive hypothesis . Parallel machine PM can be described as PM=, in which P = {P1,P2,...,PN} is the processors set and Ls Í PxP is the communication channel set. A channel cij which connects Pi to Pj is specified by the ordered couple (Pi,Pj). Each processor Pi has a set of communication channels C(Pi) = Cin(Pi) È Cout(Pi)

in which

Cin(Pi) = {(Px,Pi) | PxÎP, (Px,Pi)ÎLS} Cout(Pi) = {(Pi,Py) | PyÎP, (Pi,Py)ÎLS}. Ls is a network which makes PM connected: thus, given any two processors Pi and Pj, in Ls they are connected by a path. No other constraint on Ls exists; we may have too Cin(P i ) ¹ Cout(P i ) for some i C(P i ) ¹ C(P j ) for some i,j. The messages transmission among processors is packet switched. Each processor Pi has (fig. 1) a set of buffers given by SBUF(Pi ) = Bin È {Bi } È Bout in which Bi is an internal buffer; Bin = {Bin1 ,Bin2 ,..., BinNi} is the set of the input buffers; Bink is the buffer on the kth input channel; its output goes toward Bi; Bout = {Bout1 ,Bout2 ,..., BoutNo} is the set of the output buffers; Boutk is the buffer on kth output channel; its input comes from Bi; we suppose that each buffer can hold only one message; whenever ambiguity arises, we indicate explicitly the processor P which buffer B belongs to, i.e. B(P). C in 1 C in 2

... C in Ni

Cout 1 Bin2

... BinNi

Bout2 Bi

C out 2

...

...

Bout

N

o

Let us define a processor Pi to be in a transmission deadlock state when it tries to execute a transmission without being ever successful. We suppose that a message, as soon as it arrives to the target processor Pi, is absorbed by a receiving process without being transmitted to Bi. In order to describe the routing algorithm, we premise some definitions: - Txi(Pk): (routing function) it is the channel which must be used by processor Pi (i=1,2,...,N) to transmit a message directed to processor Pk; the transmission of the message through the channel Txi(Pk) approaches the message to its destination. -Dest(Pk): it gives the processor to which the message, contained in the internal buffer Bi of processor Pk, is directed; -Free(B,Pk) is the number of free positions in buffer B; -Bin(Pi,Pj) is the input buffer associated with channel (Pi,Pj); -Bout(Pi,Pj) is the output buffer associated with channel (Pi,Pj);

- Neigh(Pk)={Pi | PiÎP, Pi¹ Pk, C(Pi) Ç C(Pk)¹ {} }: it is the set of processors adjacent to Pk; -Next(Pk,Pi): it is the next processor to which a message must be transmitted; the message is in processor Pk and is directed to processor Pi;

Next(Pk,Pi)ÎNeigh(Pk) and Txk(Pi)=Cout(Pk)ÇCin(Next(Pk,Pi)). We demonstrate that our algorithm is DL free by finding a necessary condition for DL existence and, subsequently, by determining a routing modality which ensures that the necessary condition does not occur. In order to establish a theorem which supplies a necessary and sufficient condition for DL setting up in a processor network, we define in Ls a cyclic path of length n as pn = (Pk ,Pk ),(Pk ,Pk ),...,(Pk n-1,Pkn=Pk0), 1 2 0 1 (Pki,Pk(i+1))ÎLs, i=0,1,...,n-1 PkjÎ P, j=0,1,...,n-1.

Bout1

Bin1

2. DL in communications

Cout No

fig. 1 Connections among the Pi internal buffers

The processors set involved in pn is: PC = {Pk0,Pk1,...,Pkn-1}. theorem 2.1: NSC because DL takes place in PM is that $ p n , n•2 : 1) P k =Next(Pk ,Dest(P k )) (i+1) i i 2) Free(Bin(Pk (i-1) ,P k i ),P k i )=0 i=1,2,...,n 3) Free(Bi ,P k i )=0 "P k i ÎPC

2

3rd Euromicro Workshop on Parallel and Distributed Processing - San Remo (Italy) January 25th-27, 1995. pp 288-295

4) Free(Bout(Pk i ,P k (i+1) ),P k i )=0 i=0,1,...,n-1 The proof of theorem 2.1 is reported in appendix A. Theorem 2.1 shows that DL can exist if and only if: - a cyclic path pn exists in Ls; internal buffer Bi of each processor belonging to pn contains a message that must be transmitted toward the next processor of the cycle (1) - the cyclic path pn involves the existence of a cycle of full buffers; this cycle is constituted by 3n buffers (2),(3),(4). In order to avoid DLs, it is sufficient to ensure that one of 1), 2), 3) and 4) of theorem 2.1 is not verified.

3 A deadlock recovery policy for adaptive routing We say that buffer B in processor Pi contains a hole when it is empty [11]; the number of holes in B is given by Free(B,Pi). When a message is transmitted from Bi buffer to Bj buffer, we say that a hole has moved from Bj to Bi. In order to implement adaptive routing modality, routing functions Txi (i=1,2,...,N) must not be single valued. As the adaptive routing management is not well suited for a DL prevention policy, we adopted a DL recovery policy. Our algorithm performs adaptive routing by transmitting the messages toward the first available channel which approaches them to their destination; if, after a T interval of time is elapsed, no free channel has been found to approach the message to its destination, we suppose DL to be occurred and a misrouting is performed. The pn time dependence, with obvious notation, is introduced: - pn(t) is a cyclic path on Ls which, at instant t, satisfies the 1) of theorem 2.1, - PC(t) is the set of processors belonging to pn(t). In order to avoid DL we must ensure that, if a t0 exists for which a cyclic path pn(t0) satisfies 1), 2), 3) and 4) of

theorem 2.1, a t1 (t00. After the transmission from Bi toward Bouti is executed, Free(Bi,Pki)=1 results. Condition 3) of theorem 2.1 is not more verified and DL is removed. In order to remove DL it is so sufficient to ensure that,

3

if a pn(t) exists which satisfies all the conditions of theorem 2.1, it is always possible to find a PkiÎPC(t) and an instant t'>t for which condition 3) of theorem 2.1 is not verified. Afterwards we shall suppose holes to be subjected to casual movement, according to the following definition 3.1: a hole is subjected to casual movement if, being Free(Bi,P)=1, the input buffer Binj from which to receive a message is chosen in a random and equiprobable way (Binj Î{B | Free(B,P)=0, BÎBin(P)}). Since to avoid DL it is sufficient that a hole breaks the cycle of full buffers reaching a locked processor PkiÎPC in a finite time, let us demonstrate the following theorem 3.1: Sufficient conditions for which a hole reaches a Pi within a time t0, NL=0 cannot ever exist. The hypothesis that a message m is introduced into BÎSBUF(Pi) only if NL(Pi)³ 2 means that m can be introduced into B only if

1After reset it is reasonable to suppose all the buffers being

empty.

4

3rd Euromicro Workshop on Parallel and Distributed Processing - San Remo (Italy) January 25th-27, 1995. pp 288-295

6 Performances evaluation We have implemented our adaptive routing algorithm and the not adaptive one presented in [5] on a 3x3 mesh connected transputer machine (fig. 2). The algorithm presented in [5] achieves immunity from DL by transmitting messages on the horizontal direction in the first phase and on the vertical direction in the second phase.The data reported in the following are relative to the performances of the two algorithms submitted to the same traffic. The first test compares the two routing algorithms in traffic situations which are completely at random and of growing intensity; we have fixed a time T0 (two minutes) and we have computed the average number of messages which, during T0, each processor sends to the other randomly selected processors. In order to cause growing random traffic intensity, a process as the following is executed in each processor: tp:=10 (seconds) while < (tp>64 ms) > do while do {simulates a computing part} endwhile tp:=tp/2 endwhile Previous process behaves like a program where the communication part becomes ever more predominant than the computing part (tp is halved; the pause of tp seconds simulates the program computing phase).

5

0

1

2

5

4

3

6

7

8

fig. 2 : 3x3 transputer grid

For both the algorithms (A=adaptive, NA=no adaptive), figure 3 reports the average number of messages transmitted by a processor for each tp value; in abscissa the iteration number ni is reported: tp is related to ni according to the following tp = 10 sec. 2n i 6000 NA Average NTx

NL³ 2, because NL(Pi)³ 2 implies NL³ 2; after a message injection NL decreases of one unity. If the message injection causes NL=1, we have that $ P i : NL(P i )=1 and NL(P j )=0 "P j ÎP, P j ¹ P i ; thus no more messages can be injected because Ø$P i ÎP : NL(P i ) ³ 2 . As NL is decreased only when a new message is injected and when NL=1 new messages cannot be introduced, NL=0 cannot ever be verified. In order to satisfy also 1) of theorem 3.2, a sufficient condition is to allow the injection of new messages from PiÎP only if at least two holes are present in the buffers of Pi.

A T=0 ut

4000

A T=200 ut 2000

0 0

5 10 15 Iteration number fig.3 : average number of messages transmitted by a processor for the two A and NA routing algorithms.

Our algorithm performs a misrouting if, after T seconds are elapsed, no free channel is found to approach the message to its destination; when T=0, our algorithm is similar to hot potato routing. In figure 3 we report the average number of messages transmitted by a processor when our algorithm is adopted with T=200 ut (unit of time ut=64ms) or with T=0 ut. Fig. 3 underlines that, if the traffic has random behaviour, NA algorithm and A algorithm with T = 200 ut are equivalent. A algorithm with T=0 ut gives worst performances because, when the traffic load is heavy and equally distributed, many misroutings are performed and the latency of each message is enlarged. Usually in a parallel machine the traffic shows a proper regularity and it is not correctly modelled by an equiprobable random traffic. In order to characterize better A and NA features, let us consider some traffic situations.

3rd Euromicro Workshop on Parallel and Distributed Processing - San Remo (Italy) January 25th-27, 1995. pp 288-295

T comm normalized

300

2e+6 no faults C05 fault t comm (ut)

case 1) Processor 4 sends 10 messages (each of them is 16 Kbytes) to any other processor in the network (broadcast operation). case 2) Each processor sends 10 messages (16Kbytes) to all the processors. case 3) Transmission of 100 messages (16Kbytes) between two opposite processors (processor 2 sends 100 messages to processor 6). case 4) Transmission of 100 messages (16Kbytes) between the opposite processors of the main diagonal and of the underdiagonal (processor 0 sends 100 messages to processor 8 and processor 5 sends 100 messages to processor 7). Figure 4 reports the normalized times spent by A and NA (T=200 ut) algorithms in the execution of the communications involved in cases 1)-4).

1e+6

0e+0 0

1

2

3 4 log T figure 5: communication time vs log(T) a)with no faults and b)with C05 fault

5

A

7 Conclusions

NA

200

100

0 caso 1)

caso 2)

caso 3)

caso 4)

figure 4: normalized communication times spent by A and NA algorithms to execute the communications of cases 1)-4)

In order to show the influence of T parameter (condition 3 of theorem 3.2) on the performances of our routing algorithm, we have considered its behaviour for many T values and for different kinds of traffic. Figure 5 reports, as a function of log10(T), the communication time necessary to exchange 1000 messages (16Kbyte) between processors 0 and 8 and between processors 2 and 6; the two plots concern to the communication network without faults (solid line) and with one communication channel fault (dotted line). Figure 5 shows that communication time is minimum when T=200 ut. When T decreases the communication time grows because messages are frequently misrouted; when T increases the communication time grows because, once a DL situation has occured, more time is required before the DL situation is removed. The figure shows also that the routing algorithm is still DL free when one link fault occurs: all the messages are correctly delivered and the DL situations are correctly removed. The only consequence of the channel fault is a diminishing of the network bandwidth.

6

The analysis of DL free routing algorithm presented in literature shows that adaptive routing algorithms are dependent on the network topology; DL free behaviour can be achieved only by using communication nodes with input degree equal to output degree. This is a strong limitation, because the immunity from DL cannot be ensured when a link fault occurs. We have presented an adaptive routing algorithm which does not suffer from the previous drawback and which gives communication time responses better than the ones given by the already known adaptive and not adaptive routing algorithms. This better behaviour is achieved by performing a misrouting only when this is effectively required. We have demonstrated that our routing algorithm is deadlock and livelock free. The actual execution of the algorithm on a 9 transputer parallel machine has validated the theoretical results, i.e.: - the algorithm showed better time responses with respect to adaptive or not adaptive hot potato routing algorithms; - the algorithm removed all the DLs; - even when there was a communication link fault, the algorithm correctly performed all the communications and all DLs were removed.

Appendix A: proofs of theorems In order to demonstrate theorem 2.1 we premise definition 2.1 and we demonstrate the auxiliary theorem 2.2. B1 and B2 are two buffers connected through a channel from B1 to B2 and TX(B1,B2) is a transmission directed from B1 to B2; definition A.1: the transmission TX(B1,B2) directed from B1 to B2 is said to be locked if B1 attempts to

3rd Euromicro Workshop on Parallel and Distributed Processing - San Remo (Italy) January 25th-27, 1995. pp 288-295

transmit the message to B2 without being ever successful. Since buffers contain one message, TX(B1,B2) locked means buffer B1 is full and, consequently, cannot receive new data. theorem 2.2: let us consider the two transmissions TX(B1,B2) and TX(B2,B3), being B1,B2 and B3 three buffers sequentially connected; necessary and sufficient condition because TX(B1,B2) is locked, is that TX(B2,B3) is locked. proof NC: if TX(B1,B2) is locked, B2 buffer is never available to receive data coming from B1; this happens only if B2 does never become empty, that is only if B2 does never succeed to transmit its own content toward B3; but this means that TX(B2,B3) is locked. SC: if TX(B2,B3) is locked, B2 is never free because it does never succeed to transmit its own content to B3; as a consequence B1 does never succeed to transmit its content to B2, so TX(B1,B2) is locked. proof of theorem 2.1: proof NC: deadlock presence involves that $ B1 for which TX (B1,B2) is locked; in virtue of theorem 2.2 also TX (B2,B3) is locked (B2 is a buffer connected with B1 output and B3 is a buffer connected with B2 output). As a locked transmission involves the presence of another locked transmission, we can consider the infinite buffer sequence SB = B1,B2, ....,Bn,Bn+1,.... in which - Bn is a buffer which transmits toward the buffer Bn+1; - TX (Bn,Bn+1) is locked. By analyzing SB starting from B1, as the buffer number in PM is limited, we surely meet a buffer Bk already

appeared in SB in a previous position p (p³ 1). SB can be so written as SB = B1,B2, ....,Bp-1,Bp,Bp+1,...,Bk-1,Bk=Bp,Bp+1,... which underlines in SB the existence of a periodic part SBP = Bp,Bp+1,...,Bk-1. In order to show the periodic nature of SB, it is sufficient to note that TX(Bp,Bp+1) is locked and, when we meet TX(Bk-1,Bk=Bp), this is again locked by TX(Bk=Bp,Bp+1). If we remember the structure of the internal buffers of a processor, we see that SBP is a sequence of buffer terns, that is - internal buffer Bi of a processor Pki (indicated with Bi(Pki)), - output buffer of the channel which is the destination of the message contained in Bi, that is Bout(Pki,Next(Pki,Dest(Pki))), -

input

buffer

of

the

same

channel,

that

is

7

Bin(Pki,Next(Pki,Dest(Pki))). Without loss of generality, we substitute explicitly the buffers in SBP by supposing that Bp is the internal buffer of processor Pk0; so we have SBP = Bi(Pk0), Bout(Pk0,Pk1=Next(Pk0,Dest(Pk0))), = Pk2 Bout(Pk1, Bi(Pk1), Bin(Pk0,Pk1), Bi(Pkn-1), Bin(Pk1,Pk2),..., Next(Pk1,Dest(Pk1))), Bout(Pkn-1,Pk0 = Next(Pkn-1,Dest(Pkn-1))), Bin(Pkn-1, Pk0). Our hypothesis about DL existence brings to select a set of processors PC which satisfies 1) of the theorem; as all the transmissions among consecutive buffers in SBP are locked, all the buffers must be full, so that also 2), 3) and 4) of the theorem are verified. As the connection of an output channel with an input channel of the same processor must be excluded, n³ 2 is justified because cyclic dependence cannot exist among buffers in the same processor. SC: let us consider internal buffer Bi(Pki) of any PkiÎPC; for 1) the transmission of the message in Bi(Pki) is directed to processor Pki+1= Next(Pki,Dest(Pki)) and, for 4), this transmission cannot be executed because the Bout(Pki,Pki+1) is full. Similarly the transmission from Bout(Pki,Pki+1) to Bin(Pki,Pki+1) cannot be executed for 2) and the transmission from Bin(Pki,Pki+1) to Bi(Pki+1) cannot be executed for 3). As the previous reasonment subsists

" PkiÎPC and, on the basis of 1), it does not

exist a Pki such as Next(Pki,Dest(Pki))Ï PC, each Bi(Pki)

is full and is connected with another full Bi(Pkj) through two full buffers; because any transmission is impossible in this situation, DL subsists. proof of theorem 3.1: we demonstrate the theorem by considering the worst case, i.e. a) only one hole is present, b) in its movement the hole does not generate new holes. Under these hypothesis the hole movement in the network is a stationary Markoff process of order 1; infact, being Lj the event for which the hole is in Pj, we have Pr(Lj|Li,Lk,..,Lm; tk|tk-1,tk-2,..,tk-n)=Pr(Lj|Li; tk|tk-1) 2 because the hole position at a certain instant depends only on the hole position at the previous instant (time is discreet and is incremented at each transmission). Afterwards we shall not explicitly show the time

2 Pr(E |E ; t |t ) is the probability that, at the instant t , the 2 1 2 1 2

event E2 takes place having supposed the event E1 verified at the instant t1.

3rd Euromicro Workshop on Parallel and Distributed Processing - San Remo (Italy) January 25th-27, 1995. pp 288-295

dependence because the process is stationary and the steps in time are unitary (tk-tk-1=1). The Pr1 matrix which supplies the probability with which the hole, present in a processor PiÎP, is transmitted to processor PjÎP, is Pr(L1 |L1 ) Pr(L1 |L2 ) ... Pr(L1 |LN) Pr 1 =

Pr(L2 |L1 )

Pr(L2 |L2 ) ...

Pr(L2 |LN)

...

...

...

connected (hypothesis 1)), given two any processors Pk0 and Pkn, it is possible to find a path Pk0,Pk1,...,Pkn in which Pki-1ÎNeigh(Pki) (i=1,2,...,n). We demonstrate, by induction on n, that prn[k0,kn]>0; pr1[k0,k1]=Pr(Lk0|Lk1)>0 on the basis of (3), resulting let

us

suppose

now

that

prn-1[k0,kn-1]>0; we have Prn = [Pr1]n = Prn-1Pr1, then it is kn

prn [k0,kn ]= å prn-1 [k0 ,i]pr1 [i,kn ] = i=k 0

kn

=prn-1 [k0 ,kn-1 ]pr1 [kn-1 ,kn]+

å

crossed N times by mi and such that PiÏLP, must exist. Let us consider a PLiÎLP and indicate with Mk the event for which mi is in processor Pk. Livelock can arise if Pr{Mi|MLi}=0;

if Pr{Mi|MLi}>0, for N sufficiently high we are sure that

Pr(LN|L1 ) ... Pr(LN|LN) being, from definition 3.1, 0 if P j ÏNeigh(P i ) (3). Pr(Lj |Li ) = 1 otherwise Cin(P i ) Through Pr1 we calculate the matrix Prk=[Pr1]kwhose generic element prk[ki,kj] supplies the probability that, if the hole is in Pkj, it is in Pki after k steps. Because Ls is

Pk0ÎNeigh(Pk1);

Pi, which is transmitted a number N arbitrarily high of times without ever reaching its destination; for this to happen, a set of processors LP={PL1, PL2,...,PLm},

prn-1 [k0,i]pr1 [i,kn ]>0

i=k 0 i ¹k n-1

being prn-1[k0,kn-1]>0 from the hypothesis and pr1[kn-1,kn]>0 on the basis of (3), because Pkn-1ÎNeigh(Pkn). As a consequence prk[i,j]>0 when k is the distance between Pi e Pj, that is a hole in Pj has probability>0 of reaching any Pi in a number of steps equal to the distance between Pi and Pj; because this reasonement is valid starting from each processor Pj, we are sure that, after an indeterminate but finite number of steps, the hole reaches the processor PiÎPC. proof of theorem 4.1: 2) of theorem 3.2 assures that no privileged directions exist to receive a message; 3) involves the possibility to transmit a message toward any direction, being privileged those directions which approach it to the destination. Livelock will arise if it exists a message mi, directed to

message mi reaches its destination, avoiding livelock. Now let us prove that, if the hypothesis subsist, we have Pr{Mi|MLi}>0. As Ls is connected (hypothesis 4) of theorem 3.2), we can find a set of processors PLi=Pk0, Pk1,..., Pkn=Pi such as (PLi,Pk1), (Pk1,Pk2), ..., (Pkn-1,Pi) is a path from PLi to Pi. The probability that mi is transmitted from Pki to Pki+1 is given by Pr{Mki+1|Mki} = Pr{mi is transmitted toward Bout(Pki,Pki+1)|Mki} * Pr{a hole is in Pki+1} * Pr{a hole is in Bin(Pki,Pki+1)| a hole is in Pki+1}; therefore Pr{Mki+1|Mki}>0 results, because a) Pr{mi is transmitted toward Bout(Pki,Pki+1)| Mki}>0 for 3) of theorem 3.2, b) Pr{a hole is in Pki+1}>0 as the hypothesis of theorem 3.1 are verified, c) Pr{a hole is in Bin(Pki,Pki+1)| a hole is in Pki+1}>0 on the basis of definition 3.1. Since it results that n-1 Pr{Mi |MLi}³ å Pr{Mk i+1 |Mk i } (4) 3 i=0 we have proved that Pr{Mi|MLi}>0, i.e. that livelock does not subsist. 7 References [1] Bolding,K., Snyder,L. : ' Mesh and torus chaotic routing'. Advanced Research in VLSI Paralle Systems: Proceedings of the 1992 Brown/MIT Conference, Mar 1992. [2] Bolding,K., Snyder,L. : ' Overview of fault handling for the chaos router'. Proceedings of the IEEE International Workshop of Defect and Fault Tolerance in VLSI Systems. IEEE, Nov 1991. [3] Borgonovo,F., Cadorin,E. : 'Locally-optimal deflection routing in the bidirectional Manhattan network'. Proceedings of IEEE INFOCOM '90. IEEE, Jun 1990.

Li}= å Pr{Mki+1 |Mki } if there are no other

n-1

3Pr{M |M

i

i=0 paths between PLi and Pi

8

3rd Euromicro Workshop on Parallel and Distributed Processing - San Remo (Italy) January 25th-27, 1995. pp 288-295

[4] Dally,W.J., Seitz,C.L. : 'Deadlock free message routing in multiprocessor interconnection networks'. IEEE Transaction on Computers, Vol C-36, N. 5, May 1987. [5] De Carlini,U., Villano, U. : 'The routing problem in transputer based parallel systems'. Microprocessors and mycrosystems, Vol. 15, N. 1, Jen 1991. [6] Fang,C, Szymanski,T. : 'An analysis of deflection routing in multi-dimensional regular mesh networks'. Proceedings of IEEE INFOCOM '91. IEEE, Apr 1991. [7] Gelernter,D. : ' A DAG-based algorithm for prevention of store and forward deadlock in packet networks'. IEEE Transaction on Computers, Vol C-30, Oct. 1981 [8] Isloor,S.S., Marsland T.A. : 'The Deadlock problem : an overview' IEEE Computer , Sep. 1980 [9] Maxzmchuck,N.F. : 'Comparison of deflection and store anf forward techniques in the Manhatan street and shuffle-exchang networks'. Proceedings of IEEE INFOCOM '89. IEEE, 1989. [10] Ngai,J.Y., Seitz,C.L. : ' A framework for adaptive routing in multicomputer networks'. Proceedings of the Symposium of Parallel Algorithms and Architectures. ACM, 1989. [11] Pramanik,P., Das,P.K. : 'A deadlock free communication kernel for loop architecture'. Information Processing Letters, Vol. 38, N. 3, May 1991. [12] Szymanski,T. : 'An analysis of hot potato routing in a fiber optic packet switched hypercube'. Proceedings of IEEE INFOCOM '90. IEEE, Apr 1990. [13] Tanenbaum, A.S. : 'Computer networks'. Englewood Cliffs,NJ: Prentice-Hall, 1981. [14] Pifarrè,G.D., Gravano,S., Felperin, A., Sanz,J.L.C. : 'Fully Adaptive Minimal Deadlock-Free Packet Routing in Hypercubes, Meshes, and Other Networks: Algorithms and Simulations'. IEEE Transactions on Parallel and Distributed Systems, Vol. 5, N. 3, March 1994.

9

Suggest Documents