Dynamic Deflection Routing on Arrays. (Preliminary ... right notice, the title of the publication and ita date .... where it will meet the RIGHT container. Hi+t,j-t =.
Dynamic
Deflection
Routing
(Preliminary
Andrei
version)
Eli
Broder*
Abstract the performance algorithm
of a simple
on arrays
with
one-bend
packet
no buffering
in the
This situation adigm
routing switches, under a stochastic model in which new packets are continuously generated at each node at random times and with random destinations. We
factor
of the hardware
we show that
in the steady
bandwidth. state
Furthermore,
the expected
packet spends in the system is optimal
1
waiting
time
a
into
more packets time.
This
the system to their
destinations,
communication theory
at time
ac-
in queues does not grow with
time),
and the ex-
dence assumptions (such as between successive processing times). These assumptions do not model accurately the routing process on a communication networks where there are complex and hard-to-analyze interactions be-
O, and the routing
by the time it takes to deliver
are injected
and interesting
par-
to the system
gies and algorithms, but this is not the case: most of the work in queuing theory is baaed on indepen-
is measured
these packets
by a stochastic
One might assume that queuing theory would provide ready answers, at least for the simplest topolo-
Most theoretical work on communication networks has focused on batch, or static routing: A set of packets is injected
modeled
are injected
state.
are
Introduction
algorithm
is better packets
pected time a packet spends in the system in the steady
(up to a constant
factor). Sharper results (in terms of the constants) obtained for the ring (dimension one torus).
whereby
cording to some distribution, and the routing algorithm is evaluated according to its long term behavior. In particular, quantities of interest are the maximum arrival rate for which the system is stable (that is, arrival rate that ensures that the expected number of packets
prove that on the two dimension torus network our algorithm is stable for an arrival rate that is within a constant
Upfalt
controls only the rate at which it injects its own packets and has only a limited knowledge of the global state.
We study routing
on Arrays
assuming
into the system paradigm
(see Leighton
that
tween packets.
all
Several recent articles
no
do address the dynamic
rout-
ing problem, in the context of packet routing on arrays [11, 6, 14], and on the hypercube and the butterfly
in the mean-
leads to a reach
[16]. The analysis in all these works requires
[12] for an exten-
queues in the routing
uously
some tools from queuing theory (see [4, 5]) and helps bounding the correlation between events in the system,
injected
into the system.
Each processor
switches
(though
unbounded
sive survey) but rarely reflects the practical reality of communication net works. Most real-life net works operate in a dynamic mode whereby new packets are contin-
some works give
high probability bound on the size of the queue used [11, 6]). Unbounded queues allow the application of
usually
*Digital Systems Research Center, 130 Lytton Avenue, Palo Alto, CA 94301, USA. E-maik broder~a,dec. corn. t IBM Almaden Research Center, San Jose, CA 95120, USA, and Department of Applied Mathematics, The Wei5mann Institute of Science, Rehovot, Israel. Work at the Weizmann Imtitute supported in part by the Norman D. Cohen Professorial Chair of Computer Science, a MINERVA grant, and a grant from the Israeli Academy of Science. E-mail, el~4v&d0rn. =eizmaun. aC. il
thus simplifying model. Here with This
we address
bounded paradigm
technologies ther
Permission to make digital/hard copies of all or part of MIS material for peraomi or claasroom usc is granted without fee provided that the copies
very
the analysis at the cost of a less realistic
in which small
the
problem
of dynamic
routing
or no buffers in the routing switches. is a better model for current network routing
or no buffers
switches
at all,
are built
with
ei-
We are not aware of
any previous work that presents a rigorous analyzes of a dynamic routing problem on a network with bounded buffers.
are not made or dkibuted for profit or commercial advantage, the copyright notice, the title of the publication and ita date appear, and notice is given that copyright is by permission of the ACM, Inc. To copy otherwise,
to mpubliah, to post on servers or to redistribute to lists, requwea specific permission andlor fee. STOC’96, Philadelphia PA,USA e 1996 ACM 0-89791-785-5/96105. .$3.50
Our potato)
routing routing
ber of theory [1, 9, 10, 3].
348
model resembles the deflection (hotmodel that has been studied in a numpapers
in the context
A node in that
model
of batch consists
routing of a pro-
cessor and a routing in which switches, that
switch.
The processor
it stores the packets it generates. have no buffers
reach a routing
to store packets.
switch
2
haa a queue
2.1
All packets
at a given step must
leave
may temporarily
destinations. accurately the model
This
move further
model
it verifies
that
knowledgment
the packets
routing.
To
we need to augment mechanism that ex-
is received
was received.
within
ture
overloading.
by allowing
a given sending
a packet
that
model
amount
is not delivered
(to the same destination).
Parallel
machines
per link.
At the end of the time
it reaches its destination
this fea-
step the
when the switch
is only
has a free outgoing
that
the switch
the processor
stores packets
link,
that
did not receive a packet
every one of its four
sume that
such as
to its sender. A proswitch
through
The time
or returns
Packets are generated within the processors. cessor can inject a packet into the communication in a step in which
within
number of steps to return to its sender. processor must send it again at a later
a link, one
at the communication switches: once a, packet enters the communication network it continuously moves until
On the other hand,
We indirectly
Model
one packet
packets not delivered during some pre-specified time interval are removed. This feature helps the network recover from
The
switch sends every incoming packet along some outgoing link, at most one packet per link. There are no buffers
If no ac-
a reasonable
of time the packet is retransmitted.
Torus
in each direction. At the beginning of each time step a switch receives packets along its incoming links, at most
ists in most routing networks. In ‘real-life’ routing algorithms a processor keeps a copy of the packet it sent, until
Dimension
In each step at most two packets can traverse
away from their
suffices for batch
model dynamic routing with some ‘flow-control’
on a Two
We consider an n x n torus network. Each node consists of a processor and a communication switch, each edge represents a bidirectional communication link between communication switches. The network ia, synchronized.
the switch at the next step. If more than one packet needs to leave the switch through the same edge, all but one of the packets are deflected through other outgoing edges. Thus, packets are always moving, but some packets
Routing
The routing
incoming
links.
We as-
(not the switch !) has a queue
generated
within
that
processor.
the HEP multiprocessor [15], and high speed communication networks [17] use various forms of deflection routing.
We measure the performance of the network in a stochastic model in which at each step, within each processor, one new packet is generated with probability y p, and the generation events at different times and differ-
Our main results is an algorithm for routing on the two dimensional torus. The routes chosen are the sim-
ent processors
plest one-bend routes - the gist of the algorithm is in the choices a processor makes regarding when to insert
ber of packets at the system at time t. We say that
a packet
into
the network,
and what
packet fails to reach its intended
to do when
arrival
is constant
~.
the
For the analysis
destination.
stable fOT any constant
> 0 such that J ~ ~.,
&T
algo7i~hrn
and the eqcted
ets.
A node moves a new packet
whenever
it is empty,
it is helpful
the
as t + co.
the view
the routing
UP, DOWN, LEFT, and RIGHT containers. In each routing step all the UP containers in the system move one step up, all the the DOWN containers down,
is
etc.
At any time
a container
it may contain one packet. the switch can move packets
time a
ers it currently one container,
Since the expected minimum number of edges that a packet with a random destination must traverse is Q(n), our result is optimal up to constant factors with respect to both parameters. In section 7 we analyze the special case of routing on a ring of n processors with no intermediate queues. in the ring have priority
distribution
num-
system on the torus as a set of 4n2 containem. At each time step there are four containers in each switch, the
packet spends in the system is O(n).
Packets already
We refer to p as the
Let Zt denote the total
system is stable if Zt has a limit
Theorem 1 Consider an n x n 2-dimension tomw, with no buflem in the routing switches. Assume that at each step, at each node independently, a new packet with a random destination is inse7ted with p70babdzt~ .I/n. There
are independent.
rate of the system.
packet.
move one step
may be empty,
has, as long as every packet and no container
To inject
contains
ends up in
more than one
a new packet into the system the pro-
cessor needs to put the packet into an empty currently at its switch.
3
or
Before each routing step between the four contain-
container
Notation
over new pack-
to its routing
For simplicity
buffer
until it reaches its destination, We prove that cedure can sustain optimal injection rate.
we consider an algorithm
first goes into RIGHT container,
and the packet moves on the ring
container.
this pro-
stants The
349
(Using
all
four
whereby
a packet
then it goes into a DOWN
links
improves
only
the
con-
.) n2 locations
on the
grid
are denoted
hf~,j for
i,j G [0,.. ..1] l]. We make the conventions operations involving these i, j indices are always
that done
begin
1.
3.
wait wait
4.
Set fijj
5.
while
2.
mod n. The
(resp.
RIGHT
DOWN)
containers
are denoted
Hi,j
(resp. ~,j). These notations are fixed, that is, at time Hi,j is in position Mi,j+t and K,j is in t the container position
define
a subset of the
Aj,i is, Aj
consists
= Hi,j-i,
of {Ho,j,
. . . . Hn-l,j-n+l
od Pick
a random
wait
n steps.
A:
where it will meet the RIGHT container
Choose
=
Hi+t,j-t
wait
12.
=
SO K,j only meets the RIGHT containers that belong to A~+j, these are the only containers that “feed” it.
at random
for the container
13.
With
14.
if Hi,k
15.
else
probability
16.
Board
17.
wait
then
queue.
it waits
at Mi,j
H,,k arrives in
is certainly
are stored
delivered.
that for every (i, j) there is at most packet at time t,denoted ~,j(t).
to the corresponding
wait
After
arrives
another
fi,j
packet )
n steps and set +
O.
else
23.
wait
24.
This ensures
back or
done
22.
untfl
Hi,k
rctums
to M%,j.
one current
25. 26.
Leave
goto
27. 29.
H,,k.
A.
fl
28.
a
end
Figure 1: Algorithm tion Mi( ,j,.
for packets
at Mi,j
with
destins
V-container,
5
Analysis
of
Specific constants
the
algorithm
have been chosen for convenience,
we
made no attempt to optimize them, and, in general, we only claim that inequalities dependent on n hold for n sufficiently large. For the purpose of the analysis we take 61 = 1/2000 and 62 = 1/5. To simplify the analysis we shall initially assume that
it came.)
of wait
v8_jl+k,jf
H,,k
with
The precise description of the algorithm is given in Figure 1. Note that system time “flows” only during the execution
~til
wait
given its column destination. If the toss is unsuccessful or’ if either container is full, this stage is repeated until successful. (If the V-container is full, the packet returns to its start point in the H-container
then
to M~,2 (empty
waits n steps, 3. Once active, the packet repeatedly waits for a random H-container, tosses a coin with probability of success 192,boards the H-container, transfers
~ is empty
20.
in a FIFO
2. Once current, the packet waits for a random amount of time, geometrically distributed with parameter 81, until it becomes active.
and
it meets
!fkS2Mfer tO ~.-l~+k,r~.
Once a packet is at the top of the queue for an indicator flag, fit j, h become O.
packet
(There
arrives in row i’, then leave.
When this happens, the packet becomes current and the flag is set back to 1. Eventually the flag will be reset to O by the processor at Mi,j after the current
A.
&. until
if V,-,f+k,,
stages:
generated
A.
goto
19.
21.
1. Packets
to
u_,l+k,,/.) 18.
of three
Hith
1 – (?z goto
is not empty
of the algorithm
Our algorithm tries to deliver a packet by a one-bend path, first along the row within which it was generated, then along the column of its destination. The algorithm consists
in
at M;,j.
column j’.
Description
Mi,,j,.
destination
k uniformly
arrive
Ai+.j,j-t.
4
01/n
active.
[O, n - 1].
H~_ I,j+l}. The reason for defining diagonals is as follows: consider ~,j. At time t it will be in position Mi+t,j
1 step. probability
8.
current.
do
wait
9.
10.
Hl,j-l}
active
become
11.
that
1 and become
With
containers
RIGHT
Aj, via
as diagonal
e not
6. 7.
Mi~t, j.
We further
until at head of the queue. until fi,j = O.
statements.
at all times all queues are not empty.
350
We shall see that
even under
this pessimistic
scenario,
at the top of its queue, is “usually” will
be made precise later) We say that
delivered
the system
within
is in a normal
t if for all j no more than n/8 t have destination in column j. the system
every packet, (the meaning
is in an abnormal
O(n)
●
of length
tive packets with bound
●
enters again an H-container and thus the number
active packets at time Otherwise we say that
in an interval
-% the number
aa small
appropriately
(Theorem
as desired
by choosing
(T)
mist a occupies
(t) = 1;),
Bernoulli
variables,
with
Proof: In view of facts 1 and 2, between i+n and t +2n there is at most one packet from Mi,j that might occupy an H-container. Furthermore this packet must succeed (that is, not return to A) at line 13 in the algorithm at some time r with opportunities
state at time t + 6n (Corol-
t < r < t + 2n and it has at most two
to try.
•l
of newly active packets is Lemma
aa above.
2 The probability
rives at Mi,j If the system is in an abnormal
to a normal
(line
that Hi,k
is full
when it ar-
11) is less than 1/2.
state then the num-
ber of packets that compete for a popular destination wehp decrease every 5n steps by at least a constant times n. Thus wehp within 0(n2) the system returns
there
131
2).
lary 1) since the number
●
~ Pr(X~,j
where Xi,j (t ) are independent
livered within 6n steps (Theorem 3) and thus if the system is in a normal state at time t it will
bounded
a packet can enter
is [4/21.
j is wehp
If the system is in a normal state then every active packet haa a constant probability of being de-
in a normal
10 in the algorithm)
times in where the constant
can be made
be wehp
of length in
Pr(&~,~(t))
of newly ac-
in column
(line
of H-containers
Lemma 1 Let &i,j (t) be the event that time T with t +n ~ T < t+ 2n such that ~,j some H-container. FOT every t
high such
components:
destination
by a constant
time.
state.
has three main
In an interval
Fact 2 There is an interval of at least n between the time a packet leaves a H-container and the same packet
state at time
We say that an event & occurs with extremely probability (wehp) if there exists a constant a >0 that Pr(-#) < e-an. The analysis
once of this
state (Theorem
In section 6 we show how the facts above imply
Proof Let t be the time when ~,j line 10) in the algorithm. By Lemma
is at A (start of 1 the probability
that a packet from (i, j’) occupies an H-container any time between t + n and t + 2n is independently
4).
than
Theorem
2/5.
Therefore
H-containers
1
1.
wehp
no more than,
in row i are occupied
at less
say, 9n/20
at any time between
t+ n and t+ 2n, Since k is chosen uniformly Theorem 2 For every time t, the number of newly act + l),with destination in coltive packets in interval [t,
the probability is less that
at random, H;,k is full when it arrives at Mi,j
that
9/20+
an exponentially
small amount.
•l
umn j denoted Nj (t, t + 4) satisfies E(Nj(t,
t+l))
Proof sketch: First assume that tl s t +7n2. Then the result follows from Corollary 1. Otherwise condition on
distributed over the diagonals. Hence we the probability y that vi– jj+k,jl is empty at
correlated.)
Theorem
chosen diagonal
line 16 is at least 1/2. (Observe that the fact that ~i,k was empty and the fact that D~+k is free are positively entire
❑
abnormal state at any jized time by e-~”, for a constant p >0.
16”
a randomly
~ ~
for allj.
state
50tJ1n 0 and ~ >1 such that if the system is in a abnormal state at time t, then with probability at least 1 — e‘fin the system will be in a normal
state at time
Theorem 6 The ecpected length of a given queue at a giwm time 2s O(l). The eapected time that a packet spends in the system is O(n).
t + -ynz.
Proof
Proof sketch: Suppose that Aj (t) > n/9. Then it can be shown that wehp at least n/100 of these packets will attempt to board a V-container before t + 4n. since each packet attempts to board a random diagonal, the number of “busy” diagonals (see proof of Theorem 3) will
be wehp
at least
n/120.
From
sketch:
Define
the following
random
variables:
distributed 1. YI is a random variable geometrically with parameter L91/n; it counts the number of steps server S stays in the Start state, which stochastically dominates the number of steps from the time
each such busy
a packet becomes current
352
until
it becomes active.
l-0s,6n
~
and 2e-an V~(y3y4)
of steps server S’ spends from the Start
state to the Delivered
of steps from
x.
variables, geometri6. The variables YG,i are i.i.d. cally distributed with parameter 1 —e-~”. Assume that S’ moved to state S2 after the ith time it en-
The number
the number
that of the queue of node v, that geometrically with parameter A/n.
5. The variables Y5,i are i.i.d. Bernoulli variables: Y5,i = 1 if S’ moves to state S2 after the ith time it entered
bounds
the time a packet reaches the head of the queue to the time it is delivered.
S’
the state SO to state S2 then Y3 = 1; other-
wise (that
for server S’.
= V=(ys,iye,j)
$ ‘—. (1 -
e-Pn)2
Since the random variables YI, Yz, Ys, Y4, y6,i Ye,i are independent we compute:
state is given by
and
Y2
X = YI + 7n2Y3Y4 + ~(6n
+ ~n2Ys,iYG,i)) E(X)
i=l
353
= E(Yl)+~n2
E(y3y4)+E(y2)(6~+7n2
E(Y5,1Y6,1))
v. The expected time a packet occupies a container is n/2. Every time a container becomes empty it either gets a packet that with probability ~ has destination v, or it continues expected
= ~(n).
empty
to the next
the queue of node v is bounded If Zi are i.i.d.
random
variables,
independent
of T,
the average arrival
then
Zi) = E(T)
var(zI)
+
(@’1))2
VW(T).
icl
General
Thus, var(X)
= var(YI
var(Y2)(6n E(Yz)72n4 Let p = ~ E(X). mean-value expected
formula number
+ -pz2 E(Y511Y6,1))2+
Applying
= 0(?22).
for
any A such that
that
the
=
~ E(X)