Document not found! Please try again

Counting Networks and Multi-Processor Coordination - Brown CS

0 downloads 0 Views 834KB Size Report
networks,. i.e., networks that can be used to count. We give a counting network ..... pointer to some out- put pointer, each time shepherding a new token through.
Counting

Networks

and

Multi-Processor

Coordination James

Aspnes

Maurice

*

Abstract Many

not

fundamental

multi-processor

problems

can be expressed

processes

must,

cooperate

a given

range,

ues from ory

or

work.

Herlihy

destinations

perform

poorly

tlenecks

and

such

on

Conventional

aa counting as addresses

an

because

high

sorting proach new

to these

observations we offer

to solving

such

networks

that

a counting

1

n log2 n we

provide

the

sequential

avoid

solutions,

on

a

networks, We give

this

Iogz n

construc-

algorithms

bottlenecks have

ap-

of depth

coordination

and

new

to count.

Based

“gates, ”

of

We introduce

construction

tion,

behavior

counting

can be used

network

using

mer

called

to

show

that

inherent

con-

networks

are

fundamental

from

a given

this

paper,

commercial

advantage,

fee all or part of this material are

the ACM

not

made

copyright

or

distributed

is granted for

and its date appear, and notice is given that copying

permission

of the Association

wise, or to republish, @

1991

ACM

requires a fee and/or

direct

notice and the title of the

pubhcation

for Computing

Machinery. specific

089791-397-3/91/0004/0348

a va-

ing

such

problems:

successive

as addresses

such

network.

a completely

problems,

a new

by

values

in memory

on an interconnection

networks,

used

coordination

as counting

assign

we offer

to solving

new

introducing

class of networks

In

approach count-

that

can be

to count.

Counting 5],

are

networks,

IS by

To copy other-

permission.

$1.50

348

like

constructed

output

computing

nected

to one

elements

another

sorting

by

if they

arate

wires,

and

propagate

number rive

N the

network

line

>> n of input

at arbitrary

among the

a counting

times,

input

sorts

while

a collection together,

through can

values

even

of n on sep-

the

network

and

con-

However,

network

count

any

if they

are distributed

wires,

two-

baiancers,

wires. arrive

[2, 4,

two-input

called

network

only

lockstep,

networks

simple

values

in

sorting

from

input

ancer to copy without

range,

or destinations

Figure

that the copies

con-

under

multi-processor

collectively

ar-

unevenly

propagate

through

asynchronously. 2 provides

of a 4-input, Permission

techniques

can be expressed

an n input *Carnegie Mellon University. t D&taf Equipment Corporation, Cambridge Research Lab. i MIT Lab. for Computer Science. Supported by ONR CCR-S915206, contract NOOO14-91-J-1O46, NSF grant DARPA contract NOO014-89-J-198S, and by a Rothschild postdoctoral fellowship. A large part of this work was performed while the author was at IBM’s Almaden Research Center.

provided

provide

Introduction

processors

to for-

subst ant i all y lower

counting

we

outperform

that

tention. Finally,

they

bot-

a completely

problems.

of networks

creatures, that

of circumstances.

Many

on the

evidence

$

netproblems

contention.

networks, class

i.e.,

by

mathematical synchronization

problems Motivated

Shavit

in mem-

of synchronization

memory

riety

val-

interconnection

solutions

ventional

problems:

successive

merely

experimental

coordination

to assign

Nir

t

an example

4-output,

is represented (see Figure

1).

by

of an execution

counting

network.

two

and

Intuitively,

dots

A bala vertical

a balancer

is just

a toggle puts

mechanism

it

right.

It

thus

one

its output

wires. arrive

values the

bered

For

by

are

not

the

first

used

line

2 and

leaves

line

2, and

in general,

this

for

line

the

the

4.

them

Nth

reader

can on

leaves

on

leave

on

a shared

ith

computing

how

tures: buffers. ject

networks

throughput

by

processes

into

mance and

pieces

output

to

n

the

It

is

of

often

widely-shared hot been

[19].

the

has

of

in

in

In

conflicts

locations,

(e.g.,

throughput,

better

tolerance analysis

is

inserted

by

removed

by

as

spin

network

higher

[6,

Compared

such

counting

Our struction

practice,

counters

a

15,

data

strucof pro-

a pool

of con-

conventional

or

semaphores, provide

less memory

contention,

for

failures

delays.

of

the

and

20]).

a pool to

locks

syn-

12,

implementations

eral

certain

we

has

implementations

and

Encore

design

mentations

wcmk in soft-

tations

[3, 9, 10, 16, 20].

the

counters

the level counting

outperform

based

and

pro-

of concurrency network

conventional

on spin

the

of sev-

on an eighteen-processor

When

high,

conIn

performance

of shared

MultiMax.

network

experiment.

the

buffers

is sufficiently

counting by

compare

ducer/consumer

called

architecture

is supported

appendix,

algo-

at

conflicts

experimental

are

ob-

response

Shared

bufler

processes.

an

n in

of shared-memory

algorithms items

struc-

simply

1 to

processes.

a number

which

is

numbers

by

to

data

producer/consumer

perfor-

often

hot-spot

a

concurrent

common

counter

the

processes

our

bottlenecks

shared-memory

hardware

11] and

among

two

serial

by

Reducing

focus

[1, 8, 12, 14, ware

many

limited

of

performed

contention.

memory

spots

be

eliminates

memory

performance

rithms

can

from

networks,

highly

and

producer/consumer

ture sumer

level

interactions

that

concurrent

benefit

of counting

two

shared

requests

ducer

passing

high

construct

issues

central

A

element!

a

decomposition

benefits: reduces

the

decomposing

This

parallel.

achieve

utility

to

counters

A

that

techniques Counting

the of

shared

chronization

outputs

. . . it is counting

illustrate

show

are

to try

actually

other

would

analysis.

implementations

As

consecutive

without

we

1) enters

if on the

to

values

all through

num(these

will

many

algorithms

contention

To

one

arrival

second

similar

we feel that

2, in-

lines

we have

value

i, i + 4, i + 2.4,

of input

on

is encouraged

Thus, assigns

of values

network).

1, the

the

(The

network

numbers

the

shared-memory

input

(numbered

on line

him/herself.)

number

by

of contention;

the

of Figure

of their

input

in-

to

one

number

example

order

the

and

convenience

the

numbers

N mod

send[ing

left

the

In the

be seen,

line

the

on the network’s

other.

them

to

balances

put after

1, repeatedly

receives,

locks,

imple-

implemen-

sometimes

dramati-

cally. Counting cesses

that

while

using

other

networks

are also

non-blocking:

undergo

halting

failures

a counting

processes

property

from

architectures

asynchronous;

progress.

because

existing

timing

are

process

uncertainties complexity,

operating

system

themselves

step due

page

do not

making

is important

memory

tion

network

times

activities

cache such

In

delays prevent This shared-

subject

to and

counting

class of concurrent

rich

mathematical

tive

solutions

ing

well

networks

structure, to

and

potential networks they

uses,

a

effec-

and that

a

have

provide

problems,

other that

They

We believe

as interconnection

balancers[18],

they

important

have

represent

algorithms.

in practice.

networks

ample

in instrucmisses,

summary,

new

perform

inherently

are

to variations

faults,

pro-

or

they count-

for

ex-

[21] or as load

deserve

further

at-

tention.

as preemption

or swapping. We

show

counting argue

a

network, that

our

depth

construction

of

n log2 n balancers,

construction

1It is easy to implement Swap,

Iogz n

using

produces

a balancer

Test O Set, or a randomized

using consensus

low

a

2

that

Networks

Count

and levels

a Compare

U

2.1

Counting

Counting networks

primitive.

349

Networks

networks called

belong

balancing

to

a larger

networks,

class

constructed

of

from

wires

ancers,

and

comparison wires

computing

in a manner networks

and

elements

very

to that

are

constructed

[5]

comparators.

balancing

called

similar We

begin

by

A

bal-

balancinq

tion

in which

to input

from

describing

put

wires

A balanceris put

wires

Tokens input

a computing

and

two

output

repeatedly wires,

element wiresz

arrive

at

output

may

think

of a balancer

that

given

a stream

of input

token

the

one

one

to the

ber

of tokens

xi,

lower,

on the

ilarly

by yi,

output paper

we will

both

as the

and

name

a count

ceived

of

on the

Let

the

defined safety

the

input

number

above

(output)

collection

wires.

of tokens

We can now

liveness

properties

on

never

ZO+Z1

a quiescent

It

is important

sumptions from

pletely

creates

output

Given

any finite

co+

abstraction

—––, ., , wltnm

C

=

.1

a nm~e

a quiescent Z1

number

state,

yo + yl

swallows

amoun~

input

that

=

m

tokens

L

01“ time, is, one

m =

it will

reader

might

ation

on

A balancing

state,

=

in which

[m/21

structure, are

pointers

that

pointer,

through We

X. + never

be

and

and

output

tokens

the set of input

are the

no

as-

one

as a com-

the

time

in

above

an

imple-

multiprocessor. as a shared are records

record

to

asynchronous input

transi-

is defined

of what

repeatedly

some

each

the

and

another.

processors

traverses pointer

to

shepherding

the data some

a new

outtoken

network.

define

the

depth

yl

=

tokens

same.

2 In Figure 1 as well as in the sequel, we adopt the notation of [5] and and draw wires as horizontal lines with balancem stretched vertically.

350

the

and

is defined

any

network

wire,

as O for

max(depth(zo),

wires

to

where

the

a network

depth(zl))

of a balancer

network

network

the

following

cent

states:

In

of

having

+ input

in1 for wires

xl.

counting

for

of a balancing

depth

of a wire

X. and

depth

maximal

wire,

A state

in

a balancing

consider

balancers

from

machine’s

from

in

memory

where

of the

ancing quiescent

one

make

is implemented

a program

lm/2j. 4. In any

a shared

network

we

and

a feeling

Each

put

y.

the

i.e.

be viewed

represent,

runs

reach

tokens).

quiescent

se-

time

of token

process,

the

the output 3. In any

case

that

a balancer

(i.e.

the finite

finite

balancer

To give

structure

of input

from

properties

any

state,

can

by a schedule.

data

it is guaranteed

for

that

way

ment

tokens).

Z1 to the balancer,

the

timing

to

behavior

asynchronous

be

a balancer

note

usual

put 2.

safety

naturally

within

the

balancer its

re-

the

to

regarding

the

input

state

(i.e.

of the

The

is always

and

reaches

(y~)

of a balancer:

~ yo+yl

it

yi,

network

wires

1. In any state,

that

~~=~1

and

union

and

tokens,

network—

time

formally

follow

definition

~

output

of a network

as the

of m input

tions

wire

its

state

balancers.

quence

the

tokens

at a given

defined

namely, xi

wires to out-

designated

the

of the network

~~=~1

input

connected

unconnected),

component

network

a collecconnected

reever

use xi

of input

of a balancer

as the and

ith

the

sim-

of tokens and

liveness

w

Let

be

of all its

is

by

Throughout

notation

of the

and

time

a given

that

num-

ever

wire,

at

wire.

state

and output

wire.

this

the

no cycles.

and

not

balancers),

of balancers,

and

denote

tokens

input number

output

abuse

wire

We

of input

one

repeatedly

output

wires.

ith

repeat-

are

. . . yW – 1 (similarly

containing states

mechanism,

tokens, balancing

output

number

ith

1).

Intuitively,

upper

balancer’s

its

are

as a toggle

i c {O, 1} the

on

and

wires.

effectively

on its

i E {O, 1 } the

ceived

output

to

in-

(see Figure

times,

edly

sends

two

on one of the balancer’s

arbitrary

on its

with

w wires

w designated

(which

of

y., yl,

width

output

having

.,, ZW_l

wires

of

where

wires,

Zo, xl,

networks.

network

of balan~ers,

any any

of width

whose

outputs

additional

step

quiescent i < j.

state,

w is a is a baly.,

. . . yW _l

property

O ~ y~ – yj

in

~

have quies-

1

output

1357

246

Figure

To illustrate tion

in which

tially,

one

shows

such

work As

property,

consider

traverse

the network

completely we will

be

seen,

kens to output w.

Balancing

called

are currently so that

in,

assigned

l)w.

(This

tail

The

step

of ways

numbers

wire

is done wire

i,

are consec-

i, i+ w, i+2w,

. . . i+(gi

-

in greater

de-

property

can

we will

between

2.1

If

non-negative are

all

be defined

them

is stated

2.

go, . . . . VW-l the

The

in the

is

Either

~i

some

=

c such

a

sequence

following

of

statements

2.2

in

for

all

for

i, ~,

any

or

i