Enterprise : a market-like task scheduler for distributed ... - DSpace@MIT

23 downloads 4561 Views 2MB Size Report
Systems Laboratory, Xerox Palo Alto Research Center, .... calls on the remote machines. The next layer of the Enterprise system is the Distributed Scheduhng ...
C.A.

-^^^

Center for Information Systems Research Massachusetts

Institute of Technology Sloan School of Management 77 Massachusetts Avenue

Cambridge, Massachusetts, 02139

ENTERPRISE

;

A MARKET-LIKE TASK SCHEDULER FOR

DISTRIBUTED COMPUTING ENVIRONMENTS Thomas W. Malone

Richard E. Fikes

Michael T. Howard October 1983 CISR WP //111 Sloan WP #1537-84

©

T. W. Malone, R. E. Fikes, M. T. Howard

1983

Also published as a working paper by the Intelligent Systems Laboratory, Xerox Palo Alto Research Center, October 1983.

Center for Information Systems Research Sloan School of Management

Massachusetts Institute of Technology

Enterprise:

A Market-like Task Sciieduler for Distributed Computing Environments

'l"homas VV. Malone*

Richard H. Kikes

Michael T. Howard*

Working Paper Cognitive

and Instructional Sciences Group

Xerox Palo Alio Research Center

October 1983

Abstract:

This paper describes a system for sharing tasks among processors on a network of personal computers and presents an analysis of the problem of scheduling tasks on such a network. Ihc system, called Enterprise, is based on the metaphor of a market: processors send out "requests for bids" on tasks to be done and other processors respond with bids giving estimated completion limes JTic s\stem includes a language independent lliat reflect machine speed and ciirrcnijv loaded files. Distributed Scheduling Protocol (DSP), and an implemenuition of tJiis protocol for scheduling ihc Fnterprise implementation assigns processes to the best remote processes in Interlisp-D.

machmc

a\ailable at run-time (cither remote or local) and includes facilities for asynchronous In a scries of simulations of different load conditions and message passing among processes. network configurations, DSP was found to be substantially superior to both random assignment and a more complex alternative that maintained detailed schedules of estimated start and finish times.

*Nnw

at

Massachusetts

Institute

of Technology.

Kntcrprisc:

A Market-like Task Scheduler

Computing Environments

for Dislributcd

Introdudioii

and

Willi the rapid spread of personal compiiier networks

VLSI

the opportunities

processors,

for

becoming more and more compelling.

massive use

Parallel

^i'

llic

increasing a\ailabilii\

of low cost

and distributed computing are

parallel

hardware can often increase overall program speed

by concurrently executing independent subparts of an algorithm on different processors

some

increase

the

redundancy,

in

the

number of systems

parallel

processors

((24),

In

[47]).

With

their potential

than

reliable

dicir

multiple

for

single-processor

There arc also important cases of inherently distributed problem solving where the

counterparts.

information and resulting actions necessary to solve a problem occur in widely distributed

initial

locations

(e.g.,

[3],

[39],

(7],

[48]).

Providing appropriate programming

facilities

computations involves a new set of problems.

main

[28],

made much more

be

can

[1].

can improve average execution time even more rapidly than

cases, parallel search algorithms

classes

of problems

in

multiprocessing

for

specifying,

systems:

(1)

and debugging

controlling,

For example, Jones

&.

Schwarz

parallel

[27] discuss three

resource schcJuIing-how

to

allocate

processor time and memorj' space, (2) rcIlabilii}~-how to deal with various kinds of component failures,

and

(3) s}'nchronizaiion"how to coordinate concurrent activities that

depend on each

other's

actions.

In diis

paper we describe a prototype system called Enterprise that addresses those problems and

present the results of analyzing one aspect of the problem of resource scheduling, naincly the

scheduling of tasks on processors.

This analysis was based on simulations of several alternative task

scheduling

under

algorithms

operating

various

network

are applicable to

many forms of

parallel

and

configurations

Although we have fiKused on decentralized methods for scheduling

tasks,

load

conditions.

our results on

this topic

computation, regardless of whether or not the processors

and whether or not they share memory.

are

geographically

The

Enterprise system schedules and nins processes on a local area network of high-performance

personal computers.

separated

It

is

implemented

in Inlcrlisp-I)

(Dandelion), and 1132 (Dorado) Scientific

and runs on

iJie

Xerox 1100 (Dolphin), 1108

Information Processors connected

witli

an Ethernet.

0\

A new

As sunun.in/cd

in

W

II

l.ihlc

1.

01 nil

IM

UI'KISI

I

IM

SVSI

processing

diMrihiitid

for

pliilosopli)

UN

I

the tr.idituuidl philosopin used in dosiiiiiing systems based on

networks such as Hthcrncis |.M]

is

not used b> their owners, and dedicated special purpose servers such as

and various knids of data base

servers.

,\

system

l-.nicrp;ise that

like

pr(KCSsor available at run time (either remote or liKal) enables a distributed

systems.

In

this

new philosophy,

file

idle

when

servers, print servers,

schedules tasks on the best

new philosophy

workstations arc

personal

loc.il .irc.i

which remain

lo lia\c dedicated personal workstations

in

designing such

dedicated to their

still

owners, but during the (often substantial) periods of the day when their owners are not using them, personal workstations

tJiese

become general purpose

on the network.

servers, available to other users

"Server" functions can migrate and replicate as needed on otherwise unused machines (except for those such as

file

servers

programs can be written

and print servers lo

on

that are required to run

take advantage of the

parallelism available on a network at any lime, with

little

extra cost

Thus

specific machines).

maximum amount

of processing power and

when

there are few extra

machines available.

System Architecture

In order to use this

new philosophy,

at least the following three facilities

way of commuiiicaiing between processes on

different machines, (2) a

best available machines (either remote or local),

with

dealing

As shown

The

first

in

and

(3)

programming language

tasks

constructs

Figure

1,

layer provides an Inter-Proccss

Communication (IPC)

facility

to

by which different processes,

each other.

When

tlie

The

such as an Ethernet

remote procedure

calls

different

PUPs

(see

provide reliable non-duplicated delivery of messages over a "best efforts" physical transport

medium for

for

the Enterprise system provides these facilities in three layers of software.

processes are on different machines, the IPC protocol uses internetwork datagrams called [4]) to

(1) a

on the

remoteness.

on the same or different machines, can send messages

either

must be provided:

way of schcJuIing

on

the

[34].

Enterprise uses a pre-existing protocol that

calls ([2], [42]) in

which messages are passed

to

highly optimized

is

remote machines as procedure

remote machines.

next layer of the Enterprise system

the IPC, locates the best available

is

machine

the Distributed to

perform a

turns out to be the one on which the request originated).

Mechanism, which uses both die

communicate with each

other.

DSP and IPC

Scheduhng Protocol (DSP) which, using task (even if Uic best

Finally, tlie top layer

to create processes

machine is

a

for a task

Remote Process

on different machines that can

Iiililc

A

1.

I

1

(iistrihutcd proccssin}> philosophy

ruditional distrihutcd processing philosophy

a.

dedicated personal workstations

b.

unused workstations are

c.

2.

rii-n

New

idle

dedicated special purpose servers

distributed processing philosophy

a.

dedicated personal workstations

b.

unused workstations arc general purpose servers

c.

special

purpose ser\crs only where required by hardware

.

\ Levels 4, 5

/

.

REMOTEPROC I

I

ENTERPRISE System

7-. /

/

r

Level 3

/

DSP /

1-

L^ "N

I I

Level 2

BSP

IPC

V

RPC

/

/

^^ Level

Level

Pup Internetwork Datagram

1

Ethernet

Figure

1.

Arpanet

Packet radio

Protocol layers used in Uic Enterprise system.

IKa-fn:*

lask

.li/i'd

Silii'diiliii}'

One

(»("

first

avi\ilahlc processor.

systems

which

ob\i(His solutions lo the

llic rriDSt

As we

(e.g.. (2)).

such

when

(1)

loaded

when

(2)

have difTcrent

lo simpl\

is

many

Liiken in

The problem of scheduling

(e.g..

Almost

|44]).

priorities (e.g..

for

on processors

tasks

all

scheduling techniques, where

tiie all

In highly parallel systems

there.

some subtasks

processor

lime

the entire system

can continue to operate with in

scheduling

is

under

files

all

more

high.

is

in

Uiis

halt,

in traditional

number of mathematical and

software

(e.g.. [5]. [8]. [9], [27],

however, deals with centralized

area,

brought to one place and the decisions are made

difTcrent processors,

brought to a

is

is

where the information used

developing decentralized scheduling techniques fails,

lo ilic

search problem are

in a

of course, a well-known problem

is,

work

traditional

the information

number of

are distributed over a

i.isk

different speeds or different

techniques for solving this problem in both single and multi-processor systems (30].

.1

existing distributed

sub-optimal:

radically

operating systems and scheduling theory, and there are a

129],

jssign

detail below, there are three important conditions

be

when network-wide demand

(3)

is

and

others),

that

it

memory),

virtual

tasks

promising

ma>

have different capabilities

prix.essors

into

more

will see in

approach

simple

a

problem

stiicdulinc

t.isk

This approach, or one similar to

in scheduling

there

may be

and the

For example, when

[33].

resulting actions

substantial

from

benefits

a centralized scheduler

but systems that use decentralized scheduling techniques

the remaining nodes.

Furtliemiore.

inherently distributed and rapidly changing

(e.g.,

much of

the information used

momentary system

load).

Thus,

decentralized scheduling techniques can "bring the decisions to the information" rather dian having to

constantly

transmit

information

the

to

centralized

a

maker.

decision

Overview of the Distributed Scheduling Protocol

Since

many of

problems

in

the

problems of coordinating concurrent computer systems arc isomorphic to

coordinating

human

organizations

organizational metaphors in the design of

system's Distributed Scheduling Protixrol

tlie

([7]. [13], [31]. [39]),

we have

explicitly

Enterprise scheduling mechanism.

(DSP)

is

drawn upon

In particular, the

based on the metapiior of a market (similar to

the contract net protocol of Smith and Davis [11], [38], [39] and the scheduler in the Distributed

Computing System

([12])).

Wc

use the term clients for the machines with tasks to be done and

contractors for the remote server machines (as in [13]). bids

on

a task

he wants done, contractors

selects a conu-actor

scheduling process.

\\'ho

from among the received In

the

standard case,

'Ilie essential

idea

is

that a client requests

can do the task respond with bids, and the client

bids.

the

Figure 2 following

illustrates tlie

steps

occur:

messages used in

tliis

Typical message sequence;

Client

Contractor

MACHINE

machine

Optional messages

Figure

2.

Mcssaccs

in

ihc Disiribuicd Scheduling Protocol.

1.

riic ilicni htihiilcitsis

the

uisk.

coniraclors

csiim.itc

lo

Idle contractors

2.

o "miiicM Jar bids",

requirements, and

special

;iii\

processing

its

'i'hc

description

of

piiority

ilie

Uisk

llie

of

allows

tii.it

lime.

"bids" giving

respond with

request lor bids iixludcs

summary

a

contractors respond with "acknowledgements"

their

and add

esfinidird conplctioii

Busy

times.

the task to their queues (in order

of

priority).

3.

When a

4.

When

bidder and if

idle,

the client receives a bid.

bid

If a later

contractor becomes

is

submits a bid for the next

ii

sends the task to the contractor

it

task

on

who submitted

and cannot be

the task has side-effects

bid

If the later

restarted), the client

the bid.

on the

significantly better than the early one. ihe client cancels the task

sends the task to the later bidder.

queue.

its

first

not significantly better (or

is

sends a cancel mcss;ige

to ihe

bidder.

later

5.

When a

6.

When a

the

contractors

If a task takes

much

the client that

it

contractor finishes

a

task,

client receives the result

can

of a

off."

it

was estimated

returns

task,

remove the task from

longer tlian

was "cut

it

it

the

result

to

the

client.

broadcasts a "cancel" message so that all

their

queues.

to take, the contractor aborts the task

Tliis cutoff feature

and

notifies

prevents the possibility of a few people or tasks

monopolizing an entire system.

Protection against processor failure.

In addition to this bidding cycle, clients periodically

contractors to which they have sent tasks about the status of the tasks.

respond

query (or any other message

to a

Failures might result from

machine failed

for other uses.

contraclcr

has

fails to

its

own

tasks

decision. (i.e.,

that

this

processing

its

from happening, the

own

bid.

client

scheduled

machine.

locally

failed.

client's

its clients,

machine

client

first

Similariy,

if

a

the contractor assumes the client

task.

offers

to

be

its

own

machine submits

contractor),

then the local

bidder and uould therefore receive every

task.

waits for oilier bids during a specified interval

To

before

Since contractor machines arc assumed to be processing tasks for only one

user al a time, the client machine's

on the

one of

In the protocol as described above, if the local

the client

machine would presumably always be the prevent

cdules the task on another machine.

rcv

receive periodic queries from

The "remote or local"

assumes the contractor has

any case, unless the task description specifically prohibits restarting

and the contractor aborts

failed

bids for

In

client

hardware or software malfunctions or from a person preempting a

the client automatically

tasks,

DSF), the

in the

query the

If a contractor fails to

(Human

own

bid

is

also infiatcd by a factor that reflects the current load

users of a processor can express their willingness to have tasks

by setting cither of these two parajneters.)

('ompiiiiMni

the ainlnni

Willi

conticiciors:

use.

specifying

\siihoui

task

invimi'l.

iko the conir.icl noi piolocitl (|11|.

I

This

scihicikc.

clmosc

ainmnwus

also iiliows for

It

of a

coiitr.ielors

is.

(li.it

iiri

.iw.ird"

bid.

".innoiiiiccmciu.

;iii

iiiwouinii where

which

processor

clients

below, this specialization of the contract net protcKol allows

make some

among

serve

I-^S|.

schciiDii

imiintil

programmers merely describe

is

DSP and

difference between

mutual selection b\

of cMinmicJ coiiiplciion nines (from

lor

wliicli clients to

The most imponant

clients" tasks in the

.illdws

facilities available

of remote procedures seems

efficiently

for

than

to

be

remote processes.

remote prtKCSscs make them

tlic

tliat

is

for

users

to

certainly not

their simplicity allows

However, the control and construct of choice in most

nil

DSP

is

only one of a

I

l.ircc class

immediately clear which

ivf

allows a contracti)r lo bid

ASKSC MIDI

of

ctrcLii\c

potcnli.ill)

the possible schedulers

i)n

one

task

is

commonly

occurinc situations.

In tlie

scheduling proiocols. and

t.isk

most desirable,

good

as

development of

l-or

it

is

not

example, a protocol that

another »)ne niiglu be more elTicieni than

wliile pcrfoniiinj:

Or. random scheduling might be almost

DSP.

I'KOIM.IM

l.l\(.

as

a

DSP wc

sophisticated

scheduler in

analv/ed the objectives of

ilie

most task

scheduler and used a simulation program to compare the performance of pi'imising alternative

Global

In

proKxrols.

scheduling

wc present

section

this

the

results

of that

Scheduling Objectives

Traditional schedulers for cenirali/ed computing systems often use layering the design of a system

made

In

(e.g.. [8]).

according to their order in a priority are

study.

lisl

lliis

while the policy decisions about hov\ priorities are assigned

a higher level in the system (see

at

DSP

[30]).

separation of policy and mechanism.

The DSP

jobs according to priorities assigned

some higher

at

scheduling as a basis for

lisl

approach, one level of the system sequences jobs

allows precisely

protocol itself

is

By assigning these

level.

that in systems of identical processors, tlie average waiting time

known

doing the shortest jobs

on

decisions based

decentralized

Thus, by assigning priorities

first [9].

result

priority

in

sequencing

For example,

of jobs

is

it is

minimized by

order of job length, the completely

in

optimal sequencing of tasks on

globally

a

same kind of

wiili

priorities in different

ways, the designers of distributed systems can achieve different global objectives. well

tlie

concerned only

processors.

Oplimality results for

mean flow

lime

and maximum flow

Traditional scheduling theory

time.

has been primarily concerned with minimizing one of two objectives:

[9])

time of jobs (F3^,^)"the average time from availability of a job until

maximum also

flow time of jobs (Fmax^"^'^ "'"'^ """'

maximizes the

utilization

^^

of the processors being scheduled

scheduling theory, involving the "tardiness" of jobs in to

be

less useful in

these problems are

comparing scheduling the

A number

rcl.itive

wc consider

known [9]

exactly.

and l.PT

value

[17].

to

so

Minimizing F^^^

third class of results

much of

tlie

in

this

field

has involved

bounds on computational complexity and "goodness" of

optimal

is

If

cases where

all

In these cases,

first

schedules

(e.g..

[10].

[26]).

(LPT), to achieve the objectives

jobs are available

if all

at

the

F^^^^^

some processors arc uniformly

that

is

F^^^,

same time and

the processors arc identical, then

guaranteed to produce an

from

most general forms of both

Tlie

literature

of results suggest the value of using two simple heuristics, shortest processing time

(SPT) and longest processing time First

[23]),

heuristics in terms of

sclicdulcs

resulting

([6].

completed, and (2) the

last job.

(A

(e.g.,

the average flow

relation to their respective deadlines, appears

most computer system scheduling problems.)

NP-complcle

[8].

is

it

completion of the

(1)

and

first

F^^,^, respectively.

their processing limes are

SPT

exactly minimizes F^^^,

no worse than 4/3 of the minimum possible

faster than others, then tlie

I.

PI heuristic

is

guaranteed

1(1

;ill

iniuiuco

.III

no worse th.m luicc the luM poNsihlc

I'ljji^

jobs arc axail.ibic

Instead

processing limes lia\e certain

llie

expected values for wiiich

all

difl'crcnl jobs.

random

distributions (e.g..

these cases,

In

K^^^,

and

respectively. (|45J. |I5]).

\-„^^^-

in

ad\ancc.

with dilTercni

exiionenti.il)

I'inailv.

minimi/c the expected

exactly

1.1*1

we consider

cases where the jobs arc not

lime but instead arri\c randomly and have exponentially distributed

the s;imc

available al

In these cases,

processing times.

c,ise> sphere

known

the system contains identical processors on

if

prcemplions and sharing are allowed, ihen SIT and

values of

wo consider

Next,

\;iliie |l()|.

ihc siirnc lime but iheir cx.icl processing times are not

at

and

the processors are identical

if

preemption, then

allow

PI

I

143).

DSP

Oilier scheduling objcciivcs.

can be used

Many

Parallel heuristic search.

(1)

"best

traditional

a

in

allcmalive al each point

achieve flow

artificial

if

the

processor available

given

it,

DSP,

capabilities, its

human their

order

the

single

most promising

By using ihc hcurisiic

[36].

a system with n processors available

each

task

will

be executing on

the

best

Another obvious use of

DSP

is

lo assign

each

user of the system a fixed nuinbcr of priority points in each time period.

programs) can then allocate these priority points obtain

to

response

the

times

can

Furtliermore,

priority.

Arbitrary market with priority points.

(2)

search,

most promising alternatives rather than only one.

;/

to

For example:

search space to explore next.

in a

always chosen lo be explored next

is

have different

processors

possible objectives besides die

programs use various kinds of

intelligence

heuristic

first"

evaluaiion function lo determine priorities for

be always exploring the

many other

time for independent jobs.

which of several aliernatives

heuristics for determining

For example,

to

maximum

of minimizing mean or

iradilional ones

they

to tasks in

(see

desire

[41]

for

Users (or

any way they choose similar-though

a

in

non-

automated--scheme).

Incentive market with priority points.

(3)

If the personal

computers on a network are

assigned to different people, then a slight modification of the arbitrary market in (2) can be

used to give people an incentive lo make

tlicir

personal computers available as contractors.

In this modified scheme, people accumulate additional priority points for their

every

use,

Alternative

time

Scheduling

their

machine

theory

(e.g.,

performance.)

to

(9]).

be the most (In

fact,

The second

where no attention

is

as

a

contractor

for

someone

else's

own

later

task.

Protocols

For comparison purports, consider the

scheme designed

acts

is

it

a

given

follov^ing

two alternative protocols.

logical extension is

the

protocol

we

initially

random assignment method to

the

scheduling

ITie

first

protocol

is

a

of the traditional techniques from scheduling

thought would

that provides a

decision.

provide

the

best

comparison with designs

iissi^iiniciil

iMi:icr

The

proUKol oncicoiucs the possible deficiency

.ilicrn;iti\c

first

complelion limcs arc pnnided by processors thai arc noi ready using

DSP may

start a

on

task

machine

a

much

machine), only to find that another canceled and restarted, finishes later than

machines could be made, then iininedialely available,

used,

their

tasks

is

["hat

on the

machine

first

wasted.

is

is.

clients

own

their

local

If the task If not,

tasks

will

finish

the

human tasks

By analogy with "lazy" evaluation of variables

it

users who.

actually

witli their

in

start

and

[18]) the original

DSP

finish times. is

submitted.

could be called "lazy

assignment" because clients defer assigning a task to a specific contractor until the contractor

it

assigns

tasks

In this protocol,

estimates

its

soon

as

contractors bid on

all

starting time for a task

higher priority

who

This alternative protocol, therefore,

contractors

to

submitted

is

scheduled.

Then

When new

it.

as

by finding

tlie

if

first

they are currently busy.

time in

added

its

schedule

and sends the

the client picks the best bid

tasks are

at

have been "bumped."

may

then

try to

contractor

task to the contractor

to a contractor's schedule, or

Tlicsc clients

A

which no task of

when a

longer than expected to complete, the contractor notifies the owners of later tasks in that their reservations

is

be called "eager assignment," since

will

possible.

tasks even

all

is

execution.

way. each contractor

this

estimated

could complete any new task that

([14],

when DSP

begin

soon as possible after the tasks become

when conditions change,

expected to do, along

machines that were not

to wait for faster

until

to contractors as

and so the contractor can make estimates of when

actually ready to start.

is

the job

reasonable estimates of complelion times on currently busy

as necessary it

(p(>ssibly

machine becomes available soon.

ihese estimates might also be useful to the

and then reassigned

mainuins a schedule of

faster

no esiinuies of

liiai

immediately,

axailabic immediately

would know enough

tasks are assigned

this alternative,

In

If

clients

have no idea of when

available

is

the processing time

all

could have.

it

that

DSP

of

to st.irl

task

its

takes

schedule

reschedule their tasks on

other contractors.

It is

important to note that even in cases where there

guaranteed to converge.

new

the scheduling of a finite

task can nc\ cr cause

(but possibly large)

occur only

Random

when

a

task

more than

amount of rescheduling is

bumped by

a

a finite

in rapidly

large

In the second alternative protocol, clients pick the

when

a lot of bumping, this scheduling process

is

number of bumps. To reduce

the

changing situations, rescheduling can

amount.

assignment

bids and contractors pick the all

is

Since tasks can only buinp the reser\ ations of other tasks of lower priority,

first

first

contractor

who responds

tasks they receive after an idle period.

they are executing a task, and they answer

all

requests for bids

to their request for

Contractors do not bid at

when

they are idle.

If a

cliciu

dots

receive

luil

hulv U

.iii\

Ldiilliuies lo ivbro.idciisi ihe ivquesi

heginninc

coniniclois receive a task alter alre.uix

uho

(with a "Ininip" message) and ihe clicni In the siinulalions discussed

and

a\ailablc.

beUm.

the Ihsi task

of"

suliniiiicd

the selection

when more

execiiliDii dI" anotlii

ol"

llian

the

one

it

fi>r

luds peiKulicillv.

one. ihe new

;

i.isk

continues ining lo schedule bidders

I'nsl

task

is

when more than one

rejecied

elsewhere.

it

irtachinc

waning, are boih modeled as

is

V\ lien

choices since the dcla> times for message transmission and prwcssing are piesuinably random. contractor machines migli! often respond iikmc quicklv

fast

reality,

ones and so would be more

mechanism

in

Simulution

Results:

a

mean

In this section

objective.

tli.in

the

of this scheduling perfoiinance.)

siinulaied

Time

Flon

flow tiine of independent jobs

we summarize

is

likely to

be the primary objective

assignineni,

and

(3)

random assignmetu.

determined according alternative,

in

topic aic so

this

the results of a series of simulation studies of

distributed scheduling alternatives outlined above:

Howard

belter

(In

slow

appropriate to use simulations lo compare alternative strategics for achieving this

is

it

Mean

liuis the perfoniiance

bidders,

fiisi

be somewhat

to requests for bids than

distributed scheduling cnvironmenLs, and since analytic results about

real

scarce,

be the

to

miglit

Minimi/.ing

Since minimizing the

many

likely

system

real

is

r.indiv.n

priorities

to

arc

shortest

the

not

both the eager and lazy alternatives, priorities arc

In

processing

Iliese

used.

three

tJie

(1) lazy assis'in'enl (the original DSP), (2) eager

time

first

(SFT)

In

heuristic.

simulation studies are described

in

the

more

random detail

by

[21].

Method

To

simulate the performance of the alternative scheduling strategies, most of the code for the

operational Enterprise systcrn was used, with

some of

messages between machines) redefined so that

a

ITie completion

the ftinctions (primarily those for sending

complete network was simulated on one machine.

of jobs was simulated using elapsed time of the simulation clock,

machines completing simulated jobs proportionally more quickly

Job

For

loads.

all

the simulations, "scripts" of

tlian

random job submissions were

assumed

to

be independent of each other and required to be run remotely.

assumed

to

be a Poisson

exponentially distributed.

new job would

and the amount of processing

This means

arrive in the system,

constant probability

number

prcx:ess

generators,

that, at

we

any

that, at

and

instant,

it

also

slow

in

faster

uitli

ones.

created.

The job

All jobs

were

arrivals

were

each job was assuincd to be

every instant, there was a constant prob.ibiliiy that a llial.

for each job currently executing, thoie

would end.

was a

Hy varying the paiamctcrs of the random

created scripts with average loads of 0.1, 0.5. and 0.9, where average load

defined as the expected

amount of

ajnount of processing power

pr(x:cssing requested per time

in the system,

interval

llius, 0.1 represents a light load

divided by the

and

is

total

0.9 a heavy load.

12

addilion lo ihc

In

errors.

i'slinuiliiw

amomu

incliuicd an c*-limatcd

of how long ihe job would perfect

Nine

hhuhinc ainfi^umiions. all configur.itions, a total

achieved

in different

C'onvnunlciUions.

ways: a single machine of speed

communication

where

delays

to

arc

start

time

than

some

is

delayed at

all.

in

bumps were mad^.

the

In a real system, jobs

late

Replications.

generated.

often,

the

tliesc

same

of the two accuracies (0 and using

tlie

same

scripts for

all

fi\c scripts

if

the job

processing

In real

times,

this

In other cases, different assumptions

among

mechanisms.

scheduling

In keeping with the spirit

would

ordinarily have lo be

(After a job begins execution,

when

of simulating

bumped by more

it

is

not subject to

from

late bids are recci\ed

method might improve

is

if

if

fewer

jobs were sometimes

successfliUy assigned lo a contractor.

random method could

achieve;

if

Thus,

this

rebroadcasting

±

five

different

random

scripts

of job submissions were

were used for each of the three scheduling meUiods, each

100 percent), and each of the nine machine configurations.

ihc different methods, accuracies,

much more powerful comparison of than

job

to

performance could only get worse.

For each load average,

Then

machine of

simulations, clients rebroadcast their requests for bids in every

simulates the best scheduling performance the less

1

bids.

random assignment

Similarly, in the

or

performance of the eager mctliod could only get worse

lime interval of the simulation until the job

occurred

trade-offs

rebroadcasiing.

but the performance of the la/y

receiving

1:

in

was

the eager simulations are rescheduled every lime their scheduled

In other words, the

after

relative

In the lazy simulations, jobs are never restarted

fast contractors.

resuirted

and

laic bids,

in different cases this

be perfectly reliable and instantaneous.

appropriate.

is

tolerance before being rescheduled.

being bumped.)

on the network were defined,

or 8 machines of speed

negligible

about communications delays might change

Bumping, resiariing after

the case of inaccurate

range.

the

"pure" cases of the different scheduling meciianisins,

assumption of insuintaneoiis commimications

"pure" scheduling mediods. jobs

8;

made

were either

etc.

2;

order to simulate

In

In

error).

o\er

different configurations of machiiies

communication among machines was assumed situations

distributed

cases, these estimates

of 8 units of prwessing power was available, but

and 2 madiines of speed

speed 4

examine extreme

random!)

each job. ihc scripis also

in

die esiiniaie a user niigln ha\e

jtth (i.e..

(± 100 percent

inaccurate

were uniformh

errors

the

each

I'or

In order lo

lake).

(0 percent error) or very

estimates,

anuninl of processing

;icUi.il

of processing

the differences

due

to the factors in

submissions had been generated randomly

By

and configurations, we obtained a

in

each

which we were interested different

case.

I

Ik-

holm^

icsulls piCM-'iiicd

.no

machine conngiiraiions

several

liiiic

one should be

catecorv.

the flow times

hevii.inl aboiil a\or.fpinc tiicn;

is

at

"eager"

ofsysiem

Somewhat as,

assignment

load.

With

and

in

in

'lO'J

.'.

advantage under

How

cases,

machines are often busy

non-identical

is

the

because

appears to be

tJie faster

better than the

With heavy

so

this

is

machines.

There

advantage

a clear

analytically

(see

even when the csiimates are off by as other hand, often di)cs

worse

of having one

csiimates.

poor processing lime csiimates.

does

effects of

non-mndom

fast

processing power,

toLil

fast

does not matter nearly as much.

machine configuration:

scheduling methods at

As

(1)

lig!it

loads

m.xhinc rather than lliis

last

result

a

(3)

combination of

can also be derived

[33]).

of accuracy cf processing lime

face of

At light loads, a

not.

At moderate loads,

Figure 4 shows, the benefits of non-random scheduling

at all load levels

slower machines with the same

is

This sizable

two non-random assignment methods

the

assignmeiu preference

.^s

(2)

lazy

of non-ideniical processors, there

case

a significant advantage.

tlie

and

loads, these differences are

loads) increase as the range of processor speeds in the network increases.

all

is

more compiieaicd and

machines while the random method does

a sizable advantage of

(averaged over

even

loads in

of machme configuration, llicre are three important

just discussed, there

Effect

light

almost always available so this

is

here.

Figure 3 shows that the "la/y" assignment

much

perfect estimates of processing amounts, botli eager assignment

loads

consistently assign jobs to

machine

shown

flow limes arc

times, so the original

surprisingly.

some

tJie

advantage for either of the two non-random assignment methods.

light

same

the

normali/cd

method.

Even with

(5%-50%).

approxiinatel>

in

,ii'o

each conngiiration b\ di\iuing by

assignment arc consistently better than random assignment. substantial

wc

relative

good

least as

TluT.'rorc.

sinj-lc

of any particular

time of

of sclu'JuIing mellwd

expensive

witli

arJ one ci.nlkuration of a

this \\a).

in

.niJ o\i.M

of mulliple identical

fiovs

and eager methods

patterns as the averages of the original

Kffcct

Ihc scnpis

The averages of these noiTna!i/ed values showed exactly the saine

the la/y

t'or

method.

the rand('m

fast

lor job^ over

anJ since there arc large differences between dilTercnt configurations

piv\

ITiere arc three configiiralions

Strictly spi-akme. since the JiiTerent conliriLiratioas arc not rep!cs>:iMaii\e

machme.

i'ffeci

nic.in

each case.

c^)n^lJUlrati;l^^ (if niiilliple nca-iJciuiL-.!) m.iLhinos.

machines. fi\c

method

nf

.I'-cr.iycs

in

with

much worse

wiili

much

wiiii

iis

The

lazy assignment

"ver.ili

as 100 percent.

bad estimates

method

is

pcrfonnanco degiadmc

llian

The

quite robust in the on);,

a

few percent

eager assignmeiu method, on the

with good ones, and

bad estimates than purely random assignment.

in

some

cases

it

Single

Processor

o E Multiple

O

Identical

o

Processors

o o



Multiple

Non-identical

Processors

EZl LAZY PERFECT ESTIMATES

Load Factor FICDSE 3.

KS ^a ^S

LAZY +/- lOO: ERJtOP. EAGER PERTECT ESTI>lAri5 EAGER /- lOOZ ERROR BSl RANDOM

Perfcroacc; cf 3 ccbeciulisg methods (Itzy atEignoent, eager astignnect, random fEsigrmen: ; uader various loadt and accuTaciet of processing time estimates.

c CD E 05 CO

VCD

O -5

Range

of

Processor Speeds

O

EAGER ASSIGNMENT, +/- lOOZ ERRORS

+

LAZY ASSIGNMENT. +/- 1002 ERRORS

FIGDKE 4. Improverject in average flow time of con-random assignment methods compared to random assignment. "Range of processor speeds" is defined as ratio of speed of fastest to slowest contractor in network.

14

ilic

hi/\

iwn non-r.indoni

lumibcr

ilic

as loads increase.

"pure" simulations, communication was treated

order

in

at

boih hc.i\\ and

hough

lliey

arc not

(± 100%

linies

instantaneous and

as

hro.idcMsl nioss.igcs used

While

csiiin.iics loi jobs.

free

in

by llic

liclu loads.

shown

here,

in ihcsc

error),

order to maximi/c

designing an actual system, one would presumably sacrifice st)mc

In

performance

I

oljob processing

similar results were obtained (or poor estim.ites

scheduling

nine

same mmiber of messages per job

llic

method requires many more messages

scheduling perfomiancc.

and

dl dircciod

stlicdiiling mciliods wiili pcrlcci piiH.ossiiig

melliod icquires almost

the eager

shows

liciiic 5

inidiinl (if mcssiii^r liiiffn.

reduce

to

number of messages.

the

Discussion

The

similarities

of

scheduling problem

this

to that

of job shop scheduling might lead one

to believe

that ob\ ious extensions

of traditional scheduling theory algorithms such as the eager assignment

mediod would provide

the

best pcrfonn.ince

simulation results by Conway.

scheduling of distributed Lisks.

for

Maxwell, and Miller

Also,

early

of a job shop environment showed that

[9]

schedules based on very inaccurate estimates performed almost as well as schedules based on perfect estimates and

much

do so poorly

in

that

random

better than

our examples when

two primary

factors

account

for

"Stable world illusion."

(1)

machin; later

back

:'iat

later

this

method

Wc

believe

result:

and

to the

method, each job

in the simulation,

to start

But

same machine, then they

times,

later

in

if

is

assigned to the

jobs of higher priority arrive

keep "bumping" the

will

first

job

other words, jobs are assigned to machines on the

will arrive

that the

(i.e..

worid

jobs are rescheduled every time any

their estimated start time, by the time the

chance

then, should the eager assignment

In the eager assignment

assumption that no more jobs

though

Why,

given inaccurate processing lime estimates?

estimates the soonest completion time.

and are assigned to

schedules. is

it

job

is

rescheduled,

it

will

remain

new jobs

may

on another machine that could have completed

it

stable).

arrive

tliat

Hven delay

already have missed a

before

it

will

now be

completed.

In

some of our simulations

for this effect, that

is,

(not included here), the bids included an extra factor to correct

bids included an estimate of

would be delayed by jobs the job began execution.

though the inclusion of

that

(See

Unexpected

[28]

amilabiliiy.

the

When

priority jobs arrive at a processor,

all

how

yet arrived, but

for the derivation

this correction

assignment method somewhat,

(2)

had not

factor did

long the starting time of the job

could be expected to arrive before

of

this

correction

factor.)

Even

improve the performance of the eager

changes were not substantial.

a job takes longer than expected, or the clients

who submitted

when higher

jobs scheduled later on that

'

Directed

Messages J*:

w

CO

Ik(D

O. ^—

C o

o (D

CD CD CO

o C)

Broadcast

CO L_ CD

Messages

>


a\ailable machine

where

processors are idle while high priority jobs wait in queues on slower pnxrcssors.

fast

.ire

ne\er notified.

I

and allows ihcm message

to reschedule llicir tiisks.

and

traffic

system

'ITie

most important and surprising F.vcn though ihe lazy

lazy assignment method.

message

substantially less

performs much

usually

result

tlie

lliis

much

all

is

addition would be even greater unlikely

it

that

is

the superiority of the

much worse

than the alternative meUiods, and

it

results:

First, the

in a

introduction of

few cases.

For

heavily loaded systems witJi very long communication delays, the eager assignment

transmits jobs lo contractors in advance and thus removes one communication lag from the

trip

Also, these simulations were only attempting to minimize average flow time,

fiow time.

Nevertheless,

we

believe that the superiority of

deferring assignment of tasks to machines in multi-machine scheduling environments is

resulting

better.

not any of the other possible global objectives.

that

the

simpler la/y assignment melliod.

communication delays might change the irade-ofFs among scheduling methods in

have specified,

of such situations

simpler to implement and often requires

There are two important qualifications on the generality of these

example,

Wc

clients

of these simulations

method

never perfonns

traffic, it

of

!1ic cost

and wc believe

complexity,

perfoniijnce would be significantK better than

Inipliaiiions.

here can thus be situations

weakness of the eager assignment metJiod.

but not implemented, an addition to the protocol that notifies

round

lo

rcschcdiiie

This appears to be a serious

method

piocessoi' arc

a

a\ailable sooner than expected, but in these cases, the

likely

be

to

very

widely

applicable,

and

dial

is

not

yet

widely

is

a principle

known.

DISTRIBUTED SCHEDULING PROTOCOL DEFINITION In this section,

used to in a

we

present a detailed definition of

implement the

lazy assignment

method.

tlie

Distributed Scheduling Protocol

field

that is

network even when die machines use different programming languages and operating systems.

This section (and the Inioilisp implcmcnDiion of the proUKols) describe lists

(DSP)

This protocol can be used by cooperating machines

of

fields

with no field widths or data types specified,

widths and

The message

data

types

should

definitions in Figures 6

be

specified

.

.")

square brackets

are used lo indicate ("I

]")

indicate

lists

>"),

the message formats as

well.

BNF

specification.

Nonterminal

terminal symbols are written witliout delimiters,

containing an arbitrary

comments.

all

use the protocol with other languages,

and 7 use a modified form of

symbols arc enclosed by angle brackets ("< ellipses (".

as

lo

number of

or

more

terms,

and

1ft

lijiurc I

IT

Mcssjijc

ft

I

orn)at







|

I

1

1

lormals





I

- (RI-QLH-S



|

Mi'ss;i};c

|

|



|



|

>

|

Suggest Documents