Scheduling Parallelizable Tasks to Minimize Average ... - CiteSeerX

11 downloads 32425 Views 894KB Size Report
Center. Unwemzty of Wisconsin-Mad~son. IBM. T.J.. Watson. Resea7ch. Center ...... a scheduling algorithm we call. LIST for the problem. NMRTS. This algorithm.
Scheduling

Parallelizable

John IBM

T.J.

Tasks

Turek

Watson

Yorktown

to Minimize

Walter

Research Heights,

Center

Unwemzty

NY

Compute.

Lisa

Fleischer

Science

operations

Research Ithaca,

L.

Resea7ch

Yorktown

Department

Wolf

Watson

Heights,

Jason

Tiwari

Watson

Yorktown

Department

T.J.

ReseaTch He%ghts,

Glasgow

Ismel

NY

Institute

of Technology

Hazfa,

for

Philip

Schwiegelshohn

Unzverstty Information Dortmund,

of Dovtmund Technology

Center

NY

Technton

Center

NY Uwe

Institute

T.J.

Joel IBM

Time

WI

Prasoon IBM

Unwerstty

Response

Ludwig

of Wisconsin-Mad~son Madmen,

Cornell

Average

IBM

T.J.

Systems

Israel

S. Yu

Watson

Yorktown

Research Hetghts,

Center

NY

Germany

1

Abstract

Introduction

Consider A parallelizable (or malleable) task is one which can be run on an arbitrary number of processors, with a task execution time that depends on the number of processors allotted to it. Consider a system of M independent parallelizable tasks which are to be scheduled without preemption on a parallel computer consisting of P identical processors. For each task, the execution time is a known function of the number of processors allotted to it. The goal is to find (1) for each task i, an allotment of processors ~,, and (2) overall, a non-preemptive schedule assigning the tasks to the processors which minimizes the average response time of the tasks. Equivalently, we can minimize the flow tnne, which is the sum of the completion times of each of the tasks.

a multiprocessor

cessors dent

and

a task

tasks

i c

to

be

{1, . . . . M},

processors

be

allotted

a task

are

without to

required

i are

later

completion

think

of the

and

although

explicitly tasks

malleable.

i, and

the

total

for

all

number

number

t.The

finding

the

given

average

measure

by ~ = in

active

problem

the

In

minimizes

r,

(61,

other

we are concerned

that

task

refer

the

or i,

>0.

that

cannot

of a (We

of task

. . . . BM).

sense

processors

of processors.

a schedule

each time

allotment of

We

o~ paralleksm

to be

a pToa task

paralleltzable

for

degree

a

along

to

contiguous.

a starting

legal

within

along

allotted

consist,

and

informally

place

stretches

be

same

stretches

as either

will ,&

is required

t the

to

to @ as the

we denote

time

The

schedule

refer

schedule

tzme

required

allotment

sometimes

~;

the

i at some

can

h (~,)

processors

interchangeably

A

processor

height

and

allotted

at task

One

to

unison

task

i as taking

width the

in

complete ).

of pro-

allotted

processors

the

7, + t,(~, of task

whose

&

i, of

taskezecutzon number

task

start

then

whose

cessor not

will

ttme

a tmn.e axis are

200

They

is, the to

task

number

processors

that

pro-

indepen-

each

its

of the

of the

That required

that

that

execute

execution

axis,

and

All

to

rectangle

to such

SPAA 94-6194 Cape May, N.J, USA 0 1994 ACM 0-89791-671 -9/94/0006..$3.50

all

tzrne r,.

malleable

Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association of Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission.

it.

M

an arbitrary

function

preemption.

task

star-tmg

In this paper we tackle the problem of finding a schedule with minimum average response time in the special case where each task in the system has sublinear speedup. This natural restriction on the task execution time means simply that the efficiency of a task decreases or remains constant as the number of processors allotted to it increases. The scheduling problem, with sublinear speedups has been shown to be ,f P-complete in the strong sense. We therefore focus on finding a polynomial time algorithm whose solution comes within a fixed multiplicative constant of optimal. In particular, we give an algorithm which finds a schedule having a response time that is within 2 times that of the optimal schedule and which runs in 0( M(it42 + P)) time.

to

of

Assume

. . . . P}

E {1,

of P identical

consisting

allotted

tmne t~(L?; ) is a known cessors

T

scheduled. can

~;

consisting

system

at

) A any

exceed

words,

with

is that

average

response

by

response

in computer

time

is an important

performance.

Note

and the

standard

completion

of

times

of all

pletion makespan

count

(as is the

equally,

Modulo

be removed

problem,

not

case in the

problem).

obviously the

tasks

time

the

without

just

the

last

well-studied factor

affecting

we are attempting

com-

say t~. In other

minimum

&,

which

the

solution

ters

can to

to minimize

of the

words,

task

We refer

to this

response

time

case

the

system,

allotments

rather

problem

an execution

parame-

of the

(for

NMRTS

because time

part

as NMRTS

scheduling).

of MRTS,

are fixed

than

problem.

non-malleable

is indeed

we can

define

for

a special

each

task

i

function

(1)

i=l This

quantity

is typically

For

simplicity,

we shall

tion

to the

terms by

minimum

of the

the

to this

time

of the

[Cof7 of

time

6].

The

a solu-

problem

function

Equation (for

time

paper

response

in

as MRTS

jlow

this

objective

formula

problem

the in

average

value

flow

called speak

problem

Turek,

in

NMRTS

the

1. We

malleable

shall

refer

The

rest

time

tion

2 we briefly

response

of

present

sake

of simplicity

time

of the

number

this

ti(~~)

is not

time

actually

~,

allotted

a restriction:

t,(~;)

time

assume

gives

function

that

the

i is a non-increasing

of processors

function

cution

we shall

of task

to

exe-

function

to it.

However,

task

execution

Any

rise

defined

the

a new

a surrogate

task

the

three

MRTS-SS.

exe-

which

by

less

optimal the

number

of

remaining

We

processors

processors

do

simply

the

the

make

work,

and

In this

paper

we focus

of the

means

tasks

that

task

i

is

allotted

the

our

number

umrk

attention

function

~;.

function

Inother

in which

Formally,

this

=~itt(~i) for

az(/3i)

a non-decreasing

processors

on systems speedup. of

words,

the

foralll

of

< i < M,

task

the bound

possible

to

tasks

in

Section

to

be 5.

used

optimal

one.

ponents

of

as input bound

initial the

(Geometrically,

of course,

striction the

on the

efficiency

as the refer

task

of

number to this

which

work

malleable

execution

a task

restricted

decreases

for

MRTS

or

of

re-

simply

remains

2

Previous

remark

zncreaszng

function

time

is minimized

each

task,

work. not

that

in passing

and

Of

of /3, for trivially

ordering

course,

typically

if,

each

the

such

We

The

corresponding

minimum

special

and

a, (~,)

the

strong

the

strong

optimal paper for

case

task

i,

then

the

in order

has

to

of increasing assumption

is

been

studied

malleable

and

sense.

Thus

sense

as well.

flow

time

we present

the

problem

and

which

The

current

runs

Let

X; for

will

to

refer

in 0(M(M2 makes

entire have

is ;\ ’P-complete

in

and

the

system

which + P))

Shachnai in

denote

having

by

be .\-’P-complete

task

algorithm

MRTS-SS

paper

studied

to

MRTS-SS

solution an

was shown

cost T.

finds

flow

time

of the In

use of a new

We

com-

order

also

of

outline

a schedule

with

by

{’Ti + t.(~, )},

in

a schedule XT