Integration of Grid Cost Model into ISS/VIOLA Meta-Scheduler ...

10 downloads 322 Views 260KB Size Report
Managed by ... Integration of Grid Cost Model into ... Network performance. + .... service). • simulation of the real situation with UNICORE middeware and the.
Managed by Managed by

Integration of Grid Cost Model into ISS/VIOLA Meta-Scheduler environment

Ralf Gruber*, Vincent Keller*, Michela Thiémard*, EPFL Oliver Wäldrich*, Wolfgang Ziegler*, SCAI Philipp Wieder*, Forschungszentrum Jülich Pierre Manneback*, CETIC *CoreGRID members

Managed by

Plan • • • • • •

HPC Situation Application structure Gamma Model ISS Concept Cost Model UNICORE-VIOLA-ISS Testbed

Managed by

HPC : TODAY A3 A1 A2

B3

A4 R1

B1

A5

B2 C1 C2

B4 R2

B5

C3 C4 R3

C5

Resources chosen to satisfy needs of all local applications

Managed by

HPC : TOMORROW A3 A1

A4

A2

A5

C1

C3 C4

C2

C5

B3 B1

B4

B2

B5

UNICORE/MSS + ISS

R1

R3

R2

How to BEST distribute applications to improve overall performance ?

Managed by

HPC : TOMORROW EVENING A3 A1

A4

A2

A5

C1

C3 C4

C2

C5

B3 B1

B4

B2

B5

UNICORE/MSS + ISS

?

?

?

What resources needed to satisfy applications components WITHIN LOWEST COSTS ?

Managed by

HPC : AFTER TOMORROW A3 A1

A2

UNICORE/MSS + ISS

R1

R3

R2

What to do to BEST distribute application COMPONENTS to improve overall performance ?

Managed by

Application structure

Component 1: Embarassingly parallel

Component 2: Application

Point-to-point

Component 3: Multicast

Component 4: Client-server

Machine m NOW Machine l PC cluster

Machine i MPP Machine k NUMA

New feature to VIOLA/Meta-Scheduler environment: Submit to well-suited machines for application components

Γ model

Managed by

Characterizes application components and parallel machines

Γ=

E = CPU time over communication time 1− E

Γ parameters of application components for one machine: CPU time Communication time Size of messages

+ Parameters on one machine: Memory bandwidth Processor performance Network performance

+

Parameters on other machine: Memory bandwidth Processor performance Network performance

Γ parameters of application component on other machine

Managed by

Framework

Managed by

ISS concept

Managed by

Pre-execution

Resource discovery Prologue Decision Queing

Managed by

Prologue

Find eligible machines: Yes on Machine up? Access rights? Program exists? Enough memory?

Managed by

Decision

Collect data on their availabilities Evaluate cost function Submit the job

Cost function time costs

Managed by

HW+SW costs n

min( z ) = βK w ( C k , Ri , p k ) + ∑ ℑC ( Ri , p k ) k =1

such that ∀1 ≤ n

∑( K ( C k =1

e

k

k

k≤n

, Ri , p k ) + K l ( C k , Ri , pk )

+ K eco ( C k , Ri , p k ) + K d ( C k , Ri , p k )) ≤ KMAX max( t kd,i ) − min( t k0 ) ≤ TMAX i ,k

k

( Ri , pk ) ∈ ℜ( C k )

Managed by

Cost function

ℑCk ( Ri , p k ) = α k ( K e ( C k , Ri , p k ) + K l ( C k , Ri , p k )) + γ k ( K eco ( C k , Ri , p k )) + δ k ( K d ( C k , Ri , p k ))

α k , β ,γ k ,δ k ≥ 0 αk + β + γ k + δk > 0

Managed by

Free parameters αk, β, γk, δk

Minimize turn-around time: β= 1, KMAX=∞, αk γk, δk = 0 ∀k

Minimize hardware costs: β= 0, TMAX=∞, αk γk, δk ≥ 0

Managed by

CPU costs Ke t ie,k

K e ( C k , Ri , pk ) = ∫ k e ( C k , Ri , pk ,ϕ ,t )dt t is,k

priority

t ks,i

t ke,i

Managed by

License fees Kl

t ie,k

K l ( C k , Ri , p k ) = ∫ k l ( C k , Ri , p k ,t )dt t is,k

Managed by

Costs of turn-around time Kw max t kd,i

K w (Ck , Ri , pk ) =

k

∫k

w

(t )dt

min t k0 k

salary

t k0

time-to-market

t kd,i

t k0

t kd,i

Managed by

Energy costs t ke ,i

K eco (Ck , Ri , pk ) = ∫ keco (Ck , Ri , pk , t )dt t ks ,i

no frequency adaptation

t ks,i

t ke,i

with frequency adaptation

t ks,i

t ke,i

Managed by

Epilogue

Ganglia

VAMOS

Broker

SI dataWarehouse

Accounting

Managed by

Pleiades: Ganglia data

Γ50% =0.64 =0.82

I/O or MPI ?

To machine with faster communication

E

Managed by

VAMOS: single job profiles

CFD

Plasma physics

Managed by

First VIOLA/UNICORE/Meta-Scheduler/ISS testbeds

UNICORE VIOLA MSS

LIN-EPFL: Pleiades1 Pleiades2 Pleiades3

EPFL: SMP/NUMA High γm cluster

UNICORE VIOLA

Switch

ISS DIT-EPFL: Mizar Condor NOW BlueBrain

ETHZ: SMP/NUMA High γm cluster

MSS ISS

CERN: egee Grid EIF: NoW

CSCS: SMP/vector Low γm cluster

12.2006: EPFL HPCN installations 12.2007: Swiss HPCN Grid initiative

Managed by

Outlook: Simulator

Understanding the behavior of the simulated grid depending of the value of the free parameters.

• Usage of old executions data on 2 departemental clusters – each job was monitored. Data stored in a mysqlDB – Mapping of ganglia and localscheduler (torque) info (VAMOS service) • simulation of the real situation with UNICORE middeware and the VIOLA MSS simulated on top of it • Using the real job traces. • training of the system (broker service) . • Metric used to show the improvement of the grid is the utilization of the machines.

Managed by

Outlook: Component dependency • Each component of a workflow is executed on the well suited machine

• Hard problem : need new ideas • Co-allocation is now made manually. • With ISS in the future : automatically.

Managed by

Publications Pierre Manneback, Guy Bergère, Nahid Emad, Ralf Gruber, Vincent Keller, Pierre Kuonen, Sébastien Noël, and Serge Petiton (2005), "Towards a scheduling policy for hybrid methods on computational Grids", CoreGRID Meeting, Pisa (28-30 November, 2005), to appear in Lecture Notes in Computer Sciences (Springer) Ralf Gruber, Vincent Keller, Pierre Kuonen, Marie-Christine Sawley, Basile Schaeli, Ali Tolou, Marc Torruella, and Trach-Minh Tran (2005), "Intelligent GRID Scheduling System", PPAM 2005, Poznan, Poland, Lecture Notes in Computer Sciences (Springer) 3911, p. 751-757 Vincent Keller, Kevin Christiano, Ralf Gruber, Pierre Kuonen, Sergio Maffioletti, Nello Nellari, MarieChristine Sawley, Trach-Minh Tran, Philipp Wieder, and Wolfgang Ziegler, "Integration of ISS into the VIOLA Meta-Scheduling Environment", CoreGRID Meeting, Pisa (28-30 November, 2005), to appear in Lecture Notes in Computer Sciences (Springer) Ralf Gruber, Pieter Volgers, Alessandro De Vita, Massimiliano Stengel, and Trach-Minh Tran, "Parameterisation to tailor commodity clusters to applications", Future Generation Computer Systems 19 (2003) 111-120 Ralf Gruber, Vincent Keller, Michela Thiémard, Oliver Wäldrich, Philipp Wieder, Wolfgang Ziegler, and Pierre Manneback, "Integration of Grid Cost Model into ISS/VIOLA Meta-Scheduler environment", (2006) to appear.

Managed by

THANK YOU

Suggest Documents