High Performance Computing at CS@UPB

16 downloads 413 Views 608KB Size Report
25 Nov 2008 ... Simulare de transfer de caldura cu ecuatii de tip Laplace/Poisson (INCAS) ... PhD student & Post Doc opportunities for CS@UPB personnel.
High Performance Computing at CS@UPB

Dr. rer. nat. Emil Slusanschi ([email protected]) University Politehnica of Bucharest Computer p Science and Engineering g g Department p www.acs.pub.ro

IBM Workshop for Universities Hotel Crowne Plaza, Bucharest November 25th 2008

2

HPC Infrastructure @ UPB

1

Infrastructure – 1 3

• RO-03-UPB Site • New nodes additions: 66 P4 HT + 32 dual Xeon + 32 dual quad Xeon = 386 physical cpu cores • Storage of 4TB

Infrastructure – 2 4

U 41

EXP 1UBlank 1UBlank

40

39

38

37

37

36

36 3UBlank

35

34

33 R F Blade Center Comp.

16

BladeServer

BladeServer

BladeServer

BladeServer

C F R

15

BladeServer

17

BladeServer

D R F

1  2  3  4  5  6  7  8  9 10 11 12 13 14 

BladeServer

Blk.

BladeServer

Blk.

GbE

BladeServer

GbE

PS4

BladeServer

PS3

Fan 2

BladeServer

Fan 1

19

PS2

BladeServer

PS1

Blk.

BladeServer

20 18

14

Mon.

Blade Center Comp. Mgt

BladeServer

BladeServer

C F R

15

BladeServer

18

BladeServer

Blk.

BladeServer

GbE

BladeServer

PS4

BladeServer

Fan 2

BladeServer

PS2

BladeServer

19

Blk.

BladeServer

Blk.

BladeServer

GbE

BladeServer

PS3

BladeServer

20

Fan 1

Keyboard

1U

21

PS1

D R F

1  2  3  4  5  6  7  8  9 10 11 12 13 14 

14 12 11 10 9 8

6U UPS 10000KVA

39Y8941

39Y8941

39Y8948

13

6U UPS 10000KVA

7

7

6

6

5

1

Console Switch

22 Mgt

4 3 2

x3550 Mgmt Node

23 Blade Center Comp.

21

12 11 10 9 8

DS3400

x3650 Storage Node

24

1  2  3  4  5  6  7  8  9 10 11 12 13 14 

22

13

• Main Memory: 16GB RAM DDR2 – FBDIMM 667MHz ECC CL5 • Interconnect: 2x1GB Chk • Local Node Storage 146Gb SAS

R F 3UBlank

39Y8948

27

23

16

B

A F R

28 26 25

24

17

31 29

BladeServer

BladeServer

BladeServer

Blk.

BladeServer

GbE

BladeServer

PS4

BladeServer

Fan 2

BladeServer

PS2

BladeServer

Blk.

25

BladeServer

28

BladeServer

Blk.

BladeServer

GbE

BladeServer

PS3

BladeServer

Fan 1

BladeServer

PS1

39Y8948

Mgt

26

32 30

29 27

3UBlank

33 B

3UBlank

A F R

39Y8948

31

3UBlank

35

34 32

3UBlank

39 3UBlank

38

BladeServer

Power: 80W Frequency: 2.50GHz FSB: 1333MHz L2 Shared Cache: 12MB

U 42 41

40

30

– – – –

Main 3UBlank

42

BladeServer

• Two 42U Racks – IBM eServer BladeCenter H Chassis • 32 HS21 Bl Blades d with ith dual d l Intel I t l Xeon Quad Core E5420 – Penryn

5 6U UPS 10000KVA

4 3 2

6U UPS 10000KVA

1

2

Infrastructure – 3 5

• Under development: – Storage: 10TB – two Dell PowerVault MD1000 systems – Four F IBM Blade Bl d Center C t QS22 – to t be b mounted t d in i H Chassis: • Processors: Dual 3.2 GHz IBM PowerXCell 8i Processors (one PPE and 8 SPEs / proc) • Main Memory: 8GB • L2 Cache: 512KB on IBM PowerXCell 8i Processor and 256KB Local Store on each eDP SPE • Networking: Dual Gigabit Ethernet

– Funding is secured through a national CDI Grant: “GEEA – Sistem GRID multi-corE de înaltă pErformanţă pentru suportul Aplicaţiilor complexe” for a 10Gb based cluster of 96+ dual quad-code Blades with 16Gb RAM/Blade

Infrastructure – 4 6

• Only 32 nodes were used for Grid Activities progress: g all nodes will be allocated to • Work in p grid-related activities – Gateway for each type of node: • gw01.seegrid.grid.pub.ro is up and running, gw02 is up, gw03 up

– RO-03-UPB supports different kinds of Vos • From WLCG: Alice – no jobs have been run yet • From SEEGRID: – seegrid (SEEGRID-SCI) – sgdemo – training VO – gridmosi – national research VO – envriron – SEEGRID VO (after Aug 30th 2008)

3

Infrastructure – 5 7

• For the first time, students have access to a batch system • GridMOSI CA – national VO – CA designed for fast access to the gridmosi VO for students – Limited life of certificates

• fep.grid.pub.ro – gateway to the NCIT Batch system t but b t also l tto the th grid id (UI) – Experiments with the GridWay meta-scheduler for joblevel paralelism (not in “production” yet) – Different kind of parallelism, not at function level but as jobs (DAGs and so on)

Infrastructure – 6 8

• For the first time we can support a do-it-yourself glite site • Most central node types are also here at RO-03 – CE, SE, MON, UI – classic node types – Top-Level BDII, VOBOX (Alice), WMS, MyProxy • Allows students to add sites to the Top-Level BDII to be seen only in the GridMOSI VO • Experiment p with different schedulers and middleware components without shutting down the site

• Planned HW upgrades: – IBM QS22 Dual CellBE Enhanced DP – Upgrade cluster interconnection network to 10Gb

4

9

Applications @ UPB

HPC Applications CS@UPB 10



Porting and development of applications for IBM CellBE processors – CellGAF



Numerical Simulation of Earthquakes in the Vrancea Region – INFP/NIPNE (Earth Sciences)



Modeling and Simulation of Aerodynamic Phenomena – INCAS (Aerospace Research)



Weather Prediction Models: COSMO & HRM – ANM (Meteorology)



Atomic Scale Simulation and Magnetic Characterization of Finite Systems in Material Science – ICF (Physical Chemistry)



Nanotubes and systems of nanotubes (Computational Physics – TU Iasi)

5

CellGAF – A Genetic Algorithms Framework for the Cell BE 11

• Create genetic algorithms using a simple, yet efficient programming interface using C++ • Test & tune algorithm performance by providing a complete set of optimized functions • Debug algorithms and collect statistics • Control and monitor the state of execution by using a remote interface on a PC • Use multiple Cell machines in a local network or Internet to improve running time, or to solve larger problems

CellGAF – Job Shop Scheduling 12

• NP-Hard generalization of traveling salesman • Job scheduling of the N jobs on M machines to minimize time Test

SPU

PPU

PC

Compute Value

1.58 ns

2.21 ns

1.97 ns

Mutation

2.05 ns

3.99 ns

4.30 ns

Combination

6.87 ns

6.48 ns

10.53 ns

10 jobs, 5 machines

2080.50 ms

6945.14 ms

15836.92 ms

15 jobs, 10 machines

4918.17 ms

26480.31 ms

41250.64 ms

• Performance increase 5 to 9 times

Genetic Algorithms

• Genetic operators up to 500% faster

Performance

Cell

6

Numerical Simulations of Earthquakes in the Vrancea Region 13

Modeling and Simulation of Aerodynamic Phenomena

14

• Preprocessing & Grid generation – Metis/ParaMetis

• In-house (INCAS) developed solvers: Euler, Laplace (CS), Navier-Stokes – Porting to new (production) systems – Tuning & improving serial performance – Parallelization (MPI/OpenMP) 3-5x

• Postprocessing – ParaView

7

Weather Prediction Models: COSMO & HRM

15

• COSMO – Consortium for Small-scale Modeling (DWD) since 2005 in .ro – compressible not-hydrostatic model – 81x73 grid, 14km resolution (can be refined to 2.8km)

• HRM – High resolution Regional Model (DWD) – hydrostatic model – 301x201, 32 vertical layers, 20km resolution

• Tasks – – – –

Port the models to new (production) systems – done Profile and improve serial performance & memory management Profile and improve existing MPI parallelization Implement OpenMP / Hybrid parallelization schemes where appropriate – 2D and 3D visualization enhancements

Magnetic Characterization of Finite Systems in Material Science 16

• Paramagnetic materials simulation –O OpenMP MP CS parallelization: Speedup 8.3x on 8 procs – superlinear due to improved cache performance

• GAMESS – MPI original program – Speedup 6.3x on 8 procs

8

Nanotubes and systems of nanotubes 17

• Hysteresis phenomenon 1 nanotube – – – –

Serial run: 2165s O ti i d serial Optimized i l run: 9 9s Optimized parallel* run: 3.8s Total Speedup: 569x

• Systems of nanotubes – – – – –

100x100 tubes Serial run: 350s Optimized serial run: 17s Optimized parallel* run: 10s Total Speedup: 35x *Dual Penryn quad-core → 8 cores

18

Training Events, Diploma Theses & Future Collaborations CS @ UPB

9

Training Events for students 19

• Grid Initiative Summer school: http://gridinitiative.ncit.pub.ro g • First GridInit was in 2004 • Usually debated grid middleware tasks, but this year the main focus were HPC Applications

Grid Workshops 20

• Hipergrid Workshop http://hipergrid.grid.pub.ro p p g g p • 2nd Edition of the Grid Middleware Workshop – Topics: grid scheduling, fault-tolerance and data replication, grid security, performance prediction and others – 21-22nd November 2008 in Bucharest, Romania – 1st Edition: April 2007 in Sibiu, Romania

10

Diploma Theses in CS@UPB 21



CellGAF – A Genetic Algorithms Framework for the Cell BE (UPB)



Metode numerice pentru reducerea efortului de calcul in rezolvarea numerica a problemei de valori si vectori proprii pentru matrici simetrice mari (ICF)



Reducerea timpului de calcul in proceduri de optimizare prin paralelizarea subrutinelor (ICF)



Generatoare performante de numere aleatoare uniforme – calcule de tip Monte Carlo (ICF)



Ecuatia undelor pentru acustica - 2D sau 3D (INCAS)



Simulare de transfer de caldura cu ecuatii de tip Laplace/Poisson (INCAS)



Galaxy Formation - Explorare prin Tehnica N-Body a teoriei Cold Dark Matter (UPB)



Extragerea de elemente din imagini satelitare (UPB)



Metode de paralelizare a programelor de dinamica moleculara (UPB)



Optimizarea programelor de chimie cuantica MOPAC si GAMESS pe sisteme paralele (ICF)



GUI pentru construirea si reprezentarea structurilor moleculare (ICF)



Simulare, generare, verificare si reconstructie de evenimente seismice (IFIN/NIPNE)



Determinarea planului median ce trece prin centrul de masa in zona seismica Vrancea (IFIN/NIPNE)



Modelul regional de prognoza a vremii COSMO (ANM)



Modelul regional de prognoza a vremii HRM (ANM)



Simulare de elastodinamica cu geometrii 3D simple (INCAS)



Sistem de ecuatii al gazodinamicii compresibile (INCAS)``

Future CS@UPB Collaborations • IBM TJ Watson Labs collaborations

22

– PhD student & Post Doc opportunities for CS@UPB personnel – Joint software development projects on Cell-related topics: • Image processing frameworks • Interactive-body-physics in immersive multi-user reality systems (i.e. MMOG – Massive Multiplayer Online Gaming)

• IBM Faculty F lt Awards A d 2009 winner i (unofficial ( ffi i l announcement on Monday) – Cell GAF – A Genetic Algorithms Framework for the Cell Broadband Engine

• Gedae & RapidMind development frameworks

11

Thank you for your attention Q&A www.acs.pub.ro [email protected]

12