25 Nov 2008 ... Simulare de transfer de caldura cu ecuatii de tip Laplace/Poisson (INCAS) ... PhD
student & Post Doc opportunities for CS@UPB personnel.
High Performance Computing at CS@UPB
Dr. rer. nat. Emil Slusanschi (
[email protected]) University Politehnica of Bucharest Computer p Science and Engineering g g Department p www.acs.pub.ro
IBM Workshop for Universities Hotel Crowne Plaza, Bucharest November 25th 2008
2
HPC Infrastructure @ UPB
1
Infrastructure – 1 3
• RO-03-UPB Site • New nodes additions: 66 P4 HT + 32 dual Xeon + 32 dual quad Xeon = 386 physical cpu cores • Storage of 4TB
Infrastructure – 2 4
U 41
EXP 1UBlank 1UBlank
40
39
38
37
37
36
36 3UBlank
35
34
33 R F Blade Center Comp.
16
BladeServer
BladeServer
BladeServer
BladeServer
C F R
15
BladeServer
17
BladeServer
D R F
1 2 3 4 5 6 7 8 9 10 11 12 13 14
BladeServer
Blk.
BladeServer
Blk.
GbE
BladeServer
GbE
PS4
BladeServer
PS3
Fan 2
BladeServer
Fan 1
19
PS2
BladeServer
PS1
Blk.
BladeServer
20 18
14
Mon.
Blade Center Comp. Mgt
BladeServer
BladeServer
C F R
15
BladeServer
18
BladeServer
Blk.
BladeServer
GbE
BladeServer
PS4
BladeServer
Fan 2
BladeServer
PS2
BladeServer
19
Blk.
BladeServer
Blk.
BladeServer
GbE
BladeServer
PS3
BladeServer
20
Fan 1
Keyboard
1U
21
PS1
D R F
1 2 3 4 5 6 7 8 9 10 11 12 13 14
14 12 11 10 9 8
6U UPS 10000KVA
39Y8941
39Y8941
39Y8948
13
6U UPS 10000KVA
7
7
6
6
5
1
Console Switch
22 Mgt
4 3 2
x3550 Mgmt Node
23 Blade Center Comp.
21
12 11 10 9 8
DS3400
x3650 Storage Node
24
1 2 3 4 5 6 7 8 9 10 11 12 13 14
22
13
• Main Memory: 16GB RAM DDR2 – FBDIMM 667MHz ECC CL5 • Interconnect: 2x1GB Chk • Local Node Storage 146Gb SAS
R F 3UBlank
39Y8948
27
23
16
B
A F R
28 26 25
24
17
31 29
BladeServer
BladeServer
BladeServer
Blk.
BladeServer
GbE
BladeServer
PS4
BladeServer
Fan 2
BladeServer
PS2
BladeServer
Blk.
25
BladeServer
28
BladeServer
Blk.
BladeServer
GbE
BladeServer
PS3
BladeServer
Fan 1
BladeServer
PS1
39Y8948
Mgt
26
32 30
29 27
3UBlank
33 B
3UBlank
A F R
39Y8948
31
3UBlank
35
34 32
3UBlank
39 3UBlank
38
BladeServer
Power: 80W Frequency: 2.50GHz FSB: 1333MHz L2 Shared Cache: 12MB
U 42 41
40
30
– – – –
Main 3UBlank
42
BladeServer
• Two 42U Racks – IBM eServer BladeCenter H Chassis • 32 HS21 Bl Blades d with ith dual d l Intel I t l Xeon Quad Core E5420 – Penryn
5 6U UPS 10000KVA
4 3 2
6U UPS 10000KVA
1
2
Infrastructure – 3 5
• Under development: – Storage: 10TB – two Dell PowerVault MD1000 systems – Four F IBM Blade Bl d Center C t QS22 – to t be b mounted t d in i H Chassis: • Processors: Dual 3.2 GHz IBM PowerXCell 8i Processors (one PPE and 8 SPEs / proc) • Main Memory: 8GB • L2 Cache: 512KB on IBM PowerXCell 8i Processor and 256KB Local Store on each eDP SPE • Networking: Dual Gigabit Ethernet
– Funding is secured through a national CDI Grant: “GEEA – Sistem GRID multi-corE de înaltă pErformanţă pentru suportul Aplicaţiilor complexe” for a 10Gb based cluster of 96+ dual quad-code Blades with 16Gb RAM/Blade
Infrastructure – 4 6
• Only 32 nodes were used for Grid Activities progress: g all nodes will be allocated to • Work in p grid-related activities – Gateway for each type of node: • gw01.seegrid.grid.pub.ro is up and running, gw02 is up, gw03 up
– RO-03-UPB supports different kinds of Vos • From WLCG: Alice – no jobs have been run yet • From SEEGRID: – seegrid (SEEGRID-SCI) – sgdemo – training VO – gridmosi – national research VO – envriron – SEEGRID VO (after Aug 30th 2008)
3
Infrastructure – 5 7
• For the first time, students have access to a batch system • GridMOSI CA – national VO – CA designed for fast access to the gridmosi VO for students – Limited life of certificates
• fep.grid.pub.ro – gateway to the NCIT Batch system t but b t also l tto the th grid id (UI) – Experiments with the GridWay meta-scheduler for joblevel paralelism (not in “production” yet) – Different kind of parallelism, not at function level but as jobs (DAGs and so on)
Infrastructure – 6 8
• For the first time we can support a do-it-yourself glite site • Most central node types are also here at RO-03 – CE, SE, MON, UI – classic node types – Top-Level BDII, VOBOX (Alice), WMS, MyProxy • Allows students to add sites to the Top-Level BDII to be seen only in the GridMOSI VO • Experiment p with different schedulers and middleware components without shutting down the site
• Planned HW upgrades: – IBM QS22 Dual CellBE Enhanced DP – Upgrade cluster interconnection network to 10Gb
4
9
Applications @ UPB
HPC Applications CS@UPB 10
•
Porting and development of applications for IBM CellBE processors – CellGAF
•
Numerical Simulation of Earthquakes in the Vrancea Region – INFP/NIPNE (Earth Sciences)
•
Modeling and Simulation of Aerodynamic Phenomena – INCAS (Aerospace Research)
•
Weather Prediction Models: COSMO & HRM – ANM (Meteorology)
•
Atomic Scale Simulation and Magnetic Characterization of Finite Systems in Material Science – ICF (Physical Chemistry)
•
Nanotubes and systems of nanotubes (Computational Physics – TU Iasi)
5
CellGAF – A Genetic Algorithms Framework for the Cell BE 11
• Create genetic algorithms using a simple, yet efficient programming interface using C++ • Test & tune algorithm performance by providing a complete set of optimized functions • Debug algorithms and collect statistics • Control and monitor the state of execution by using a remote interface on a PC • Use multiple Cell machines in a local network or Internet to improve running time, or to solve larger problems
CellGAF – Job Shop Scheduling 12
• NP-Hard generalization of traveling salesman • Job scheduling of the N jobs on M machines to minimize time Test
SPU
PPU
PC
Compute Value
1.58 ns
2.21 ns
1.97 ns
Mutation
2.05 ns
3.99 ns
4.30 ns
Combination
6.87 ns
6.48 ns
10.53 ns
10 jobs, 5 machines
2080.50 ms
6945.14 ms
15836.92 ms
15 jobs, 10 machines
4918.17 ms
26480.31 ms
41250.64 ms
• Performance increase 5 to 9 times
Genetic Algorithms
• Genetic operators up to 500% faster
Performance
Cell
6
Numerical Simulations of Earthquakes in the Vrancea Region 13
Modeling and Simulation of Aerodynamic Phenomena
14
• Preprocessing & Grid generation – Metis/ParaMetis
• In-house (INCAS) developed solvers: Euler, Laplace (CS), Navier-Stokes – Porting to new (production) systems – Tuning & improving serial performance – Parallelization (MPI/OpenMP) 3-5x
• Postprocessing – ParaView
7
Weather Prediction Models: COSMO & HRM
15
• COSMO – Consortium for Small-scale Modeling (DWD) since 2005 in .ro – compressible not-hydrostatic model – 81x73 grid, 14km resolution (can be refined to 2.8km)
• HRM – High resolution Regional Model (DWD) – hydrostatic model – 301x201, 32 vertical layers, 20km resolution
• Tasks – – – –
Port the models to new (production) systems – done Profile and improve serial performance & memory management Profile and improve existing MPI parallelization Implement OpenMP / Hybrid parallelization schemes where appropriate – 2D and 3D visualization enhancements
Magnetic Characterization of Finite Systems in Material Science 16
• Paramagnetic materials simulation –O OpenMP MP CS parallelization: Speedup 8.3x on 8 procs – superlinear due to improved cache performance
• GAMESS – MPI original program – Speedup 6.3x on 8 procs
8
Nanotubes and systems of nanotubes 17
• Hysteresis phenomenon 1 nanotube – – – –
Serial run: 2165s O ti i d serial Optimized i l run: 9 9s Optimized parallel* run: 3.8s Total Speedup: 569x
• Systems of nanotubes – – – – –
100x100 tubes Serial run: 350s Optimized serial run: 17s Optimized parallel* run: 10s Total Speedup: 35x *Dual Penryn quad-core → 8 cores
18
Training Events, Diploma Theses & Future Collaborations CS @ UPB
9
Training Events for students 19
• Grid Initiative Summer school: http://gridinitiative.ncit.pub.ro g • First GridInit was in 2004 • Usually debated grid middleware tasks, but this year the main focus were HPC Applications
Grid Workshops 20
• Hipergrid Workshop http://hipergrid.grid.pub.ro p p g g p • 2nd Edition of the Grid Middleware Workshop – Topics: grid scheduling, fault-tolerance and data replication, grid security, performance prediction and others – 21-22nd November 2008 in Bucharest, Romania – 1st Edition: April 2007 in Sibiu, Romania
10
Diploma Theses in CS@UPB 21
•
CellGAF – A Genetic Algorithms Framework for the Cell BE (UPB)
•
Metode numerice pentru reducerea efortului de calcul in rezolvarea numerica a problemei de valori si vectori proprii pentru matrici simetrice mari (ICF)
•
Reducerea timpului de calcul in proceduri de optimizare prin paralelizarea subrutinelor (ICF)
•
Generatoare performante de numere aleatoare uniforme – calcule de tip Monte Carlo (ICF)
•
Ecuatia undelor pentru acustica - 2D sau 3D (INCAS)
•
Simulare de transfer de caldura cu ecuatii de tip Laplace/Poisson (INCAS)
•
Galaxy Formation - Explorare prin Tehnica N-Body a teoriei Cold Dark Matter (UPB)
•
Extragerea de elemente din imagini satelitare (UPB)
•
Metode de paralelizare a programelor de dinamica moleculara (UPB)
•
Optimizarea programelor de chimie cuantica MOPAC si GAMESS pe sisteme paralele (ICF)
•
GUI pentru construirea si reprezentarea structurilor moleculare (ICF)
•
Simulare, generare, verificare si reconstructie de evenimente seismice (IFIN/NIPNE)
•
Determinarea planului median ce trece prin centrul de masa in zona seismica Vrancea (IFIN/NIPNE)
•
Modelul regional de prognoza a vremii COSMO (ANM)
•
Modelul regional de prognoza a vremii HRM (ANM)
•
Simulare de elastodinamica cu geometrii 3D simple (INCAS)
•
Sistem de ecuatii al gazodinamicii compresibile (INCAS)``
Future CS@UPB Collaborations • IBM TJ Watson Labs collaborations
22
– PhD student & Post Doc opportunities for CS@UPB personnel – Joint software development projects on Cell-related topics: • Image processing frameworks • Interactive-body-physics in immersive multi-user reality systems (i.e. MMOG – Massive Multiplayer Online Gaming)
• IBM Faculty F lt Awards A d 2009 winner i (unofficial ( ffi i l announcement on Monday) – Cell GAF – A Genetic Algorithms Framework for the Cell Broadband Engine
• Gedae & RapidMind development frameworks
11
Thank you for your attention Q&A www.acs.pub.ro
[email protected]
12