Document not found! Please try again

JaceP2P: an Environment for Asynchronous ...

2 downloads 0 Views 700KB Size Report
Sep 28, 2006 - runs a Daemon on his computer when unused ("cycle stealing"). ⇒ contributes to execute application launched on the JaceP2P infrastructure.
JaceP2P: an Environment for Asynchronous Computations on Peer-to-Peer Networks J. Bahi, R. Couturier, P. Vuillemin

AND team (Distributed Numerical Algorithms) Laboratoire d’Informatique de l’université de Franche-Comté (LIFC) 28 September 2006

J. Bahi, R. Couturier, P. Vuillemin

HeteroPar’06, Barcelona (Spain)

28 September 2006

1 / 21

Introduction

Motivations Scientific context : iterative methods ⇒ approximate results at each iteration ⇒ communications/synchronizations after each iteration Execution context : Peer-to-Peer (P2P) Computing ⇒ used for file sharing, possibility for distributed computing ⇒ dynamic infrastructures ⇒ decentralized organization ⇒ heterogeneity of processors and networks ⇒ communications between computing nodes Lots of idle times when disconnections due to synchronizations J. Bahi, R. Couturier, P. Vuillemin

HeteroPar’06, Barcelona (Spain)

28 September 2006

2 / 21

Introduction

Our solution

JaceP2P : the P2P version of JACE (Java Asynchronous Computation Environment) ⇒ programming and execution environment for iterative applications ⇒ based on asynchronous iteration model ⇒ based on cycle stealing ⇒ enables node disconnections ⇒ enables communications between peers ⇒ decentralized organization

J. Bahi, R. Couturier, P. Vuillemin

HeteroPar’06, Barcelona (Spain)

28 September 2006

3 / 21

Introduction

Outline

1. Parallel iterative algorithms 2. The JaceP2P environment 3. Experimentations with JaceP2P Conclusion and future works

J. Bahi, R. Couturier, P. Vuillemin

HeteroPar’06, Barcelona (Spain)

28 September 2006

4 / 21

1. Parallel iterative algorithms

1. Parallel iterative algorithms

J. Bahi, R. Couturier, P. Vuillemin

HeteroPar’06, Barcelona (Spain)

28 September 2006

5 / 21

1. Parallel iterative algorithms

1.1. Classification

Synchronous Iterations, Synchronous Communications (SISC) Processor 1 Processor 2 time

Synchronous Iterations, Asynchronous Communications (SIAC)

J. Bahi, R. Couturier, P. Vuillemin

HeteroPar’06, Barcelona (Spain)

28 September 2006

6 / 21

1. Parallel iterative algorithms

1.1. Classification Synchronous Iterations, Synchronous Communications (SISC) Processor 1 Processor 2 time

Synchronous Iterations, Asynchronous Communications (SIAC) Processor 1

Processor 2 time J. Bahi, R. Couturier, P. Vuillemin

HeteroPar’06, Barcelona (Spain)

28 September 2006

6 / 21

1. Parallel iterative algorithms

1.1. Classification

Asynchronous Iterations, Asynchronous Communications (AIAC) Processor 1

Processor 2 time

Processors can compute different iterations at a given time t No synchronization between two iterations ⇒ no idle time

J. Bahi, R. Couturier, P. Vuillemin

HeteroPar’06, Barcelona (Spain)

28 September 2006

7 / 21

1. Parallel iterative algorithms

1.2. Conclusions about asynchronism Number of iterations generally greater Warning : ensure convergence ! BUT All idle times suppressed Communications overlapped by computations Execution time considerably reduced, especially in distant context Tolerant to long message delays Message loss tolerant Tolerant to processor heterogeneity Neighbors do not stop when disconnections occur ⇒ Adapted in P2P computing context J. Bahi, R. Couturier, P. Vuillemin

HeteroPar’06, Barcelona (Spain)

28 September 2006

8 / 21

2. The JaceP2P environment

2. The JaceP2P environment

J. Bahi, R. Couturier, P. Vuillemin

HeteroPar’06, Barcelona (Spain)

28 September 2006

9 / 21

2. The JaceP2P environment

2.1. General presentation

Programming and execution environment on P2P network Decentralized platform ("hybrid P2P" topology) Designed for asynchronous iterative applications Multithreaded environment : communications overlapped by computations Developed in Java (portability), RMI for communications (message passing paradigm) Fault tolerant environment (checkpoint mechanisms) Direct communications between peers

J. Bahi, R. Couturier, P. Vuillemin

HeteroPar’06, Barcelona (Spain)

28 September 2006

10 / 21

2. The JaceP2P environment

2.2. The JaceP2P architecture 3 types of entities : Daemons, Super-Nodes, Spawners Daemons (the computing peers) ⇒ execute computation tasks in a parallel fashion ⇒ tolerate neighbor disconnections ⇒ asynchronous communications for exchange dependencies ⇒ store the checkpoints (task clones) of neighbors Super-Nodes (the points of entrance) ⇒ register the available Daemons of the system ⇒ attribute Daemons when launching applications

J. Bahi, R. Couturier, P. Vuillemin

HeteroPar’06, Barcelona (Spain)

28 September 2006

11 / 21

2. The JaceP2P environment

2.2. The JaceP2P architecture 3 types of entities : Daemons, Super-Nodes, Spawners Daemons (the computing peers) ⇒ execute computation tasks in a parallel fashion ⇒ tolerate neighbor disconnections ⇒ asynchronous communications for exchange dependencies ⇒ store the checkpoints (task clones) of neighbors Super-Nodes (the points of entrance) ⇒ register the available Daemons of the system ⇒ attribute Daemons when launching applications

J. Bahi, R. Couturier, P. Vuillemin

HeteroPar’06, Barcelona (Spain)

28 September 2006

11 / 21

2. The JaceP2P environment

2.2. The JaceP2P architecture Spawners (the application launchers) ⇒ launch a given application by specifying : URL of the application class-file number of nodes parameters

⇒ reserve computing nodes on the Super-Nodes ⇒ distribute the computation tasks over the Daemons ⇒ detect Daemon disconnections ⇒ detect global convergence

J. Bahi, R. Couturier, P. Vuillemin

HeteroPar’06, Barcelona (Spain)

28 September 2006

12 / 21

2. The JaceP2P environment

2.2. The JaceP2P architecture 2 types of JaceP2P users The resource provider ⇒ runs a Daemon on his computer when unused ("cycle stealing") ⇒ contributes to execute application launched on the JaceP2P infrastructure The application programmer ⇒ implements his own specific application using the JaceP2P API ⇒ executes a Spawner on his computer The Spawner is the only entity which must be stable

J. Bahi, R. Couturier, P. Vuillemin

HeteroPar’06, Barcelona (Spain)

28 September 2006

13 / 21

2. The JaceP2P environment

2.2. The JaceP2P architecture 2 types of JaceP2P users The resource provider ⇒ runs a Daemon on his computer when unused ("cycle stealing") ⇒ contributes to execute application launched on the JaceP2P infrastructure The application programmer ⇒ implements his own specific application using the JaceP2P API ⇒ executes a Spawner on his computer The Spawner is the only entity which must be stable

J. Bahi, R. Couturier, P. Vuillemin

HeteroPar’06, Barcelona (Spain)

28 September 2006

13 / 21

2. The JaceP2P environment

2.3. Interaction between peers Super−node1 Register

J. Bahi, R. Couturier, P. Vuillemin

Super−node2 Register

HeteroPar’06, Barcelona (Spain)

28 September 2006

14 / 21

2. The JaceP2P environment

2.3. Interaction between peers Super−node1

Super−node2 Register

Register N1 N2

Registration

Daemon

Daemon

N1

N2

J. Bahi, R. Couturier, P. Vuillemin

HeteroPar’06, Barcelona (Spain)

28 September 2006

14 / 21

2. The JaceP2P environment

2.3. Interaction between peers Super−node2

Super−node1 Register

Register

N1 N2 N3

N4 N5

Registration

Registration

Daemon

N1

J. Bahi, R. Couturier, P. Vuillemin

N2

Daemon

N3

HeteroPar’06, Barcelona (Spain)

N4

Daemon

N5

28 September 2006

14 / 21

2. The JaceP2P environment

2.3. Interaction between peers Spawner Super−node1

Super−node2

Register

Register

N1 N2 N3

N1

J. Bahi, R. Couturier, P. Vuillemin

4 processors

N4 N5

N2

N3

HeteroPar’06, Barcelona (Spain)

N4

N5

28 September 2006

14 / 21

2. The JaceP2P environment

2.3. Interaction between peers Spawner Super−node1 Register

Super−node2

RegApli

Register

N1 N2 N3 N4

N5

N1

J. Bahi, R. Couturier, P. Vuillemin

N2

N3

HeteroPar’06, Barcelona (Spain)

N4

N5

28 September 2006

14 / 21

2. The JaceP2P environment

2.3. Interaction between peers Spawner Super−node1

Super−node2

RegApli

Register

N1 N2 N3 N4

Register

N5

Send RegApli

N1 N2 N3 N4

N1 N2 N3 N4

N1 N2 N3 N4

N1 N2 N3 N4

N1

N2

N3

N4

J. Bahi, R. Couturier, P. Vuillemin

HeteroPar’06, Barcelona (Spain)

N5

28 September 2006

14 / 21

2. The JaceP2P environment

2.3. Interaction between peers Spawner Super−node1

Super−node2

RegApli

Register

N1 N2 N3 N4

Register

N5

Send Checkpoints

ite2

ite1 N1 N2 N3 N4

N1 N2 N3 N4

1 N1

J. Bahi, R. Couturier, P. Vuillemin

N1 N2 N3 N4

2 N2

ite3 N1 N2 N3 N4

3 N3

HeteroPar’06, Barcelona (Spain)

N4

N5

28 September 2006

14 / 21

2. The JaceP2P environment

2.3. Interaction between peers Spawner Super−node1

Super−node2

RegApli

Register

N1 N2 N3 N4

Register

N5

Send Checkpoints ite4 N1 N2 N3 N4

N1 N2 N3 N4

4 N1

J. Bahi, R. Couturier, P. Vuillemin

N1 N2 N3 N4

2 N2

N1 N2 N3 N4

3 N3

HeteroPar’06, Barcelona (Spain)

N4

N5

28 September 2006

14 / 21

2. The JaceP2P environment

2.3. Interaction between peers Spawner Super−node1

Super−node2

RegApli

Register

N1 N2 N3 N4

Register

N5

Send Checkpoints

ite1 N1 N2 N3 N4

1 N1

J. Bahi, R. Couturier, P. Vuillemin

ite3

ite2 N1 N2 N3 N4

4

N1 N2 N3 N4

2 2 N2

N1 N2 N3 N4

3 3

N3

HeteroPar’06, Barcelona (Spain)

N4

N5

28 September 2006

14 / 21

2. The JaceP2P environment

2.3. Interaction between peers Spawner Super−node1

Super−node2

RegApli

Register

N1 N2 N3 N4

Register

N5

N1 N2 N3 N4

8 6 4 N1

J. Bahi, R. Couturier, P. Vuillemin

N1 N2 N3 N4

6

7 5 N2

N1 N2 N3 N4

7 9

6

N1 N2 N3 N4

8 10 8

N3

HeteroPar’06, Barcelona (Spain)

N4

N5

28 September 2006

14 / 21

2. The JaceP2P environment

2.3. Interaction between peers Spawner Super−node1

Super−node2

Register

RegApli

1 processor

Register

N1 N2 N3 N4

N5

N5

N1 N2 N3 N4

8 6 4 N1

J. Bahi, R. Couturier, P. Vuillemin

N1 N2 N3 N4

6

7 5 N2

N1 N2 N3 N4

7 9

6

N1 N2 N3 N4

8 10 8

N3

HeteroPar’06, Barcelona (Spain)

N4

N5

28 September 2006

14 / 21

2. The JaceP2P environment

2.3. Interaction between peers Spawner Super−node1

Super−node2

RegApli

Register

N1 N2 N5 N4

Register

Update RegApli

N1 N2 N5 N4

8 6 4 N1

J. Bahi, R. Couturier, P. Vuillemin

N1 N2 N5 N4

6

7 5

N1 N2 N5 N4

N1 N2 N5 N4

8 10 8

N2

N4

HeteroPar’06, Barcelona (Spain)

N5

28 September 2006

14 / 21

2. The JaceP2P environment

2.3. Interaction between peers Spawner Super−node1

Super−node2

RegApli

Register

N1 N2 N5 N4

Register

N1 N2 N5 N4

8 6 4 N1

J. Bahi, R. Couturier, P. Vuillemin

N1 N2 N5 N4

6

7 5 N2

N1 N2 N5 N4

N1 N2 N5 N4

8 10 8 N4 N5 Reload last checkpoint

HeteroPar’06, Barcelona (Spain)

28 September 2006

14 / 21

3. Experimentations with JaceP2P

3. Experimentations with JaceP2P

J. Bahi, R. Couturier, P. Vuillemin

HeteroPar’06, Barcelona (Spain)

28 September 2006

15 / 21

3. Experimentations with JaceP2P

3.1. Problem description The Poisson equation : −∆u = f

Linear problem (PDE) in 2D Finite Difference Method : Space discretization in n2 meshes (problem size : n2 ) Bloc Jacobi like decomposition Each bloc solved by sequential sparse Conjugate Gradient method

J. Bahi, R. Couturier, P. Vuillemin

HeteroPar’06, Barcelona (Spain)

28 September 2006

16 / 21

3. Experimentations with JaceP2P

3.2. Execution context n from 2000 up to 5000 (size problem = n2 ) ⇒ matrices from 4,000,000×4,000,000 up to 25,000,000×25,000,000 Heterogeneous processors and networks : ⇒ 3 Super-nodes (2.40 GHz CPU) ⇒ 100 Daemons (1266 MHz up to 3.00 GHz CPU) ⇒ 1 Spawner (2.40 GHz CPU) ⇒ Ethernet 100 Mbps up to 1 Gbps Application launched on 80 Daemons randomly disconnected/reconnected (from 0 up to 50 disconnections per execution) Tasks checkpointed every 5 iterations J. Bahi, R. Couturier, P. Vuillemin

HeteroPar’06, Barcelona (Spain)

28 September 2006

17 / 21

3. Experimentations with JaceP2P

3.2. Execution context n from 2000 up to 5000 (size problem = n2 ) ⇒ matrices from 4,000,000×4,000,000 up to 25,000,000×25,000,000 Heterogeneous processors and networks : ⇒ 3 Super-nodes (2.40 GHz CPU) ⇒ 100 Daemons (1266 MHz up to 3.00 GHz CPU) ⇒ 1 Spawner (2.40 GHz CPU) ⇒ Ethernet 100 Mbps up to 1 Gbps Application launched on 80 Daemons randomly disconnected/reconnected (from 0 up to 50 disconnections per execution) Tasks checkpointed every 5 iterations J. Bahi, R. Couturier, P. Vuillemin

HeteroPar’06, Barcelona (Spain)

28 September 2006

17 / 21

3. Experimentations with JaceP2P

3.2. Execution context n from 2000 up to 5000 (size problem = n2 ) ⇒ matrices from 4,000,000×4,000,000 up to 25,000,000×25,000,000 Heterogeneous processors and networks : ⇒ 3 Super-nodes (2.40 GHz CPU) ⇒ 100 Daemons (1266 MHz up to 3.00 GHz CPU) ⇒ 1 Spawner (2.40 GHz CPU) ⇒ Ethernet 100 Mbps up to 1 Gbps Application launched on 80 Daemons randomly disconnected/reconnected (from 0 up to 50 disconnections per execution) Tasks checkpointed every 5 iterations J. Bahi, R. Couturier, P. Vuillemin

HeteroPar’06, Barcelona (Spain)

28 September 2006

17 / 21

3. Experimentations with JaceP2P

3.3. Results Time execution according to n with different amounts of disconnections 4000 3500 3000

0 disconnection 10 disconnections 20 disconnections 30 disconnections 40 disconnections 50 disconnections

Time (in s)

2500 2000 1500 1000 500 0 2000

2500

3000 3500 4000 n (problem size = n x n)

4500

5000

Maximum slowdown (i.e. with 50 disconnections) ⇒ ' 2 for n = 2000 ⇒ ' 2.5 for n = 5000 JaceP2P adapted to highly dynamic infrastructures J. Bahi, R. Couturier, P. Vuillemin

HeteroPar’06, Barcelona (Spain)

28 September 2006

18 / 21

Conclusion / Future works

Conclusion / Future works

J. Bahi, R. Couturier, P. Vuillemin

HeteroPar’06, Barcelona (Spain)

28 September 2006

19 / 21

Conclusion / Future works

Conclusion

Presentation of JaceP2P : programming and execution environment for iterative applications on P2P infrastructures Based on hybrid P2P topology and asynchronous iteration model Checkpoint mechanism for fault tolerance Experimentations with real scientific application (linear problem) JaceP2P adapted for iterative algorithms on dynamic and heterogeneous processors/network ⇒ 50 disconnections, slowdown ≤ 2.5

J. Bahi, R. Couturier, P. Vuillemin

HeteroPar’06, Barcelona (Spain)

28 September 2006

20 / 21

Conclusion / Future works

Future works

Experiment the scalability of JaceP2P (thousands of peers : EGEE, Grid’5000...) Implementation and experimentation with other kind of iterative problems (nonlinear, non-stationary, eigenvalues, ...) Make the Spawner fault tolerant (decentralize convergence detection and register...)

J. Bahi, R. Couturier, P. Vuillemin

HeteroPar’06, Barcelona (Spain)

28 September 2006

21 / 21

Suggest Documents