Parallel Computing and Parallel Programming - LIP Lisboa
Recommend Documents
Data Parallel Problem Decomposition. Parallel Memory Sharing. Transparent
Scalability. CUDA Programming Model. CUDA: C on the GPU. CUDA Example.
critical element of a practical visual programming system; text is still the best ...... designed with the idea that the sequential computation node is the basic unit of.
result of this effort is the reference implementation Globus Toolkit [36]. The design .... other languages (C, Fortran, C++, Python, and Java) is also foreseen. ...... R. Martin J., N. Hoover, Guide to cloud computing, InformationWeek, 2008. 51. ....
was provided with a copy of the benchmark software. By doing this, it ... ronment supports bulk data transfers and batch pro- ... send e-mail to [email protected].
span all MIMD multiprocessor systems. Programming environments based upon these models 15, 6] let the application programmer write portable programs for.
Feb 8, 2002 - Email: {bromling,stevem,janvik,jonathan,duane}@cs.ualberta.ca ... There are two critical shortcomings of pattern (or template) based ... CO2P3S generates code that is specific to the pattern/parameter .... Frameworks are a set of classe
logic programming languages such as Prolog, whale the family of concurrent ..... selection of a unifying clause in the program when per- forming a resolution.
approach, is to have the programmers insert navigational statements (e.g., ...... hops (lines (9.1) and (12.1)) and one load statement (line (12)) are inserted.
arXiv:hep-lat/9507021v1 25 Jul 1995. WUB 95-13. HLRZ 32/95. Hyper-Systolic Parallel Computing. Th. Lipperta, A. Seyfrieda, A. Bodeb and K. Schillingc.
May 11, 2008 - Bangalore, India ... Passing Interface (MPI) to allow a Fortran or C programmer to ... programming model uses the serial subset of Java and.
architecture models, as well as parallel programming models, and ... Lecture 1:
Introduction to Advanced Computer Architecture and Parallel. Processing.
Parallel Prefix and Suffix Computations on a Linked List 60. 2.6.2. Sorting on a ...... disk storage which is cheap and available in larger quantity. OOC LU ... the Sony's PlayStation 3 game console, which boasts a chip with nine CPUs for faster and
Parallel Computing and Parallel Programming - LIP Lisboa
Introduction. M.A. Oliveira. Outline. Parallel Computing and Parallel
Programming. Miguel Afonso Oliveira. Laboratório de Instrumentaç˜ao e Fısica
Experimental ...
Introduction M.A. Oliveira Outline
Parallel Computing and Parallel Programming Miguel Afonso Oliveira Laborat´ orio de Instrumenta¸c˜ ao e F´ısica Experimental de Part´ıculas LIP
LNEC April 2010
Introduction M.A. Oliveira Outline
Outline
Outline Introduction M.A. Oliveira Outline
1 Parallel Computing
What is Parallel Computing? Why do Parallel Computing? Limits of Parallel Computing? 2 Parallel Programming Notions
Scalability Speedup factor Efficiency Maximum speedup Amdahl’s Law Practical Limits: Amdahl’s Law versus Reality Networking
Outline Introduction M.A. Oliveira Outline
3 Parallel Computers
Flynn’s Taxonomy Flynn’s Taxonomy Memory model Taxonomy
4 Parallel Programming
The Two Extreme Models Parallel Programming: The Real World
Introduction M.A. Oliveira Parallel Computing What? Why? Limits
Notions
Parallel Computing
What is Parallel Computing? Introduction M.A. Oliveira Parallel Computing What? Why? Limits
Notions
Parallel Computing: use of multiple processing units or computers for a common task. Each processing unit works on its section of the problem. Processing units can exchange information.
PU_1 works on this are of the problem
PU_2 works on this are of the problem
PU_3 works on this are of the problem
PU_4 works on this are of the problem
Why do Parallel Computing? Introduction M.A. Oliveira Parallel Computing What? Why? Limits
Notions
To compute beyond the limits of single PU systems: achieve more performance; utilize more memory.
To be able to: solve that can’t be solved in a reasonable time with single PU systems; solve problems that don’t fit on a single PU system or even a single system.
So we can: solve larger problems; solve problems faster; solve more problems.
Limits of Parallel Computing Introduction M.A. Oliveira Parallel Computing What? Why? Limits
Other considerations: Time to develop/rewrite code. Time do debug and optimize code.
Introduction M.A. Oliveira Parallel Computing Notions Scalability Speedup factor Efficiency Maximum speedup Amdahl’s Law Praticalities Network
Parallel Programming Notions
Scalability Introduction M.A. Oliveira Parallel Computing Notions Scalability Speedup factor Efficiency Maximum speedup Amdahl’s Law Praticalities Network
Imprecise term: It’s used to indicate if an algorithm or a system can be increased in size and in doing so obtain increased performance.
Speedup factor Introduction M.A. Oliveira Parallel Computing Notions Scalability Speedup factor Efficiency Maximum speedup Amdahl’s Law Praticalities Network
S(p) =
Execution time for best sequential algorithm Execution time using p processors
=
ts tp
Efficiency Introduction M.A. Oliveira Parallel Computing Notions Scalability Speedup factor Efficiency Maximum speedup Amdahl’s Law Praticalities Network
E=
Execution time using one processor Execution time on multiprocessor × p
=
ts tp × p
=
S(p) p
Maximum speedup Introduction M.A. Oliveira Parallel Computing Notions Scalability Speedup factor Efficiency Maximum speedup Amdahl’s Law Praticalities Network
All parallel codes contain: parallel section serial sections
Maximum speedup is usually the linear speedup. ts
tp
S(p) =
ts ts p
=p
Superlinear speedup - S(p) > p - is not theoretically excluded but is usually due to: suboptimal sequencial algorithm, unique feature of the parallel architecture, non-deterministic nature of algorithm.
Amdahl’s Law Introduction M.A. Oliveira Parallel Computing
Derivation ts
Notions Scalability Speedup factor Efficiency Maximum speedup Amdahl’s Law Praticalities Network
ft s
(1−f)ts
S(p) =
ft s
fts +
(1−f)ts p
Corollary lim S(p) =
p→∞
1 f
ts (1−f )ts p
=
p 1+(p−1)f
Amdahl’s Law Introduction M.A. Oliveira Parallel Computing Notions Scalability Speedup factor Efficiency Maximum speedup Amdahl’s Law Praticalities Network
Practical Limits: Amdahl’s Law versus Reality Introduction M.A. Oliveira Parallel Computing Notions Scalability Speedup factor Efficiency Maximum speedup Amdahl’s Law Praticalities Network
Amdahl’l Law provides a theoretical upper limit on parallel speedup but in reality the situation is even worse due to: Load Balancing. Scheduling. Communications. I/O.
Networking Introduction M.A. Oliveira Parallel Computing Notions Scalability Speedup factor Efficiency Maximum speedup Amdahl’s Law Praticalities Network
The purpose for the interconnecting network is to provide a physical path for the memory access or for messages. There are several key issues when considering the network: Design (mesh, hypercube, crossbar, tree,...). Bandwidth. Latency. Cost. A “good” algorithm takes into account the underlying network characteristics.
Introduction M.A. Oliveira Computer Early days Recently Currently
Programming
Parallel Computers
Early days: Flynn’s Taxonomy Introduction M.A. Oliveira Computer Early days Recently Currently
Programming
Recently: Flynn’s Taxonomy Introduction M.A. Oliveira Computer Early days Recently Currently
MIMD
Programming
SPMD Single Program Multiple Data
MPMD Multiple Program Multiple Data
Memory Model Taxonomy Introduction M.A. Oliveira Computer
Parallel Computers
Interconnect
Memory
CPU_N
Mem_0
CPU_0 CPU_1
CPU_0
Mem_1
Distributed Memory
CPU_1
Mem_N
Shared Memory
Programming
CPU_N
Interconnect
Early days Recently Currently
Introduction M.A. Oliveira Computer Programming The Models Reality
Parallel Programming
The Two Extreme Parallel Programming Models Introduction M.A. Oliveira
Parallel Computers
Computer Programming
Shared Memory
Distributed Memory
The Models Reality
Shared Memory Programming OpenMP
Message−Passing Programming MPI
The distributed model can be used directly on a shared memory system. Using the shared memory model on a distributed memory system is only possible indirectly. Both models can be combined to optimize performance.
Parallel Computers and Parallel Programming: The Real World Introduction M.A. Oliveira Computer