Superlinear speedup in cloud virtual environment. â» Discussion. â» Conclusion and future work ... Private cache shared cache. FEDCSIS 2016, WSC, Sep.
Superlinear Speedup in HPC Systems: why and when? Sasko Ristov1,2, Radu Prodan1, Marjan Gusev2, Karolj Skala3 1University
1
of Innsbruck, Austria, 2Ss. “Cyril and Methodius” University, Skopje, Macedonia, 3Rugjer Boškovic Institute, Zagreb, Croatia FEDCSIS 2016, WSC, Sep. 2016, Gdans, Poland
Abstract The speedup sometimes can reach far beyond the limited linear speedup, known as superlinear speedup,
greater than the number of processors that are used.
Not a new concept.
Many authors have already reported as a side effect, without explaining why and how it is happening.
We analyze several different superlinear speedup types and define a taxonomy for them.
several explanations and cases of superlinearity existence
Frequent explanation - having more cache But, Other different effects also cause the superlinearity
2
FEDCSIS 2016, WSC, Sep. 2016, Gdans, Poland
Motivation
Mostly the explanation for superlinear speedup greater amount of cache memory in the parallel execution compared to the sequential
However, why superlinear speedup is not achieved
for each modern multi-core CPU ? for each algorithm? for each problem size for the same algorithm? for each number of threads?
Systematic overview of reasons
3
FEDCSIS 2016, WSC, Sep. 2016, Gdans, Poland
Outline Speedup limitations Beyond the speedup limits. Why and when? Superlinear speedup regions Superlinear speedup in cloud virtual environment Discussion Conclusion and future work
4
FEDCSIS 2016, WSC, Sep. 2016, Gdans, Poland
Speedup limitations Amdahl’s Law
Max speedup saturates
Gustafson’s Law
5
Max speedup is linear
FEDCSIS 2016, WSC, Sep. 2016, Gdans, Poland
Scaled serial fraction
Karp and Flatt
Speedup depends of p
More broader speedup
6
Amdahl’s and Gustafson’s laws are special cases FEDCSIS 2016, WSC, Sep. 2016, Gdans, Poland
Outline Speedup limitations Beyond the speedup limits. Why and when? Superlinear speedup regions Superlinear speedup in cloud virtual environment Discussion Conclusion and future work
7
FEDCSIS 2016, WSC, Sep. 2016, Gdans, Poland
Spedup analysis Clocks spent for
Computation Memory accesses
Condition to exist the superlinear speedup. To exist epsilon
8
FEDCSIS 2016, WSC, Sep. 2016, Gdans, Poland
(non)persistent algorithms Non-persistent algorithms
Parallel searching algorithms
Parallel shortest path planning
CCp < CCs
Persistent algorithms Ip = Is
9
More cache for parallel execution Shared cache for parallel execution Superlinear speedup in a heterogeneous environment
FEDCSIS 2016, WSC, Sep. 2016, Gdans, Poland
More cache – superlinear speedup
Sequential
Sup. Speedup for Loosely coupled, as well
10
Parallel
FEDCSIS 2016, WSC, Sep. 2016, Gdans, Poland
Shared last level cache for parallel
Private cache
11
shared cache
FEDCSIS 2016, WSC, Sep. 2016, Gdans, Poland
Heterogeneous environment
Maybe should be called
Non persistent hardware
Better scheduling of tasks Reduces the impact of Amdahl’s Law
Achieved superlinear speedup
12
FEDCSIS 2016, WSC, Sep. 2016, Gdans, Poland
Outline
Speedup limitations Beyond the speedup limits. Why and when? Superlinear speedup regions Superlinear speedup in cloud virtual environment Discussion Conclusion and future work
13
FEDCSIS 2016, WSC, Sep. 2016, Gdans, Poland
Some range of the number of processors (fixed problem size)
14
FEDCSIS 2016, WSC, Sep. 2016, Gdans, Poland
Particular range of problem size, but fixed number of processors
15
FEDCSIS 2016, WSC, Sep. 2016, Gdans, Poland
Outline
Speedup limitations Beyond the speedup limits. Why and when? Superlinear speedup regions Superlinear speedup in cloud virtual environment Discussion Conclusion and future work
16
FEDCSIS 2016, WSC, Sep. 2016, Gdans, Poland
Superlinear while different scaling
Superlinear speedup in each scaling 17
FEDCSIS 2016, WSC, Sep. 2016, Gdans, Poland
Outline
Speedup limitations Beyond the speedup limits. Why and when? Superlinear speedup regions Superlinear speedup in cloud virtual environment Discussion Conclusion and future work
18
FEDCSIS 2016, WSC, Sep. 2016, Gdans, Poland
Discussion
Superlinearity versus algorithm type
19
Matrix-vector multiplication Loosely coupled is better
Matrix-matrix multiplication Tightly coupled is better
FEDCSIS 2016, WSC, Sep. 2016, Gdans, Poland
More special cases
Using a multi-tiered memory is not the sine qua non for superlinearity
Sublinear speedup
for i7
But superlinear for
20
Cray XMT AMD
FEDCSIS 2016, WSC, Sep. 2016, Gdans, Poland
How to scale
L3 cache level is shared
But not among all cores VM with one / two cores has 6MB VM with three / four cores has 12 MB
Vertical scaling provides a better speedup, Horizontal offers more flexible scaling of resources
21
FEDCSIS 2016, WSC, Sep. 2016, Gdans, Poland
Outline
Speedup limitations Beyond the speedup limits. Why and when? Superlinear speedup regions Superlinear speedup in cloud virtual environment Discussion Conclusion and future work
22
FEDCSIS 2016, WSC, Sep. 2016, Gdans, Poland
Conclusion
summarizes and discusses several cases for the appearance of superlinearity
superlinearity could have an impact in the supercomputers’ architecture and design
Vendor are racing in parallel architecture, not in GHz
23
FEDCSIS 2016, WSC, Sep. 2016, Gdans, Poland
Conclusion
This paper will help in decision:
how much to scale the resources? how to scale?
Algorithms that
can benefit from greater cache memory
need to finish more work in a given time,
24
scale vertically scale horizontally
FEDCSIS 2016, WSC, Sep. 2016, Gdans, Poland
Future work
Determine an analytical relation of a complex computer system that will enable the conditions for superlinearity model the multidimensional space of superlinearity
25
Value of superlinearity Not only appears or not
FEDCSIS 2016, WSC, Sep. 2016, Gdans, Poland
Acknowledgment
EU H2020 research and innovation programme under the grant agreement 644179 ENTICE: dEcentralized repositories for traNsparent and efficienT vIrtual maChine opErations.
Networking support by the COST programme Action IC1305, Network for Sustainable Ultrascale Computing (NESUS).
26
FEDCSIS 2016, WSC, Sep. 2016, Gdans, Poland
Questions?
27
FEDCSIS 2016, WSC, Sep. 2016, Gdans, Poland