Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Multi-criteria Mapping and Scheduling of Workflow Applications onto Heterogeneous Platforms Veronika Sonigo GRAAL team, LIP ´ Ecole Normale Sup´ erieure de Lyon
Supervisors: Anne Benoit, Harald Kosch, Yves Robert
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
1/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Multi-criteria Mapping and Scheduling
Mapping and scheduling for parallel machines Makespan minimization → already a difficult problem Mapping and scheduling for large-scale heterogeneous platforms Need new objectives Period, reliability, latency, cost, QoS, energy, etc Multi-criteria optimization Assess problem complexity (polynomial /NP-hard instances) Design practical heuristics for important application problems New results and solutions for algorithmically challenging problems
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
2/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Thesis outline - Today’s Presentation Replica Placement in Tree-Networks Without Constraints QoS Constraints Bandwidth Constraints Pipeline Workflow Applications Mono-criterion Optimization Bi-criteria Optimization Example: JPEG-Encoder In-network Stream Processing Single Application - Platform Creation Multiple Applications
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
3/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Thesis outline - Today’s Presentation Replica Placement in Tree-Networks Without Constraints QoS Constraints Bandwidth Constraints Pipeline Workflow Applications Mono-criterion Optimization Bi-criteria Optimization Example: JPEG-Encoder In-network Stream Processing Single Application - Platform Creation Multiple Applications
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
3/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Outline of the Talk 1
Replica Placement in Tree-Networks Framework Complexity Heuristics for Replica Cost Problem Experiments
2
Pipeline Workflow Applications Bi-criteria Complexity Results
3
In-network Stream Processing Heuristics and Experiments
4
Conclusion
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
4/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Mimi alone at home
Internet
Problems and questions: Where to download from? How to deal with multiple users?
Where to place the replicas?
Heterogeneity
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
5/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Mimi alone at home
Internet
?
Problems and questions: Where to download from? How to deal with multiple users?
Where to place the replicas?
Heterogeneity
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
5/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Mimi alone at home
Internet
Problems and questions: Where to download from? How to deal with multiple users?
Where to place the replicas?
Heterogeneity
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
5/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Mimi alone at home 01 1 0 00 11 0 1 0 1 00 11 0 1 0 0 1 111 000 01 1 0 1 0 1 00 11 000 111 0 1 0 1 0 1 00 11 0 1 0 1 0 1 000 111 000 111 0 1 0 1 0 1 000 111 000 111 0 1 0 1 0 1 0 1 0 1 000 111 0 1 0 1 0 1 0 1 0 1 11 00 000 111 0 1 0 1 0 1 00 11 0 1 0 1 000 111 0 0 1 0 1 0 1 01 1 00 11 000 111 00 11 0 1 0 1 0 1 0 1 0 1 00 11 000 111 00 11 0 1 0 1 0 1 0 1 0 1 000 111 00 11 1 1 0 0 000 111 1 1 0 0 0 1 000 111 00 11 1 1 0 0 000 111 1 0 0 01 1 000 111 0 1 0 0 1 0 1 000 111 01 1 0 1 0 1 0 1 00 11 0 1 0 1 00 11 1 1 0 0 00 11 ?
Problems and questions: Where to download from? How to deal with multiple users?
Where to place the replicas?
Heterogeneity
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
5/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Mimi alone at home 01 1 0 00 11 0 1 0 1 00 11 0 1 0 0 1 111 000 01 1 0 1 0 1 00 11 000 111 0 1 0 1 0 1 00 11 0 1 0 1 0 1 000 111 000 111 0 1 0 1 0 1 000 111 000 111 0 1 0 1 0 1 0 1 0 1 000 111 0 1 0 1 0 1 0 1 0 1 11 00 000 111 0 1 0 1 0 1 00 11 0 1 0 1 000 111 0 0 1 0 1 0 1 01 1 00 11 000 111 00 11 0 1 0 1 0 1 0 1 0 1 00 11 000 111 00 11 0 1 0 1 0 1 0 1 0 1 000 111 00 11 1 1 0 0 000 111 1 1 0 0 0 1 000 111 00 11 1 1 0 0 000 111 1 0 0 01 1 000 0 1 0 0 1 0 111 1 000 111 01 1 0 1 0 1 0 1 00 11 0 1 0 1 00 11 0 1 0 1 00 11 ?
?
?
?
Problems and questions: Where to download from? How to deal with multiple users?
Where to place the replicas?
Heterogeneity
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
5/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Mimi alone at home 01 1 0 00 11 0 1 0 1 00 11 0 1 0 0 1 111 000 01 1 0 1 0 1 00 11 000 111 0 1 0 1 0 1 00 11 0 1 0 1 0 1 000 111 000 111 0 1 0 1 0 1 000 111 000 111 0 1 0 1 0 1 0 1 0 1 000 111 0 1 0 1 0 1 0 1 0 1 11 00 000 111 0 1 0 1 0 1 00 11 0 1 0 1 000 111 0 0 1 0 1 0 1 01 1 00 11 000 111 00 11 0 1 0 1 0 1 0 1 0 1 00 11 000 111 00 11 0 1 0 1 0 1 0 1 0 1 000 111 00 11 1 1 0 0 000 111 1 1 0 0 0 1 000 111 00 11 1 1 0 0 000 111 1 0 0 01 1 000 0 1 0 0 1 0 111 1 000 111 01 1 0 1 0 1 0 1 00 11 0 1 0 1 00 11 0 1 0 1 00 11 ?
?
?
?
Problems and questions: Where to download from? How to deal with multiple users?
Where to place the replicas?
Heterogeneity
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
5/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Mimi alone at home 01 1 0 00 11 0 1 0 1 00 11 0 1 0 0 1 111 000 01 1 0 1 0 1 00 11 000 111 0 1 0 1 0 1 00 11 0 1 0 1 0 1 000 111 000 111 0 1 0 1 0 1 000 111 000 111 0 1 0 1 0 1 0 1 0 1 000 111 0 1 0 1 0 1 0 1 0 1 11 00 000 111 0 1 0 1 0 1 00 11 0 1 0 1 000 111 0 0 1 0 1 0 1 01 1 00 11 000 111 00 11 0 1 0 1 0 1 0 1 0 1 00 11 000 111 00 11 0 1 0 1 0 1 0 1 0 1 000 111 00 11 1 1 0 0 000 111 1 1 0 0 0 1 000 111 00 11 1 1 0 0 000 111 1 0 0 01 1 000 0 1 0 0 1 0 111 1 000 111 01 1 0 1 0 1 0 1 00 11 0 1 0 1 00 11 0 1 0 1 00 11 ?
?
?
?
Problems and questions: Where to download from? How to deal with multiple users?
Where to place the replicas?
Heterogeneity
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
5/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Introduction and motivation
Replica placement in tree networks Set of clients (tree leaves): requests with QoS or bandwidth constraints, known in advance Internal nodes may be provided with a replica; in this case they become servers and process requests (up to their capacity limit) Research questions: How many replicas required? Which locations?
[email protected]
July 7, 2009
Total replica cost? Quality of Service?
Mapping and Scheduling of Workflow Applications
6/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Introduction and motivation
Replica placement in tree networks Set of clients (tree leaves): requests with QoS or bandwidth constraints, known in advance Internal nodes may be provided with a replica; in this case they become servers and process requests (up to their capacity limit) Research questions: How many replicas required? Which locations?
[email protected]
July 7, 2009
Total replica cost? Quality of Service?
Mapping and Scheduling of Workflow Applications
6/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Introduction and motivation
Replica placement in tree networks Set of clients (tree leaves): requests with QoS or bandwidth constraints, known in advance Internal nodes may be provided with a replica; in this case they become servers and process requests (up to their capacity limit) Research questions: How many replicas required? Which locations?
[email protected]
July 7, 2009
Total replica cost? Quality of Service?
Mapping and Scheduling of Workflow Applications
6/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Introduction and motivation
Replica placement in tree networks Set of clients (tree leaves): requests with QoS or bandwidth constraints, known in advance Internal nodes may be provided with a replica; in this case they become servers and process requests (up to their capacity limit) Research questions: How many replicas required? Which locations?
[email protected]
July 7, 2009
Total replica cost? Quality of Service?
Mapping and Scheduling of Workflow Applications
6/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Introduction and motivation
Replica placement in tree networks Set of clients (tree leaves): requests with QoS or bandwidth constraints, known in advance Internal nodes may be provided with a replica; in this case they become servers and process requests (up to their capacity limit) Research questions: How many replicas required? Which locations?
[email protected]
July 7, 2009
Total replica cost? Quality of Service?
Mapping and Scheduling of Workflow Applications
6/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Rule of the game Handle all client requests, and minimize cost of replicas → Replica Placement problem Several policies to assign replicas W = 10
1 5
[email protected]
4
July 7, 2009
3
2
2
3
Mapping and Scheduling of Workflow Applications
7/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Rule of the game Handle all client requests, and minimize cost of replicas → Replica Placement problem Several policies to assign replicas W = 10
1 5
[email protected]
4
July 7, 2009
3
2
2
3
Mapping and Scheduling of Workflow Applications
7/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Rule of the game Handle all client requests, and minimize cost of replicas → Replica Placement problem Several policies to assign replicas W = 10
1 5
[email protected]
4
July 7, 2009
3
2
2
3
Mapping and Scheduling of Workflow Applications
7/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Rule of the game Handle all client requests, and minimize cost of replicas → Replica Placement problem Several policies to assign replicas W = 10
1 5
4
3
2
2
3
Closest
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
7/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Rule of the game Handle all client requests, and minimize cost of replicas → Replica Placement problem Several policies to assign replicas W = 10
1 5
4
3
2
2
3
Upwards
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
7/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Rule of the game Handle all client requests, and minimize cost of replicas → Replica Placement problem Several policies to assign replicas W = 10
3 2
1
5
4
3
2
2
3
Multiple
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
7/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Major contributions
Theory New access policies Problem complexity LP-based optimal solution to cost of Replica Placement Practice Heuristics for each policy Experiments to assess impact of new policies Experiments to assess impact of QoS on different policies
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
8/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Major contributions
Theory New access policies Problem complexity LP-based optimal solution to cost of Replica Placement Practice Heuristics for each policy Experiments to assess impact of new policies Experiments to assess impact of QoS on different policies
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
8/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Definitions and notations
Distribution tree T , clients C (leaf nodes), internal nodes N Client i ∈ C: Sends ri requests per time unit (number of accesses to a single object database) Quality of service qi (response time)
Node j ∈ N : Can contain the object database replica (server) or not Processing capacity Wj Storage cost scj
Tree edge: l ∈ L (communication link between nodes) Communication time comml Bandwidth limit BWl
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
9/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Definitions and notations
Distribution tree T , clients C (leaf nodes), internal nodes N Client i ∈ C: Sends ri requests per time unit (number of accesses to a single object database) Quality of service qi (response time)
Node j ∈ N : Can contain the object database replica (server) or not Processing capacity Wj Storage cost scj
Tree edge: l ∈ L (communication link between nodes) Communication time comml Bandwidth limit BWl
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
9/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Definitions and notations
Distribution tree T , clients C (leaf nodes), internal nodes N Client i ∈ C: Sends ri requests per time unit (number of accesses to a single object database) Quality of service qi (response time)
Node j ∈ N : Can contain the object database replica (server) or not Processing capacity Wj Storage cost scj
Tree edge: l ∈ L (communication link between nodes) Communication time comml Bandwidth limit BWl
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
9/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Definitions and notations
Distribution tree T , clients C (leaf nodes), internal nodes N Client i ∈ C: Sends ri requests per time unit (number of accesses to a single object database) Quality of service qi (response time)
Node j ∈ N : Can contain the object database replica (server) or not Processing capacity Wj Storage cost scj
Tree edge: l ∈ L (communication link between nodes) Communication time comml Bandwidth limit BWl
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
9/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Problem instances
Minimize
P
scs under the constraints: P Server capacity – ∀s ∈ R, i∈C|s∈Servers(i) ri,s ≤ Ws P QoS – ∀i ∈ C, ∀s ∈ Servers(i), l∈path[i→s] comml ≤ qi . P Link capacity – ∀l ∈ L i∈C,s∈Servers(i)|l∈path[i→s] ri,s ≤ BWl s∈R
Restrict to case where scs = Ws : Replica Counting problem on homogeneous platforms, Replica Cost problem with heterogeneous servers.
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
10/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Problem instances
Minimize
P
scs under the constraints: P Server capacity – ∀s ∈ R, i∈C|s∈Servers(i) ri,s ≤ Ws P QoS – ∀i ∈ C, ∀s ∈ Servers(i), l∈path[i→s] comml ≤ qi . P Link capacity – ∀l ∈ L i∈C,s∈Servers(i)|l∈path[i→s] ri,s ≤ BWl s∈R
Restrict to case where scs = Ws : Replica Counting problem on homogeneous platforms, Replica Cost problem with heterogeneous servers.
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
10/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Problem instances
Minimize
P
scs under the constraints: P Server capacity – ∀s ∈ R, i∈C|s∈Servers(i) ri,s ≤ Ws P QoS – ∀i ∈ C, ∀s ∈ Servers(i), l∈path[i→s] comml ≤ qi . P Link capacity – ∀l ∈ L i∈C,s∈Servers(i)|l∈path[i→s] ri,s ≤ BWl s∈R
Restrict to case where scs = Ws : Replica Counting problem on homogeneous platforms, Replica Cost problem with heterogeneous servers.
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
10/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Example: existence of a solution
s2 (a)
s1
1
s2 (b)
s1
1
s2 (c)
1
W =1
s1
2
(a): solution for all policies (Closest, Upwards, Multiple) (b): no solution with Closest (c): no solution with Closest nor Upwards
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
11/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Example: existence of a solution
s2 (a)
s1
1
s2 (b)
s1
1
s2 (c)
1
W =1
s1
2
(a): solution for all policies (Closest, Upwards, Multiple) (b): no solution with Closest (c): no solution with Closest nor Upwards
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
11/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Example: existence of a solution
s2 (a)
s1
1
s2 (b)
s1
1
s2 (c)
1
W =1
s1
2
(a): solution for all policies (Closest, Upwards, Multiple) (b): no solution with Closest (c): no solution with Closest nor Upwards
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
11/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Example: existence of a solution
s2 (a)
s1
1
s2 (b)
s1
1
s2 (c)
1
W =1
s1
2
(a): solution for all policies (Closest, Upwards, Multiple) (b): no solution with Closest (c): no solution with Closest nor Upwards
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
11/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Complexity results
Homogeneous platform: Replica Counting problem, no bandwidth constraints Closest Upwards Multiple
No QoS polynomial [Cidon02,Liu06] NP-hard polynomial
With QoS polynomial [Liu06] NP-hard NP-hard
Homogeneous platforms with bandwidth and QoS constraints: Closest remains polynomial Heterogeneous platforms: all problems are NP-hard
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
12/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Complexity results
Homogeneous platform: Replica Counting problem, no bandwidth constraints Closest Upwards Multiple
No QoS polynomial [Cidon02,Liu06] NP-hard polynomial
With QoS polynomial [Liu06] NP-hard NP-hard
Homogeneous platforms with bandwidth and QoS constraints: Closest remains polynomial Heterogeneous platforms: all problems are NP-hard
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
12/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Complexity results
Homogeneous platform: Replica Counting problem, no bandwidth constraints Closest Upwards Multiple
No QoS polynomial [Cidon02,Liu06] NP-hard polynomial
With QoS polynomial [Liu06] NP-hard NP-hard
Homogeneous platforms with bandwidth and QoS constraints: Closest remains polynomial Heterogeneous platforms: all problems are NP-hard
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
12/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Complexity results
Homogeneous platform: Replica Counting problem, no bandwidth constraints Closest Upwards Multiple
No QoS polynomial [Cidon02,Liu06] NP-hard polynomial
With QoS polynomial [Liu06] NP-hard NP-hard
Homogeneous platforms with bandwidth and QoS constraints: Closest remains polynomial Heterogeneous platforms: all problems are NP-hard
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
12/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Algo: Homogeneous Platform with QoS and Bandwidth Basic idea: computation of the minimal necessary number of replicas in a subtree
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
13/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Algo: Homogeneous Platform with QoS and Bandwidth Basic idea: computation of the minimal necessary number of replicas in a subtree Case 1: too many requests
W = 10
r:
[email protected]
3
5
July 7, 2009
4
Mapping and Scheduling of Workflow Applications
13/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Algo: Homogeneous Platform with QoS and Bandwidth Basic idea: computation of the minimal necessary number of replicas in a subtree Case 1: too many requests
W = 10 P
i
r:
[email protected]
3
5
July 7, 2009
r (i) = 12
4
Mapping and Scheduling of Workflow Applications
13/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Algo: Homogeneous Platform with QoS and Bandwidth Basic idea: computation of the minimal necessary number of replicas in a subtree Case 1: too many requests
W = 10 P
i
r (i) = 12
1 replica
r:
[email protected]
3
5
July 7, 2009
4
Mapping and Scheduling of Workflow Applications
13/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Algo: Homogeneous Platform with QoS and Bandwidth Basic idea: computation of the minimal necessary number of replicas in a subtree Case 2: QoS constraints
W = 10
r: q:
[email protected]
3 1
5 3
July 7, 2009
4 2
Mapping and Scheduling of Workflow Applications
13/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Algo: Homogeneous Platform with QoS and Bandwidth Basic idea: computation of the minimal necessary number of replicas in a subtree Case 2: QoS constraints
W = 10 q(i) < hops 1 replica
r: q:
[email protected]
3 1
5 3
July 7, 2009
4 2
Mapping and Scheduling of Workflow Applications
13/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Algo: Homogeneous Platform with QoS and Bandwidth Basic idea: computation of the minimal necessary number of replicas in a subtree Case 3: bandwidth constraints
W = 10 4
b:
5
2
4 r:
[email protected]
3
5
July 7, 2009
4
Mapping and Scheduling of Workflow Applications
13/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Algo: Homogeneous Platform with QoS and Bandwidth Basic idea: computation of the minimal necessary number of replicas in a subtree Case 3: bandwidth constraints
W = 10 4
b:
5
b(l) < r (i)
2
4 1 replica
r:
[email protected]
3
5
July 7, 2009
4
Mapping and Scheduling of Workflow Applications
13/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Algo: Homogeneous Platform with QoS and Bandwidth Basic idea: computation of the minimal necessary number of replicas in a subtree
W = 10 4
b:
5
2
1 replica
4 1 replica
r: q:
3 1
5 3
4 2 2 replicas
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
13/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
ORP - Optimal Replica Placement Algorithm Preparation Tree transformation Step 1 Bottom up computation of the contribution of client requests
6
C (v , i) : the contribution of node v on its i-th ancestor
6 4 b:
5
2
5 3
4 2
4 r: q:
3 1
[email protected]
2 2
e(v , i) : children of v that have to be equipped with a replica to minimize the contribution on the i-th ancestor of v (respecting some additional constraints).
July 7, 2009
Mapping and Scheduling of Workflow Applications
14/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
ORP - Optimal Replica Placement Algorithm Preparation Tree transformation Step 1 Bottom up computation of the contribution of client requests Step 2 Top down replica placement procedure Place-replica (v, i) if v ∈ C then return; end place a replica at each node of e(v , i); forall c ∈ children(v ) do if c ∈ e(v , i) then Place-replica(c,0); else Place-replica(c,i+1); end end
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
15/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Linear programming General instance of the problem: Heterogeneous platform, QoS+bandwidth, Closest, Upwards and Multiple policies Solving over the rationals: solution for all practical values of the problem size Not very precise bound Upwards/Closest equivalent to Multiple
Integer solving: limitation to s ≤ 50 nodes and clients Mixed bound obtained by solving the Upwards formulation over the rational and imposing only the xj being integers Resolution for problem sizes s ≤ 400 Improved bound: if a server is used only at 50% of its capacity, the cost of placing a replica at this node is not halved as it would be with xj = 0.5 → optimal solution for Multiple
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
16/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Linear programming General instance of the problem: Heterogeneous platform, QoS+bandwidth, Closest, Upwards and Multiple policies Solving over the rationals: solution for all practical values of the problem size Not very precise bound Upwards/Closest equivalent to Multiple
Integer solving: limitation to s ≤ 50 nodes and clients Mixed bound obtained by solving the Upwards formulation over the rational and imposing only the xj being integers Resolution for problem sizes s ≤ 400 Improved bound: if a server is used only at 50% of its capacity, the cost of placing a replica at this node is not halved as it would be with xj = 0.5 → optimal solution for Multiple
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
16/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Linear programming General instance of the problem: Heterogeneous platform, QoS+bandwidth, Closest, Upwards and Multiple policies Solving over the rationals: solution for all practical values of the problem size Not very precise bound Upwards/Closest equivalent to Multiple
Integer solving: limitation to s ≤ 50 nodes and clients Mixed bound obtained by solving the Upwards formulation over the rational and imposing only the xj being integers Resolution for problem sizes s ≤ 400 Improved bound: if a server is used only at 50% of its capacity, the cost of placing a replica at this node is not halved as it would be with xj = 0.5 → optimal solution for Multiple
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
16/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Linear programming General instance of the problem: Heterogeneous platform, QoS+bandwidth, Closest, Upwards and Multiple policies Solving over the rationals: solution for all practical values of the problem size Not very precise bound Upwards/Closest equivalent to Multiple
Integer solving: limitation to s ≤ 50 nodes and clients Mixed bound obtained by solving the Upwards formulation over the rational and imposing only the xj being integers Resolution for problem sizes s ≤ 400 Improved bound: if a server is used only at 50% of its capacity, the cost of placing a replica at this node is not halved as it would be with xj = 0.5 → optimal solution for Multiple
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
16/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Heuristics Polynomial heuristics for Replica Cost problem Heterogeneous platforms Heuristics with and without QoS QoS constraints: QoS of client i represents the maximum distance (number of hops) between i and server(i)
Experimental assessment of relative performance of the three policies Impact of QoS No QoS: Traversals of the tree, bottom-up or top-down QoS: Sorted lists Worst case complexity O(s 2 ), where s = |C| + |N | is problem size
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
17/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Heuristics Polynomial heuristics for Replica Cost problem Heterogeneous platforms Heuristics with and without QoS QoS constraints: QoS of client i represents the maximum distance (number of hops) between i and server(i)
Experimental assessment of relative performance of the three policies Impact of QoS No QoS: Traversals of the tree, bottom-up or top-down QoS: Sorted lists Worst case complexity O(s 2 ), where s = |C| + |N | is problem size
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
17/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Heuristics Polynomial heuristics for Replica Cost problem Heterogeneous platforms Heuristics with and without QoS QoS constraints: QoS of client i represents the maximum distance (number of hops) between i and server(i)
Experimental assessment of relative performance of the three policies Impact of QoS No QoS: Traversals of the tree, bottom-up or top-down QoS: Sorted lists Worst case complexity O(s 2 ), where s = |C| + |N | is problem size
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
17/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Heuristics for Closest- No QoS Closest Top Down Largest First CTDLF Traversal of the tree, treating subtrees that contains most requests first When a node can process the requests of all the clients in its subtree, node chosen as a server and traversal stopped Procedure called until no more servers are added
n1
n2 n3
18
9
n4
2
3
1
1
2
5
2
Cost
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
18/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Heuristics for Closest- No QoS Closest Top Down Largest First CTDLF Traversal of the tree, treating subtrees that contains most requests first When a node can process the requests of all the clients in its subtree, node chosen as a server and traversal stopped Procedure called until no more servers are added
n1
n2 n3
18
9
n4
2
3
1
1
2
5
2
Cost: 18
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
18/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Heuristics for Closest- No QoS Closest Top Down Largest First CTDLF Traversal of the tree, treating subtrees that contains most requests first When a node can process the requests of all the clients in its subtree, node chosen as a server and traversal stopped Procedure called until no more servers are added
n1
n2 n3
8
9
n4
2
3
1
1
2
5
2
Cost
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
18/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Heuristics for Closest- No QoS Closest Top Down Largest First CTDLF Traversal of the tree, treating subtrees that contains most requests first When a node can process the requests of all the clients in its subtree, node chosen as a server and traversal stopped Procedure called until no more servers are added
n1
n2 n3
8
9
n4
2
3
1
1
2
5
2
Solution cost: 17
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
18/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Heuristics for Closest- QoS Closest Big Subtree First CBSF n1
Traversal of the tree, treating subtrees that contain most requests first When a node can process the requests of all the clients in its subtree, node chosen as a server and traversal stopped Procedure called until no more servers are added
n2 n3
18
9
n4
2
1
2 5 2 3 1 q=3 q=1 q=1 q=2 q=3
Cost
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
19/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Heuristics for Closest- QoS Closest Big Subtree First CBSF n1
Traversal of the tree, treating subtrees that contain most requests first When a node can process the requests of all the clients in its subtree, node chosen as a server and traversal stopped Procedure called until no more servers are added
n2 n3
18
9
n4
2
1
2 5 2 3 1 q=3 q=1 q=1 q=2 q=3
Cost
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
19/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Heuristics for Closest- QoS Closest Big Subtree First CBSF n1
Traversal of the tree, treating subtrees that contain most requests first When a node can process the requests of all the clients in its subtree, node chosen as a server and traversal stopped Procedure called until no more servers are added
n2 n3
18
9
n4
2
1
2 5 2 3 1 q=3 q=1 q=1 q=2 q=3
Cost: 27
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
19/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Results - Percentage of success - QoS Number of solutions for each lambda and each heuristic average(qos) = height/2 100
P ri λ = P i∈C j∈N Wi
percentage of success
80
60
40
20
0 0.1
0.2
0.3
0.4
Closest_BigSubtreeFirst Closest_SmallQoSFirst Upwards_SQoS_Started Upwards_SQoS_MinReq Upwards_DistServer_Indisp
[email protected]
July 7, 2009
0.5 lambda
0.6
0.7
0.8
0.9
Multiple_SQoS_Close Multiple_SQoS_MinReq Multiple_MinQoS_Indisp MixedBest LP
Mapping and Scheduling of Workflow Applications
20/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Results - Relative Performance Distance of the result (in terms of replica cost) of the heuristic to the optimal solution Tλ : subset of trees with a solution Relative performance: rperf =
1 X costLP (t) |Tλ | costh (t) t∈Tλ
costLP(t) : optimal cost on tree t costh (t): heuristic cost on tree t; costh (t) = +∞ if h did not find any solution
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
21/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Results - Relative Performance Distance of the result (in terms of replica cost) of the heuristic to the optimal solution Tλ : subset of trees with a solution Relative performance: rperf =
1 X costLP (t) |Tλ | costh (t) t∈Tλ
costLP(t) : optimal cost on tree t costh (t): heuristic cost on tree t; costh (t) = +∞ if h did not find any solution
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
21/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Results - Relative Performance - No QoS Heterogeneous results - similar to the homogeneous case
100
relative performance
80
60
40
20
0 0.1
0.2
0.3
0.4
ClosestTopDownAll ClosestTopDownLargestFirst ClosestBottomUp UpwardsTopDown UpwardsBigClientFirst
[email protected]
July 7, 2009
0.5 lambda
0.6
0.7
0.8
0.9
MultipleGreedy MultipleTopDown MultipleBottomUp MixedBest
Mapping and Scheduling of Workflow Applications
22/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Results - Relative Performance - QoS average(qos) = height/2
100
relative performance
80
60
40
20
0 0.1
0.2
0.3
0.4
Closest_BigSubtreeFirst Closest_SmallQoSFirst Upwards_SQoS_Started Upwards_SQoS_MinReq Upwards_DistServer_Indisp
[email protected]
July 7, 2009
0.5 lambda
0.6
0.7
0.8
0.9
Multiple_SQoS_Close Multiple_SQoS_MinReq Multiple_MinQoS_Indisp MixedBest
Mapping and Scheduling of Workflow Applications
23/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Summary No QoS: Striking effect of new policies: many more solutions to the Replica Placement problem Multiple ≥ Upwards ≥ Closest: hierarchy observed within our heuristics Best Multiple heuristic (MB) always at 85% of the optimal: satisfactory result QoS: Hierarchy also under QoS constraints Performance compared to the optimal solution: qos ∈ {1, 2}: 95% average(qos) = height/2: 85% no qos: 85% Smaller trees: results slightly less good
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
24/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Summary No QoS: Striking effect of new policies: many more solutions to the Replica Placement problem Multiple ≥ Upwards ≥ Closest: hierarchy observed within our heuristics Best Multiple heuristic (MB) always at 85% of the optimal: satisfactory result QoS: Hierarchy also under QoS constraints Performance compared to the optimal solution: qos ∈ {1, 2}: 95% average(qos) = height/2: 85% no qos: 85% Smaller trees: results slightly less good
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
24/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Related work Several papers on replica placement, but... ...all consider only the Closest policy Replica Placement in a general graph is NP-complete Wolfson and Milo: impact of the write cost, use of a minimum spanning tree for updates. Tree networks: polynomial solution Cidon et al (multiple objects) and Liu et al (QoS constraints): polynomial algorithms for homogeneous networks. Kalpakis et al: NP-completeness of a variant with bidirectional links (requests served by any node in the tree) Karlsson et al: comparison of different objective functions and several heuristics. No QoS, but several other constraints. Tang et al: real QoS constraints Rodolakis et al: Multiple policy but in a very different context
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
25/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Related work Several papers on replica placement, but... ...all consider only the Closest policy Replica Placement in a general graph is NP-complete Wolfson and Milo: impact of the write cost, use of a minimum spanning tree for updates. Tree networks: polynomial solution Cidon et al (multiple objects) and Liu et al (QoS constraints): polynomial algorithms for homogeneous networks. Kalpakis et al: NP-completeness of a variant with bidirectional links (requests served by any node in the tree) Karlsson et al: comparison of different objective functions and several heuristics. No QoS, but several other constraints. Tang et al: real QoS constraints Rodolakis et al: Multiple policy but in a very different context
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
25/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Related work Several papers on replica placement, but... ...all consider only the Closest policy Replica Placement in a general graph is NP-complete Wolfson and Milo: impact of the write cost, use of a minimum spanning tree for updates. Tree networks: polynomial solution Cidon et al (multiple objects) and Liu et al (QoS constraints): polynomial algorithms for homogeneous networks. Kalpakis et al: NP-completeness of a variant with bidirectional links (requests served by any node in the tree) Karlsson et al: comparison of different objective functions and several heuristics. No QoS, but several other constraints. Tang et al: real QoS constraints Rodolakis et al: Multiple policy but in a very different context
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
25/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Outline of the Talk 1
Replica Placement in Tree-Networks Framework Complexity Heuristics for Replica Cost Problem Experiments
2
Pipeline Workflow Applications Bi-criteria Complexity Results
3
In-network Stream Processing Heuristics and Experiments
4
Conclusion
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
26/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Introduction and motivation Mapping applications onto parallel platforms Difficult challenge Heterogeneous clusters, fully heterogeneous platforms Even more difficult! Structured programming approach Easier to program (deadlocks, process starvation) Range of well-known paradigms (pipeline, farm) Algorithmic skeleton: help for mapping
Focus on pipeline applications Mapping the JPEG encoder pipeline onto a cluster of workstations.
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
27/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Introduction and motivation Mapping applications onto parallel platforms Difficult challenge Heterogeneous clusters, fully heterogeneous platforms Even more difficult! Structured programming approach Easier to program (deadlocks, process starvation) Range of well-known paradigms (pipeline, farm) Algorithmic skeleton: help for mapping
Focus on pipeline applications Mapping the JPEG encoder pipeline onto a cluster of workstations.
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
27/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Introduction and motivation Mapping applications onto parallel platforms Difficult challenge Heterogeneous clusters, fully heterogeneous platforms Even more difficult! Structured programming approach Easier to program (deadlocks, process starvation) Range of well-known paradigms (pipeline, farm) Algorithmic skeleton: help for mapping
Focus on pipeline applications Mapping the JPEG encoder pipeline onto a cluster of workstations.
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
27/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Multi-criteria scheduling of workflows Source Image Data
Scaling
YUV Conversion
Block Storage
Subsampling
FDCT
Quantizier
Entropy Encoder
Quantization Table
Huffman Table
Compressed Image Data
Workflow
Several consecutive data-sets enter the application graph. Period P: time interval between the beginning of execution of two consecutive data-sets Latency L: maximal time elapsed between beginning and end of execution of a data-set Failure probability FP: the probability that a processor fails during execution
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
28/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Rule of the game The application δ0
S1
δ1
w1
S2
...
w2
δ k−1
Sk
δk
wk
...
Sn
δn
wn
Cut pipeline into intervals Map each interval on a single processor... ... or replicate it to improve reliability The platform P processors Fully connected graph (i.e., a clique)
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
29/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Rule of the game The application δ0
S1
δ1
w1
S2
...
w2
δ k−1
Sk
δk
wk
...
Sn
δn
wn
Cut pipeline into intervals Map each interval on a single processor... ... or replicate it to improve reliability The platform P processors Fully connected graph (i.e., a clique)
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
29/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Rule of the game The application
S1
...
S2
Sk
P1
...
Sn P2
Cut pipeline into intervals Map each interval on a single processor... ... or replicate it to improve reliability The platform P processors Fully connected graph (i.e., a clique)
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
29/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Rule of the game The application
S1
...
S2
Sk
P1
...
Sn P2
P3
P4 P6
Cut pipeline into intervals Map each interval on a single processor... ... or replicate it to improve reliability The platform P processors Fully connected graph (i.e., a clique)
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
29/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Rule of the game The application
S1
...
S2
Sk
P1
...
Sn P2
P3
P4 P6
Cut pipeline into intervals Map each interval on a single processor... ... or replicate it to improve reliability The platform P processors Fully connected graph (i.e., a clique)
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
29/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Objective function? Mono-criterion
Bi-criteria
Minimize P Minimize L Minimize FP How to define it? Minimize α.P + β.L? Minimize α.L + β.FP? Values which are not comparable Minimize P for a fixed latency Minimize L for a fixed period Minimize FP for a fixed latency Minimize L for a fixed failure probability
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
30/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Objective function? Mono-criterion
Bi-criteria
Minimize P Minimize L Minimize FP How to define it? Minimize α.P + β.L? Minimize α.L + β.FP? Values which are not comparable Minimize P for a fixed latency Minimize L for a fixed period Minimize FP for a fixed latency Minimize L for a fixed failure probability
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
30/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Objective function? Mono-criterion
Bi-criteria
Minimize P Minimize L Minimize FP How to define it? Minimize α.P + β.L? Minimize α.L + β.FP? Values which are not comparable Minimize P for a fixed latency Minimize L for a fixed period Minimize FP for a fixed latency Minimize L for a fixed failure probability
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
30/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Objective function? Mono-criterion
Bi-criteria
Minimize P Minimize L Minimize FP How to define it? Minimize α.P + β.L? Minimize α.L + β.FP? Values which are not comparable Minimize P for a fixed latency Minimize L for a fixed period Minimize FP for a fixed latency Minimize L for a fixed failure probability
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
30/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
An Optimal Algorithm Minimize L with fixed P - Homogeneous platform L(i, q) : min. latency with exactly q procs mapping stages 1 to i L(n, q) for 1 ≤ q ≤ p Init:
( L(i, 1) =
δ0 b
+
Pi
wk k=1 s
+
δi b
if ≤ P , else
∞
L(1, q) = ∞ if q > 1
Recursion:
i i X X wk wk δ i δ j δi + + ≤P L(i, q) = min L(j, q − 1) + + j 22
Open complexity!
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
32/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Exemple: Minimizing FP with fixed latency Minimize FP with fixed latency Different speed processors - Failure heterogeneous s=1
Fixed latency: 22 10
S1 w1 = 1
1
S2
fp = 0.1
0
w2 = 100 s = 100 fp = 0.8
10 + 1/1 + 10 × 1 + 100/100 = 22 FP : 1−(1−0.1)×(1−0.810 ) < 0.2
Open complexity!
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
32/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Exemple: Minimizing FP with fixed latency Minimize FP with fixed latency Different speed processors - Failure heterogeneous s=1
Fixed latency: 22 10
S1 w1 = 1
1
S2
fp = 0.1
0
w2 = 100 s = 100 fp = 0.8
Open complexity!
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
32/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Complexity Results
Bi-criteria interval mapping Objective P&L FP & L FP & L
[email protected]
Failure / hom. het.
Hom. polynomial polynomial polynomial
July 7, 2009
Com. Hom. NP-hard polynomial open
Het. NP-hard NP-hard NP-hard
Mapping and Scheduling of Workflow Applications
33/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Integer linear programming Integer LP to solve Interval Mapping on Communication Homogeneous platforms Many integer variables: no efficient algorithm to solve Approach limited to small problem instances Absolute performance of the heuristics for such instances Bucket behavior of LP solutions 350
330 L_fixed 325
340 Optimal Period
Optimal Latency
P_fixed
330 320 310 300 300
320 315 310 305
305
310 315 320 Fixed Period
325
330
(a) Fixed P.
[email protected]
300 320 330 340 350 360 370 380 390 400 Fixed Latency
(b) Fixed L. July 7, 2009
Mapping and Scheduling of Workflow Applications
34/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Outline of the Talk 1
Replica Placement in Tree-Networks Framework Complexity Heuristics for Replica Cost Problem Experiments
2
Pipeline Workflow Applications Bi-criteria Complexity Results
3
In-network Stream Processing Heuristics and Experiments
4
Conclusion
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
35/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Rule of the Game Processors
Applications op5
op3
op2
op4
op2 ob3
op1
op1
op1 ob1
ob1
ob2
ob1
ob1
ob2
ob1
ob2 application 2
application 1
computation speed network card capacity
Goal Minimize total processing power of the target platform while matching all application requirements. Assess impact of reusing intermediate results.
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
36/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Rule of the Game Processors
Applications op5
op3
op2
op4
op2 ob3
op1
op1
op1 ob1
ob1
ob2
ob1
ob1
ob2
ob1
ob2 application 2
application 1
computation speed network card capacity
Goal Minimize total processing power of the target platform while matching all application requirements. Assess impact of reusing intermediate results.
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
36/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Rule of the Game Processors
Applications op5
op3
op2
op4
op2 ob3
op1
op1
op1 ob1
ob1
ob2
ob1
ob1
ob2
ob1
ob2 application 2
application 1
computation speed network card capacity
Goal Minimize total processing power of the target platform while matching all application requirements. Assess impact of reusing intermediate results.
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
36/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Rule of the Game Processors
Applications op5
op3
op2
op4
op2 ob3
op1
op1
op1 ob1
ob1
ob2
ob1
ob1
ob2
ob1
ob2 application 2
application 1
computation speed network card capacity
Goal Minimize total processing power of the target platform while matching all application requirements. Assess impact of reusing intermediate results.
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
36/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Rule of the Game Processors
Applications op5
op3
op2
op4
op2 ob3
op1
op1
op1 ob1
ob1
ob2
ob1
ob1
ob2
ob1
ob2 application 2
application 1
computation speed network card capacity
Goal Minimize total processing power of the target platform while matching all application requirements. Assess impact of reusing intermediate results.
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
36/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
The Application Model Application tree K applications
n5
OP = {op1 , op2 , . . . } set of operators n4
n2
OB = {ob1 , ob2 , ob3 , . . . } basic objects
n1
n3
Computation of operator opp : wp operations, δ p size of output
o1
o1
o2
o2
o3
For application k: ρ(k) application throughput Object obj
[email protected]
dj size of obj (k) fj download frequency
July 7, 2009
Mapping and Scheduling of Workflow Applications
37/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
The Platform Model
The platform P processors Fully connected graph (i.e., a clique) Objective Map operators onto processors such that processing power is minimized and all application throughputs are achieved.
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
38/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Overview of Heuristics (1)
Server selection strategies: (S1) Select the fastest processor (blocking); (S2) Select the processor with the fastest network card (blocking); (S3) Select the fastest processor (non-blocking); (S4) Select the processor with the fastest network card (non-blocking).
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
39/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Overview of Heuristics (2) Heuristics: Reuse of intermediate results (H1) RandomNoReuse
(H2) Random
(H3) TopDownBFS
(H4) TopDownDFS
(H5) BottomUpBFS
(H6) BottomUpDFS
op5
op3
op2
op4
op2 ob3
op1
op1
op1 ob1
ob1
ob2
ob1
ob1
application 1
[email protected]
July 7, 2009
ob2
ob1
ob2 application 2
Mapping and Scheduling of Workflow Applications
40/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Results Number of processors increases. 50 runs. 5 applications. 50 operators. Successful runs. 50
number of solutions
number of solutions
50 40 30 20 10 0
40 30 20 10 0
0
10
20
30
40
50
60
70
10
20
30
(S3) Fastest proc.
50
60
70
(S3) Fastest proc - no reuse.
TopDownBFS
BottomUpBFS
Random
TopDownDFS
BottomUpDFS
Random NoReuse
[email protected]
40
number of processors
number of processors
July 7, 2009
Mapping and Scheduling of Workflow Applications
41/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Results Number of processors increases. 50 runs. 5 applications. 50 operators.
1
relative performance (cost)
relative performance (cost)
Relative performance.
0.8 0.6 0.4 0.2 0 0
10
20
30
40
50
60
70
1 0.8 0.6 0.4 0.2 0 0
10
20
30
(S3) Fastest proc.
50
60
70
(S1) Fastest proc - blocking.
TopDownBFS
BottomUpBFS
Random
TopDownDFS
BottomUpDFS
Random NoReuse
[email protected]
40
number of processors
number of processors
July 7, 2009
Mapping and Scheduling of Workflow Applications
42/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Summary
Random approach dramatically bad Neglecting reuse limits success rate and quality of solution in terms of cost Top Down approach turns out to be the best BottomUp only with BFS competitive DFS unable to reuse results efficiently (bandwidth) Strong dependency of processor selection strategy on solution quality Solid combination: TopDownBFS with fastest proc non-blocking
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
43/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Outline of the Talk 1
Replica Placement in Tree-Networks Framework Complexity Heuristics for Replica Cost Problem Experiments
2
Pipeline Workflow Applications Bi-criteria Complexity Results
3
In-network Stream Processing Heuristics and Experiments
4
Conclusion
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
44/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Summary Study of three mapping and scheduling problems Replica placement Pipeline workflows In-network stream processing
1 5
4
3
2
2
3
Complexity study: Influence of heterogeneity Multi-criteria optimization Algorithms: Optimal algorithms and NP-completeness proofs Heuristics for NP-complete instances Experiments: Absolute performance via comparison to LP MPI-application of JPEG encoder
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
45/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Summary Study of three mapping and scheduling problems Replica placement δ0
Pipeline workflows
S1 w1
δ1
S2 w2
...
δ k−1
Sk wk
δk
...
Sn
δn
wn
In-network stream processing Complexity study: Influence of heterogeneity Multi-criteria optimization Algorithms: Optimal algorithms and NP-completeness proofs Heuristics for NP-complete instances Experiments: Absolute performance via comparison to LP MPI-application of JPEG encoder
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
45/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Summary Study of three mapping and scheduling problems op5
Replica placement
op3
op2
op4
op2 ob3
Pipeline workflows In-network stream processing
op1
op1
op1
ob1
ob1
ob2
ob1
ob1
application 1
ob2
ob1
ob2 application 2
Complexity study: Influence of heterogeneity Multi-criteria optimization Algorithms: Optimal algorithms and NP-completeness proofs Heuristics for NP-complete instances Experiments: Absolute performance via comparison to LP MPI-application of JPEG encoder
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
45/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Summary Study of three mapping and scheduling problems Replica placement Pipeline workflows In-network stream processing Complexity study: Influence of heterogeneity Multi-criteria optimization Algorithms: Optimal algorithms and NP-completeness proofs Heuristics for NP-complete instances Experiments: Absolute performance via comparison to LP MPI-application of JPEG encoder
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
45/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Ongoing Work
Tri-criteria optimization on pipelines: Latency - reliability - period Heuristics More ambitious application: MPEG4
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
46/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Perspectives Replica Placement More simulations for the Replica Cost problem: shape of the trees, distribution law of the requests, degree of heterogeneity of the platforms Consider the problem with several object types Extension to more complex objective functions Pipeline Workflow Extension to fork, fork-join and tree workflows Multi-criteria: new objectives like power consumption and rental cost Stream Processing Mutable applications: Operators can be rearranged based on operator associativity and commutativity rules
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
47/ 48
Replica Placement
Pipeline Workflows
In-network Stream Processing
Conclusion
Journals A. Benoit, H. Kosch, V. Rehn-Sonigo, and Y. Robert. Multi-criteria Scheduling of Pipeline Workflows (and Application to the JPEG Encoder) Int. Journal of High Performance Computing Applications, 23, 2009 A.Benoit, V. Rehn-Sonigo, and Y. Robert. Replica Placement and Access Policies in Tree Networks. IEEE Transactions on Parallel and Distributed Systems, 19(12), 2008. L. Marchal, V. Rehn, Y. Robert, and F. Vivien. Scheduling algorithms for data redistribution and load-balancing on master-slave platforms. Parallel Processing Letters, 17(1), 2007.
Conferences Replica Placement CoreGRID’2007, Core GRID Symposium 2007 ICCS’07, the 2007 International Conference on Computational Science. HCW’07, the 16th Heterogeneity in Computing Workshop. Pipeline Workflow Applications ICCS’08, the 2008 International Conference on Computational Science. HCW’08, the 17th Heterogeneity in Computing Workshop. HeteroPar’07, Algorithms, Models and Tools for Parallel Computing on Heterogeneous Networks (in conjunction with Cluster 2007). In-Network Stream Processing APDCM’09, the 11th Workshop on Advances in Parallel and Distributed Computational Models. Scheduling Strategies on Star Platforms PDP’2007, 15th Euromicro Workshop on Parallel, Distributed and Network-based Processing. HCW’06, the 15th Heterogeneous Computing Workshop.
[email protected]
July 7, 2009
Mapping and Scheduling of Workflow Applications
48/ 48