Multi-criteria Mapping and Scheduling of Workflow Applications ... - graal

0 downloads 0 Views 2MB Size Report
Mapping and scheduling for parallel machines. Makespan minimization ... Period, reliability, latency, cost, QoS, energy, etc ... Problems and questions: Where to ...
Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Multi-criteria Mapping and Scheduling of Workflow Applications onto Heterogeneous Platforms Veronika Sonigo GRAAL team, LIP ´ Ecole Normale Sup´ erieure de Lyon

Supervisors: Anne Benoit, Harald Kosch, Yves Robert

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

1/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Multi-criteria Mapping and Scheduling

Mapping and scheduling for parallel machines Makespan minimization → already a difficult problem Mapping and scheduling for large-scale heterogeneous platforms Need new objectives Period, reliability, latency, cost, QoS, energy, etc Multi-criteria optimization Assess problem complexity (polynomial /NP-hard instances) Design practical heuristics for important application problems New results and solutions for algorithmically challenging problems

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

2/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Thesis outline - Today’s Presentation Replica Placement in Tree-Networks Without Constraints QoS Constraints Bandwidth Constraints Pipeline Workflow Applications Mono-criterion Optimization Bi-criteria Optimization Example: JPEG-Encoder In-network Stream Processing Single Application - Platform Creation Multiple Applications

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

3/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Thesis outline - Today’s Presentation Replica Placement in Tree-Networks Without Constraints QoS Constraints Bandwidth Constraints Pipeline Workflow Applications Mono-criterion Optimization Bi-criteria Optimization Example: JPEG-Encoder In-network Stream Processing Single Application - Platform Creation Multiple Applications

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

3/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Outline of the Talk 1

Replica Placement in Tree-Networks Framework Complexity Heuristics for Replica Cost Problem Experiments

2

Pipeline Workflow Applications Bi-criteria Complexity Results

3

In-network Stream Processing Heuristics and Experiments

4

Conclusion

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

4/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Mimi alone at home

Internet

Problems and questions: Where to download from? How to deal with multiple users?

Where to place the replicas?

Heterogeneity [email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

5/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Mimi alone at home

Internet

?

Problems and questions: Where to download from? How to deal with multiple users?

Where to place the replicas?

Heterogeneity [email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

5/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Mimi alone at home

Internet

Problems and questions: Where to download from? How to deal with multiple users?

Where to place the replicas?

Heterogeneity [email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

5/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Mimi alone at home 01 1 0 00 11 0 1 0 1 00 11 0 1 0 0 1 111 000 01 1 0 1 0 1 00 11 000 111 0 1 0 1 0 1 00 11 0 1 0 1 0 1 000 111 000 111 0 1 0 1 0 1 000 111 000 111 0 1 0 1 0 1 0 1 0 1 000 111 0 1 0 1 0 1 0 1 0 1 11 00 000 111 0 1 0 1 0 1 00 11 0 1 0 1 000 111 0 0 1 0 1 0 1 01 1 00 11 000 111 00 11 0 1 0 1 0 1 0 1 0 1 00 11 000 111 00 11 0 1 0 1 0 1 0 1 0 1 000 111 00 11 1 1 0 0 000 111 1 1 0 0 0 1 000 111 00 11 1 1 0 0 000 111 1 0 0 01 1 000 111 0 1 0 0 1 0 1 000 111 01 1 0 1 0 1 0 1 00 11 0 1 0 1 00 11 1 1 0 0 00 11 ?

Problems and questions: Where to download from? How to deal with multiple users?

Where to place the replicas?

Heterogeneity [email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

5/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Mimi alone at home 01 1 0 00 11 0 1 0 1 00 11 0 1 0 0 1 111 000 01 1 0 1 0 1 00 11 000 111 0 1 0 1 0 1 00 11 0 1 0 1 0 1 000 111 000 111 0 1 0 1 0 1 000 111 000 111 0 1 0 1 0 1 0 1 0 1 000 111 0 1 0 1 0 1 0 1 0 1 11 00 000 111 0 1 0 1 0 1 00 11 0 1 0 1 000 111 0 0 1 0 1 0 1 01 1 00 11 000 111 00 11 0 1 0 1 0 1 0 1 0 1 00 11 000 111 00 11 0 1 0 1 0 1 0 1 0 1 000 111 00 11 1 1 0 0 000 111 1 1 0 0 0 1 000 111 00 11 1 1 0 0 000 111 1 0 0 01 1 000 0 1 0 0 1 0 111 1 000 111 01 1 0 1 0 1 0 1 00 11 0 1 0 1 00 11 0 1 0 1 00 11 ?

?

?

?

Problems and questions: Where to download from? How to deal with multiple users?

Where to place the replicas?

Heterogeneity [email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

5/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Mimi alone at home 01 1 0 00 11 0 1 0 1 00 11 0 1 0 0 1 111 000 01 1 0 1 0 1 00 11 000 111 0 1 0 1 0 1 00 11 0 1 0 1 0 1 000 111 000 111 0 1 0 1 0 1 000 111 000 111 0 1 0 1 0 1 0 1 0 1 000 111 0 1 0 1 0 1 0 1 0 1 11 00 000 111 0 1 0 1 0 1 00 11 0 1 0 1 000 111 0 0 1 0 1 0 1 01 1 00 11 000 111 00 11 0 1 0 1 0 1 0 1 0 1 00 11 000 111 00 11 0 1 0 1 0 1 0 1 0 1 000 111 00 11 1 1 0 0 000 111 1 1 0 0 0 1 000 111 00 11 1 1 0 0 000 111 1 0 0 01 1 000 0 1 0 0 1 0 111 1 000 111 01 1 0 1 0 1 0 1 00 11 0 1 0 1 00 11 0 1 0 1 00 11 ?

?

?

?

Problems and questions: Where to download from? How to deal with multiple users?

Where to place the replicas?

Heterogeneity [email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

5/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Mimi alone at home 01 1 0 00 11 0 1 0 1 00 11 0 1 0 0 1 111 000 01 1 0 1 0 1 00 11 000 111 0 1 0 1 0 1 00 11 0 1 0 1 0 1 000 111 000 111 0 1 0 1 0 1 000 111 000 111 0 1 0 1 0 1 0 1 0 1 000 111 0 1 0 1 0 1 0 1 0 1 11 00 000 111 0 1 0 1 0 1 00 11 0 1 0 1 000 111 0 0 1 0 1 0 1 01 1 00 11 000 111 00 11 0 1 0 1 0 1 0 1 0 1 00 11 000 111 00 11 0 1 0 1 0 1 0 1 0 1 000 111 00 11 1 1 0 0 000 111 1 1 0 0 0 1 000 111 00 11 1 1 0 0 000 111 1 0 0 01 1 000 0 1 0 0 1 0 111 1 000 111 01 1 0 1 0 1 0 1 00 11 0 1 0 1 00 11 0 1 0 1 00 11 ?

?

?

?

Problems and questions: Where to download from? How to deal with multiple users?

Where to place the replicas?

Heterogeneity [email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

5/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Introduction and motivation

Replica placement in tree networks Set of clients (tree leaves): requests with QoS or bandwidth constraints, known in advance Internal nodes may be provided with a replica; in this case they become servers and process requests (up to their capacity limit) Research questions: How many replicas required? Which locations?

[email protected]

July 7, 2009

Total replica cost? Quality of Service?

Mapping and Scheduling of Workflow Applications

6/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Introduction and motivation

Replica placement in tree networks Set of clients (tree leaves): requests with QoS or bandwidth constraints, known in advance Internal nodes may be provided with a replica; in this case they become servers and process requests (up to their capacity limit) Research questions: How many replicas required? Which locations?

[email protected]

July 7, 2009

Total replica cost? Quality of Service?

Mapping and Scheduling of Workflow Applications

6/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Introduction and motivation

Replica placement in tree networks Set of clients (tree leaves): requests with QoS or bandwidth constraints, known in advance Internal nodes may be provided with a replica; in this case they become servers and process requests (up to their capacity limit) Research questions: How many replicas required? Which locations?

[email protected]

July 7, 2009

Total replica cost? Quality of Service?

Mapping and Scheduling of Workflow Applications

6/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Introduction and motivation

Replica placement in tree networks Set of clients (tree leaves): requests with QoS or bandwidth constraints, known in advance Internal nodes may be provided with a replica; in this case they become servers and process requests (up to their capacity limit) Research questions: How many replicas required? Which locations?

[email protected]

July 7, 2009

Total replica cost? Quality of Service?

Mapping and Scheduling of Workflow Applications

6/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Introduction and motivation

Replica placement in tree networks Set of clients (tree leaves): requests with QoS or bandwidth constraints, known in advance Internal nodes may be provided with a replica; in this case they become servers and process requests (up to their capacity limit) Research questions: How many replicas required? Which locations?

[email protected]

July 7, 2009

Total replica cost? Quality of Service?

Mapping and Scheduling of Workflow Applications

6/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Rule of the game Handle all client requests, and minimize cost of replicas → Replica Placement problem Several policies to assign replicas W = 10

1 5

[email protected]

4

July 7, 2009

3

2

2

3

Mapping and Scheduling of Workflow Applications

7/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Rule of the game Handle all client requests, and minimize cost of replicas → Replica Placement problem Several policies to assign replicas W = 10

1 5

[email protected]

4

July 7, 2009

3

2

2

3

Mapping and Scheduling of Workflow Applications

7/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Rule of the game Handle all client requests, and minimize cost of replicas → Replica Placement problem Several policies to assign replicas W = 10

1 5

[email protected]

4

July 7, 2009

3

2

2

3

Mapping and Scheduling of Workflow Applications

7/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Rule of the game Handle all client requests, and minimize cost of replicas → Replica Placement problem Several policies to assign replicas W = 10

1 5

4

3

2

2

3

Closest

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

7/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Rule of the game Handle all client requests, and minimize cost of replicas → Replica Placement problem Several policies to assign replicas W = 10

1 5

4

3

2

2

3

Upwards

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

7/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Rule of the game Handle all client requests, and minimize cost of replicas → Replica Placement problem Several policies to assign replicas W = 10

3 2

1

5

4

3

2

2

3

Multiple

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

7/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Major contributions

Theory New access policies Problem complexity LP-based optimal solution to cost of Replica Placement Practice Heuristics for each policy Experiments to assess impact of new policies Experiments to assess impact of QoS on different policies

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

8/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Major contributions

Theory New access policies Problem complexity LP-based optimal solution to cost of Replica Placement Practice Heuristics for each policy Experiments to assess impact of new policies Experiments to assess impact of QoS on different policies

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

8/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Definitions and notations

Distribution tree T , clients C (leaf nodes), internal nodes N Client i ∈ C: Sends ri requests per time unit (number of accesses to a single object database) Quality of service qi (response time)

Node j ∈ N : Can contain the object database replica (server) or not Processing capacity Wj Storage cost scj

Tree edge: l ∈ L (communication link between nodes) Communication time comml Bandwidth limit BWl

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

9/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Definitions and notations

Distribution tree T , clients C (leaf nodes), internal nodes N Client i ∈ C: Sends ri requests per time unit (number of accesses to a single object database) Quality of service qi (response time)

Node j ∈ N : Can contain the object database replica (server) or not Processing capacity Wj Storage cost scj

Tree edge: l ∈ L (communication link between nodes) Communication time comml Bandwidth limit BWl

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

9/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Definitions and notations

Distribution tree T , clients C (leaf nodes), internal nodes N Client i ∈ C: Sends ri requests per time unit (number of accesses to a single object database) Quality of service qi (response time)

Node j ∈ N : Can contain the object database replica (server) or not Processing capacity Wj Storage cost scj

Tree edge: l ∈ L (communication link between nodes) Communication time comml Bandwidth limit BWl

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

9/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Definitions and notations

Distribution tree T , clients C (leaf nodes), internal nodes N Client i ∈ C: Sends ri requests per time unit (number of accesses to a single object database) Quality of service qi (response time)

Node j ∈ N : Can contain the object database replica (server) or not Processing capacity Wj Storage cost scj

Tree edge: l ∈ L (communication link between nodes) Communication time comml Bandwidth limit BWl

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

9/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Problem instances

Minimize

P

scs under the constraints: P Server capacity – ∀s ∈ R, i∈C|s∈Servers(i) ri,s ≤ Ws P QoS – ∀i ∈ C, ∀s ∈ Servers(i), l∈path[i→s] comml ≤ qi . P Link capacity – ∀l ∈ L i∈C,s∈Servers(i)|l∈path[i→s] ri,s ≤ BWl s∈R

Restrict to case where scs = Ws : Replica Counting problem on homogeneous platforms, Replica Cost problem with heterogeneous servers.

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

10/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Problem instances

Minimize

P

scs under the constraints: P Server capacity – ∀s ∈ R, i∈C|s∈Servers(i) ri,s ≤ Ws P QoS – ∀i ∈ C, ∀s ∈ Servers(i), l∈path[i→s] comml ≤ qi . P Link capacity – ∀l ∈ L i∈C,s∈Servers(i)|l∈path[i→s] ri,s ≤ BWl s∈R

Restrict to case where scs = Ws : Replica Counting problem on homogeneous platforms, Replica Cost problem with heterogeneous servers.

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

10/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Problem instances

Minimize

P

scs under the constraints: P Server capacity – ∀s ∈ R, i∈C|s∈Servers(i) ri,s ≤ Ws P QoS – ∀i ∈ C, ∀s ∈ Servers(i), l∈path[i→s] comml ≤ qi . P Link capacity – ∀l ∈ L i∈C,s∈Servers(i)|l∈path[i→s] ri,s ≤ BWl s∈R

Restrict to case where scs = Ws : Replica Counting problem on homogeneous platforms, Replica Cost problem with heterogeneous servers.

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

10/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Example: existence of a solution

s2 (a)

s1

1

s2 (b)

s1

1

s2 (c)

1

W =1

s1

2

(a): solution for all policies (Closest, Upwards, Multiple) (b): no solution with Closest (c): no solution with Closest nor Upwards

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

11/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Example: existence of a solution

s2 (a)

s1

1

s2 (b)

s1

1

s2 (c)

1

W =1

s1

2

(a): solution for all policies (Closest, Upwards, Multiple) (b): no solution with Closest (c): no solution with Closest nor Upwards

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

11/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Example: existence of a solution

s2 (a)

s1

1

s2 (b)

s1

1

s2 (c)

1

W =1

s1

2

(a): solution for all policies (Closest, Upwards, Multiple) (b): no solution with Closest (c): no solution with Closest nor Upwards

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

11/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Example: existence of a solution

s2 (a)

s1

1

s2 (b)

s1

1

s2 (c)

1

W =1

s1

2

(a): solution for all policies (Closest, Upwards, Multiple) (b): no solution with Closest (c): no solution with Closest nor Upwards

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

11/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Complexity results

Homogeneous platform: Replica Counting problem, no bandwidth constraints Closest Upwards Multiple

No QoS polynomial [Cidon02,Liu06] NP-hard polynomial

With QoS polynomial [Liu06] NP-hard NP-hard

Homogeneous platforms with bandwidth and QoS constraints: Closest remains polynomial Heterogeneous platforms: all problems are NP-hard

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

12/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Complexity results

Homogeneous platform: Replica Counting problem, no bandwidth constraints Closest Upwards Multiple

No QoS polynomial [Cidon02,Liu06] NP-hard polynomial

With QoS polynomial [Liu06] NP-hard NP-hard

Homogeneous platforms with bandwidth and QoS constraints: Closest remains polynomial Heterogeneous platforms: all problems are NP-hard

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

12/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Complexity results

Homogeneous platform: Replica Counting problem, no bandwidth constraints Closest Upwards Multiple

No QoS polynomial [Cidon02,Liu06] NP-hard polynomial

With QoS polynomial [Liu06] NP-hard NP-hard

Homogeneous platforms with bandwidth and QoS constraints: Closest remains polynomial Heterogeneous platforms: all problems are NP-hard

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

12/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Complexity results

Homogeneous platform: Replica Counting problem, no bandwidth constraints Closest Upwards Multiple

No QoS polynomial [Cidon02,Liu06] NP-hard polynomial

With QoS polynomial [Liu06] NP-hard NP-hard

Homogeneous platforms with bandwidth and QoS constraints: Closest remains polynomial Heterogeneous platforms: all problems are NP-hard

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

12/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Algo: Homogeneous Platform with QoS and Bandwidth Basic idea: computation of the minimal necessary number of replicas in a subtree

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

13/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Algo: Homogeneous Platform with QoS and Bandwidth Basic idea: computation of the minimal necessary number of replicas in a subtree Case 1: too many requests

W = 10

r:

[email protected]

3

5

July 7, 2009

4

Mapping and Scheduling of Workflow Applications

13/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Algo: Homogeneous Platform with QoS and Bandwidth Basic idea: computation of the minimal necessary number of replicas in a subtree Case 1: too many requests

W = 10 P

i

r:

[email protected]

3

5

July 7, 2009

r (i) = 12

4

Mapping and Scheduling of Workflow Applications

13/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Algo: Homogeneous Platform with QoS and Bandwidth Basic idea: computation of the minimal necessary number of replicas in a subtree Case 1: too many requests

W = 10 P

i

r (i) = 12

1 replica

r:

[email protected]

3

5

July 7, 2009

4

Mapping and Scheduling of Workflow Applications

13/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Algo: Homogeneous Platform with QoS and Bandwidth Basic idea: computation of the minimal necessary number of replicas in a subtree Case 2: QoS constraints

W = 10

r: q:

[email protected]

3 1

5 3

July 7, 2009

4 2

Mapping and Scheduling of Workflow Applications

13/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Algo: Homogeneous Platform with QoS and Bandwidth Basic idea: computation of the minimal necessary number of replicas in a subtree Case 2: QoS constraints

W = 10 q(i) < hops 1 replica

r: q:

[email protected]

3 1

5 3

July 7, 2009

4 2

Mapping and Scheduling of Workflow Applications

13/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Algo: Homogeneous Platform with QoS and Bandwidth Basic idea: computation of the minimal necessary number of replicas in a subtree Case 3: bandwidth constraints

W = 10 4

b:

5

2

4 r:

[email protected]

3

5

July 7, 2009

4

Mapping and Scheduling of Workflow Applications

13/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Algo: Homogeneous Platform with QoS and Bandwidth Basic idea: computation of the minimal necessary number of replicas in a subtree Case 3: bandwidth constraints

W = 10 4

b:

5

b(l) < r (i)

2

4 1 replica

r:

[email protected]

3

5

July 7, 2009

4

Mapping and Scheduling of Workflow Applications

13/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Algo: Homogeneous Platform with QoS and Bandwidth Basic idea: computation of the minimal necessary number of replicas in a subtree

W = 10 4

b:

5

2

1 replica

4 1 replica

r: q:

3 1

5 3

4 2 2 replicas

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

13/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

ORP - Optimal Replica Placement Algorithm Preparation Tree transformation Step 1 Bottom up computation of the contribution of client requests

6

C (v , i) : the contribution of node v on its i-th ancestor

6 4 b:

5

2

5 3

4 2

4 r: q:

3 1

[email protected]

2 2

e(v , i) : children of v that have to be equipped with a replica to minimize the contribution on the i-th ancestor of v (respecting some additional constraints).

July 7, 2009

Mapping and Scheduling of Workflow Applications

14/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

ORP - Optimal Replica Placement Algorithm Preparation Tree transformation Step 1 Bottom up computation of the contribution of client requests Step 2 Top down replica placement procedure Place-replica (v, i) if v ∈ C then return; end place a replica at each node of e(v , i); forall c ∈ children(v ) do if c ∈ e(v , i) then Place-replica(c,0); else Place-replica(c,i+1); end end [email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

15/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Linear programming General instance of the problem: Heterogeneous platform, QoS+bandwidth, Closest, Upwards and Multiple policies Solving over the rationals: solution for all practical values of the problem size Not very precise bound Upwards/Closest equivalent to Multiple

Integer solving: limitation to s ≤ 50 nodes and clients Mixed bound obtained by solving the Upwards formulation over the rational and imposing only the xj being integers Resolution for problem sizes s ≤ 400 Improved bound: if a server is used only at 50% of its capacity, the cost of placing a replica at this node is not halved as it would be with xj = 0.5 → optimal solution for Multiple

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

16/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Linear programming General instance of the problem: Heterogeneous platform, QoS+bandwidth, Closest, Upwards and Multiple policies Solving over the rationals: solution for all practical values of the problem size Not very precise bound Upwards/Closest equivalent to Multiple

Integer solving: limitation to s ≤ 50 nodes and clients Mixed bound obtained by solving the Upwards formulation over the rational and imposing only the xj being integers Resolution for problem sizes s ≤ 400 Improved bound: if a server is used only at 50% of its capacity, the cost of placing a replica at this node is not halved as it would be with xj = 0.5 → optimal solution for Multiple

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

16/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Linear programming General instance of the problem: Heterogeneous platform, QoS+bandwidth, Closest, Upwards and Multiple policies Solving over the rationals: solution for all practical values of the problem size Not very precise bound Upwards/Closest equivalent to Multiple

Integer solving: limitation to s ≤ 50 nodes and clients Mixed bound obtained by solving the Upwards formulation over the rational and imposing only the xj being integers Resolution for problem sizes s ≤ 400 Improved bound: if a server is used only at 50% of its capacity, the cost of placing a replica at this node is not halved as it would be with xj = 0.5 → optimal solution for Multiple

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

16/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Linear programming General instance of the problem: Heterogeneous platform, QoS+bandwidth, Closest, Upwards and Multiple policies Solving over the rationals: solution for all practical values of the problem size Not very precise bound Upwards/Closest equivalent to Multiple

Integer solving: limitation to s ≤ 50 nodes and clients Mixed bound obtained by solving the Upwards formulation over the rational and imposing only the xj being integers Resolution for problem sizes s ≤ 400 Improved bound: if a server is used only at 50% of its capacity, the cost of placing a replica at this node is not halved as it would be with xj = 0.5 → optimal solution for Multiple

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

16/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Heuristics Polynomial heuristics for Replica Cost problem Heterogeneous platforms Heuristics with and without QoS QoS constraints: QoS of client i represents the maximum distance (number of hops) between i and server(i)

Experimental assessment of relative performance of the three policies Impact of QoS No QoS: Traversals of the tree, bottom-up or top-down QoS: Sorted lists Worst case complexity O(s 2 ), where s = |C| + |N | is problem size

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

17/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Heuristics Polynomial heuristics for Replica Cost problem Heterogeneous platforms Heuristics with and without QoS QoS constraints: QoS of client i represents the maximum distance (number of hops) between i and server(i)

Experimental assessment of relative performance of the three policies Impact of QoS No QoS: Traversals of the tree, bottom-up or top-down QoS: Sorted lists Worst case complexity O(s 2 ), where s = |C| + |N | is problem size

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

17/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Heuristics Polynomial heuristics for Replica Cost problem Heterogeneous platforms Heuristics with and without QoS QoS constraints: QoS of client i represents the maximum distance (number of hops) between i and server(i)

Experimental assessment of relative performance of the three policies Impact of QoS No QoS: Traversals of the tree, bottom-up or top-down QoS: Sorted lists Worst case complexity O(s 2 ), where s = |C| + |N | is problem size

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

17/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Heuristics for Closest- No QoS Closest Top Down Largest First CTDLF Traversal of the tree, treating subtrees that contains most requests first When a node can process the requests of all the clients in its subtree, node chosen as a server and traversal stopped Procedure called until no more servers are added

n1

n2 n3

18

9

n4

2

3

1

1

2

5

2

Cost

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

18/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Heuristics for Closest- No QoS Closest Top Down Largest First CTDLF Traversal of the tree, treating subtrees that contains most requests first When a node can process the requests of all the clients in its subtree, node chosen as a server and traversal stopped Procedure called until no more servers are added

n1

n2 n3

18

9

n4

2

3

1

1

2

5

2

Cost: 18

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

18/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Heuristics for Closest- No QoS Closest Top Down Largest First CTDLF Traversal of the tree, treating subtrees that contains most requests first When a node can process the requests of all the clients in its subtree, node chosen as a server and traversal stopped Procedure called until no more servers are added

n1

n2 n3

8

9

n4

2

3

1

1

2

5

2

Cost

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

18/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Heuristics for Closest- No QoS Closest Top Down Largest First CTDLF Traversal of the tree, treating subtrees that contains most requests first When a node can process the requests of all the clients in its subtree, node chosen as a server and traversal stopped Procedure called until no more servers are added

n1

n2 n3

8

9

n4

2

3

1

1

2

5

2

Solution cost: 17

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

18/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Heuristics for Closest- QoS Closest Big Subtree First CBSF n1

Traversal of the tree, treating subtrees that contain most requests first When a node can process the requests of all the clients in its subtree, node chosen as a server and traversal stopped Procedure called until no more servers are added

n2 n3

18

9

n4

2

1

2 5 2 3 1 q=3 q=1 q=1 q=2 q=3

Cost

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

19/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Heuristics for Closest- QoS Closest Big Subtree First CBSF n1

Traversal of the tree, treating subtrees that contain most requests first When a node can process the requests of all the clients in its subtree, node chosen as a server and traversal stopped Procedure called until no more servers are added

n2 n3

18

9

n4

2

1

2 5 2 3 1 q=3 q=1 q=1 q=2 q=3

Cost

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

19/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Heuristics for Closest- QoS Closest Big Subtree First CBSF n1

Traversal of the tree, treating subtrees that contain most requests first When a node can process the requests of all the clients in its subtree, node chosen as a server and traversal stopped Procedure called until no more servers are added

n2 n3

18

9

n4

2

1

2 5 2 3 1 q=3 q=1 q=1 q=2 q=3

Cost: 27

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

19/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Results - Percentage of success - QoS Number of solutions for each lambda and each heuristic average(qos) = height/2 100

P ri λ = P i∈C j∈N Wi

percentage of success

80

60

40

20

0 0.1

0.2

0.3

0.4

Closest_BigSubtreeFirst Closest_SmallQoSFirst Upwards_SQoS_Started Upwards_SQoS_MinReq Upwards_DistServer_Indisp

[email protected]

July 7, 2009

0.5 lambda

0.6

0.7

0.8

0.9

Multiple_SQoS_Close Multiple_SQoS_MinReq Multiple_MinQoS_Indisp MixedBest LP

Mapping and Scheduling of Workflow Applications

20/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Results - Relative Performance Distance of the result (in terms of replica cost) of the heuristic to the optimal solution Tλ : subset of trees with a solution Relative performance: rperf =

1 X costLP (t) |Tλ | costh (t) t∈Tλ

costLP(t) : optimal cost on tree t costh (t): heuristic cost on tree t; costh (t) = +∞ if h did not find any solution

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

21/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Results - Relative Performance Distance of the result (in terms of replica cost) of the heuristic to the optimal solution Tλ : subset of trees with a solution Relative performance: rperf =

1 X costLP (t) |Tλ | costh (t) t∈Tλ

costLP(t) : optimal cost on tree t costh (t): heuristic cost on tree t; costh (t) = +∞ if h did not find any solution

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

21/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Results - Relative Performance - No QoS Heterogeneous results - similar to the homogeneous case

100

relative performance

80

60

40

20

0 0.1

0.2

0.3

0.4

ClosestTopDownAll ClosestTopDownLargestFirst ClosestBottomUp UpwardsTopDown UpwardsBigClientFirst

[email protected]

July 7, 2009

0.5 lambda

0.6

0.7

0.8

0.9

MultipleGreedy MultipleTopDown MultipleBottomUp MixedBest

Mapping and Scheduling of Workflow Applications

22/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Results - Relative Performance - QoS average(qos) = height/2

100

relative performance

80

60

40

20

0 0.1

0.2

0.3

0.4

Closest_BigSubtreeFirst Closest_SmallQoSFirst Upwards_SQoS_Started Upwards_SQoS_MinReq Upwards_DistServer_Indisp

[email protected]

July 7, 2009

0.5 lambda

0.6

0.7

0.8

0.9

Multiple_SQoS_Close Multiple_SQoS_MinReq Multiple_MinQoS_Indisp MixedBest

Mapping and Scheduling of Workflow Applications

23/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Summary No QoS: Striking effect of new policies: many more solutions to the Replica Placement problem Multiple ≥ Upwards ≥ Closest: hierarchy observed within our heuristics Best Multiple heuristic (MB) always at 85% of the optimal: satisfactory result QoS: Hierarchy also under QoS constraints Performance compared to the optimal solution: qos ∈ {1, 2}: 95% average(qos) = height/2: 85% no qos: 85% Smaller trees: results slightly less good [email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

24/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Summary No QoS: Striking effect of new policies: many more solutions to the Replica Placement problem Multiple ≥ Upwards ≥ Closest: hierarchy observed within our heuristics Best Multiple heuristic (MB) always at 85% of the optimal: satisfactory result QoS: Hierarchy also under QoS constraints Performance compared to the optimal solution: qos ∈ {1, 2}: 95% average(qos) = height/2: 85% no qos: 85% Smaller trees: results slightly less good [email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

24/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Related work Several papers on replica placement, but... ...all consider only the Closest policy Replica Placement in a general graph is NP-complete Wolfson and Milo: impact of the write cost, use of a minimum spanning tree for updates. Tree networks: polynomial solution Cidon et al (multiple objects) and Liu et al (QoS constraints): polynomial algorithms for homogeneous networks. Kalpakis et al: NP-completeness of a variant with bidirectional links (requests served by any node in the tree) Karlsson et al: comparison of different objective functions and several heuristics. No QoS, but several other constraints. Tang et al: real QoS constraints Rodolakis et al: Multiple policy but in a very different context [email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

25/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Related work Several papers on replica placement, but... ...all consider only the Closest policy Replica Placement in a general graph is NP-complete Wolfson and Milo: impact of the write cost, use of a minimum spanning tree for updates. Tree networks: polynomial solution Cidon et al (multiple objects) and Liu et al (QoS constraints): polynomial algorithms for homogeneous networks. Kalpakis et al: NP-completeness of a variant with bidirectional links (requests served by any node in the tree) Karlsson et al: comparison of different objective functions and several heuristics. No QoS, but several other constraints. Tang et al: real QoS constraints Rodolakis et al: Multiple policy but in a very different context [email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

25/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Related work Several papers on replica placement, but... ...all consider only the Closest policy Replica Placement in a general graph is NP-complete Wolfson and Milo: impact of the write cost, use of a minimum spanning tree for updates. Tree networks: polynomial solution Cidon et al (multiple objects) and Liu et al (QoS constraints): polynomial algorithms for homogeneous networks. Kalpakis et al: NP-completeness of a variant with bidirectional links (requests served by any node in the tree) Karlsson et al: comparison of different objective functions and several heuristics. No QoS, but several other constraints. Tang et al: real QoS constraints Rodolakis et al: Multiple policy but in a very different context [email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

25/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Outline of the Talk 1

Replica Placement in Tree-Networks Framework Complexity Heuristics for Replica Cost Problem Experiments

2

Pipeline Workflow Applications Bi-criteria Complexity Results

3

In-network Stream Processing Heuristics and Experiments

4

Conclusion

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

26/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Introduction and motivation Mapping applications onto parallel platforms Difficult challenge Heterogeneous clusters, fully heterogeneous platforms Even more difficult! Structured programming approach Easier to program (deadlocks, process starvation) Range of well-known paradigms (pipeline, farm) Algorithmic skeleton: help for mapping

Focus on pipeline applications Mapping the JPEG encoder pipeline onto a cluster of workstations.

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

27/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Introduction and motivation Mapping applications onto parallel platforms Difficult challenge Heterogeneous clusters, fully heterogeneous platforms Even more difficult! Structured programming approach Easier to program (deadlocks, process starvation) Range of well-known paradigms (pipeline, farm) Algorithmic skeleton: help for mapping

Focus on pipeline applications Mapping the JPEG encoder pipeline onto a cluster of workstations.

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

27/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Introduction and motivation Mapping applications onto parallel platforms Difficult challenge Heterogeneous clusters, fully heterogeneous platforms Even more difficult! Structured programming approach Easier to program (deadlocks, process starvation) Range of well-known paradigms (pipeline, farm) Algorithmic skeleton: help for mapping

Focus on pipeline applications Mapping the JPEG encoder pipeline onto a cluster of workstations.

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

27/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Multi-criteria scheduling of workflows Source Image Data

Scaling

YUV Conversion

Block Storage

Subsampling

FDCT

Quantizier

Entropy Encoder

Quantization Table

Huffman Table

Compressed Image Data

Workflow

Several consecutive data-sets enter the application graph. Period P: time interval between the beginning of execution of two consecutive data-sets Latency L: maximal time elapsed between beginning and end of execution of a data-set Failure probability FP: the probability that a processor fails during execution [email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

28/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Rule of the game The application δ0

S1

δ1

w1

S2

...

w2

δ k−1

Sk

δk

wk

...

Sn

δn

wn

Cut pipeline into intervals Map each interval on a single processor... ... or replicate it to improve reliability The platform P processors Fully connected graph (i.e., a clique) [email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

29/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Rule of the game The application δ0

S1

δ1

w1

S2

...

w2

δ k−1

Sk

δk

wk

...

Sn

δn

wn

Cut pipeline into intervals Map each interval on a single processor... ... or replicate it to improve reliability The platform P processors Fully connected graph (i.e., a clique) [email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

29/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Rule of the game The application

S1

...

S2

Sk

P1

...

Sn P2

Cut pipeline into intervals Map each interval on a single processor... ... or replicate it to improve reliability The platform P processors Fully connected graph (i.e., a clique) [email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

29/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Rule of the game The application

S1

...

S2

Sk

P1

...

Sn P2

P3

P4 P6

Cut pipeline into intervals Map each interval on a single processor... ... or replicate it to improve reliability The platform P processors Fully connected graph (i.e., a clique) [email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

29/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Rule of the game The application

S1

...

S2

Sk

P1

...

Sn P2

P3

P4 P6

Cut pipeline into intervals Map each interval on a single processor... ... or replicate it to improve reliability The platform P processors Fully connected graph (i.e., a clique) [email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

29/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Objective function? Mono-criterion

Bi-criteria

Minimize P Minimize L Minimize FP How to define it? Minimize α.P + β.L? Minimize α.L + β.FP? Values which are not comparable Minimize P for a fixed latency Minimize L for a fixed period Minimize FP for a fixed latency Minimize L for a fixed failure probability

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

30/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Objective function? Mono-criterion

Bi-criteria

Minimize P Minimize L Minimize FP How to define it? Minimize α.P + β.L? Minimize α.L + β.FP? Values which are not comparable Minimize P for a fixed latency Minimize L for a fixed period Minimize FP for a fixed latency Minimize L for a fixed failure probability

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

30/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Objective function? Mono-criterion

Bi-criteria

Minimize P Minimize L Minimize FP How to define it? Minimize α.P + β.L? Minimize α.L + β.FP? Values which are not comparable Minimize P for a fixed latency Minimize L for a fixed period Minimize FP for a fixed latency Minimize L for a fixed failure probability

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

30/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Objective function? Mono-criterion

Bi-criteria

Minimize P Minimize L Minimize FP How to define it? Minimize α.P + β.L? Minimize α.L + β.FP? Values which are not comparable Minimize P for a fixed latency Minimize L for a fixed period Minimize FP for a fixed latency Minimize L for a fixed failure probability

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

30/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

An Optimal Algorithm Minimize L with fixed P - Homogeneous platform L(i, q) : min. latency with exactly q procs mapping stages 1 to i L(n, q) for 1 ≤ q ≤ p Init:

( L(i, 1) =

δ0 b

+

Pi

wk k=1 s

+

δi b

if ≤ P , else



L(1, q) = ∞ if q > 1

Recursion:

  i i   X X wk wk δ i δ j δi + + ≤P L(i, q) = min L(j, q − 1) + +  j 22

Open complexity!

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

32/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Exemple: Minimizing FP with fixed latency Minimize FP with fixed latency Different speed processors - Failure heterogeneous s=1

Fixed latency: 22 10

S1 w1 = 1

1

S2

fp = 0.1

0

w2 = 100 s = 100 fp = 0.8

10 + 1/1 + 10 × 1 + 100/100 = 22 FP : 1−(1−0.1)×(1−0.810 ) < 0.2

Open complexity!

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

32/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Exemple: Minimizing FP with fixed latency Minimize FP with fixed latency Different speed processors - Failure heterogeneous s=1

Fixed latency: 22 10

S1 w1 = 1

1

S2

fp = 0.1

0

w2 = 100 s = 100 fp = 0.8

Open complexity!

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

32/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Complexity Results

Bi-criteria interval mapping Objective P&L FP & L FP & L

[email protected]

Failure / hom. het.

Hom. polynomial polynomial polynomial

July 7, 2009

Com. Hom. NP-hard polynomial open

Het. NP-hard NP-hard NP-hard

Mapping and Scheduling of Workflow Applications

33/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Integer linear programming Integer LP to solve Interval Mapping on Communication Homogeneous platforms Many integer variables: no efficient algorithm to solve Approach limited to small problem instances Absolute performance of the heuristics for such instances Bucket behavior of LP solutions 350

330 L_fixed 325

340 Optimal Period

Optimal Latency

P_fixed

330 320 310 300 300

320 315 310 305

305

310 315 320 Fixed Period

325

330

(a) Fixed P. [email protected]

300 320 330 340 350 360 370 380 390 400 Fixed Latency

(b) Fixed L. July 7, 2009

Mapping and Scheduling of Workflow Applications

34/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Outline of the Talk 1

Replica Placement in Tree-Networks Framework Complexity Heuristics for Replica Cost Problem Experiments

2

Pipeline Workflow Applications Bi-criteria Complexity Results

3

In-network Stream Processing Heuristics and Experiments

4

Conclusion

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

35/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Rule of the Game Processors

Applications op5

op3

op2

op4

op2 ob3

op1

op1

op1 ob1

ob1

ob2

ob1

ob1

ob2

ob1

ob2 application 2

application 1

computation speed network card capacity

Goal Minimize total processing power of the target platform while matching all application requirements. Assess impact of reusing intermediate results. [email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

36/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Rule of the Game Processors

Applications op5

op3

op2

op4

op2 ob3

op1

op1

op1 ob1

ob1

ob2

ob1

ob1

ob2

ob1

ob2 application 2

application 1

computation speed network card capacity

Goal Minimize total processing power of the target platform while matching all application requirements. Assess impact of reusing intermediate results. [email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

36/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Rule of the Game Processors

Applications op5

op3

op2

op4

op2 ob3

op1

op1

op1 ob1

ob1

ob2

ob1

ob1

ob2

ob1

ob2 application 2

application 1

computation speed network card capacity

Goal Minimize total processing power of the target platform while matching all application requirements. Assess impact of reusing intermediate results. [email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

36/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Rule of the Game Processors

Applications op5

op3

op2

op4

op2 ob3

op1

op1

op1 ob1

ob1

ob2

ob1

ob1

ob2

ob1

ob2 application 2

application 1

computation speed network card capacity

Goal Minimize total processing power of the target platform while matching all application requirements. Assess impact of reusing intermediate results. [email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

36/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Rule of the Game Processors

Applications op5

op3

op2

op4

op2 ob3

op1

op1

op1 ob1

ob1

ob2

ob1

ob1

ob2

ob1

ob2 application 2

application 1

computation speed network card capacity

Goal Minimize total processing power of the target platform while matching all application requirements. Assess impact of reusing intermediate results. [email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

36/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

The Application Model Application tree K applications

n5

OP = {op1 , op2 , . . . } set of operators n4

n2

OB = {ob1 , ob2 , ob3 , . . . } basic objects

n1

n3

Computation of operator opp : wp operations, δ p size of output

o1

o1

o2

o2

o3

For application k: ρ(k) application throughput Object obj

[email protected]

dj size of obj (k) fj download frequency

July 7, 2009

Mapping and Scheduling of Workflow Applications

37/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

The Platform Model

The platform P processors Fully connected graph (i.e., a clique) Objective Map operators onto processors such that processing power is minimized and all application throughputs are achieved.

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

38/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Overview of Heuristics (1)

Server selection strategies: (S1) Select the fastest processor (blocking); (S2) Select the processor with the fastest network card (blocking); (S3) Select the fastest processor (non-blocking); (S4) Select the processor with the fastest network card (non-blocking).

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

39/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Overview of Heuristics (2) Heuristics: Reuse of intermediate results (H1) RandomNoReuse

(H2) Random

(H3) TopDownBFS

(H4) TopDownDFS

(H5) BottomUpBFS

(H6) BottomUpDFS

op5

op3

op2

op4

op2 ob3

op1

op1

op1 ob1

ob1

ob2

ob1

ob1

application 1

[email protected]

July 7, 2009

ob2

ob1

ob2 application 2

Mapping and Scheduling of Workflow Applications

40/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Results Number of processors increases. 50 runs. 5 applications. 50 operators. Successful runs. 50

number of solutions

number of solutions

50 40 30 20 10 0

40 30 20 10 0

0

10

20

30

40

50

60

70

10

20

30

(S3) Fastest proc.

50

60

70

(S3) Fastest proc - no reuse.

TopDownBFS

BottomUpBFS

Random

TopDownDFS

BottomUpDFS

Random NoReuse

[email protected]

40

number of processors

number of processors

July 7, 2009

Mapping and Scheduling of Workflow Applications

41/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Results Number of processors increases. 50 runs. 5 applications. 50 operators.

1

relative performance (cost)

relative performance (cost)

Relative performance.

0.8 0.6 0.4 0.2 0 0

10

20

30

40

50

60

70

1 0.8 0.6 0.4 0.2 0 0

10

20

30

(S3) Fastest proc.

50

60

70

(S1) Fastest proc - blocking.

TopDownBFS

BottomUpBFS

Random

TopDownDFS

BottomUpDFS

Random NoReuse

[email protected]

40

number of processors

number of processors

July 7, 2009

Mapping and Scheduling of Workflow Applications

42/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Summary

Random approach dramatically bad Neglecting reuse limits success rate and quality of solution in terms of cost Top Down approach turns out to be the best BottomUp only with BFS competitive DFS unable to reuse results efficiently (bandwidth) Strong dependency of processor selection strategy on solution quality Solid combination: TopDownBFS with fastest proc non-blocking

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

43/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Outline of the Talk 1

Replica Placement in Tree-Networks Framework Complexity Heuristics for Replica Cost Problem Experiments

2

Pipeline Workflow Applications Bi-criteria Complexity Results

3

In-network Stream Processing Heuristics and Experiments

4

Conclusion

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

44/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Summary Study of three mapping and scheduling problems Replica placement Pipeline workflows In-network stream processing

1 5

4

3

2

2

3

Complexity study: Influence of heterogeneity Multi-criteria optimization Algorithms: Optimal algorithms and NP-completeness proofs Heuristics for NP-complete instances Experiments: Absolute performance via comparison to LP MPI-application of JPEG encoder [email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

45/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Summary Study of three mapping and scheduling problems Replica placement δ0

Pipeline workflows

S1 w1

δ1

S2 w2

...

δ k−1

Sk wk

δk

...

Sn

δn

wn

In-network stream processing Complexity study: Influence of heterogeneity Multi-criteria optimization Algorithms: Optimal algorithms and NP-completeness proofs Heuristics for NP-complete instances Experiments: Absolute performance via comparison to LP MPI-application of JPEG encoder [email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

45/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Summary Study of three mapping and scheduling problems op5

Replica placement

op3

op2

op4

op2 ob3

Pipeline workflows In-network stream processing

op1

op1

op1

ob1

ob1

ob2

ob1

ob1

application 1

ob2

ob1

ob2 application 2

Complexity study: Influence of heterogeneity Multi-criteria optimization Algorithms: Optimal algorithms and NP-completeness proofs Heuristics for NP-complete instances Experiments: Absolute performance via comparison to LP MPI-application of JPEG encoder [email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

45/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Summary Study of three mapping and scheduling problems Replica placement Pipeline workflows In-network stream processing Complexity study: Influence of heterogeneity Multi-criteria optimization Algorithms: Optimal algorithms and NP-completeness proofs Heuristics for NP-complete instances Experiments: Absolute performance via comparison to LP MPI-application of JPEG encoder [email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

45/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Ongoing Work

Tri-criteria optimization on pipelines: Latency - reliability - period Heuristics More ambitious application: MPEG4

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

46/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Perspectives Replica Placement More simulations for the Replica Cost problem: shape of the trees, distribution law of the requests, degree of heterogeneity of the platforms Consider the problem with several object types Extension to more complex objective functions Pipeline Workflow Extension to fork, fork-join and tree workflows Multi-criteria: new objectives like power consumption and rental cost Stream Processing Mutable applications: Operators can be rearranged based on operator associativity and commutativity rules [email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

47/ 48

Replica Placement

Pipeline Workflows

In-network Stream Processing

Conclusion

Journals A. Benoit, H. Kosch, V. Rehn-Sonigo, and Y. Robert. Multi-criteria Scheduling of Pipeline Workflows (and Application to the JPEG Encoder) Int. Journal of High Performance Computing Applications, 23, 2009 A.Benoit, V. Rehn-Sonigo, and Y. Robert. Replica Placement and Access Policies in Tree Networks. IEEE Transactions on Parallel and Distributed Systems, 19(12), 2008. L. Marchal, V. Rehn, Y. Robert, and F. Vivien. Scheduling algorithms for data redistribution and load-balancing on master-slave platforms. Parallel Processing Letters, 17(1), 2007.

Conferences Replica Placement CoreGRID’2007, Core GRID Symposium 2007 ICCS’07, the 2007 International Conference on Computational Science. HCW’07, the 16th Heterogeneity in Computing Workshop. Pipeline Workflow Applications ICCS’08, the 2008 International Conference on Computational Science. HCW’08, the 17th Heterogeneity in Computing Workshop. HeteroPar’07, Algorithms, Models and Tools for Parallel Computing on Heterogeneous Networks (in conjunction with Cluster 2007). In-Network Stream Processing APDCM’09, the 11th Workshop on Advances in Parallel and Distributed Computational Models. Scheduling Strategies on Star Platforms PDP’2007, 15th Euromicro Workshop on Parallel, Distributed and Network-based Processing. HCW’06, the 15th Heterogeneous Computing Workshop.

[email protected]

July 7, 2009

Mapping and Scheduling of Workflow Applications

48/ 48

Suggest Documents