CT30A7001 Concurrent and Parallel Computing ... - www2.it.lut.fi

22 downloads 73 Views 53KB Size Report
CT30A7001 Concurrent and Parallel Computing. Exercise 5 ... Maximum degree of concurrency ... In both cases the problem can be solved in seven steps.
CT30A7001 Concurrent and Parallel Computing

Exercise 5, answers

Assignment 1



Maximum degree of concurrency Amount of tasks that can operate concurrently



Critical path length Sum of the weights of the nodes along the critical path



Maximum available speedup Amount of tasks / Critical path length



Minimum processes to achieve maximum speedup = Can you solve the problem optimally using less tasks than maximum degree of concurrency



Maximum speedup if the number is limited Amount of tasks / number of steps to complete

A

B

C

D

Maximum degree of concurrency

8

8

8

8

Critical Path length

4

4

7

8

Maximum speedup

15/4

15/4

14/7

15/8

Minimum processes

8

8

3

2

Limited 2

15/8

15/8

14/8

15/8

Limited 4

15/5

15/5

14/7

15/8

Limited 8

15/4

15/4

14/7

15/8

Maximum limited speedup

Table 1: Metrics for task graphs

Assignment 2 The task graph is shown in Figure 1.

1

Figure 1: Task Graph for LU factorization In both cases the problem can be solved in seven steps. With four processes this is simple, as the maximum concurrency is four. With three tasks, you will need to look at the relationships and decide how to split the work between steps. The computation of tasks 3 and 7 can be postponed by one step and still reach the goal in seven steps. The exact orders can be seen in Table 2. Step

4 tasks

1

1

3 tasks 1

2

5, 2, 4, 3

5, 2, 3

3

8, 6, 7, 9

8, 6, 3

4

10

10, 7, 9

5

12, 11

12, 11

6

13

13

7

14

14

Table 2: Execution orders with 3 and 4 tasks

Assignment 3

S=

W Tp

As

increases,

p

=

W s Ws + W −W p W −Ws approaches zero. But no matter how large p

p

is,

S

cannot

W

exceed W . s

Assignment 4



a) 12 steps are needed (Sequential version that always searches the leftmost tree opens 12 nodes)



b) Only 5 steps are needed,

⇒ S =

12 5 , when the root node is opened

by one task, and afterwards the two tasks operate on the identical trees starting from the second level Speedup is greater than 2 since parallel algorithm performs less work. Sequential version opens 12 nodes, the parallel version opens only 9.

2