A Precise Bandwidth Control Arbitration Algorithm for Hard ... - ASP-DAC

2 downloads 496 Views 339KB Size Report
Jan 24, 2007 - Similar requesting rate for each master is assumed. □ What if the requesting rates are not similar? ▫ e.g., 3 masters with the same tickets ...
A Precise Bandwidth Control Arbitration Algorithm for Hard Real-Time SoC Buses Bu-Ching Lin, Geeng-Wei Lee, Juinn-Dar Huang, Jing-Yang Jou Department of Electronics Engineering, National Chiao Tung University, Hsinchu, Taiwan

2007/01/24

A Precise Bandwidth Control Arbitration Algorithm for Hard Real-Time SoC Buses

Outline † Introduction † Previous Works „ „ „ „

fixed priority time division multiple access (TDMA) Lottery RT_lottery

† Proposed Arbitration Architecture † Experimental Results † Conclusions

2

A Precise Bandwidth Control Arbitration Algorithm for Hard Real-Time SoC Buses

Introduction(1/2) † Shared bus is widely used in current SoC designs „ master – initiate communication transactions „ slave – respond to transactions initiated by masters „ arbiter – manage the usage of bus

3

A Precise Bandwidth Control Arbitration Algorithm for Hard Real-Time SoC Buses

Introduction(2/2) † Requirements in different applications „ complete transactions of all requests before the corresponding deadlines in real-time applications „ take at least a fixed fraction of total bandwidth in multimedia applications

† Difficult to satisfy both real-time and bandwidth requirements simultaneously „ an innovative arbitration algorithm is required

4

A Precise Bandwidth Control Arbitration Algorithm for Hard Real-Time SoC Buses

Previous Works † Existing arbitration algorithms „ „ „ „

fixed priority time division multiple access (TDMA) Lottery RT_lottery

5

A Precise Bandwidth Control Arbitration Algorithm for Hard Real-Time SoC Buses

Fixed Priority † Among the requesting masters, the one with the highest priority gets granted † Pros „ simple, low hardware cost and easy to implement

† Cons „ starvation problem – the masters with lower priority hardly get the service „ lack of control over real-time and bandwidth requirements 6

A Precise Bandwidth Control Arbitration Algorithm for Hard Real-Time SoC Buses

TDMA (1/2) † Execution time is divided into time slots which are statically assigned to masters † 2nd-level of arbitration is usually adopted to alleviate the wasted time slots

7

A Precise Bandwidth Control Arbitration Algorithm for Hard Real-Time SoC Buses

TDMA (2/2) † Pros „ deterministic worst-case response latency „ reserved bandwidth for each master

† Cons „ difficult to design time slot sequences in an unpredictable system „ more slots Î more bandwidth and shorter latency † what if a master with LOW bandwidth requirement but needs SHORT response latency?

8

A Precise Bandwidth Control Arbitration Algorithm for Hard Real-Time SoC Buses

Lottery † Arbiter grants a master stochastically from contending requests „ i.e., a weighted random mechanism

Tickets (weight)

Request Map

1

M1

2

M2

3

M3

T[ 0 ] = M1 T[ 1 ] = M3

Y

T[ 2 ] = M3 N Y

Rand[ 0, 8 ) = 5

Lottery Manager

T[ 3 ] = M3 T[ 4 ] = M4 T[ 5 ] = M4

Grant M4

T[ 6 ] = M4 4

M4

Y

T[ 7 ] = M4 T[ 8 ] = XX T[ 9 ] = XX

1 + 3 + 4 =8 9

A Precise Bandwidth Control Arbitration Algorithm for Hard Real-Time SoC Buses

Why Lottery is not suitable in SoC ? † Similar requesting rate for each master is assumed † What if the requesting rates are not similar? „ e.g., 3 masters with the same tickets assignment, one with the requesting

rate varies from 80% to 40%, the rate of the other 2 is fixed at 40% similar requesting rate

Requesting Rate of M1

80% 70% 60% 50% 40% diverse requesting rate

Ö ticket ratio ≠ bandwidth ratio Ö weight tuning is required

M1

49% 46% 42% 38% 33%

M2

24% 26% 28% 30% 33%

M3

24% 26% 28% 30% 33%

10

A Precise Bandwidth Control Arbitration Algorithm for Hard Real-Time SoC Buses

Latency and Bandwidth in Lottery † Response latency and bandwidth allocation both controlled by the number of tickets „ e.g., 3 masters have similar traffic behaviors 50 45

5

40 bandwidth (%)

average latency (cycles)

6

4 3 2

35 30 25 20 15 10

1

5

0

0

1:1:1

2:1:1

3:1:1

4:1:1

ticket ratio

more tickets

5:1:1 10:1:1

1:1:1

2:1:1

3:1:1

4:1:1

5:1:1

10:1:1

ticket ratio

Ö shorter response latency Ö more bandwidth allocation 11

A Precise Bandwidth Control Arbitration Algorithm for Hard Real-Time SoC Buses

Summary of Lottery † Pros „ good control over bandwidth allocation in network switching applications „ fair average response latency

† Cons „ no hard real-time consideration „ no independent controllability over response latency and bandwidth allocation

12

A Precise Bandwidth Control Arbitration Algorithm for Hard Real-Time SoC Buses

RT_lottery † A 2-level arbitration algorithm dealing with real-time and bandwidth requirements simultaneously † The proposed architecture „ 1st level – real-time handler † handles the hard real-time requirements „ 2nd level – Lottery with tuned weight † reserves the bandwidth allocation for each master

13

A Precise Bandwidth Control Arbitration Algorithm for Hard Real-Time SoC Buses

Real-Time Handler † Similar to earliest deadline first scheduling (EDF) „ the request with earliest deadline and below the warning line gets granted

† Deadline „ the time limit for a master to complete a request „ missing the deadline is regarded as the real-time violation

† Warning line „ the worst case of scheduling the contending requests

14

A Precise Bandwidth Control Arbitration Algorithm for Hard Real-Time SoC Buses

Weight Tuning † A ticket redistribution mechanism to meet the required bandwidth by simulation M2 M3

† †

M1

M4

M5

Initially, Td=Tm/2 If any master in Smore or Smet -> Sless „

Td = Td / 2

Tm’=Tm-Td

Tl’=Tl+Td

Tm M3

Mmost

M1

M2

M5

Smore

Sless

Tl Mleast

M2

Smore

M5

Sless

M4

M4

Smeet

Smeet 15

A Precise Bandwidth Control Arbitration Algorithm for Hard Real-Time SoC Buses

Fail Case of Weight Tuning † Fail to meet bandwidth requirements due to diverse masters’ requesting rate „ e.g., the requesting rate of each master is 80%, 30%, 30%,

respectively and each of them requires at least 30% bandwidth

over-allocated bandwidth even with few tickets Ticket Assignment of M1, M2 and M3 100:100:100 60:120:120 44:128:128 34:133:133 28:136:136 16:142:142 8:146:146 2:149:149 M1 M2 M3

60 19 19

58 20 20

57 21 21

56 21 21

Ö weight tuning is not a panacea !

55 21 22

55 22 22

54 22 22

53 23 23 16

A Precise Bandwidth Control Arbitration Algorithm for Hard Real-Time SoC Buses

Summary of Previous Works Pros fixed priority

simplicity area efficiency

Cons no real-time consideration no means for bandwidth control

TDMA

deterministic worst-case latency no hard real-time guarantee reserved bandwidth allocation no precise bandwidth control

Lottery

reserved bandwidth allocation fair average latency

no real-time consideration no precise bandwidth control

hard real-time guarantee

limitation of Weight Tuning

RT_lottery

17

A Precise Bandwidth Control Arbitration Algorithm for Hard Real-Time SoC Buses

RB_lottery Architecture † 3-level arbitration algorithm „ real-time handler – handles the hard real-time requirements „ Lottery with tuned weight – reserves the bandwidth allocation for each master „ bandwidth regulator –provides fine-grained control over bandwidth allocation

18

A Precise Bandwidth Control Arbitration Algorithm for Hard Real-Time SoC Buses

An Example of RB_lottery † Bandwidth regulator monitors the bus traffic „ record the transactions of each master „ temporarily block the requests from masters that have already got the required bandwidth in a period Tickets (weight)

1

3 4

Block signals Grant

M1 M2 M3 M4

Y N Y Y

1

Real-Time Handler

2

Request map

0 3 4

1 1

T[ 0 ] = M1

0

1

3

0

0

Bandwidth Regulator

T[ 1 ] = M3 T[ 2 ] = M3

1

Lottery Manager

Rand[ 0, 4 ) = 2

T[ 3 ] = M3 T[ 4 ] = XX

Grant M3

T[ 5 ] = XX T[ 6 ] = XX T[ 7 ] = XX T[ 8 ] = XX T[ 9 ] = XX

19

A Precise Bandwidth Control Arbitration Algorithm for Hard Real-Time SoC Buses

The Implementation † Observation window (w) — the execution time is divided into windows of size w cycles „ block the requests of over-served masters temporarily

† Bandwidth register — the allocated bandwidth in the current window required bandwidth = 30% allocated bandwidth

30%

w

30%

w

30%

w

30%

w

execution time 20

A Precise Bandwidth Control Arbitration Algorithm for Hard Real-Time SoC Buses

RB_lottery Algorithm Flow Start

6

1

Proceed to the next window?

Yes

Release all the blocked masters

No

1. 2.

2

Real-Time Handler

3. 3

Block the request of current granted master

5

No

Yes Meet requirement?

Any master gets granted?

4

No

Yes

Record the transaction amounts of the current granted master

Lottery Manager

4. 5. 6.

check whether a new window starts grant the most urgent master stochastically grant an unblocked master record the transaction cycles check the allocated bandwidth reset all the blocked signals and proceed to the next window 21

A Precise Bandwidth Control Arbitration Algorithm for Hard Real-Time SoC Buses

Experimental Environment † Transaction level model in SystemC „ different arbitration algorithms are used in the experiments, such as fixed priority, lottery, TDMA + Lottery, RT_lottery, RB_lottery

22

A Precise Bandwidth Control Arbitration Algorithm for Hard Real-Time SoC Buses

3 Types of Masters Example : beat = 4, interval = 15, Rcycles = 10

† D type (D for Dependency) request

grant

finish

2

5

9

request

24

cycle

† D_R type (D for Dependency, R for Real-time) request

grant

finish

2

5

9

request

grant

finish

2

5

9

request

24

12

cycle

† ND_R type (ND for No Dependency, R for Real-time) request

12

17

cycle 23

A Precise Bandwidth Control Arbitration Algorithm for Hard Real-Time SoC Buses

Experiment Setup † Behavior of masters type

beat/prob.

interval/prob.

Master 1

D

8/50

16/50

6/10

7/20

8/40

9/20

10/10

Master 2

D

1/50

4/50

10/10

11/20

12/40

13/20

14/10

Master 3

D

8/50

16/50

6/10

7/20

8/40

9/20

10/10

Master 4

D

1/50

4/50

10/10

11/20

12/40

13/20

14/10

Master 5

D_R

8/50

16/50

10/10

11/20

12/40

13/20

14/10

Master 6

D_R

1/50

4/50

10/10

11/20

12/40

13/20

14/10

Master 7

ND_R

8/50

16/50

65/10

66/20

67/40

68/20

69/10

Master 8

ND_R

1/50

4/50

85/10

86/20

87/40

88/20

89/10

Heavy-Traffic

Light-Traffic

† 4 D type masters, 2 D_R type masters and 2 ND_R type masters in the simulation system † Half of masters are heavy-traffic 24

A Precise Bandwidth Control Arbitration Algorithm for Hard Real-Time SoC Buses

Performance Comparisons (1/2) † Fail cases of different arbitration algorithms „ 100 random required-bandwidth combinations for each workload „ 102400 simulation cycles for each combination Workload (%)

Fixed Priority

Lottery

60 65 70 75 80 85 90 95

100 100 100 100 100 100 100 100

100 100 100 100 100 100 100 100

TDMA + Lottery 95 98 100 100 100 100 100 100

RT _lottery

RB _lottery

0 0 0 10 18 37 55 74

0 0 0 0 0 1 12 44 25

A Precise Bandwidth Control Arbitration Algorithm for Hard Real-Time SoC Buses

Performance Comparisons (2/2) † Hardware comparisons

Gate counts

Fixed Priority

Lottery

215

4296

TDMA + Lottery 4917

RT _lottery

RB _lottery

5134

5814

† Summary real-time capability

bandwidth capability

fixed priority

no consideration

poor

Lottery

no consideration

good but weight tuning is required

TDMA+Lottery no guarantee RT_lottery

always hold

RB_lottery

always hold

good only in low loaded bus (workload < 60%) good but still fails in high loaded bus (workload < 75%) good even in extremely high loaded bus 26

A Precise Bandwidth Control Arbitration Algorithm for Hard Real-Time SoC Buses

Observation Window Comparisons † Fail cases in different size of observation window of RB_lottery „ „ „

100 random required-bandwidth combinations for each workload 102400 simulation cycles for each combination the size of observation window ranges from 128 to 2048

4 11 25 37 42 57

1 1 11 25 31 44

0 0 4 10 24 33

0 0 2 7 20 32

0 0 0 7 14 28

40 30 20 95 93 91 89 87 85

10 0 20 48

512 1024 2048

10 24

256

51 2

128

50

25 6

85 87 89 91 93 95

The size of observation window

12 8

Workload (%)

fail pattern number(s)

60

observation window

workload(%) 27

A Precise Bandwidth Control Arbitration Algorithm for Hard Real-Time SoC Buses

Conclusions † RB_lottery is proposed to provide „ hard real-time guarantee „ fine-grained bandwidth control

† The observation window in the bandwidth regulator „ the larger size of observation window, the better controllability over bandwidth requirements

28

A Precise Bandwidth Control Arbitration Algorithm for Hard Real-Time SoC Buses

Thank you!

29

Suggest Documents