Pipelining II - Hazards, bypassing

60 downloads 6490 Views 157KB Size Report
CS352, S07. Lecture 12. 7. Memory hazards must stall: The “Hazard detection unit” inserts a NOP. 0. M. WB. WB. Data memory. Instruction memory. M u x. M u x .
Lecture 12: Pipelining Hazards • Administrative

– HW #3 due – HW #4 handed out, due Thursday after spring break – Exam #1 handed out

• Review pipelining

• 5 stages • Data Hazards

– Fixing them with forwarding

• Memory Hazards

• Today

– Fixing them with stalls

– Control hazards • Branch delay slot, branch prediction – Branch prediction Lecture 12

UTCS CS352, S07

1

Split machine into 5 pipeline stages

IF/ID

ID/EX

EX/MEM

MEM/WB

Add 4 Shift left 2

PC

Address

Instruction memory

Add Add result

Read register 1

Read data 1 Read register 2 Registers Read Write data 2 register

Zero ALU ALU result

Read data

Address Data memory

Write data Write data 16

Sign extend

32

It’s like attaching a note to the laundry basket used for a particular load UTCS CS352, S07

Lecture 12

2

1

Several instructions in progress at once

IF/ID

ID/EX

EX/MEM

MEM/WB

Add Add Add result

4 Shift left 2

PC

Read register 1

Address

Read data 1 Read register 2 Registers Read Write data 2 register

Instruction memory

Zero ALU ALU result

Read data

Address Data memory

Write data Write data 16

Sign extend

32

Lecture 12

UTCS CS352, S07

3

Control signals are carried with each instruction PCSrc

ID/EX

Control

IF/ID

WB

EX/MEM

M

WB

EX

M

MEM/WB WB

Add 4 Shift left 2

PC

Address

Instruction memory

Add Add result

Branch

ALUSrc

Read register 1

Read data 1 Read register 2 Registers Read Write data 2 register

Zero ALU ALU result

Read data

Address Data memory

Write data Write data Instruction [15–0]

16

Sign extend

32

Instruction [20–16]

LW R1, 4(R2)

6

ALU control

MemRead

ALUOp

Instruction [15–11] RegDst

UTCS CS352, S07

Lecture 12

4

2

Three kinds of Pipeline Hazards • Data hazards – an instruction uses the result of a previous instruction (RAW) ADD R1, R2, R3 or SW R1, 3(R2) ADD R4, R1, R5 LW R3, 3(R2)

• Control hazards – the location of an instruction depends on a previous instruction JMP LOOP … LOOP: ADD R1, R2, R3

• Structural hazards – two instructions need access to the same resource • e.g., single memory shared for instruction fetch and load/store Lecture 12

UTCS CS352, S07

5

Bypassing: send data directly to where it’s needed

Hazard detection unit

ID/EX.MemRead

ID/EX WB M u x

Control 0

IF/ID

EX/MEM

M

WB

EX

M

MEM/WB WB

M u x Registers M u x

ALU PC

Instruction memory

M u x

Data memory

IF/ID.RegisterRs IF/ID.RegisterRt IF/ID.RegisterRt

Rt

IF/ID.RegisterRd

Rd

M u x

ID/EX.RegisterRt Rs Rt

ADD ADD SUB XOR

R1, R4, R5, R7, UTCS CS352, S07

R2, R1, R1, R8,

Forwarding unit

R3 R5 R6 R1 Lecture 12

6

3

Memory hazards must stall: The “Hazard detection unit” inserts a NOP Hazard detection unit

ID/EX.MemRead

ID/EX WB M u x

Control 0

IF/ID

EX/MEM

M

WB

EX

M

MEM/WB WB

M u x Registers M u x

ALU PC

Instruction memory

M u x

Data memory

IF/ID.RegisterRs IF/ID.RegisterRt IF/ID.RegisterRt

Rt

IF/ID.RegisterRd

Rd

M u x

ID/EX.RegisterRt

LW ADD

Rs Rt

R1, 0(R2) R4, R1, R5 UTCS CS352, S07

Lecture 12

Forwarding unit

7

Control Hazards

UTCS CS352, S07

Lecture 12

8

4

Control Hazard: Branch on condition At what stage do we know if conditional is true? What are the options for dealing with this?

Hazard detection unit

ID/EX.MemRead

ID/EX WB M u x

Control 0

IF/ID

EX/MEM

M

WB

EX

M

MEM/WB WB

M u x Registers M u x

ALU Instruction memory

PC

M u x

Data memory

IF/ID.RegisterRs IF/ID.RegisterRt IF/ID.RegisterRt

Rt

IF/ID.RegisterRd

Rd

M u x

ID/EX.RegisterRt

XX:

Rs Rt

BGE R2, R3, ZZZ ... ADD ...

Forwarding unit

Lecture 12

UTCS CS352, S07

9

Move branch logic to ID stage How many cycles of stall now?

IF.Flush Hazard detection unit ID/EX WB Control 0 IF/ID

M u x

+

EX/MEM

M

WB

EX/MEM

EX

M

WB

+ 4

M u x

Shift left 2

Registers PC

=

M u x

Instruction memory

ALU M u x

Data memory

M u x

Sign extend

M u x Fowarding unit

XX:

BGE R2, R3, ZZZ ... ADD ... UTCS CS352, S07

Lecture 12

10

5

Branch Delay Slots • Since we need to have a dead cycle anyway, let’s put a useful instruction there ADD R2,R3,R4 BNEZ R5,_loop NOP

• Advantage: – Do more useful work – Potentially get rid of all stalls

BNEZ R5,_loop ADD R2,R3,R4

• Disadvantage: – Exposes microarchitecture to ISA – Deeper pipelines require more delay slots Lecture 12

UTCS CS352, S07

11

Speculating for control hazards • Conservatively, the pipeline waits until the branch target is computed before fetching the next instruction. • Alternatively, we can speculate which direction and to what address the branch will go.

F R X M W

F R X M W F R X M W

F R X M W

• Need to confirm speculation and back up later.

UTCS CS352, S07

F R X M W

F

Lecture 12

F R X M W

12

6

How can we predict branch direction? • Past history of this branch – Last time – Last two time (why is this useful?)

• Past history of last several branches – Why is this useful? for (i=0; i