CS352, S07. Lecture 12. 7. Memory hazards must stall: The “Hazard detection
unit” inserts a NOP. 0. M. WB. WB. Data memory. Instruction memory. M u x. M u x
.
Lecture 12: Pipelining Hazards • Administrative
– HW #3 due – HW #4 handed out, due Thursday after spring break – Exam #1 handed out
• Review pipelining
• 5 stages • Data Hazards
– Fixing them with forwarding
• Memory Hazards
• Today
– Fixing them with stalls
– Control hazards • Branch delay slot, branch prediction – Branch prediction Lecture 12
UTCS CS352, S07
1
Split machine into 5 pipeline stages
IF/ID
ID/EX
EX/MEM
MEM/WB
Add 4 Shift left 2
PC
Address
Instruction memory
Add Add result
Read register 1
Read data 1 Read register 2 Registers Read Write data 2 register
Zero ALU ALU result
Read data
Address Data memory
Write data Write data 16
Sign extend
32
It’s like attaching a note to the laundry basket used for a particular load UTCS CS352, S07
Lecture 12
2
1
Several instructions in progress at once
IF/ID
ID/EX
EX/MEM
MEM/WB
Add Add Add result
4 Shift left 2
PC
Read register 1
Address
Read data 1 Read register 2 Registers Read Write data 2 register
Instruction memory
Zero ALU ALU result
Read data
Address Data memory
Write data Write data 16
Sign extend
32
Lecture 12
UTCS CS352, S07
3
Control signals are carried with each instruction PCSrc
ID/EX
Control
IF/ID
WB
EX/MEM
M
WB
EX
M
MEM/WB WB
Add 4 Shift left 2
PC
Address
Instruction memory
Add Add result
Branch
ALUSrc
Read register 1
Read data 1 Read register 2 Registers Read Write data 2 register
Zero ALU ALU result
Read data
Address Data memory
Write data Write data Instruction [15–0]
16
Sign extend
32
Instruction [20–16]
LW R1, 4(R2)
6
ALU control
MemRead
ALUOp
Instruction [15–11] RegDst
UTCS CS352, S07
Lecture 12
4
2
Three kinds of Pipeline Hazards • Data hazards – an instruction uses the result of a previous instruction (RAW) ADD R1, R2, R3 or SW R1, 3(R2) ADD R4, R1, R5 LW R3, 3(R2)
• Control hazards – the location of an instruction depends on a previous instruction JMP LOOP … LOOP: ADD R1, R2, R3
• Structural hazards – two instructions need access to the same resource • e.g., single memory shared for instruction fetch and load/store Lecture 12
UTCS CS352, S07
5
Bypassing: send data directly to where it’s needed
Hazard detection unit
ID/EX.MemRead
ID/EX WB M u x
Control 0
IF/ID
EX/MEM
M
WB
EX
M
MEM/WB WB
M u x Registers M u x
ALU PC
Instruction memory
M u x
Data memory
IF/ID.RegisterRs IF/ID.RegisterRt IF/ID.RegisterRt
Rt
IF/ID.RegisterRd
Rd
M u x
ID/EX.RegisterRt Rs Rt
ADD ADD SUB XOR
R1, R4, R5, R7, UTCS CS352, S07
R2, R1, R1, R8,
Forwarding unit
R3 R5 R6 R1 Lecture 12
6
3
Memory hazards must stall: The “Hazard detection unit” inserts a NOP Hazard detection unit
ID/EX.MemRead
ID/EX WB M u x
Control 0
IF/ID
EX/MEM
M
WB
EX
M
MEM/WB WB
M u x Registers M u x
ALU PC
Instruction memory
M u x
Data memory
IF/ID.RegisterRs IF/ID.RegisterRt IF/ID.RegisterRt
Rt
IF/ID.RegisterRd
Rd
M u x
ID/EX.RegisterRt
LW ADD
Rs Rt
R1, 0(R2) R4, R1, R5 UTCS CS352, S07
Lecture 12
Forwarding unit
7
Control Hazards
UTCS CS352, S07
Lecture 12
8
4
Control Hazard: Branch on condition At what stage do we know if conditional is true? What are the options for dealing with this?
Hazard detection unit
ID/EX.MemRead
ID/EX WB M u x
Control 0
IF/ID
EX/MEM
M
WB
EX
M
MEM/WB WB
M u x Registers M u x
ALU Instruction memory
PC
M u x
Data memory
IF/ID.RegisterRs IF/ID.RegisterRt IF/ID.RegisterRt
Rt
IF/ID.RegisterRd
Rd
M u x
ID/EX.RegisterRt
XX:
Rs Rt
BGE R2, R3, ZZZ ... ADD ...
Forwarding unit
Lecture 12
UTCS CS352, S07
9
Move branch logic to ID stage How many cycles of stall now?
IF.Flush Hazard detection unit ID/EX WB Control 0 IF/ID
M u x
+
EX/MEM
M
WB
EX/MEM
EX
M
WB
+ 4
M u x
Shift left 2
Registers PC
=
M u x
Instruction memory
ALU M u x
Data memory
M u x
Sign extend
M u x Fowarding unit
XX:
BGE R2, R3, ZZZ ... ADD ... UTCS CS352, S07
Lecture 12
10
5
Branch Delay Slots • Since we need to have a dead cycle anyway, let’s put a useful instruction there ADD R2,R3,R4 BNEZ R5,_loop NOP
• Advantage: – Do more useful work – Potentially get rid of all stalls
BNEZ R5,_loop ADD R2,R3,R4
• Disadvantage: – Exposes microarchitecture to ISA – Deeper pipelines require more delay slots Lecture 12
UTCS CS352, S07
11
Speculating for control hazards • Conservatively, the pipeline waits until the branch target is computed before fetching the next instruction. • Alternatively, we can speculate which direction and to what address the branch will go.
F R X M W
F R X M W F R X M W
F R X M W
• Need to confirm speculation and back up later.
UTCS CS352, S07
F R X M W
F
Lecture 12
F R X M W
12
6
How can we predict branch direction? • Past history of this branch – Last time – Last two time (why is this useful?)
• Past history of last several branches – Why is this useful? for (i=0; i