Branch Prediction

Chapter 06: Instruction Pipelining and Parallel Processing Lesson 13: Dynamic Hardware Branch Prediction

Objective • Understand the advantage of branch prediction dynamically

Schaum’s Outline of Theory and Problems of Computer Architecture Copyright © The McGraw-Hill Companies Inc. Indian Special Edition 2009

2

Hardware for pre-calculation of Branch Address


3

Pre-calculation of Branch • Add hardware to allow the result of a branch instruction to be computed earlier in the pipeline


4

Example of Pre-calculation of Branch • If the pipeline computed the next value of the PC in the register read stage instead of the execute stage, the branch delay would be reduced to three cycles • Only two bubbles if PC value forwards to fetch stage


5

Four Cycles for New PC value and 2 bubbles Seven Clock Cycles Fetch an instruction Decode an instruction

BR I2

I3

I2

I3

I4

I5

BR I2

I3

I2

I3

I4

BR I2

I3

I2

I3

Read Inputs Perform Computations Write Outputs

Find PC

BR I1

I1

I2

NOP NOP

I1

BR BR


6

Branch Prediction


7

Condition Branch Instruction control hazard • Dependency of the PC on the result causes delay (branch penalty) because pipelined instruction has to wait till write-back of the PC finishes at the write-back stage


8

Branch Prediction • A powerful technique is there to predict the value of PC, which then proceeds to execute In, In+1, In+2, In+3, …., the new instruction path as per the condition at cond field • If the prediction is incorrect, then the path cancels and branch penalty imposes; else the prediction improves the performance


9

Branch Prediction • Add hardware that predicts the destination address of each branch before the branch completes • Allowing the processor to begin fetching instructions from that address earlier in the pipeline


10

SuperSparc processor • Uses the branch-prediction technique • Accurate predictions possible • A dynamic hardware predicts the PC and thus the instruction path


11

Seven Clock Cycles Incorrect Branch Prediction for I2 Seven Clock Cycles Fetch an instruction Decode an instruction Read Inputs Perform Computations Write Outputs

BR I2

I3

I4

I2

I3

I4

BR I2

I3

I4

I2

I3

BR I2

I3

I4

I2

Incorrect Branch Prediction

BR I2

I3

BR I1


I4 I1 12

Seven Clock Cycles without correct Branch Prediction for I2 Seven Clock Cycles Fetch an instruction Decode an instruction Read Inputs Perform Computations Write Outputs

BR I2

I3

I4

I5

I6

I7

BR I2

I3

I4

I5

I6

BR I2

I3

I4

I5

Correct Branch Prediction

BR I2

I3

BR I1


I4 I1 13

Branch Prediction • A dynamic hardware to predict the PC and thus the instruction path


14

Dynamic Branch Prediction and use of Ibit


15

Dynamic Prediction Hardware • Let us assign a control prediction bit p to the instruction In when it first executes • Processor hardware uses p bit with In to predict on subsequent use of In


16

Loops Handling with Dynamic Prediction Hardware • After the first assignment of p bit, the loop back address will be predicted and will not have any instruction that does not have the branch penalty • A correct prediction sets a bit q = 1 when p = 1


17

p-and q-bits Bit • As long as q remains 1, the prediction is correct and p remains 1 • When prediction is incorrect, the q resets to 0 and it forces p also 0 • The processor now follows the path before the prediction


18

Dynamic Branch Prediction and use of branch target buffer


19

Use of branch target look ahead buffer (BTB) • Let us assume that the hardware has a branch history table


20

Branch History Table Row • Instruction address • predicted branch target address • number of times prediction and number of times succeeded, made, and predicted correctly


21

Instruction In • Request to instruction In sent to instruction cache as well as to BTB


22

Use of branch target look ahead buffer (BTB) • If In matches an entry in BTB, then the predicted address is used and the field for the number of times prediction updates • After execution of the instruction, target address updates


23

Use of branch target look ahead buffer (BTB) • Also if prediction succeeds, the field for success is updated by increasing the value • If it fails, then by decreasing the value.


24

Summary


25

We Learnt • Advantage of branch prediction • Dynamic branch prediction • Use of instruction bits p and q, branch table low in branch target look ahead buffer


26

End of Lesson 13 on Dynamic Hardware Branch Prediction


27

Branch Prediction

Branch Prediction

Suggest Documents

Branch Prediction

Exploring Dynamic Branch Prediction Methods

Dynamic Branch Prediction - Google Groups

Branch Misprediction Prediction: Complementary ...

Exploring Dynamic Branch Prediction Methods

branch prediction for network processors - DORAS - DCU

Multiple Branch and Block Prediction - Semantic Scholar

practical multidimensional branch prediction - Joshua San Miguel

Evaluating Various Branch-Prediction Schemes for Biomedical

Dynamic Branch Prediction using Neural Networks - ULBS

Difficult-Path Branch Prediction Using Subordinate Microthreads ...

Countermeasures for the Simple Branch Prediction Analysis

Exploring Dynamic Branch Prediction Methods - Semantic Scholar

Performance Analysis of Branch Prediction Unit for

Exploring Dynamic Branch Prediction Methods - Semantic Scholar

Difficult-Path Branch Prediction Using Subordinate Microthreads ...

Compiler Controlled Value Prediction using Branch ... - CiteSeerX

Compiler Controlled Value Prediction using Branch ... - CiteSeerX

Prophet/critic hybrid branch prediction - CiteSeerX

Branch Prediction Techniques and Optimizations - Google Sites

On-Demand Branch Prediction - Google Sites

New Branch Prediction Vulnerabilities in OpenSSL and Necessary ...

Power-Aware Branch Prediction Techniques: A Compiler ... - CiteSeerX

Parallel Branch Prediction on GPU Platform - Semantic Scholar