Hazards, Forwarding, Branch Prediction

Hazards

Pipeline	Instruction	Stage
Immediate Memory	4	FETCH (IF)
Pipeline Register(PR1)	4	FETCH (IF)
Register (READ part)	3	DECODE (ID)
PR2	3	DECODE (ID)
ALU	2	EXECUTE (IDEX)
PR3	2	EXECUTE (IDEX)
DM	1	Memory (EX/MEM)
PR4	1	Memory (EX/MEM)
Register (WRITE part)	0	WRITE BACK (WB)

Example 1

Instruction	Assembly	Data Hazard
0	`add t0, t1, t2`
1	`sub t4, t0, t3`	RAW
2	`and t5, t0, t6`	RAW
3	`or t7, t0, t8`	RAW
4	`xor t9, t0, t10`

Data Hazards

RAW: (0,1), (0,2), (0,3)
WAR
WAW

Data hazard types

RAW (Read After Write)
WAR (Write After Read)
WAW (Write After Write)

Example 2

0: li t1, 1
1: li t2, 2
2: addi t1, t1, 1
3: beq t1, t2, crypt
4: and t3, t1, t2
5: j exit
crypt:
    6: xori t3, t1, 0xff # xor 0x2 0xff = 0xfd
exit:
    7: addi t3, t1, 0x00

Here there is a RAW hazard happening between instruction 0 and 2 as instruction 2 depends on t1, but as instruction 2 executes in EX/MEM instruction 0 is still in FETCH stage.

The solution to this hazard is to insert a NOP between instruction 0 and 2 so instruction 0 is decoded as instruction 2 is executed. (The instructions can execute in parallel)

0: li t1, 1
1: li t2, 2
2: nop # addi 0x, 0x, 0
3: addi t1, t1, 1
...

Another problem is the branching on 3 because as the instruction is in WB, instructions 4 and 5 has been predictably queued and are being FETCHED and DECODED. If these instructions were executed it would throw the program out of whack.

The solution here is pipeline flushing. The ripes simulator does this automatically, seemingly by replacing the flushed instructions with NOPs. Seems like it does this by clearing the pipeline registers containing the instructions of 4 and 5

Branch Prediction

How early can we know whether a branch has been taken?

In the DECODE stage you could have a mini BEQ check, thus moving the Branch Taken from the EXECUTE stage to the DECODE stage, 1 pipeline step earlier.

A loop in a program has low locality, it uses only part of the instructions in a program and only part of memory frequently.

Hardware can optimize for a loop with a 1 bit branch predictor and even optimize for a loop within a loop with a 2 bit branch predictor

Forwarding

Too little time for this

Exam

Starts with Potpurri section from all of pensum
3/4 bigger problems for focus ISA, microarchitecture, digital teknikk
OS paging

Begynn her!

Emner

EXPH0300

Forelesninger

øvingsoppgave

Seminar

IT2805

Forelesninger

IT2810

MA0001

eksamensøving

Forelesninger

ØKO1001

eksamen

Forelesninger

TDT4160

eksamensøving

Forelesninger

Kompendium

Calculus

Computer Science

Programming

Algorithms

Concepts

Design patterns

Languages

Paradigms

EXPH

Software Development

Hazards, Forwarding, Branch Prediction

Hazards

Example 1

Data hazard types

Example 2

Branch Prediction

Forwarding

Exam