5.1k views

Delayed branching can help in the handling of control hazards

The following code is to run on a pipelined processor with one branch delay slot:

I1: ADD $R2 \leftarrow R7 + R8$

I2: Sub $R4 \leftarrow R5 – R6$

I3: ADD $R1 \leftarrow R2 + R3$

I4: STORE Memory $[R4] \leftarrow R1$

BRANCH to Label if $R1 == 0$

Which of the instructions I1, I2, I3 or I4 can legitimately occupy the delay slot without any program modification?

1. I1
2. I2
3. I3
4. I4

retagged | 5.1k views
0
#plz give a link?
+4
Delayed Branching- this technique says that execute some meaningful instruction which is independent of other instructions before taking the branch to avoid stall cycle in the pipeline
Here I4 instruction says that "move the content of the register R1 into some main memory location whose address is stored in the register R4.
Here instruction I2 cannot be move after branch otherwise memory location stored in register R4 will get change for the instruction i4 and may throw some error while executing I4.
0
No re arrangement no mop substitute so I4 instruction occupy delay

What is Delayed Branching ?
One way to maximize the use of the pipeline, is to find an instruction that can be safely exeucted whether the branch is taken or not, and execute that instruction. So, when a branch instruction is encountered, the hardware puts the instruction following the branch into the pipe and begins executing it, just as in predict-not-taken. However, unlike in predict-not-taken, we do not need to worry about whether the branch is taken or not, we do not need to clear the pipe because no matter whether the branch is taken or not, we know the instruction is safe to execute.
More Read : https://www.cs.umd.edu/class/fall2001/cmsc411/projects/branches/delay.html
Moving $I_1$ after branch

• $I1$ is updating the value of  $R2$
• $R2$ which is used to determine branch condition $R1$
• Value of $R2$ is available after branch
$\Rightarrow$ Cannot be moved

Moving $I_3$ after branch

• value of $R1$ is computed in this instruction
• $R1$ is the branch condition
$\Rightarrow$ Cannot be moved

Moving $I_4$ after branch

• $I4$ is simple store instruction used to store R1 in memory

• program execution will have no effect if this is placed after conditional branch
$\Rightarrow$ Can be moved

Moving $I_2$ after branch

• It update the memory location to place the storing of conditional branch instruction $R1$
• If moved after branch , when compiler reaches $I4$ program execution will stop
$\Rightarrow$ Cannot be moved

Hence, option D is answer.

by
edited
0
Why do u think $I_{4}$ is better choice than $I_{2}$

in $I_{2}$ $R_{4}$ is a location of register

while in $I_{4}$ ,$R_{4}$ is memory location.

So, which one better to put in delay slot?
0
@srestha , just think what happens when you put I2 after braching ? ( when compiler reaches I4  ) will it be able to countinue ?

Well You can move I2 and I4 after brach . that option is not given so i choosed I4 .
If Anything wrong with my thinking  pls do  point it out
+13
The branch condition is dependent on R1 which is in turn dependent on R2. So moving instructions that modify R1 and R2 after the branch instruction is not the Option. So we can eliminate I1 and I3.

If we move I2 after branch, and keep all the I1, I3 and I4 before the branch, then a time will come when I4 is about to get executed. At that time the compiler would be clueless(or will get wrong/old value) as to what the value of R4 is. This is because R4 gets a value at I2 but that has been shifted after branch. So we cannot move I2 also.

The only remaining option left is I4.

Option D

@pC please check this and point out the mistakes (if any) in my understanding.
0
good explanation @PC boss
0

why I2 can’t be choice? We are using memory location of R2 in I4;

not the value of R2; so we should be able to move I2 too? @talha hashim

0
main objective here whatever instruction place after branch that should not affect the original result.......right?
0

$I_2$ must be executed before $I_4$

we have 5 inst. and let pipeline stages will be 5.so we can process all the inst. in (n+k-1)=(5+5-1)=9 clock cycle.

if in 9 clock cycle we are able to arrange in such a way that the dependancy will not violate.but the order may be different.  then it will be ok.

I1 CANT BE AFTER I3 (DUE TO DEPENDENCY ON R2)

I2 CANT BE AFTER I4 (COZ  I4 REQUIRE MEM[R4] OF I2)

I3 CANT BE AFTER I4 (DUE TO DEPENDENCY ON R1)

SO  I1 THEN I3

I2,I3 THEN I4

=>I4 CANT BE BEFORE I1,I2,I3.

=>LET I1 FIRST(5 CLOCK)

I2(1 CLOCK)

I3(1 CLOCK)

BRANCH(1 CLOCK)

I4 (1 CLOCK)

TOTAL 9 CLOCK AND THAT IS WHAT WE WANT
by

It can't be l1 or l3, because directly or indirectly they are taking part in the branching decision. Now we can have both l2 and l4 after the branching decision statement and  the order of I2 and I4 matters because in I2 we are getting the final value in register R4 and in instruction we are saving contents of R1 in memory whose address is stored in the register. So If we made I2 to be the instruction after branch then the value in the first loop itself the value stored in memory location whose address is stored in R4 would be wrong because it actually should have been updated first by R5-R6.So  I4 is correct. So (D) is correct option.

by