Delayed branching can help in the handling of control hazards
The following code is to run on a pipelined processor with one branch delay slot:
I1: ADD $R2 \leftarrow R7 + R8$
I2: Sub $R4 \leftarrow R5 – R6$
I3: ADD $R1 \leftarrow R2 + R3$
I4: STORE Memory $[R4] \leftarrow R1$
BRANCH to Label if $R1 == 0$
Which of the instructions I1, I2, I3 or I4 can legitimately occupy the delay slot without any program modification?
What is Delayed Branching ?
One way to maximize the use of the pipeline, is to find an instruction that can be safely exeucted whether the branch is taken or not, and execute that instruction. So, when a branch instruction is encountered, the hardware puts the instruction following the branch into the pipe and begins executing it, just as in predict-not-taken. However, unlike in predict-not-taken, we do not need to worry about whether the branch is taken or not, we do not need to clear the pipe because no matter whether the branch is taken or not, we know the instruction is safe to execute.
More Read : https://www.cs.umd.edu/class/fall2001/cmsc411/projects/branches/delay.html
Moving $I_1$ after branch
Cannot be moved
Moving $I_3$ after branch
Moving $I_4$ after branch
$I4$ is simple store instruction used to store R1 in memory
program execution will have no effect if this is placed after conditional branch
$\Rightarrow$ Can be moved
Can be moved
Moving $I_2$ after branch
Cannot be moved
Hence, option D is answer.
why I2 can’t be choice? We are using memory location of R2 in I4;
not the value of R2; so we should be able to move I2 too? @talha hashim
It can't be l1 or l3, because directly or indirectly they are taking part in the branching decision. Now we can have both l2 and l4 after the branching decision statement and the order of I2 and I4 matters because in I2 we are getting the final value in register R4 and in instruction we are saving contents of R1 in memory whose address is stored in the register. So If we made I2 to be the instruction after branch then the value in the first loop itself the value stored in memory location whose address is stored in R4 would be wrong because it actually should have been updated first by R5-R6.So I4 is correct. So (D) is correct option.
You should know about Delayed Branching concept first for such type of problem. It is the concept where in delay slot(the instruction space following branch instruction) we insert that instruction which is always executed, whether branch is taken or not.
Hence, let's consider branch instruction as Ij and the instruction following branch instruction as Ij+1, so in delay slot, we can't place Ij+1 , as it may be discarded when branch is taken, hence we can place instruction preceding to branch instruction in delay slot, So as per above program I4 should be placed in Delay Slot.
So remember in Delayed branching, we are not logically changing order of instructions, only branch instruction is executed one instruction later(Delayed).