edited by
45,839 views
142 votes
142 votes

A $5$ stage pipelined CPU has the following sequence of stages:

  • IF – instruction fetch from instruction memory
  • RD – Instruction decode and register read
  • EX – Execute: ALU operation for data and address computation
  • MA – Data memory access – for write access, the register read at RD state is used.
  • WB – Register write back

Consider the following sequence of instructions:

  • $I_1$: $L$ $R0, loc$ $1$; $R0 \Leftarrow M[loc1]$
  • $I_2$: $A$ $R0$, $R0$; $R0 \Leftarrow R0 +R0$
  • $I_3$: $S$ $R2$, $R0$; $R2 \Leftarrow R2-R0$

Let each stage take one clock cycle.

What is the number of clock cycles taken to complete the above sequence of instructions starting from the fetch of $I_1$?

  1. $8$
  2. $10$
  3. $12$
  4. $15$
edited by

4 Answers

Best answer
264 votes
264 votes

Answer is option A.

Without data forwarding:

13 clock - WB and RD state non overlapping.

$$\begin{array}{|c|c|c|c|c|} \hline \textbf {T1} & \textbf {T2} & \textbf {T3} & \textbf {T4} & \textbf {T5} & \textbf {T6} & \textbf {T7} & \textbf {T8} & \textbf {T9} & \textbf {T10} & \textbf {T11} & \textbf {T12} & \textbf {T13} \\\hline \text{IF}& \text{RD} & \text{EX}  & \text{MA} & \text{WB} &   &  \\\hline \text{} & \text{IF} & &&&\text{RD} & \text{EX} & \text{MA}  & \text{WB} & \text{} & \text{}\\\hline &&&&& \text{IF}& & & &\text{RD} & \text{EX}  & \text{MA}&\text{WB} \\\hline \end{array}$$

Here, WB and RD stage operate in Non-Overlapping mode.

11 clock - WB and RD states overlapping.

$$\begin{array}{|c|c|c|c|c|} \hline \textbf {T1} & \textbf {T2} & \textbf {T3} & \textbf {T4} & \textbf {T5} & \textbf {T6} & \textbf {T7} & \textbf {T8} & \textbf {T9} & \textbf {T10} & \textbf {T11}  \\\hline \text{IF}& \text{RD} & \text{EX}  & \text{MA} & \text{WB} &   &  \\\hline \text{} & \text{IF} & &&\text{RD} & \text{EX} & \text{MA}  & \text{WB} & \text{} & \text{}\\\hline &&&& \text{IF}& & &\text{RD} & \text{EX}  & \text{MA}&\text{WB} \\\hline \end{array}$$

Split Phase access between WB and RD means:

WB stage produce the output during the rising edge of the clock and RD stage fetch the output during the falling edge.

In Question it is mentioned

for write access, the register read at RD state is used.

This means that for writing operands back to memory, register read at RD state is used (no operand forward for STORE instructions).

Note

  • As in any question in any subject unless otherwise stated we always consider the best case. So, do overlap - unless otherwise stated. But this is for only WB/RD
  1. Why there is stall for I2 in T3 and T4 ?
     RD is instruction decode and register read. IF we execute RD of I2 in T3, data from memory will not get stored to R0 hence proper operands are not available at T3. Perhaps I2 has to wait until I1 write values to memory.
  2. WB of I1 and RD of I2 are operating in same clock why it is so ?
    If nothing has mentioned in question. This scenario is taken into consideration by default. It is because after MA operands will be available in register so RD and WB could overlap .

With data forwarding 

(Should be the case here as question says no operand forwarding for memory register for STORE instructions)

8 clock cycles

  1. Why there is a stall I2 in T4 ?
    Data is being forwarded from MA of I1 EX of I2 .MA operation of I1 must complete so that correct data will be available in register .
  2. Why RD of I2 in T3 ? Will it not fetch incorrect information if executed before Operand are forwarded from MA of I1 ?
     Yes. RD of I2 will definitely fetch INCORRECT data at T3 . But don't worry about it Operand Forwarding technique will take care of it .
  3. Why can't RD of I2 be placed in T4 ?
    Yes . We can place RD of I2 in T4 as well. But what is the fun in that ? pipeline is a technique used to reduce the execution time of instructions . Why do we need to make an extra stall ? Moreover there is one more problem which is discussed just below .After reading the below point  Just think if we had created a stall at T3 !
  4. Why can't RD of I3 be placed at  T4 ?
    This cannot be done . I3 cannot use RD because Previous instruction I2 should start next stage (EX) before current (I3) could utilize that(RD) stage . It is because data will be residing in buffers.  
  5. Can an operand being forwarded from one clock cycle to same clock cycle ?
     No, the previous clock cycle  must complete before data being forwarded . Unless split phase technique is used
  6. Cant there be a forwarding from EX stage(T3) of I1 to EX stage(T4) of I2 ?
    This is not possible . See what is happening in I1 . It is Memory Read .So data will be available in register after memory read only .So data cannot be forwarded from EX of I1 .
  7. In some case data is forwarded from MA and some case data is forwarded from EX Why it is so ?
    Data is forwarded when it is ready . It solely depends on the type of instruction .
  8. When to use Split-Phase ?
    We can use split phase if data is readily available like between WB/RD and also when operand forwarding happens from EX-ID stage, but not from EX-EX stage. We cannot do split phase access between EX-EX because here the instruction execution may not be possible in the first phase. (This is not mentioned in any standard resource but said by Arjun Suresh by considering practical implementation and how previous year GATE questions have been formed)

[Mostly it is given in question that there is operand forwarding from A stage to B stage eg:https://gateoverflow.in/8218/gate2015-2_44 ]

Split-Phase can be used even when no Operand Forwarding because they aren't related.

References

Similar Questions

Discussions

edited by
32 votes
32 votes

For write access the register read at RD stage is used- this means for a STORE instruction we cannot get operand forwarded but only from RD stage. So, we can assume data forwarding is possible for all other instructions.

T1 T2 T3 T4 T5 T6 T7 T8
IF RD EX MA WB      
  IF RD   EX MA WB  
    IF   RD EX MA WB
MA -> EX forwarding done between I1 and I2
EX -> EX forwarding done between I2 and I3

Hence, answer will be 8.

http://www.cs.iastate.edu/~prabhu/Tutorial/PIPELINE/forward.html

edited by
25 votes
25 votes

answer = option A

$8$ cycles required with operand forwarding.

it is not given that RD and WB stage could overlap.


1 votes
1 votes
Everywhere the explanation for this question is wrong, The correct explanation is:

OPERAND FORWARDING:

(RAW):

1. In case of LOAD statements data forwarding fails and the operand is available in (MA) stage of instruction Here I1 (MA) and I2 (RD)

2. While In case of ALU type statements the operand is available in EX stage of instruction Here I2(EX) and I3(RD)

This is the correct way to do such questions.

Statement 1 mentioned above is the drawback of operand forwarding due to which it is not able to solve all such dependencies.
Answer:

Related questions

55 votes
55 votes
12 answers
1
47 votes
47 votes
6 answers
2