GATE2015-2-44

11.5k views
Consider the sequence of machine instruction given below:
$$\begin{array}{ll} \text{MUL} & \text{R5, R0, R1} \\ \text{DIV} & \text{R6, R2, R3} \\ \text{ADD} & \text{R7, R5, R6} \\ \text{SUB} & \text{R8, R7, R4} \\ \end{array}$$
In  the above sequence, $R0$ to $R8$ are general purpose registers. In the instructions shown, the first register shows the result of the operation performed on the second and the third registers. This sequence of instructions is to be executed in a pipelined instruction processor with the following $4$ stages: $(1)$ Instruction Fetch and Decode $(IF)$, $(2)$ Operand Fetch $(OF)$, $(3)$ Perform Operation $(PO)$ and $(4)$ Write back the result $(WB)$. The $IF$, $OF$ and $WB$ stages take $1$ clock cycle each for any instruction. The $PO$ stage takes $1$ clock cycle for ADD and SUB instruction, $3$ clock cycles for MUL instruction and $5$ clock cycles for DIV instruction. The pipelined processor uses operand forwarding from the PO stage to the OF stage. The number of clock cycles taken for the execution of the above sequence of instruction is _________.

edited
0
Does the fact that forwarding is done from PO to OF stage (not PO to PO) incur a pipeline stall here?
–1
due to dependency of registers here  we  are reducing the no of stall cycle using operand forwarding from po to of .it will give optimal result .
0
Notice --> Two adjacent functional unit could perform their operation in one clock cycle.
0
@chhotu i dont think split phase is possible between PO AND OF

operand forwarding is used only from :

1.MA OUTPUT REGISTER TO INPUT REGISTER OF PO

2.PO OUTPUT REGISTER TO PO INPUT REGISTER

3.SPLIT PHASE IS ONLY DURING THE WB TO OF PHASE NO WHERE ELSE WE SHOULD ASSUME SPLIT PHASE (BY DEFAULT WE CAN CONSIDER )

IN GATE 2015 WHAT THEY MEANT BY OPERAND FORWARDING BETWEEN EX TO RD PHASE MEANS THE ABOVE MENTIONED 2ND CASE the question only speaks about forwarding not about split both are different things forwarding deals with just bypassing the results to the respective input registers it has got nothing to do with rising and falling edge even if we consider like this we get the same answer 13
0
@Venkat Sai  -- I am totally confused over this SPLIT PHASE thing. Please, can you provide any reference for reading about the SPLIT PHASE. It's not given in Hamacher or any of the NPTEL lectures (IISC, IIT KGP, IITD).
1

$$\small \begin{array}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|} \hline &\bf{t_1}&\bf{t_2}&\bf{t_3}&\bf{t_4}&\bf{t_5}&\bf{t_6}&\bf{t_7}&\bf{t_8}&\bf{t_9}&\bf{t_{10}}&\bf{t_{11}}&\bf{t_{12}}&\bf{t_{13}}&\bf{t_{14}}&\bf{t_{15}}\\ \hline \textbf{I1}&\text{IF}&\text{OF}&\text{PO}&\text{PO}&\text{PO}&\text{WB}\\ \textbf{I2}&&\text{IF}&\text{OF}&\color{red}{-}&\color{red}{-}&\text{PO}&\text{PO}&\text{PO}&\text{PO}&\color{green}{\boxed{\text{PO}}}&\text{WB}\\ \textbf{I3}&&&\text{IF}&\color{red}{-}&\color{red}{-}&\color{red}{-}&\color{red}{-}&\color{red}{-}&\color{red}{-}&\color{red}{-}&\color{green} {\boxed{\text{OF}}}&\color{blue}{\boxed{\text{PO}}}&\text{WB}\\ \textbf{I4}&&&&\color{red}{-}&\color{red}{-}&\color{red}{-}&\color{red}{-}&\color{red}{-}&\color{red}{-}&\color{red}{-}&\text{IF}&\color{red}{-} &\color{blue}{\boxed{\text{OF}}} &\text{PO}&\text{WB}\\ \hline\end{array}$$

It is mentioned in the question that operand forwarding takes place from PO stage to OF stage and not to PO stage. So, $15$ clock cycles.

But since operand forwarding is from PO-OF, we can do like make the PO stage produce the output during the rising edge of the clock and OF stage fetch the output during the falling edge. This would mean the final PO stage and OF stage can be done in one clock cycle making the total number of cycles $=$ $13$. And $13$ is the answer given in GATE key.

$$\small \begin{array}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|} \hline &\bf{t_1}&\bf{t_2}&\bf{t_3}&\bf{t_4}&\bf{t_5}&\bf{t_6}&\bf{t_7}&\bf{t_8}&\bf{t_9}&\bf{t_{10}}&\bf{t_{11}}&\bf{t_{12}}&\bf{t_{13}}\\ \hline \textbf{I1}&\text{IF}&\text{OF}&\text{PO}&\text{PO}&\text{PO}&\text{WB}\\ \textbf{I2}&&\text{IF}&\text{OF}&\color{red}{-}&\color{red}{-}&\text{PO}&\text{PO}&\text{PO}&\text{PO}&\color{green}{\boxed{\text{PO}}}&\text{WB}\\ \textbf{I3}&&&\text{IF}&\color{red}{-}&\color{red}{-}&\color{red}{-}&\color{red}{-}&\color{red}{-}&\color{red}{-}&\color{green} {\boxed{\text{OF}}}&\color{blue}{\boxed{\text{PO}}}&\text{WB}\\ \textbf{I4}&&&&\color{red}{-}&\color{red}{-}&\color{red}{-}&\color{red}{-}&\color{red}{-}&\color{red}{-}&\text{IF} &\color{blue}{\boxed{\text{OF}}} &\text{PO}&\text{WB}\\ \hline\end{array}$$ Reference: http://www.cs.iastate.edu/~prabhu/Tutorial/PIPELINE/forward.html

edited
0
sir please explain insruction 2 why there are two stalls there
0
sir please explain why there are 2 stalls in I2 even when I1 and I2 are independent
0

can we use split phase in EX and ID phase always or only when explicit operand forwarding is mentioned from EX to ID phase ??

0

yes we can use split phase, when the data is ready

Means if MUL uses 5 EX stage, then in 5th EX stage data is ready. So, we can use split phase.

____________________________________________________________________

But operand forwarding is used in every case I think, because otherwise in this question why they are using operand forwarding (Though not stated operand forwarding in question)https://gateoverflow.in/1391/gate2005-68

1
Another point I am getting

there is no relationship between operand forwarding and split phase.

Operand forwarding can be done in EX to EX stage or EX to MEM stage.

(I think when MEM stage is not ready , then operand forwarding is from EX to EX directly, otherwise operand forwarding is from MEM to EX or MEM to MEM stage)

while in split phase EX to EX split never possible
0
@ Sretha the defaults path of operand forwarding which we use are from EX to EX, EX to mem, mem to mem. But here in this question its directky written that operand forwarding happens from PO to OF stages. So we used split here. But i dont know whether we will be using OF and PO split always?? or not??
0

@sachin    everything is fine but you have written this line just opposite  --- "(Note that dependent instruction is not reading from buffer if it reads from buffer then NO need of operand forwarding, it is done smoothly and automatically)"

0
sachin sir thanks .....
0
nice explanation @Arjun sir now all the doubts removed from my chip
0

why IF of I4 is in t10 and not t4?

0
@ user because I3 is already present in IF stage till 9 th cycle
0

@talha hashim, But IF stage will be completed after 1 cycle, won't it be? After the instruction will be stalled, but does that mean it remain in IF stage as it is already completed? Why can't we use it for the next IF stage? Like will the data be overwritten and wrong ID will be happening if we perform the next IF?

0
@ user OF stage is not free to avail because I2 is already in it ..so I3 will be in IF
0

as this is the recent que in xm, it may come in exam

in exam if it comes as operand forwarding is used from Execute stage to Instruction Decode stage, then second solution of selected answer is to be followed ???

0

@Arjun Sir

Could you please tell me the standard book where I can read about split phase, operand forwarding from PO to PO, PO to OF ?

0

@Arjun Sir, Here it is given that PO to OF operand forwarding in this case it is possible to do it in split phase .

1) But what if operand forwarding is from MEM ==> EX  , MEM ==> MEM ..can't we do it in split phase ?

Last example in reference you given in selected answer.

0

@Arjun sir

why OF for I3 can't be performed in cycle 6???

If it can't be performed, then why we have followed this concept in the following question?

https://gateoverflow.in/2207/gate2010-33?show=326629#c326629

0

@Lovejeet Singh

In PO stage the execution is taking place and in this question it is clearly mentioned that Operand forwarding is done between PO - OF stages only. so until the operand is not processed(executed) in PO stage for I2,  it can't be forwarded to OF of I3. hence OF of I3 will be at t10 and not at t6.

I1 and I2 are independent instructions i.e. they have no registers in common, hence OF of I2 can appear at t4.

0

what to assume, with split phase or without split phase

0
split phase is assumed by default.

answer = $15$ cycles

2
I guess it was a typo in question to say PO to OF. Answer in key is 13 clock cycles and there is no logical reason to do forwarding to OF stage rt?

Also, regarding first half operation of PO- I'm still not sure if this is possible. Most probably it is not.
43

1) Operand Forwarding from PO to OF means:

2) Split-phase access between PO & OF means:

They are saying that apply the first one. but giving answer in the answer key after applying the second one.

2
whats the consensus here?? Finally kya karna h??????? ufffffffff. so many assumptions we cant make in exam.
0
Go with the split phase if nothing is mentioned explicitly.
2
arjun sir, should we go with the split phase by default??? and what in numerical type answer how to know which method to follow since options wont be there too.
1
cahotic question
0
sir,  lets say OF stage  takes  2 cycle, then also OF would come in 10 th cycle for I3 thereby increasing one more cycle or we can OF in 9th and 10th cycle so that PO phase can be in 11th cycle.

or PO  would be in 12 th cycle then???
 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13 I1 IF OF PO PO PO WB I2 IF OF - - PO PO PO PO PO WB I3 IF - - - - - - OF PO WB I4 -IF - - - - - OF PO WB

0
is this correct ??

if not then why?

Is it a correct answer ???

t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15
I1 IF OF PO PO PO WB
I2   IF OF - - PO PO PO PO PO WB
I3     IF OF - - - - - -   PO WB
I4       IF OF - - - - - - - - PO WB
0
but they have asked to use operand forwarding from po to of.

Related questions

1
5.1k views
Assume that for a certain processor, a read request takes $50\:\text{nanoseconds}$ on a cache miss and $5\:\text{nanoseconds}$ on a cache hit. Suppose while running a program, it was observed that $80\%$ of the processor's read requests result in a cache hit. The average read access time in nanoseconds is ______.
Consider a non-pipelined processor with a clock rate of $2.5$ $GHz$ and average cycles per instruction of four. The same processor is upgraded to a pipelined processor with five stages; but due to the internal pipeline delay, the clock speed is reduced to $2$ $GHz$. Assume that there are no stalls in the pipeline. The speedup achieved in this pipelined processor is_______________.
Let $X$ and $Y$ denote the sets containing 2 and 20 distinct objects respectively and $F$ denote the set of all possible functions defined from $X$ to $Y$. Let $f$ be randomly chosen from $F$. The probability of $f$ being one-to-one is ______.
The number of states in the minimal deterministic finite automaton corresponding to the regular expression $(0+1)^* (10)$ is _____.