Wrt the question and the selected answer given here, why aren't we considering all the above 4 pipeline stages to be of the same duration, i.e., the duration having the maximum length of 2 ns ??

Dark Mode

10,018 views

35 votes

Consider a pipeline processor with $4$ stages $S1$ to $S4$. We want to execute the following loop:

for (i = 1; i < = 1000; i++) {I1, I2, I3, I4}

where the time taken (in ns) by instructions $I1$ to $I4$ for stages $S1$ to $S4$ are given below:

$$\begin{array}{|c|c|c|c|c|} \hline & \textbf {$S _1$} &\textbf {$S _2$} & \textbf {$S _3$} & \textbf{$S _4$ } \\\hline \textbf{I1}& \text{$1$} & \text{$2$} & \text{$1$} & \text{$2$} \\\hline \textbf{I2} & \text{$2$} & \text{$1$} & \text{$2$} & \text{$1$}\\\hline \textbf{I3}& \text{$1$} & \text{$1$} & \text{$2$} & \text{$1$} \\\hline \textbf{I4} & \text{$2$} & \text{$1$} & \text{$2$} & \text{$1$} \\\hline \end{array}$$

The output of $I1$ for $i = 2$ will be available after

- $\text{11 ns}$
- $\text{12 ns}$
- $\text{13 ns}$
- $\text{28 ns}$

Wrt the question and the selected answer given here, why aren't we considering all the above 4 pipeline stages to be of the same duration, i.e., the duration having the maximum length of 2 ns ??

0

edited
Dec 4, 2021
by ankit3009

@the_bob

If your doubt is cleared then please let me know the reason too. What I believe is if there are synchronous transfer / constant clocking rate terms are used then we need to use max(d1, d2, d3) where d1, d2, and d3 is the delay of stages S1, S2, S3 because then we need to have constant transfer for all the stages. And if those above-mentioned terms aren’t used then consider time for each stage of instruction, then using summation and the required formula we can easily calculate as we have done in this problem. If I am wrong then please correct me.

If your doubt is cleared then please let me know the reason too. What I believe is if there are synchronous transfer / constant clocking rate terms are used then we need to use max(d1, d2, d3) where d1, d2, and d3 is the delay of stages S1, S2, S3 because then we need to have constant transfer for all the stages. And if those above-mentioned terms aren’t used then consider time for each stage of instruction, then using summation and the required formula we can easily calculate as we have done in this problem. If I am wrong then please correct me.

0

49 votes

Best answer

$$\begin{array}{|c|c|c|c|c|} \hline \textbf{} & \textbf {t1} & \textbf {t2} & \textbf {t3} & \textbf {t4} & \textbf {t5} & \textbf {t6} & \textbf {t7} & \textbf {t8} & \textbf {t9} & \textbf {t10} & \textbf {t11} & \textbf {t12} & \textbf {t13} \\\hline \textbf{I1}& \text{$s _1$} & \text{$s _2$} & \text{$s _2$} & \text{$s _3$} & \text{$s _4$} & \text{$s _4$} \\\hline \textbf{I2} & & \text{$s _1$} & \text{$ s _1$} & \text{$ s _2$} & \text{$s _3$} & \text{$s _3$} & \text{$s _4$}\\\hline \textbf{I3}& & & &\text{$s_1$} & \text{$s_2$} & \text{--}&\text{$s_3$} & \text{$s _3$} & \text{$s _4$}\\\hline \textbf{I4} & &&&&\text{$s_1$} & \text{$s_1$} & \text{$s_2$} & \text{--} & \text{$s_3$} & \text{$s _3$}& \text{$s _4$}\\\hline \textbf{I5} & &&&&&&\text{$s_1$} & \text{--} & \text{$s_2$} & \text{$s _2$} & \text{$s_3$} & \text{$s _4$}& \text{$s _4$}\\\hline \end{array}$$

So, total time would be $13\;ns$

Option (c).

So, total time would be $13\;ns$

Option (c).

edited
May 29, 2016
by Shreya Roy

The answer is 13 ns, but in the 2nd time for I1 ,S2 phase will not start at 8th cycle bcz S3 for I4 has not stated yet.So S2 will start at 9th cylcle and so both 9th and 10th cycle will be used by S2 of I1 (in the 2nd time) thus the total clock cycles still remain 13.

1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | |

I1 | s1 | s2 | s2 | s3 | s4 | s4 | |||||||

I2 | s1 | s1 | s2 | s3 | s3 | s4 | |||||||

I3 | s1 | s2 | s3 | s3 | s4 | ||||||||

I4 | s1 | s1 | s2 | s3 | s3 | s4 | |||||||

I1 | s1 | s2 | s2 | s3 | s4 | s4 |

46

1

8

Can some one tell me why loop unrolling concepts is not used here like in this one:-https://gateoverflow.in/1314/gate2009-28 .Can some one explain when to use loop enrolling and when not to.Although answer will be same but with loop unrolling for second iteration of I1 it can do work in 8th and 9th clock cycle and not in 9th and 10th clock cycle.Please help here

0

2

30

0

@Arjun sir. In pipelined architecture, don't we consider the slowest stage time as the time fr each stage?

Then why don't we consider in this question that since the maximum time taken by any instruction in any stage is 2 units, therefore every stage will be synchronized with a clock so that each stage takes 2 unit time? I am having a lot of confusion in this subject. Can you please tell me some resource to study this from.

Then why don't we consider in this question that since the maximum time taken by any instruction in any stage is 2 units, therefore every stage will be synchronized with a clock so that each stage takes 2 unit time? I am having a lot of confusion in this subject. Can you please tell me some resource to study this from.

5

@Arjun sir,You mentioned

Because it doesnt make a difference here. As long as all stages are taking same cycles, we dont need a queue.

But here also different stage delays are there.I am getting same answer with both(with and without loop level enrollment),but it will not be same always as in https://gateoverflow.in/1314/gate2009-28

1

0

edited
Dec 22, 2018
by jatin khachane 1

@Arjun sir, @Shaik Masthan , @sushmita , @Ayush Upadhyaya

When should we consider stage buffers as queue and when not ..because of which answers may be different here it is same but for this https://gateoverflow.in/1512/gate1999-13

**BY DEFAULT **..we should consider a single buffer right not as queue right ???? ..that is unless prev instruction leaves stage and get into next stage current instruction can't enter into that stage

0

0

I have the same doubt. Did you get an answer @Rishabh Gupta 2 ? Slowest stage time for each stage is considered in this question: https://gateoverflow.in/1063/gate2004-69. So why are we not doing the same here? Arjun sir also mentioned in a comment for that answer that if no information about clock rate is given, we must assume uniform clock rate. Therefore, clock period for this question should be considered as 2ns. Please correct me if I am wrong

1