Dark Mode

24,399 views

49 votes

Consider a $4$ stage pipeline processor. The number of cycles needed by the four instructions $I1, I2, I3, I4$ in stages $S1, S2, S3, S4$ is shown below:

$$\begin{array}{|c|c|c|c|c|} \hline \textbf{} & \textbf {S1} &\textbf {S2} & \textbf {S3} & \textbf{S4 } \\\hline \textbf{I1}& 2 & 1 & 1 & 1 \\\hline \textbf{I2} & 1 & 3 & 2 & 2\\\hline \textbf{I3}& 2 & 1 & 1 & 3 \\\hline \textbf{I4} & 1 & 2 & 2 & 2 \\\hline \end{array}$$

What is the number of cycles needed to execute the following loop?

For (i=1 to 2) {I1; I2; I3; I4;}

- $16$
- $23$
- $28$
- $30$

For people wondering why the debate between 23 cycles and 25 cycles, here is some visual help:

The stages in red are the ones leading to a different answer. Reason:

Take for example the 1st red in I4 at cycle 7. The **‘****1’** could’ve been scheduled in cycle 6 itself, under the assumption as given in the selected answer :

“buffers between the pipeline stages can store multiple results in the form of a queue”

But, some people are pointing out that it shouldn’t work like that, because I3 is already being stalled in stage 1 in cycle 6, and thus I4 can’t be scheduled there.

This is what @bikram points out:

“yes , answer would be 25 if multistage buffer not use and it is nearest to option B , so this option tells we need to use multistage buffer”

Hope this clears everything.

2

62 votes

Best answer

Here bound of the loop are constants, therefore compiler will do the loop unrolling(If compiler won't then prefetcher will do) to increase the instruction level parallelism. And after loop unrolling $23$ cycles are required for execution. Therefore, correct answer would be **(B).**

PS: We assume the buffers between the pipeline stages can store multiple results in the form of a queue.

$$\tiny \begin{array}{|c|c|c|c|c|c|c|c|c|c|c|c|c|} \hline &C_1&C_2&C_3&C_4&C_5&C_6&C_7&C_8&C_9&C_{10}&C_{11}&C_{12}&C_{13}&C_{14}&C_{15}&C_{16}&C_{17}&C_{18}&C_{19}&C_{20}

&C_{21}&C_{22}&C_{23}\\\hline

\bf{I_1}&S_1&S_1&S_2&S_3&S_4\\\hline

\bf{I_2}&&&S_1&S_2&S_2&S_2&S_3& S_3&S_4&S_4\\\hline

\bf{I_3}&&&&S_1&S_1&\color{red}{-}&S_2&\color{red}{-}&S_3&\color{red}{-}&S_4& S_4&S_4\\\hline

\bf{I_4}&&&&&&S_1&\color{red}{-}&S_2&S_2&S_3&S_3&\color{red}{-}&\color{red}{-}&S_4&S_4\\\hline

\bf{I_1}&&&&&&&S_1&S_1&\color{red}{-}&S_2&\color{red}{-}&S_3&\color{red}{-}&\color{red}{-}&\color{red}{-}&S_4\\\hline

\bf{I_2}&&&&&&&&&S_1&\color{red}{-}&S_2&S_2&S_2&S_3& S_3&\color{red}{-}&S_4&S_4\\\hline

\bf{I_3}&&&&&&&&&&S_1&S_1&\color{red}{-}&\color{red}{-}&S_2&\color{red}{-}&S_3&\color{red}{-}&\color{red}{-}&S_4& S_4&S_4\\\hline

\bf{I_4} &&&&&&&&&&&&S_1&\color{red}{-}&\color{red}{-}&S_2&S_2&S_3&S_3&\color{red}{-}&\color{red}{-}&\color{red}{-}&S_4&S_4\\\hline

\end{array} $$

16 votes

When using the following timeline we get **23** cycles:

In the 6th cycle, instruction 4 is entering stage 1 while instruction 3 has not started stage 2. This requires additional output buffer for stage 1 and we can assume this is available. (This additional buffer is not required if all stage delays are the same).

Hence , option (b) is true.

Ok, in above diag, in 6th cycle we are doing it introducing i4 while i3 hasn't gone to next stage

In this ques if we take that approach, we will get diff ans i think i4 will be introduced in 7th clock.

but why cant we use that stage if its free?cant we just use intermediate buffers to store prev instruction output. We will require those buffers only at end of that stage for new instruction.

Ok so i hv read discussion regarding this doubt in below ans.. So in which cases we can use that particular stage.. As we are using here

2

7 votes

Concept for solving this question:

Stage S_{i} is allocated to instruction I_{i }iff

1. Eligibility: I_{i} must have completed S_{i-1} ... X time

2. Availability: I_{i-1 }must have released S_{i }... Y time

Time of allocation = Max(X,Y)

In first iteration:

I1 completes after 5th cycle

I2 completes after 10th cycle

I3 completes after 13th cycle

I4 completes after 15th cycle

In second iteration

I1 completes after 16th cycle

I2 completes after 18th cycle

I3 completes after 21th cycle

I4 completes after 23th cycle

Answer : B