14,594 views
Consider a $6$-stage instruction pipeline, where all stages are perfectly balanced. Assume that there is no cycle-time overhead of pipelining. When an application is executing on this $6$-stage pipeline, the speedup achieved with respect to non-pipelined execution if $25$% of the instructions incur $2$ pipeline stall cycles is ____________

http://web.cs.iastate.edu/~prabhu/Tutorial/PIPELINE/pipe_title.html

but in this site they mentioned

If the stages are perfectly balanced, then the time per instruction on the pipelined machine is equal to Time per instruction on nonpipelined machine
Under these conditions, the speedup from pipelining equals the number of pipe stages.

Check this appendix taken from Hennessy and Patterson CO – https://www.cs.ucf.edu/~dcm/Teaching/CDA5106-Fall2015/Appendices/appendix_c.pdf.

The page C12 describes pipeline performance in case of stalling, and even talks about a perfectly balanced pipeline.

Time without pipeline $=6 \text{ stages}=6 \text{ cycles}$

Time with pipeline $=1+\text{stall freqency}\times \text{stall cycle}$

$=1+.25\times 2$
$=1.5$

Speed up $=\dfrac{6}{1.5}=4$

@neelesh_007

If there are n instructions and x of them stalls for s cycles. Then , cpi for this ideal pipeline will be

1*(n-x)/n+(1+s)*x/n

If will simplify this:

1+x/n*s

1+stall freq*no. of stalls
Here  for the non pipelined example they are considering it as a 1 instruction and for the pipelined they are considering  it as many  instructions how is the relation possible
Because in non-pipelined instructions 6 cycles will be for 1 instruction and CPI (average ) will be also 6 cycles.

So, the average will be 6 for non-pipelined and 1.5 for pipelined instruction.

So, speed up  = 6/1.5 =4
Speed Up= 4

Speed up = Time without Pipeline /  Time with Pipeline

Time without pipeline = 6 clock cycle

25% of the instructions incur 2 pipeline stall cycles means CPI = 3

Time with pipeline = 0.25*( 3 ) + 0.75*1 = 1.5 clock cycle

speed up = 6 clock cycle / 1.5 clock cycle   = 4

very good explanation

cpi=1+stall cycle

### Correct Explanation But With One Flaw

This is obviously one of best explanation but here is one flaw according to me as sequencing of these 25% instructions will also matter.

If all 25% instruction are at last then in that case those 25 % will get stablized and avg CPI will be 1 for them as well so total time they take effectively will be(for n instructions) = .75n*1 + 2(moving from normal to stalled instructions) + .25n*1 = n+2

for non pipelined  = n*6

so speed up in that case  = 6*n/(n+2) and for very large n that will be effectively ~6

but if these 25% instuctions are not together and there is 1 such instruction after 3 normal instructions in that case avg CPI for these 25% instructions will be = 3

in that case we can apply above formula...

please correct if i am wrong somewhere.

It was a numerical digit type question so answer must be 4.

As for 6 stages, non-pipelining takes 6 cycles.

There were 2 stall cycles for pipelining for 25% of the instructions

So pipe line time = (1+(25/100)*2) = 1.5

Speed up = Non pipeline time/Pipeline time = 6/1.5 = 4
The question has been answered before. However, when I looked at it, I couldn't get it. So I am providing an alternate explanation here.

By the basic performance equation, we know that $$\text{Execution Time} = \frac{N \times S}{ R}$$ where N is the number of instructions, S is the average clock cycles taken per instruction and R is the number of clock cycles per second.

Another useful metric is throughput, which is the number of instructions executed per second. It is defined as:

$$\text{Throughput} = \frac{R}{S}$$.

Now in an non-pipelined processor, the number of stages is equal to the number of cycles i.e $S = \text{No. of stages}$

In a pipelined processor, the value of $S = 1$ because an instruction can enter the pipeline in every cycle.

However, this is the case for an ideal pipeline. In case of a pipeline with stall, the general formula is:

$$S = 1 + \delta_{stall} + \delta_{branch} + \delta_{cache}$$ where $\delta$ is the penalty associated with each of the misses.

Here, $\delta_{stall}$ is given as $0.25 \times 2$, hence the value of $S$ is $1+ 0.5 = 1.5$.

So according to the question, non-pipelined $T_{np} = R/6$ and for the pipelined, $T_{p} = R/1.5$.

From this, we can see that the speedup is $4$ times.
by