The Gateway to Computer Science Excellence
First time here? Checkout the FAQ!
+28 votes
Consider a $6$-stage instruction pipeline, where all stages are perfectly balanced. Assume that there is no cycle-time overhead of pipelining. When an application is executing on this $6$-stage pipeline, the speedup achieved with respect to non-pipelined execution if $25$% of the instructions incur $2$ pipeline stall cycles is ____________
in CO and Architecture by Veteran (98.9k points)
edited by | 5.4k views

Assume that there is no cycle-time overhead of pipelining ? what is meaning for this

There is no cycle time overhead of pipelining means all stage delays are same ...Having different stage delays will cause cycle time overhead ...
What's a perfectly balanced pipeline?

6 Answers

+43 votes
Best answer
Time without pipeline $=6 \text{ stages}=6 \text{ cycles}$

Time with pipeline $=1+\text{stall freqency}\times \text{stall cycle}$

 $=1+.25\times 2$

Speed up $=\dfrac{6}{1.5}=4$
by (429 points)
edited by
Is this approach fine :

Suppose each instruction is taking n time without pipelining.

Time for 100 instructions : 100n

For pipelined So each stage will be n/6.

Time for 100 instructions : n+99*n/6+25*2*n/6  = 155n/6

So speed up is : 100*6/155 ~= 4
yes, but you can avoid adding "n" and use 100 for 99 as for average time wee can assume a very large sequence of instructions- not just 100.

2 stall for 25 instruction, lets take total instruction 100.

6 stage pipeline 100 instruction will complete in pipeline 6+(100-1) & for stall 2 cycle extra for 25 instructions so pipeline time 2+(25-1)

total pipeline time 6+99+2+24=131

Non pipeline = 6*100 =600

Speedup = Non pipeline / Pipeline = 600/131= 4.58 ≈ 4

It should be more accurate 4.58


not understood

Time with pipeline =1+stall freqency*stall cycle

when we take total 4 instructions, 2 stall for 1 instruction ( as per above calculation for 100 instructions!!)

Speedup = Non pipeline / Pipeline = 24/11= 2.18

Can somebody make this clear please?
@arjun is it correct to say those 25% instruction incurring stalls cycles should be contigous?Stalling of i leads to i+1 and so on?.Having 2 stall cycles doesn't neccesarily make CPI 3 for those 25% instruction, as pointed out in other answers.Please correct me if wrong.

Manoja Rajalakshmi A 

Non-pipeline does not suffer from a stall . here 2 is branch penalty.


Actually the question asked by manoja was that, suppose there are 4 instructions.

For non pipeline = 6 stages * 4 instructions = 24

For pipeline (if you draw the pipelining diagram considering 25% i.e 1 instruction containing 2 stall cycles, you get answer as) = 11

Speed up = non pipeline/ pipeline

I.e 24/11= 2.18

I was unable to understand your explanation on this question. So will you please elaborate it again for me?
+36 votes
Speed Up= 4

Speed up = Time without Pipeline /  Time with Pipeline

Time without pipeline = 6 clock cycle

 25% of the instructions incur 2 pipeline stall cycles means CPI = 3

Time with pipeline = 0.25*( 3 ) + 0.75*1 = 1.5 clock cycle

speed up = 6 clock cycle / 1.5 clock cycle   = 4
by Loyal (6.8k points)
how did you get CPI as 3?
B/C it have 2 stalls
For those who didn't get how CPI=3:

1    2    3   4   5    6   7    8    9

S1 S2 S3 S4 S5 S6

      X    X  S1 S2 S3 S4 S5 S6

The first instruction takes 6 clock cycles. S1 stage of next instruction should start at clock cycle 2. But because of 2 stall cycles S1 stage starts at clock cycle 4. Thus 1st instruction will get completed at 6 , second instruction will get completed at 9 so on...

So CPI=3
very good explanation

cpi=1+stall cycle

+4 votes
It was a numerical digit type question so answer must be 4.

As for 6 stages, non-pipelining takes 6 cycles.

There were 2 stall cycles for pipelining for 25% of the instructions

So pipe line time = (1+(25/100)*2) = 1.5

Speed up = Non pipeline time/Pipeline time = 6/1.5 = 4
by Loyal (9.4k points)
+2 votes

Suppose 'n' instructions are there and n is very large...and Tp time of each cycle. Thus we are having 0.25n branched and 0.75n non-branched instructions

now 0.25n instructions create 2 stalls and thus the time taken is:

           =    #f cycles * Tp

           =   (6+0.25n-1 + 2*0.25n)Tp

           =   (5+0.75n)Tp

Now similarly for non branch, time= (6+0.75n-1)Tp = (5+0.75n)Tp

thus total time taken in Pipeline =  (5+0.75n)Tp + (5+0.75n)Tp =(10 + 1.5n)Tp

For Non pipeline time taken for n instruction = #f cycles for per instruction * each cycle time * #f instructions

                                                                      = 6*Tp*n = 6n*Tp

now, speedup achieved =Time(non-pipe)/Time(pipe)

                                      = (6n Tp) / (10+1.5n)Tp 

                                      = 6n / (10+1.5n)

  as  n → , taking limits,  speedup= 6/1.5 =4



by Active (1.8k points)
+1 vote
The question has been answered before. However, when I looked at it, I couldn't get it. So I am providing an alternate explanation here.

By the basic performance equation, we know that $$ \text{Execution Time} = \frac{N \times S}{ R} $$ where N is the number of instructions, S is the average clock cycles taken per instruction and R is the number of clock cycles per second.

Another useful metric is throughput, which is the number of instructions executed per second. It is defined as:

$$\text{Throughput} = \frac{R}{S}$$.

Now in an non-pipelined processor, the number of stages is equal to the number of cycles i.e $S = \text{No. of stages}$

In a pipelined processor, the value of $S = 1$ because an instruction can enter the pipeline in every cycle.

However, this is the case for an ideal pipeline. In case of a pipeline with stall, the general formula is:

$$S = 1 + \delta_{stall} + \delta_{branch} + \delta_{cache}$$ where $\delta$ is the penalty associated with each of the misses.

Here, $\delta_{stall}$ is given as $0.25 \times 2$, hence the value of $S$ is $1+ 0.5 = 1.5$.

So according to the question, non-pipelined $T_{np} = R/6$ and for the pipelined, $T_{p} = R/1.5$.

From this, we can see that the speedup is $4$ times.
by Loyal (6.3k points)
0 votes
In non-pipelined , for 100 instructions, we need 6*100 = 600 cycles. As 6 stages are there and no cycle time overhead.

Now in pipelined architecture we need = 6+ (100-1) = 105 cycles and 25 instructions additionally take 2 cycle stalls so 25*2 = 50 cycles more, so 105+50 = 155 cycles in pipelined architecture.

So speedup = 600/155 = 3.87 ~ 4.

4 is the ans.
by Junior (863 points)

Related questions

Quick search syntax
tags tag:apple
author user:martin
title title:apple
content content:apple
exclude -tag:apple
force match +apple
views views:100
score score:10
answers answers:2
is accepted isaccepted:true
is closed isclosed:true
49,823 questions
54,818 answers
80,991 users