Assume that there is no cycle-time overhead of pipelining ? what is meaning for this

The Gateway to Computer Science Excellence

+38 votes

Consider a $6$-stage instruction pipeline, where all stages are perfectly balanced. Assume that there is no cycle-time overhead of pipelining. When an application is executing on this $6$-stage pipeline, the speedup achieved with respect to non-pipelined execution if $25$% of the instructions incur $2$ pipeline stall cycles is ____________

0

There is no cycle time overhead of pipelining means all stage delays are same ...Having different stage delays will cause cycle time overhead ...

0

http://web.cs.iastate.edu/~prabhu/Tutorial/PIPELINE/pipe_title.html

but in this site they mentioned

If the stages are perfectly balanced, then the time per instruction on the pipelined machine is equal to __Time per instruction on nonpipelined machine__

Under these conditions, the speedup from pipelining equals the number of pipe stages.

+54 votes

Best answer

+9

Is this approach fine :

Suppose each instruction is taking n time without pipelining.

Time for 100 instructions : 100n

For pipelined So each stage will be n/6.

Time for 100 instructions : n+99*n/6+25*2*n/6 = 155n/6

So speed up is : 100*6/155 ~= 4

Suppose each instruction is taking n time without pipelining.

Time for 100 instructions : 100n

For pipelined So each stage will be n/6.

Time for 100 instructions : n+99*n/6+25*2*n/6 = 155n/6

So speed up is : 100*6/155 ~= 4

+1

yes, but you can avoid adding "n" and use 100 for 99 as for average time wee can assume a very large sequence of instructions- not just 100.

+5

2 stall for 25 instruction, lets take total instruction 100.

6 stage pipeline 100 instruction will complete in pipeline 6+(100-1) & for stall 2 cycle extra for 25 instructions so pipeline time 2+(25-1)

total pipeline time 6+99+2+24=131

Non pipeline = 6*100 =600

Speedup = Non pipeline / Pipeline = 600/131= 4.58 ≈ 4

It should be more accurate 4.58

0

when we take total 4 instructions, 2 stall for 1 instruction ( as per above calculation for 100 instructions!!)

Speedup = Non pipeline / Pipeline = 24/11= 2.18

Can somebody make this clear please?

Speedup = Non pipeline / Pipeline = 24/11= 2.18

Can somebody make this clear please?

0

@arjun is it correct to say those 25% instruction incurring stalls cycles should be contigous?Stalling of i leads to i+1 and so on?.Having 2 stall cycles doesn't neccesarily make CPI 3 for those 25% instruction, as pointed out in other answers.Please correct me if wrong.

0

@Anu007

Actually the question asked by manoja was that, suppose there are 4 instructions.

For non pipeline = 6 stages * 4 instructions = 24

For pipeline (if you draw the pipelining diagram considering 25% i.e 1 instruction containing 2 stall cycles, you get answer as) = 11

Speed up = non pipeline/ pipeline

I.e 24/11= 2.18

I was unable to understand your explanation on this question. So will you please elaborate it again for me?

Actually the question asked by manoja was that, suppose there are 4 instructions.

For non pipeline = 6 stages * 4 instructions = 24

For pipeline (if you draw the pipelining diagram considering 25% i.e 1 instruction containing 2 stall cycles, you get answer as) = 11

Speed up = non pipeline/ pipeline

I.e 24/11= 2.18

I was unable to understand your explanation on this question. So will you please elaborate it again for me?

+49 votes

Speed Up= 4

Speed up = Time without Pipeline / Time with Pipeline

Time without pipeline = 6 clock cycle

25% of the instructions incur 2 pipeline stall cycles means CPI = 3

Time with pipeline = 0.25*( 3 ) + 0.75*1 = 1.5 clock cycle

speed up = 6 clock cycle / 1.5 clock cycle = 4

Speed up = Time without Pipeline / Time with Pipeline

Time without pipeline = 6 clock cycle

25% of the instructions incur 2 pipeline stall cycles means CPI = 3

Time with pipeline = 0.25*( 3 ) + 0.75*1 = 1.5 clock cycle

speed up = 6 clock cycle / 1.5 clock cycle = 4

+7

For those who didn't get how CPI=3:

1 2 3 4 5 6 7 8 9

S1 S2 S3 S4 S5 S6

X X S1 S2 S3 S4 S5 S6

The first instruction takes 6 clock cycles. S1 stage of next instruction should start at clock cycle 2. But because of 2 stall cycles S1 stage starts at clock cycle 4. Thus 1st instruction will get completed at 6 , second instruction will get completed at 9 so on...

So CPI=3

1 2 3 4 5 6 7 8 9

S1 S2 S3 S4 S5 S6

X X S1 S2 S3 S4 S5 S6

The first instruction takes 6 clock cycles. S1 stage of next instruction should start at clock cycle 2. But because of 2 stall cycles S1 stage starts at clock cycle 4. Thus 1st instruction will get completed at 6 , second instruction will get completed at 9 so on...

So CPI=3

+7 votes

It was a numerical digit type question so answer must be 4.

As for 6 stages, non-pipelining takes 6 cycles.

There were 2 stall cycles for pipelining for 25% of the instructions

So pipe line time = (1+(25/100)*2) = 1.5

Speed up = Non pipeline time/Pipeline time = 6/1.5 = 4

As for 6 stages, non-pipelining takes 6 cycles.

There were 2 stall cycles for pipelining for 25% of the instructions

So pipe line time = (1+(25/100)*2) = 1.5

Speed up = Non pipeline time/Pipeline time = 6/1.5 = 4

+3 votes

Suppose 'n' instructions are there and n is very large...and Tp time of each cycle. Thus we are having 0.25n branched and 0.75n non-branched instructions

now 0.25n instructions create 2 stalls and thus the time taken is:

= #f cycles * Tp

= (6+0.25n-1 + 2*0.25n)Tp

= (5+0.75n)Tp

Now similarly for non branch, time= (6+0.75n-1)Tp = (5+0.75n)Tp

thus total time taken in Pipeline = (5+0.75n)Tp + (5+0.75n)Tp =(10 + 1.5n)Tp

For Non pipeline time taken for n instruction = #f cycles for per instruction * each cycle time * #f instructions

= 6*Tp*n = 6n*Tp

now, speedup achieved =Time(non-pipe)/Time(pipe)

= (6n Tp) / (10+1.5n)Tp

= 6n / (10+1.5n)

as n → **∞**, taking limits, **speedup= 6/1.5 =4**

+3 votes

The question has been answered before. However, when I looked at it, I couldn't get it. So I am providing an alternate explanation here.

By the basic performance equation, we know that $$ \text{Execution Time} = \frac{N \times S}{ R} $$ where N is the number of instructions, S is the average clock cycles taken per instruction and R is the number of clock cycles per second.

Another useful metric is throughput, which is the number of instructions executed per second. It is defined as:

$$\text{Throughput} = \frac{R}{S}$$.

Now in an non-pipelined processor, the number of stages is equal to the number of cycles i.e $S = \text{No. of stages}$

In a pipelined processor, the value of $S = 1$ because an instruction can enter the pipeline in every cycle.

However, this is the case for an ideal pipeline. In case of a pipeline with stall, the general formula is:

$$S = 1 + \delta_{stall} + \delta_{branch} + \delta_{cache}$$ where $\delta$ is the penalty associated with each of the misses.

Here, $\delta_{stall}$ is given as $0.25 \times 2$, hence the value of $S$ is $1+ 0.5 = 1.5$.

So according to the question, non-pipelined $T_{np} = R/6$ and for the pipelined, $T_{p} = R/1.5$.

From this, we can see that the speedup is $4$ times.

By the basic performance equation, we know that $$ \text{Execution Time} = \frac{N \times S}{ R} $$ where N is the number of instructions, S is the average clock cycles taken per instruction and R is the number of clock cycles per second.

Another useful metric is throughput, which is the number of instructions executed per second. It is defined as:

$$\text{Throughput} = \frac{R}{S}$$.

Now in an non-pipelined processor, the number of stages is equal to the number of cycles i.e $S = \text{No. of stages}$

In a pipelined processor, the value of $S = 1$ because an instruction can enter the pipeline in every cycle.

However, this is the case for an ideal pipeline. In case of a pipeline with stall, the general formula is:

$$S = 1 + \delta_{stall} + \delta_{branch} + \delta_{cache}$$ where $\delta$ is the penalty associated with each of the misses.

Here, $\delta_{stall}$ is given as $0.25 \times 2$, hence the value of $S$ is $1+ 0.5 = 1.5$.

So according to the question, non-pipelined $T_{np} = R/6$ and for the pipelined, $T_{p} = R/1.5$.

From this, we can see that the speedup is $4$ times.

0 votes

In non-pipelined , for 100 instructions, we need 6*100 = 600 cycles. As 6 stages are there and no cycle time overhead.

Now in pipelined architecture we need = 6+ (100-1) = 105 cycles and 25 instructions additionally take 2 cycle stalls so 25*2 = 50 cycles more, so 105+50 = 155 cycles in pipelined architecture.

So speedup = 600/155 = 3.87 ~ 4.

4 is the ans.

Now in pipelined architecture we need = 6+ (100-1) = 105 cycles and 25 instructions additionally take 2 cycle stalls so 25*2 = 50 cycles more, so 105+50 = 155 cycles in pipelined architecture.

So speedup = 600/155 = 3.87 ~ 4.

4 is the ans.

- All categories
- General Aptitude 1.9k
- Engineering Mathematics 7.6k
- Digital Logic 2.9k
- Programming and DS 4.9k
- Algorithms 4.4k
- Theory of Computation 6.2k
- Compiler Design 2.1k
- Databases 4.1k
- CO and Architecture 3.4k
- Computer Networks 4.2k
- Non GATE 1.4k
- Others 1.5k
- Admissions 595
- Exam Queries 573
- Tier 1 Placement Questions 23
- Job Queries 72
- Projects 18

50,832 questions

57,686 answers

199,273 comments

107,208 users