3,822 views
A processor $\text{X}_{1}$ operating at $2 \; \text{GHz}$ has a standard $5-$stage $\text{RISC}$ instruction pipeline having a base $\text{CPI (cycles per instruction)}$ of one without any pipeline hazards. For a given program $\text{P}$ that has $30 \%$ branch instructions, control hazards incur $2$ cycles stall for every branch. A new version of the processor $\text{X}_{2}$ operating at same clock frequency has an additional branch predictor unit $\text{(BPU)}$ that completely eliminates stalls for correctly predicted branches. There is neither any savings nor any additional stalls for wrong predictions. There are no structural hazards and data hazards for $\text{X}_{1}$ and $\text{X}_{2}.$ If the $\text{BPU}$ has a prediction accuracy of $80 \%,$ the speed up $\textit{(rounded off to two decimal places)}$ obtained by $\text{X}_{2}$ over $\text{X}_{1}$ in executing $\text{P}$ is _______________.

## 4 Answers

Execution Time = No.of Instructions x Clocks per Instructions x Clock cylce time

$Speedup = \frac{\text{Old system execution time}}{\text{New system execution time}}$

When no.of Instructions and clock cycle time are same, then

$Speedup = \frac{\text{Old system CPI}}{\text{New system CPI}}$

Normal processor CPI = 1, without any pipeline hazards.

Given that, Program P has 30% branch instructions where each instruction will lead to 2 stall cycles.

Processor X1 has NO BPU, therefore CPI = $1+\overset{\text{branch instructions penalty}}{\overbrace{(0.30*2)}}$ = 1.60

Processor X2 has BPU, therefore CPI = $1 + \overset{\text{branch instructions penalty}}{\overbrace{(0.30* ({ \underset{\text{BPU correctly predicted}}{\underbrace{0.80* 0}} \;\;\;+ \underset{\text{BPU wrongly predicted}}{\underbrace{0.2*2}} }) )}}$ = 1.12

$Speedup = \frac{\text{1.60}}{\text{1.12}}=1.4285$

### 9 Comments

Question asked for x2 over x1

https://en.wikipedia.org/wiki/Division_(mathematics)#Notation

which can also be read out loud as "divide a by b" or "a over b".

So, x2/x1=1.12/1.6=0.7

The Question specifically states Speedup, which is something important and has a defined meaning over something trivial like $x$ over $y$.  I believe the answer is correct.

@Shaik Masthan sir,

Ans should range between 1.42 to 1.43

I think one of the answer possible is 1.41.

Assuming 100 instrn and stages = 5 given lets find total cycles required in each case

for X1 : total cycles =  (1*5 + 99)  CPI=1 cycles + (30 * 2 ) extra cycles bcz of stalls

= 164        ( avg CPI = 1.64)

for X2:  total cycles = (1*5 + 99) CPI=1 cycles  + 20% has 2 stalls therefore ( 6*2 ) extra                                                                                                                                    cycles

total cycles for X2  =  104 + 16 = 116(Avg CPI = 1.16)

speed up = Avg CPI X1/Avf CPI X2  ( clock is same for both)

1.64/1.16   =   1.41

I think one of the answer possible is 1.41.

Assuming 100 instrn and stages = 5

Your approach is not correct. Answer will NOT be 1.41.

Already CPI is given as 1 BUT when you take 100 instructions only, CPI doesn't become 1.

Mistake that you did was to take 100 Instructions only.

With only 100 instructions, Your CPI(ignoring any branch instructions) : (1*5+99)/100 isn't 1.

We must not by-heart a method, we must understand it.

## There is neither any savings nor any additional stalls for wrong predictions.

What is the significance of this line??

Shouldn't it mean that no penalty for wrong prediction.

This line had confused me, and I assumed that there is no penalty for incorrect prediction which is, by common seems absurd but still, I thought maybe the question is asking us to deliberately assume that, very very bad understanding :(, and it cost me 2 marks :(

I am regretting now for not believing my common sense

reshown ago by
The answer given by mahisingh is correct. In fact, it is more accurate than the answer given by the IIT.  He took base CPI as 5 i.e, each stage will run for 1 cycle.

But he missed one convention used to calculate execution times. The Very first instruction in both processors should be taken as only one stage execution i.e, the first instruction takes only 1 cycle(1/5 CPI)  rather than 5 Cycles(1 CPI). So there is an extra 4 in both numerator and denominator.

So instead of 164/116 which is more accurate by all means, by convention, the answer is (164-4/116-4) => 160/112 = 1.428
edited

There is neither any savings nor any additional stalls for wrong predictions.

Sir isn’t this statement a bit ambiguous? Like can't it be understood like this as well → In case of a BUP unit being installed it eliminates additional stalls be it a correct prediction or a misprediction?? Going by that 1.6 can as well be an acceptable solution right?

Speed up factor will always be greater than 1

Dealing with percentages can be counterintuitive so let the program has 100 instructions, then it has 30 branch instructions.

The effective time taken by each instruction in X1 and X2 due to pipelining is 1 Cycle, without any pipeline hazards. CPI is 5 without pipelining. But Pipeline hazards are created by branch instructions (only) in both processors.

In X1, 70 instructions take 1 Cycle since CPI = 1 and another 30 instructions take 3 Cycles (Here an extra 2 is added to 1 for each branch instruction because of the 2-cycle stall caused by every branch instruction)

So for executing the program P with 100 instructions, the time taken by X1 is  (70 * 1) Cycles + (30 * 3)  Cycles = 160 cycles.

Therefore effective CPI (cycles per instruction) with X1 = 160/100 = 1.6

Similarly,

In X2, 70 instructions take 1 cycle. 80 percent of 30 instructions, i.e, 24 instructions take 1 cycle because BPU of X2 has a prediction accuracy of 80%. And 20 percent of 30 instructions, i.e, 6 instructions take 3 cycles. (since BPU prediction goes wrong for 20% of the instructions).

So for executing the program P with 100 instructions, the time taken by X2 is  (70 * 1) Cycles + (24 * 1) Cycles + (6 * 3) Cycles = 112 cycles.

Therefore effective CPI (cycles per instruction) with X2 = 112/100 = 1.12

Speedup Factor =  (N1 * C1 * T1 / N2 * C2 * T2) = 1.60/1.12 = 1.428… = 1.43 (rounded off)

Here N1 = N2 = 100, T1 = T2 = 1/(2 GHZ)

C1 = CPI of X1 and C2 = CPI of X2.

by

5 Stage RISC has these stages

 IF ID EX MA WB

Processor has a frequency of 2GHz  means clock pulse = 0.5 nano Sec (tp = 0.5 nsec)

As per the question CPI is one(1) when there is no any hazard. Now 30% instructions cause 2 cycle stall.

let us suppose we have some x number of instructions.  0.3*x instructions suffer stall of 2 cycles each means total stall cycles is 0.6*x. So total number of cycles for x instructions = 1.6*x.

Avg CPI is 1.6 cc

Now this version is improved which has now a Branch predictor unit. It is mentioned that if prediction goes wrong there is no any penalty(extra stall) means 2 cc stall will still be there which was there in older version.

But when we have a correct branch prediction those extra 2 cc stall in the older version would be eliminated.

$\textup{80% of 30%}$ instructions has zero(0) stall but  $\textup{20% of 30%}$ instructions has 2 cycle stall.

Number of stall cycles in the new version is $\frac{20}{100} * \frac{30}{100}* 2 = 0.12$ cc and overall CPI is 1.12 cc

Now speedup of older version X1 :

$\frac{Non - pipeline \, execution\, time}{Pipelined \, execution\, time}$ = $\frac{5*x* 0.5}{1.6*x* 0.5} = \frac{5}{1.6} = 3.125$

Speedup of New version X2 :

$\frac{Non - pipeline \, execution\, time}{Pipelined \, execution\, time}$ = $\frac{5*x* 0.5}{1.12*x* 0.5} = \frac{5}{1.12} = 4.464$

Now Ratio of Speedup of X2 over X1 = $\frac{Speedup\, X2}{Speedup\, X1} = \frac{4.64}{3.125} = 1.4285$

Answer:

0 votes
0 answers
1
485 views
4 votes
0 answers
2
267 views
0 votes
0 answers
3
288 views
0 votes
1 answer
4