The Gateway to Computer Science Excellence
First time here? Checkout the FAQ!
x
+19 votes
4.7k views
An instruction pipeline has five stages, namely, instruction fetch (IF), instruction decode and register fetch (ID/RF), instruction execution (EX), memory access (MEM), and register writeback (WB) with stage latencies $1$ ns, $2.2 $ ns, $2$ ns, $1$ ns, and $0.75$ ns, respectively (ns stands for nanoseconds). To gain in terms of frequency, the designers have decided to split the ID/RF stage into three stages (ID, RF1, RF2) each of latency $2.2/3$ ns. Also, the EX stage is split into two stages (EX1, EX2) each of latency $1$ ns. The new design has a total of eight pipeline stages. A program has $20\%$ branch instructions which execute in the EX stage and produce the next instruction pointer at the end of the EX stage in the old design and at the end of the EX2 stage in the new design. The IF stage stalls after fetching a branch instruction until the next instruction pointer is computed. All instructions other than the branch instruction have an average CPI of one in both the designs. The execution times of this program on the old and the new design are $P$ and $Q$ nanoseconds, respectively. The value of $P/Q$ is __________.
asked in CO & Architecture by Veteran (106k points)
edited by | 4.7k views
+3

Is this solution from gateforum correct?

I couldn't understand how stall clocks are 2 and 5 they should be 1 and 4 instead. Please explain

0
2nd last line shud be  Q = (.8 + .12) * 1ns = 2 ns . else correct.
+1
Thanks
0

For explanation with diagrams see this thread

https://www.facebook.com/groups/core.cs/permalink/1378313795534203/

0
This is absolutely correct.You look at the fact that in first design pipeline will be stalled for 2 stages where as in second design pipeline will be stalled for 5 stages due to intermediate breakdown of stages.

4 Answers

+34 votes
Best answer

Five stages:

(IF), instruction decode and register fetch (ID/RF),

instruction execution (EX),

memory access (MEM), and register writeback (WB)


P old  design:

with stage latencies $\text{1 ns, 2.2 ns, 2 ns, 1 ns, and 0.75 ns}$

$\text{MAX( 1 ns, 2.2 ns, 2 ns, 1 ns, and 0.75 ns) = 2.2nsec}$

AVG instruction execution time is 

$\text{Tavg=(1+no of stalls$\times $branch penality)$\times $cycle time}$

$=(1+0.20\times 2)2.2$  { branch peanlity is $2$ because the next instruction
     pointer at the end of the EX stage in the old design.}

 $=3.08 \text{ nsec}$


Q :new DESIGN:

the designers decided to split the ID/RF stage into three stages $\text{(ID, RF1, RF2)}$
each of latency $\dfrac{2.2}{3}\text{ ns}$. Also, the $EX$ stage is split into two stages
$\text{(EX1, EX2)}$ each of latency $1\text{ ns}$.
The new design has a total of eight pipeline stages.

Time of stages in new design $=\text{{1 ns, 0.73ns, 0.73ns, 0.73ns , 1ns,1ns, 1 ns, and 0.75 ns}}$

(IF), instruction decode

register fetch (ID/RF) $\rightarrow$ further divided into $3$ ie with latency $0.73$ of each 

instruction execution (EX) $\rightarrow$ further divided int $1\text{ nsec}$ of each)

memory access (MEM)

 register writeback (WB)

$\text{MAX( 1 ns, 0.73ns, 0.73ns, 0.73ns , 1ns,1ns, 1 ns, and 0.75 ns) =1 nsec}$

AVG instruction execution time is 

$\text{Tavg=(1+no of stalls$\times $branch penality)$\times $cycle time}$

$=(1+0.20\times 5)1$ { branch penalty is $5$ because the next instruction pointer
    at the end of the $EX2$ stage in the new design.}

$=2 \text{nsec}$


final result 

$\dfrac{P}{Q}=\dfrac{3.08}{2}=1.54$

answered by Boss (20.6k points)
edited by
0
I think in both case,branch penalty is  .2...and ni of stalls in first case 2,and second case 5..
0

 in both case,no of stalls is same 0.2  ....and BRANCH PENALTY  in first case 2,and second case 5.. rt???

0
@Arjun  sir.

 

as stall is a stage in pipeline..that is waiting stage..so how it is in fraction??
0

Could anyone will tell how to calculate (the no. of stall) ?

+3

@ashiwa see this,

0
Their is a doubt, I just can't understand in the formula of Tavg as u used above why we are adding 1 in the no. Of stalls ...pls explain it coz it is confusing me a lot
0
if you see clearly he had take 2.2 as common. so, actually he is adding 2.2(first case) and 1(second case) not 1
+19 votes
cpi for first case $=2.2(1+2\times .2)$ as the stall required is $2$ and $2.2$
is the maximum stage delay.

cpi for second state $=1\times (1+5\times .2)$ as now stall increase to $5$ as
there are five stages before the address is calculated and the maximum stage
delay now is $1.$

$\dfrac{\text{cpu_time1}}{\text{cpu_time2}}=\dfrac{3.08}{2}=1.54$
answered by Active (3.6k points)
edited by
+1
plzz explain how to find stall cycles here ??
0
to calculate stalls in question you have to  find the branch instructions . because branch instruction  cause the non overlapping of the subsequent instruction due to which waiting is there hence stalls occurs .!
0

@Arpit : The approach is logically correct, but shouldn't the maximum stage delay for the new design be 3ns instead of 1ns since it has been explicitly mentioned in the new design as per the question :

"split the ID/RF stage into three stages (ID, RF1, RF2) each of latency 2.2/3 ns" ?

So we should get the ratio as : 2.2*(1 + 0.2*2 ) / [3*(1 + 0.2*5)] = 0.5133.  (which is not really an improvement w.r.t the old design.)

0
In the first design, total no. of stages before EX is two and branch penalty is also two. Similarly in the second design total no. of stages before EX2 is 5 and branch penalty is also 5. Is this a general identity?
0
How  to find s tall cycle ?? Im getting stall cycles as 3 and 6
+6

Video Solution:

(Explained very nicely!)

and he calculated 3 and 6 CPI for Branch instead of 2 and 5 CPI

0
@arpit

Hey i guess you have mentioned wrong formula for CPI

CPI= 1+stall frequency*no of stall cycle

avg execution time= CPI*cycle time   where cycle time =max(stage delay)+register delay

what you have calculated is average execution time.

I know this does not affect the solution but using wrong notation or formula causes confusion.
+1 vote
OLD DESIGN :

phase time = 2 ns branch penalty= 2 (since execution is 3rd stage penalty will be of 2 cycles) ,                                                 other than all branch instrucions CPI = 1

branch instrucion = 20% , other instruction = 80%

total execution time = CPI*Phase time

                            = (1+2*.2)*2ns

                          P = 3.08ns

NEW DESIGN :

phase time = 1ns branch penalty = 5 (since EX2 is 6th stage penalty will be of 5 cycles)

total execution time = (1+.2*5)*1ns

                           Q= 2 ns

therefor P/Q = 3.08/2 = 1.54ns
answered by Active (2k points)
0 votes

This might help.

answered by Active (4.2k points)
Answer:

Related questions



Quick search syntax
tags tag:apple
author user:martin
title title:apple
content content:apple
exclude -tag:apple
force match +apple
views views:100
score score:10
answers answers:2
is accepted isaccepted:true
is closed isclosed:true

42,599 questions
48,601 answers
155,674 comments
63,742 users