12,639 views

An instruction pipeline has five stages where each stage take 2 nanoseconds and all instruction use all five stages. Branch instructions are not overlapped. i.e., the instruction after the branch is not fetched till the branch instruction is completed. Under ideal conditions,

1. Calculate the average instruction execution time assuming that 20% of all instructions executed are branch instruction. Ignore the fact that some branch instructions may be conditional.
2. If a branch instruction is a conditional branch instruction, the branch need not be taken. If the branch is not taken, the following instructions can be overlapped. When 80% of all branch instructions are conditional branch instructions, and 50% of the conditional branch instructions are such that the branch is taken, calculate the average instruction execution time.

edited
@Kushagra Sir, I understood your approach and it’s one of the best ways to solve such problems. Number of stalls = number of pipeline stages – 1[general case formula for branch instructions which can’t be overlapped](Implement pipeline operation on 2 instructions in which 1st instruction is branched instruction. You will see that here, we have 5 stages so let’s say (IF, OF, EX, MEM, WB are the stages). So, 1st instruction(branched instruction) will execute with 1 CPI(by default in the pipeline) but then 2nd instruction (irrespective of whether branched or non branched) will have to wait till WB of 1st instruction completes as it was branched instruction (In question, it’s mentioned we can’t overlap the branch, so we can’t executing 2nd instruction until and unless we get confirmation from 1st instruction(branched) whether it was branched or not, hence 2nd instruction generates 4 stalls)). Also,default case for branch instruction that can’t be overlapped, CPI = 1 + no. of stalls generated because of branch instructions [for pipeline].

@Arjun sir

I understood the question,
but have doubt regarding one point

If a branch instruction is a conditional branch instruction, the branch need not be taken.

what does the highlighted part implies?

Based on the condition (some flag being set) only JUMP is taken. This link shows some example instructions on x86. https://stackoverflow.com/questions/28182827/useless-jp-jnp-assembly-instruction-on-x86-64

Each stage is $2$ns. So, after $5$ time units each of $2$ns, the first instruction finishes (i.e., after $10$ns), in every $2$ns after that a new instruction gets finished. This is assuming no branch instructions. Now, once the pipeline is full, we can assume that the initial fill time doesn't matter our calculations and average execution time for each instruction is $2$ns assuming no branch instructions.

1. Now, we are given that $20\%$ of instructions are branch (like JMP) and when a branch instruction is executed, no further instruction enters the pipeline. So, we can assume every $5$th instruction is a branch instruction. So, with this assumption, total time to finish $5$ instruction will be $5 * 2 + 8 = 18$ ns (as when a branch instruction enters the pipeline and before it finishes, $4$ pipeline stages will be empty totaling $4 * 2 = 8$ ns, as it is mentioned in question that the next instruction fetch starts only when branch instruction completes). And this is the same for every set of 5 instructions, and hence the average instruction execution time $= 18/5 = 3.6$ ns

2. This is just a complex statement. But what we need is to identify the $\%$ of branch instructions which cause a branch to be taken as others will have no effect on the pipeline flow.
$20\%$ of instructions are branch instructions. $80\%$ of branch instructions are conditional.
That means .$2*.8 = 16\%$ of instructions are conditional branch instructions and it is given that $50\%$ of those result in a branch being taken.
So, $8\%$ of instructions are conditional branches being taken and we also have $20\%$ of $20\% = 4\%$ of unconditional branch instructions which are always taken.

So, percentage of instructions where a branch is taken is $8+4 = 12\%$ instead of $20\%$ in (A) part.

So, in $100$ instructions there will be $12$ branch instructions. We can do a different calculation here as compared to (A) as $12$ is not a divisor of $100$. Each branch instruction causes a pipeline delay of $4*2 = 8$ ns. So, $12$ instructions will cause a delay of $12 * 8 = 96$ ns. For $100$ instructions, we need $100 * 2 = 200$ ns without any delay and with delay we require $200 + 96 = 296$ ns for $100$ instructions.

So, average instruction execution time $= 296/100 = 2.96$ ns

(We can also use this method for part (A) which will give $100 * 2 + 20*8 = 360$ ns for $100$ instructions)

by

@Arjun sir

Can you please explain why in case of unconditional branch instruction, next instruction is known @ the end of instruction only and not @the end of decode or execution stage??

In part b) "for 100 iinstructions, we need 100*2=200ns without any delay".  It should be (5+99*1)*2=204ns without any delay,isnt it?

@Vishal Rana,

As we don't know total number of instructions, so we can't find average time taking into account initial pipeline fill time,

But in both A and B we have assumed number of instructions to be 100 for calculation. So, shouldn't we use (k+n-1) method of including initial instruction's total time.

if an instruction branches then it takes $2ns \times 5 = 10ns$, coz if branch is taken then the instruction after that branch instruction is not fetched until entire current branch instruction is completed, this means it will go through all stages.

if an instruction is non-branch or branching does not happen then, it takes $2ns$ to get completed.

a) $\text{average time taken} = 0.8 \times 2ns + 0.2 \times 10ns = 3.6ns$

b)

$\text{Average time taken,}$

$= 0.8 \times 2ns + 0.2 \Big( 0.2 \times 10ns + 0.8 \big( 0.5\times 10ns + 0.5 \times 2ns\big)\Big)$

$=2.96ns$

@Abhisek Tiwari 4 ,I am also getting 3.68ns

Abhisek Tiwari 4  QUESTION SAYS UNDER IDEAL CONDITION SO  IN PIPELINE IDEAL CONDITION IS CPI=1

@amarVashishth

Kindly help me in this. I followed the approach given in your solution , got answer as 62.5 but that is wrong.

https://gateoverflow.in/285418/go2019-flt1-61

A))

Let total number of instruction is n

Tavg=n x 20% x 5 x 2ns + n x 80% x 1 x 2ns/n

Tavg=3.6ns

B))

Let us suppose we have 100 instruction

20% are branch instruction and 80% Not branch instruction

out of 20% ,80% of 20% are cnditional branch and 20% of 20% are unconditional

Now 50% of 80% of 20% are branch instruction where branch is taken and 50% of 80% of 20% branch not taken.

Clock average=148/100=1.48

Tavg=1.48 x 2ns=2.96ns

.........

### 1 comment

New CPI = OLD CPI + stall-rate*stall-cycle

execution time = new CPI * cycle time.