129 views
7.14  A pipeline processor uses the delayed branch technique. You are asked to recommend one of the two possibilities for the design of this processor. In the first possibility, the processor has a four stage pipeline and one delay slot, and in second possibility, it has six stage pipeline with two delay slot. compare the performance of these two alternatives, taking only the branch penalty into account. Assume that 20 percent instruction are branch instruction and that an optimizing compiler has an 80 percent success rate in filling the single delay slot. For the second alternative, the compiler is able to fill the second slot 25 percent of the time.
+2

T4-> CPI for 4 stage pipeline, T6-> CPI for 6 stage pipeline

T4 = 0.8*1 + 0.2*(0.8*1 + 0.2*2) = 1.04

T6 = 0.8*1 + 0.2*(0.8*(0.75*2 + 0.25*1) + 0.2*3) = 1.2

clearly machine with 4 stage pipeline with 1 delay slot is faster than macine with 6 stage pipeline and 2 delay slot.

0
What logic you are applying for stalls? i mean branch penalties.
0
Throughput of 4 stage will be 4/1.04= 3.85 and that of 6 stage will be 6/1.2= 5 so 6 stage will be better ?
0
@Tesla,

$performance ∝ \frac{1}{CPI}$
0

we assume that number of instruction are infinite so can we say system with less delay slot will always give better performance irrespective of stages in pipeline ?  @joshi_nitish

+2
with all the other constraints remaining constant(optimizing compiler efficiency, percentage of branch instrctns etc etc), the system with less delay slot will give less CPI and hence more performance irrespective of no. of stages.
–1

For four stage pipeline,  Seff = 1 + 0.2 (0.8*1 + 0.2 * 2) = 1.24

For six stage pipeline,  Seff = 1 + (0.2) (0.8 * 2 + 0.8 * 0.25 * 1) = 1.36

Four stage pipeline is faster than Six stage pipeline.

2