The Gateway to Computer Science Excellence
First time here? Checkout the FAQ!
+2 votes
7.14  A pipeline processor uses the delayed branch technique. You are asked to recommend one of the two possibilities for the design of this processor. In the first possibility, the processor has a four stage pipeline and one delay slot, and in second possibility, it has six stage pipeline with two delay slot. compare the performance of these two alternatives, taking only the branch penalty into account. Assume that 20 percent instruction are branch instruction and that an optimizing compiler has an 80 percent success rate in filling the single delay slot. For the second alternative, the compiler is able to fill the second slot 25 percent of the time.
asked in CO & Architecture by Boss (16.1k points) | 86 views

T4-> CPI for 4 stage pipeline, T6-> CPI for 6 stage pipeline

T4 = 0.8*1 + 0.2*(0.8*1 + 0.2*2) = 1.04

T6 = 0.8*1 + 0.2*(0.8*(0.75*2 + 0.25*1) + 0.2*3) = 1.2

clearly machine with 4 stage pipeline with 1 delay slot is faster than macine with 6 stage pipeline and 2 delay slot.

What logic you are applying for stalls? i mean branch penalties.
Throughput of 4 stage will be 4/1.04= 3.85 and that of 6 stage will be 6/1.2= 5 so 6 stage will be better ?

$performance ∝ \frac{1}{CPI}$

we assume that number of instruction are infinite so can we say system with less delay slot will always give better performance irrespective of stages in pipeline ?  @joshi_nitish

with all the other constraints remaining constant(optimizing compiler efficiency, percentage of branch instrctns etc etc), the system with less delay slot will give less CPI and hence more performance irrespective of no. of stages.

Please log in or register to answer this question.

Quick search syntax
tags tag:apple
author user:martin
title title:apple
content content:apple
exclude -tag:apple
force match +apple
views views:100
score score:10
answers answers:2
is accepted isaccepted:true
is closed isclosed:true

36,161 questions
43,620 answers
42,880 users