search
Log In
5 votes
535 views
Consider a pipelined processor with $5$ stages, $\text{Instruction Fetch} (\textsf{IF})$, $\text{Instruction Decode} \textsf{(ID)}$, $\text{Execute } \textsf{(EX)}$, $\text{Memory Access } \textsf{(MEM)}$, and $\text{Write Back } \textsf{(WB)}$. Each stage of the pipeline, except the $\textsf{EX}$ stage, takes one cycle. Assume that the $\textsf{ID}$ stage merely decodes the instruction and the register read is performed in the $\textsf{EX}$ stage. The $\textsf{EX}$ stage takes one cycle for $\textsf{ADD}$ instruction and the register read is performed in the $\textsf{EX}$ stage, The $\textsf{EX}$ stage takes one cycle for $\textsf{ADD}$ instruction and two cycles for $\textsf{MUL}$ instruction. Ignore pipeline register latencies.

Consider the following sequence of $8$ instructions:

$$\textsf{ADD, MUL, ADD, MUL, ADD, MUL, ADD, MUL}$$

Assume that every $\textsf{MUL}$ instruction is data-dependent on the $\textsf{ADD}$ instruction just before it and every $\textsf{ADD}$ instruction (except the first $\textsf{ADD}$) is data-dependent on the $\textsf{MUL}$ instruction just before it. The $\textit{speedup}$ defined as follows.

$$\textit{Speedup} = \dfrac{\text{Execution time without operand forwarding}}{\text{Execution time with operand forearding}}$$

The $\textit{Speedup} $ achieved in executing the given instruction sequence on the pipelined processor (rounded to $2$ decimal places) is _____________
in CO and Architecture
edited by
535 views
0
Cycles due to Operand forwarding=16

Cycles by not using operand forwarding=37

S=37/16=2.31…

2 Answers

4 votes
 
Best answer

1.875

$\text{Speedup(def in question)}=\cfrac{\text{Time without Operand Forwarding}}{\text{Time with Operand Forwarding}}$


Without Operand Forwarding:

\begin{array}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}\hline &1&2&3&4&5&6&7&8&9&10&11&12&13&14&15&16&17&18&19&20&21&22&23&24&25&26&27&28&29&30\\\hline \text{ADD}&IF&ID&EX&MEM&WB\\\hline
\text{MUL}&&IF&ID&&&EX&EX&MEM&WB\\\hline
\text{ADD}&&&IF&&&ID&&&&EX&MEM&WB\\\hline
\text{MUL}&&&&&&IF&&&&ID&&&EX&EX&MEM&WB\\\hline
\text{ADD}&&&&&&&&&&IF&&&ID&&&&EX&MEM&WB\\\hline \text{MUL}&&&&&&&&&&&&&IF&&&&ID&&&EX&EX&MEM&WB\\\hline \text{ADD}&&&&&&&&&&&&&&&&&IF&&&ID&&&&EX&MEM&WB\\\hline \text{MUL}&&&&&&&&&&&&&&&&&&&&IF&&&&ID&&&EX&EX&MEM&WB\\\hline \end{array}

With Operand Forwarding:
\begin{array}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}\hline &1&2&3&4&5&6&7&8&9&10&11&12&13&14&15&16\\\hline \text{ADD}&IF&ID&EX&MEM&WB\\\hline
\text{MUL}&&IF&ID&EX&EX&MEM&WB\\\hline
\text{ADD}&&&IF&ID&&EX&MEM&WB\\\hline
\text{MUL}&&&&IF&&ID&EX&EX&MEM&WB\\\hline
\text{ADD}&&&&&&IF&ID&&EX&MEM&WB\\\hline
\text{MUL}&&&&&&&IF&&ID&EX&EX&MEM&WB\\\hline
\text{ADD}&&&&&&&&&IF&ID&&EX&MEM&WB\\\hline \text{MUL}&&&&&&&&&&IF&&ID&EX&EX&MEM&WB\\\hline \end{array}
$\text{Time taken with Operand Forwarding }= 16$
$\text{Time taken without Operand Forwarding}=30$

$\text{Speedup}=\cfrac{30}{16}=1.875$


selected by
0

Sir , I have written 1.875 as ANS   that's 3 decimal.

but question mentions write up to  2 decimals, will it be considered in final , GO predictor  as well as ME have given marks for it , please clear this sir 

      

 

2 votes

Speed Up = $\frac{23}{16} = 1.437 = 1.44$

0

dependency is between Add to Mul

No dependency between Mul to add.

 

i think it should be 20/16 = 1.25

please check

3
Answer should be 30/16 = 1.875 . Please check.
2
Using Operand forwarding: 16 cycle

Without Operand forwarding:

Case 1:

 From I2,

If EX is available on WB

Than 23 cycle got.

Speed up = 23/16

Case2:

From I2,

If EX ia available after WB

Than 30 cycle got.

Speed up = 30/16

 

...…

Which case prefer for without operand forwarding....
0
let answer key come
Answer:

Related questions

1 vote
2 answers
1
405 views
Consider a set-associative cache of size $\text{2KB (1KB} =2^{10}$ bytes$\text{)}$ with cache block size of $64$ bytes. Assume that the cache is byte-addressable and a $32$ -bit address is used for accessing the cache. If the width of the tag field is $22$ bits, the associativity of the cache is _________
asked Feb 18 in CO and Architecture Arjun 405 views
1 vote
1 answer
2
356 views
Consider a computer system with $\text{DMA}$ support. The $\text{DMA}$ module is transferring one $8$-bit character in one $\text{CPU}$ cycle from a device to memory through cycle stealing at regular intervals. Consider a $\text{2 MHz}$ processor. If $0.5 \%$ processor cycles are used for $\text{DMA}$, the data transfer rate of the device is __________ bits per second.
asked Feb 18 in CO and Architecture Arjun 356 views
0 votes
2 answers
3
239 views
A five-stage pipeline has stage delays of $150, 120, 150, 160$ and $140$ nanoseconds. The registers that are used between the pipeline stages have a delay of $5$ nanoseconds each. The total time to execute $100$ independent instructions on this pipeline, assuming there are no pipeline stalls, is _______ nanoseconds.
asked Feb 18 in CO and Architecture Arjun 239 views
1 vote
2 answers
4
310 views
Consider a complete binary tree with $7$ nodes. Let $A$ denote the set of first $3$ elements obtained by performing Breadth-First Search $\text{(BFS)}$ starting from the root. Let $B$ denote the set of first $3$ elements obtained by performing Depth-First Search $\text{(DFS)}$ starting from the root. The value of $\mid A-B \mid $ is _____________
asked Feb 18 in DS Arjun 310 views
...