34,478 views
Consider the following reservation table for a pipeline having three stages $S_1, S_2 \text{ and } S_3$.
$$\begin{array}{|ccccc|} \hline \textbf{Time} \rightarrow \\\hline & \text{1}& \text{2} & \text{3} & \text{4} & \text{5} \\\hline \textbf{S _1} & \text{X} & & & & \text{X}\\\hline \textbf{S _2} & & \text{X} & & \text{X}\\\hline \textbf{S _3} & & & \text{X} & \\\hline \end{array}$$
The minimum average latency (MAL) is ______

I found this video to be really good. Cleared all concepts.

@AyushBhatt

Reference: Page 24 http://www2.cs.siu.edu/~cs401/Textbook/ch3.pdf

$S_1$ is needed at time $1$ and $5,$ so its forbidden latency is $5-1=4.$

$S_2$ is needed at time $2$ and $4,$ so its forbidden latency is $4-2=2.$

So, forbidden latency $= (2,4,0)$ ( $0$ by default is forbidden)

Allowed latency $= (1,3,5)$ (any value more than $5$ also).

Collision vector $(4,3,2,1,0) = 10101$ which is the initial state as well.

From initial state we can have a transition after $\text{“1"}$ or $\text{“3"}$ cycles and we reach new states with collision vectors $(10101 >> 1 + 10101 = 11111)$ and $(10101 >> 3 + 10101 = 10111)$ respectively.
These $2$ becomes states $2$ and $3$ respectively. For $\text{“5"}$ cycles we come back to state $1$ itself.

From state $2\ (11111),$ the new collision vector is $11111.$ We can have a transition only when we see the first $0$ from the right. So, here it happens on $5^{th}$ cycle only which goes to the initial state. (Any transition after $5$ or more cycles goes to initial state as we have $5$ time slices).

From state $3\ (10111),$ the new collision vector is $10111.$ So, we can have a transition on $3,$ which will give $(10111 >> 3 + 10101 = 10111)$ third state itself. For $5,$ we get the initial state. Thus all the transitions are complete.

$$\begin{array}{|c|c|c|} \hline \textbf {State\Time} & \textbf {1} & \textbf {3} & \textbf{5 } \\\hline \textbf{1(10101)} & \text{2}& \text{3} & \text{1} \\\hline \textbf{2(11111)} & \text{-} & \text{-}& \text{1}\\\hline \textbf{3(10111)}& \text{-}&\text{3} & \text{1}\\\hline \end{array}$$

So, minimum length cycle is of length 3 either from $\text{3-3}$ or from $\text{1-3,3-1}$.

Not asked in the question, still.

Pipeline throughput is the number of instructions initiated per unit time.
So, with $MAL = 3,$ we have $2$ initiations in $1+3 = 4$ units of time (one at time unit $1$ and another at time unit $4$ ). So, throughput $=\dfrac{2}{4}=0.5$.

Pipeline efficiency is the $\%$ of time every stage of the pipeline is being used.
For the given question we can extend the reservation table and taking $MAL = 3,$ we can initiate new tasks after every $3$ cycles. So, we can consider the time interval from $\text{4-6}$ in the below figure. (The red color shows a stage not being used- affects efficiency).$$\begin{array}{|ccccc|} \hline \textbf{Time} \rightarrow \\\hline & \text{1}& \text{2} & \text{3} & \text{4} & \text{5} & \text{6} & \text{7} & \text{8} & \text{9} & \text{10} & \text{11} \\\hline \textbf{S _{1}} & \text{X} & & & \text{Y} & \text{X} & \times & \text{Z} & \text{Y} & \times & \text{A} & \text{Z} \\\hline \textbf{S _{2}} & & \text{X} & & \text{X} & \text{Y} & \times & \text{Y} & \text{Z} & & \text{Z} & \text{A} \\\hline \textbf{S _{3}} & & & \text{X} & \times & \times & \text{Y} & & & \text{Z} & & & \\\hline \end{array}$$

Here (during cycles $4-6$ ), $\text{stage 1}$ is used $\dfrac{2}{3},$ $\text{stage 2}$ is used $\dfrac{2}{3}$ and stage $3$ is used $\dfrac{1}{3}.$
So, total stage utilization $=\dfrac{(2+2+1)}{9}=\dfrac{5}{9}$ and efficiency $=\dfrac{500}{9} \%=55.55\ %$.

For simulation, Reference: http://www.ecs.umass.edu/ece/koren/architecture/ResTable/SimpRes/

Similar Question

by

page number 25 has AB = (2,1,0), BA = (4,2,1,0) , so my question is how AB = (2,1,0), BA = (4,2,1,0) ?  arjun sir please help .

isn't the through put 1/3? we get an output every 3 cycle as the table drawn.
Thanks for this elaborated explanation 💜💜💜💜💜💜💜💜

first we find forbidden latency which is distance between each pair or X in the same row=(2,4) ...  FL indicate another task can not initiated after this

permissible latency =(1,3) we may initiate another task after this.

now we can find collision vector using FL ..

this collision vector will b known as initial state of the pipeline
collision vector = cycles (4,3,2,1) (one bit for each cycle )and bit at 2 and 4 will b one because of collision=(1010)
now we can construct state diagram using permissible latency .
at 1 ..we will shift right 1 bit to collision vector (it is initial state ) and perform OR operation to result with collision vector =
1010 after one right shift 0101
1010+0101=1111
same for at 3 ..1011
same for at >=5 ..1010

now we have consider these points to find MAL

from the state diagram we can determine optimal latency cycle which result in MAL.

we have to find simple cycle which is a latency cycle in which each state appears only once

some simple cycle are greedy cycle which is one whose edges are all made with minimum latencies  from their respective starting state
so here we find latency cycle (3) and (1,5)
so the answer will b min of (3 which have constant latency or avg latency of (1,5)=(1+5)/2=3)

http://www2.cs.siu.edu/~cs401/Textbook/ch3.pdf

@Anoop Sonkar Shouldn't 1111 on 5+ loop on 1111 also? Other than going back to 1010?

@Anoop Sonkar @Sambhrant Maurya I also have the same query because after shifting 5 bits to the left, we will have 0000. Doing OR bit-wise operation between 1111 and 0000 will result in 1111

@GoalSet1

We do bitwise-OR with Initial Vector.

Latency: It is time state difference between two successive initiation of tasks.

Set of forbidden Latencies: distance between each pair of reserved time slot(X) of each stage of pipeline.

Collision Vector : C = { $C_{n},C_{n-1},....,C_{2},C_{1}$}

Where, $C_{i} = 1$ if i ϵ F else $C{i} = 0$; n = number of bits in C = Max. forbidden latency.

Here, F = { 2,4 }  and collision vector, C = 1010.

Now, Put this collision vector C in a shift register. At the end of every time step, give a right shift to C.

if bit = 1 that comes out, then new task can’t be initiated.

if bit = 0 that comes out, initiate a new task.

q0: 1010 → 0101 (0) since bit 0 comes out, new task can be initiated. So bit-wise OR with CV. 0101 OR 1010 = 1111 ( new state q1 reached in 1 shift). From this state again shift till bit 0 comes out.

0101→ 0010(1) → 0001(0) => 0001 OR 1010 = 1011 ( new state q2 ). 0001→ 0000(1). (stop since all 0’s).

Repeat the procedure with all new states.

q1: 1111 → 0111(1) → 0011(1) → 0001(1) → 0000(1) → 0000(0) => 0000 OR 1010 = 1010 (initial state q0). Hence, it takes 5 or more shifts to reach from q1 to q0.

q2: 1011 → 0101(1) → 0010(1) → 0001(0) => 0001 OR 1010 = 1011 (q2). After, 3 shifts 1011 reaches to itself.

0001 → 0000(1) → 0000(0) => 0000 OR 1010 = 1010. (stop since all 0’s). Hence, it takes 5 or more steps to reach from q2 to q0.

Now, we can draw the state transition diagram:

• Cycles in this state diagram tells the sustainable way in which tasks can be initiated. [Stable Cycles]
• Average Latency possible in stable cycle = $\frac{Total \ latency\ in\ cycle}{ No.\ of\ states\ in\ that\ cycle}$.
• For Minimum Average Latency(MAL) take minimum of all average latencies.

There are three possible cycles.

1. q0 → q1 → q0.  Avg. latency = (1+5)/2 = 3.
2. q0 → q2 → q0.   Avg. latency = (3+5)/2 = 4.
3. q2 → q2.            Avg. latency = 3.

So, Minimum Average Latency (MAL) = Min{3,4,3} = 3.

### 1 comment

Good work Gaurav!

Appreciated!!

The Forbidden Latencies { 0, 2, 4 }
The Pipeline Collision Vector ( 1 0 1 0 1 )
The State Diagram
State 1 : 10101
Reaches State 2 ( 11111 ) after 1 cycle.
Reaches State 3 ( 11101 ) after 3 cycles.
Reaches State 1 ( 10101 ) after 5 or more cycles.
State 2 : 11111
Reaches State 1 ( 10101 ) after 5 or more cycles.
State 3 : 11101
Reaches State 3 ( 11101 ) after 3 cycles.
Reaches State 1 ( 10101 ) after 5 or more cycles.

There are 3 states in the state diagram:

1. State 1 represents 10101
2. State 2 represents 11111
3. State 3 represents 11101

The Greedy Cycle = (1, 5 )*

The Minimum Average Latency = (1+ 5)/2 = 3
The Throughput 1/3*100 = 0.33333 instruction per cycle

by