17,387 views

A 4-stage pipeline has the stage delays as $150$, $120$, $160$ and $140$ $nanoseconds$, respectively. Registers that are used between the stages have a delay of $5$ $nanoseconds$ each. Assuming constant clocking rate, the total time taken to process $1000$ data items on this pipeline will be:

1. $\text{120.4 microseconds}$

2. $\text{160.5 microseconds}$

3. $\text{165.5 microseconds}$

4. $\text{590.0 microseconds}$

In a pipelined processor, a clock signal is applied to each segment(synchronised clock-same signal at same time to each segment). Suppose if the clock is positive edge triggered, then when positive edge occurs, each segment will start its operation.When different segments take different times to complete their suboperation. The clock cycle must be chosen so that data has reached infront of every segment.

Here we cannot use clock cycle time less than 165ns because before 165ns the data from segment number 3 has not reached to segment number 4.It will take a time of (160+5)ns for it to reach infront of segment number 4. So take clock cycle time of 165ns. Now 1st instruction will take 165*4 ns and remaining 999 instructions will take 165 ns each. So total time should be 165*4+165*999=165495 ns
(k+n-1)tp=(4+1000-1)165=165.5microsec     here tp=160+5=165
Is it like this that “If constant clocking rate is mentioned, then we need to find the max among all the stage delays and then calculate the runtime else we need to count the delay of each instruction by independently adding each of the stage delays”?

Pipelining requires all stages to be synchronized meaning, we have to make the delay of all stages equal to the maximum pipeline stage delay which here is $160$. We also have to add the intermediate register delay which here is $5ns$ which makes the clock period as $165ns.$

Time for execution of the first instruction $= 165* 4 = 660$ ns.

Now, in every $165$ ns, an instruction can be completed. So,

Total time for $1000$ instructions $= 660 + 999*165 = 165.495$ microseconds

Correct Answer: $C$
by

edited

Even I believe, this would be the correct way to do, as the registers are placed between stages : 3×165+160=655.
Also, when to consider the max of among all delays and when to consider the delay of each individual stage while calculating the execution time?

What if maximum stage delay is at last stage, say instead of 160 delay at second last stage it is at last stage. Will then also we will add register delay to 160? Because it is given “Registers that are used ‘between’ the stages” not at the last stage.

In a test series solution it was not added but I think it will be added. Please clarify.

@ankit3009 Individual delays are considered when we are calculating EMAT for non-pipelined structure.

Lets first instruction will take all four stages(4cycle) and rest 999 instruction will be completed in every clock cycle.

TT(total time)=First instruction x Number of cycle x Duration of each cycle + 999 x Number of cycle x Duration of cycle

TT=1 x4x(160+5)+999x1x165 ns

TT=165,495 ns

TT=165.495 micro second

//Max time period=Max_duration(150,120,160,140)+register delay=165ns
Delay between each stage is 5 ns.
Total delay in pipline = 150 + 120 + 160 + 140 = 570
Total delay for one data item = 570 + 5*3 (Note that there are 3 intermediate registers)
= 585
For 1000 data items, first data will take 585 ns to complete and rest
999 data will take max of all the stages that is 160 ns + 5 ns register delay

Total Delay = 585 + 999*165 ns which is approximately 165.5 microsecond.

Wrong