The Gateway to Computer Science Excellence
+67 votes
Suppose the functions $F$ and $G$ can be computed in $5$ and $3$ nanoseconds by functional units $U_{F}$ and $U_{G}$, respectively. Given two instances of $U_{F}$ and two instances of $U_{G}$, it is required to implement the computation $F(G(X_{i}))$ for $1 \leq i \leq 10$. Ignoring all other delays, the minimum time required to complete this computation is ____________ nanoseconds.
in CO and Architecture by Boss (41.9k points)
retagged by | 8.3k views


 The tricky part of the question is that we need to understand that the processor is dual core means same instruction can run in parallel.Here UF and UG has two instance so instruction run in parallel in both. 
And the F and G are said  as functions we need to understand these function are like to stages in pipeline.

F(G(Xi)) seems like function composition which means we need to compute G(X) first  and then F(X) .

ie, stage G is followed by the stage F. 


Should be tagged difficult :D
It is a superscalar processor's concept in which multiple instruction units (E units), each of which is usually pipelined so that they constitute a set of independent instruction pipelines.
Refer COA by J.P hayes 3rd edition, Chapter 5.3.3 Page no 384-385
By reading the language in the question, how did  we get idea that this question is based on pipeline.

As they did not used a single term related to pipelining
Simple just see the timings of different functional units , when F functional units are computing G's it will take 5ns , withinn this 5ns , computation of two more G's can take place.

10 Answers

+73 votes
Best answer
The same concept is used in pipelining. Bottleneck here is $U_F$ as it takes 5 ns while $U_G$ takes 3ns only. We have to do 10 such calculations and we have 2 instances of $U_F$ and $U_G$ respectively. So, $U_F$ can be done in $50/2 = 25$ nano seconds. For the start $U_F$ needs to wait for $U_G$ output for 3 ns and rest all are pipelined and hence no more wait. So, answer is

$$3 + 25 = 28 ns.$$
by Veteran (431k points)
selected by
can u explain this question with the help of pipeline diagram sir

What does that "two instances" mean? How is F(G(Xi))  working?

Two instances mean 2 separate instances - Sorry I do not know a simpler word for instance.

G(Xi) is calculated for each i and on its result, say yi, we do F(yi).
sir my confusion is you are saying

1. g(xi) is being computed,and after that f(()).

lets say the above is line no 1. This above line is happening precisely 5 times from x1 to x5 in one instance(assuming both instances are running simultenously,I am considering this instance only).

so g(xi) 5 times and f(()) 5 times.That means i am considering g(xi) and f() as 2 phases in 1 instruction,one is 5ns and another is 3ns,so max taking 5ns,and so 5 instructions in 1 instance.Now i dont know the stages but i can say only at the execution stage of g(xi) i can get the Operand fetch phase of f(),so for 5 instruction it should be (1*2 + (5-1)*1)*5ns =30 ns?

so my confusion is one functional unit equivalent to one phase of an instruction? (the instruction being f(g(xi)))   ?
The question here is regarding the Functional Units meaning Hardware Instruction - there is no subparts to it like Operand Fetch and all as this is acually happening inside the EX stage of the Instruction Execution.
sir so if i make a timelin

|123||45678||9......28|...I assume the second G(x2) is being executed within 4-8th second itself? otherwise how will F(G(x2)) will start from 9 th second without the result of G(x2)?
Exactly - that is what is happening. And that is why we have 2 instances. In a one dimensional timeline we cannot represent this.
interesting....ok got it sir :)
Also i have come to this realization: a vague timeline diagram would be like this

note:any 3 under a 5 is 5 only because getting eclipsed.






also if in the question it was given 1 instance of f and 1 instance of g then answer would have been 53 ns,but if it was given 10 instances of f and 10 instances of g? then answer would be 8 ns :P

I still could not understand what's two instances of UF and Ug doing.

As given in ques Uf and Ug functional units take 5ns and 3ns to compute F and G. 

is it like functional unit means the instruction and 2 instances means 2 stage for both Uf and Ug? That's why Uf's two instances take 5ns in total and likewise Ug take 3ns. And ith Uf and (i+1)st  Ug can compute at the same time.

Two instance of each unit is given.If I calculate the same for 1<=i<=5 with one instances each,its giving same result.Is it conceptually right?
no where it is written that pipeline concept is to be used ...?   then cant we do without pipeline..since i was thinking w/o pipelining..!!
Very good explanation sir.

a simpler word for instances are parellel units

+135 votes

if somebody is trying to understand through a pipeline diagram,

by Boss (12k points)
edited by
your timig dig is wromg !  for i=3 and i=4 , they are not exceuting simulatenously
i dint get u @dexter..
i=3 and i=4 should execute at the same time right ? at clock 4 ?
:O i again updated the previous pic only! :/ will upload it
 yes those G phases can be done in parellel 2 at a time
really very well explained .........................super.................

@Anusha Motamarri

Why did you take clock of 1ns?

It is not mentioned in the question.

Much convincing solution
Make this one the best.Seriously the "best one" declared above is so complex to understand.
great explanation......
two instances of $U_{F}$ and two instances for $U_{G}$, only told to eyewash. If it is given one instances or three instances, answer would be same
my question is that for instruction 5 G phase should start at 9th cycle instead of 7th .right?

because the buffer for the Gth  stage will  basically be filled with the output of the Gth phase of the 3rd instruction and this might be overwritten by the Gth phase of the 5th instruction.
good explanation
+45 votes

Just assume we have two stages in pipeline. They are  G & F taking 3 & 5 seconds.
We have two instances of Uf & Ug.  It is like having dual core processor. We have two pipeline processors.

So we will do  i=1 to 5 in  one processor  &  i= 6 to 10 in another processor.
So they will be done in parallel.

So only focus on i=1 to 5.

In ideal pipeline processor CPI=1 (First instruction takes full time, after it in every cycle one  instruction gets completed unless there is any form of hazard)

So from 5 instructions, first one will take (5+3)= 8ns
For the rest 4 instructions in every 5 ns, 1 instruction will be completed. (Since max(5,3)=5)
5*4= 20ns

So total time taken = 28 ns (ans)

by Boss (12.4k points)
@ahwan how from 5 instructions one instruction will take 5+3=8 ns ?
In ideal pipeline, the 1st instruction takes full time. I.e the sum of time taken by all stages..
The rest instruction takes CPI 1.. i.e the maximum of time taken by all stages..
So 1st one will take time 5+3...  rest will take 5 each.
Sir if it is always first instruction takes full time so when we calculate formula for pipeline ( k+(n-1))tp why we take clock cycles for 1 instruction in every stage same.

Like ex.

S1 2ns

S2 3ns


So our ans is by formula( k+n-1)tp is (2+9)3=33

But by taking first instruction alone then



How ??
Thanks for making things clear!!
What is clock cycle here ? Is it 1ns ?
+9 votes

We can use two instances of both F and G at same time but F depends on G.So,first compute G then F.

by Active (4.8k points)
How G(1) and G(2) both in 3?
we have two instances of both G and F.

So we can both of  them in parallel.
+8 votes

Using concepts of pipelining, answer comes out to be 28.

by (169 points)
Where will the results of G3, G4, G7 and G8 will be stored, if they start executing for next input?

this is most correct diagram.

  F & G are functions and not instruction. we can assume here as last step of function , results are somewhere in memory ,mostly in array format

+5 votes
$10$ instructions using two pair of functional units will take time equivalent to $5$ instructions.

For a pipeline with $k$ stages & $n$ instructions.

$T_{pipeline} \space =\space \sum_{i=1}^{k} \delta_{i} \space + \space (n-1)*\delta_{max}$ where $\delta_{i}$ is delay of $i_{th}$ stage.

$\implies T_{pipeline} \space = 3+5 + \space (5-1)*5 = 28 $
by Loyal (6.3k points)
+5 votes

This might also work .

by Junior (625 points)
Cleared all the doubts.. Good explanation
+4 votes

This is pipeline concept

two unit says we have to pipelines with same functionality.we can start both of them at same time.

Lets distribute i's values 1 to 5 for pipeline1...6 to 10 for pipeline2...


T1=Time required for pipeline1

T2=Time required for pipeline2

Parallel running so answer should be 28.

by Boss (25.6k points)

Related questions

Quick search syntax
tags tag:apple
author user:martin
title title:apple
content content:apple
exclude -tag:apple
force match +apple
views views:100
score score:10
answers answers:2
is accepted isaccepted:true
is closed isclosed:true
50,737 questions
57,313 answers
105,053 users