The Gateway to Computer Science Excellence

+67 votes

Suppose the functions $F$ and $G$ can be computed in $5$ and $3$ nanoseconds by functional units $U_{F}$ and $U_{G}$, respectively. Given two instances of $U_{F}$ and two instances of $U_{G}$, it is required to implement the computation $F(G(X_{i}))$ for $1 \leq i \leq 10$. Ignoring all other delays, the minimum time required to complete this computation is ____________ nanoseconds.

+5

The tricky part of the question is that we need to understand that the processor is dual core means same instruction can run in parallel.Here **U****F** and **U****G** has two instance so instruction run in parallel in both.

And the **F** and **G** are said as functions we need to understand these function are like to stages in pipeline.

**F(G(Xi)) **seems like function composition which means we need to compute **G(X) **first and then **F(X) .**

ie, stage **G** is followed by the stage **F**.

0

It is a superscalar processor's concept in which multiple instruction units (E units), each of which is usually pipelined so that they constitute a set of independent instruction pipelines.

+73 votes

Best answer

The same concept is used in pipelining. Bottleneck here is $U_F$ as it takes 5 ns while $U_G$ takes 3ns only. We have to do 10 such calculations and we have 2 instances of $U_F$ and $U_G$ respectively. So, $U_F$ can be done in $50/2 = 25$ nano seconds. For the start $U_F$ needs to wait for $U_G$ output for 3 ns and rest all are pipelined and hence no more wait. So, answer is

$$3 + 25 = 28 ns.$$

$$3 + 25 = 28 ns.$$

+3

Two instances mean 2 separate instances - Sorry I do not know a simpler word for instance.

G(Xi) is calculated for each i and on its result, say yi, we do F(yi).

G(Xi) is calculated for each i and on its result, say yi, we do F(yi).

0

sir my confusion is you are saying

1. g(xi) is being computed,and after that f(()).

lets say the above is line no 1. This above line is happening precisely 5 times from x1 to x5 in one instance(assuming both instances are running simultenously,I am considering this instance only).

so g(xi) 5 times and f(()) 5 times.That means i am considering g(xi) and f() as 2 phases in 1 instruction,one is 5ns and another is 3ns,so max taking 5ns,and so 5 instructions in 1 instance.Now i dont know the stages but i can say only at the execution stage of g(xi) i can get the Operand fetch phase of f(),so for 5 instruction it should be (1*2 + (5-1)*1)*5ns =30 ns?

so my confusion is one functional unit equivalent to one phase of an instruction? (the instruction being f(g(xi))) ?

1. g(xi) is being computed,and after that f(()).

lets say the above is line no 1. This above line is happening precisely 5 times from x1 to x5 in one instance(assuming both instances are running simultenously,I am considering this instance only).

so g(xi) 5 times and f(()) 5 times.That means i am considering g(xi) and f() as 2 phases in 1 instruction,one is 5ns and another is 3ns,so max taking 5ns,and so 5 instructions in 1 instance.Now i dont know the stages but i can say only at the execution stage of g(xi) i can get the Operand fetch phase of f(),so for 5 instruction it should be (1*2 + (5-1)*1)*5ns =30 ns?

so my confusion is one functional unit equivalent to one phase of an instruction? (the instruction being f(g(xi))) ?

+1

The question here is regarding the Functional Units meaning Hardware Instruction - there is no subparts to it like Operand Fetch and all as this is acually happening inside the EX stage of the Instruction Execution.

+1

sir so if i make a timelin

|123||45678||9......28|...I assume the second G(x2) is being executed within 4-8th second itself? otherwise how will F(G(x2)) will start from 9 th second without the result of G(x2)?

|123||45678||9......28|...I assume the second G(x2) is being executed within 4-8th second itself? otherwise how will F(G(x2)) will start from 9 th second without the result of G(x2)?

+2

Exactly - that is what is happening. And that is why we have 2 instances. In a one dimensional timeline we cannot represent this.

+6

Also i have come to this realization: a vague timeline diagram would be like this

note:any 3 under a 5 is 5 only because getting eclipsed.

|3||5|

|3||5|

|3||5|

|3||5|

|3||5|

also if in the question it was given 1 instance of f and 1 instance of g then answer would have been 53 ns,but if it was given 10 instances of f and 10 instances of g? then answer would be 8 ns :P

note:any 3 under a 5 is 5 only because getting eclipsed.

|3||5|

|3||5|

|3||5|

|3||5|

|3||5|

also if in the question it was given 1 instance of f and 1 instance of g then answer would have been 53 ns,but if it was given 10 instances of f and 10 instances of g? then answer would be 8 ns :P

0

I still could not understand what's two instances of U_{F} and U_{g} doing.

As given in ques Uf and Ug functional units take 5ns and 3ns to compute F and G.

**is it like functional unit means the instruction and 2 instances means 2 stage for both Uf and Ug? That's why Uf's two instances take 5ns in total and likewise Ug take 3ns. And i ^{th }Uf and (i+1)^{st } Ug can compute at the same time.**

0

Two instance of each unit is given.If I calculate the same for 1<=i<=5 with one instances each,its giving same result.Is it conceptually right?

+135 votes

+1

:O i again updated the previous pic only! :/ will upload it

yes those G phases can be done in parellel 2 at a time

yes those G phases can be done in parellel 2 at a time

0

two instances of $U_{F}$ and two instances for $U_{G}$, only told to eyewash. If it is given one instances or three instances, answer would be same

+45 votes

Just assume we have two stages in pipeline. They are G & F taking 3 & 5 seconds.

We have two instances of Uf & Ug. It is like having **dual core processor**. We have two pipeline processors.

So we will do i=1 to 5 in one processor & i= 6 to 10 in another processor.

So they will be done in parallel.

So only focus on i=1 to 5.

In ideal pipeline processor CPI=1 *(First instruction takes full time, after it in every cycle one instruction gets completed unless there is any form of hazard)*

So from 5 instructions, first one will take (5+3)= 8ns

For the rest 4 instructions in every 5 ns, 1 instruction will be completed. (Since max(5,3)=5)

5*4= 20ns

So total time taken = **28 ns** (ans)

+1

In ideal pipeline, the 1st instruction takes full time. I.e the sum of time taken by all stages..

The rest instruction takes CPI 1.. i.e the maximum of time taken by all stages..

So 1st one will take time 5+3... rest will take 5 each.

The rest instruction takes CPI 1.. i.e the maximum of time taken by all stages..

So 1st one will take time 5+3... rest will take 5 each.

0

Sir if it is always first instruction takes full time so when we calculate formula for pipeline ( k+(n-1))tp why we take clock cycles for 1 instruction in every stage same.

Like ex.

S1 2ns

S2 3ns

N=10

So our ans is by formula( k+n-1)tp is (2+9)3=33

But by taking first instruction alone then

(2+3)+(9)3=32

How ??

Like ex.

S1 2ns

S2 3ns

N=10

So our ans is by formula( k+n-1)tp is (2+9)3=33

But by taking first instruction alone then

(2+3)+(9)3=32

How ??

+9 votes

+8 votes

0

0

this is most correct diagram.

manali.rajput22 F & G are functions and not instruction. we can assume here as last step of function , results are somewhere in memory ,mostly in array format

+5 votes

$10$ instructions using two pair of functional units will take time equivalent to $5$ instructions.

For a pipeline with $k$ stages & $n$ instructions.

$T_{pipeline} \space =\space \sum_{i=1}^{k} \delta_{i} \space + \space (n-1)*\delta_{max}$ where $\delta_{i}$ is delay of $i_{th}$ stage.

$\implies T_{pipeline} \space = 3+5 + \space (5-1)*5 = 28 $

For a pipeline with $k$ stages & $n$ instructions.

$T_{pipeline} \space =\space \sum_{i=1}^{k} \delta_{i} \space + \space (n-1)*\delta_{max}$ where $\delta_{i}$ is delay of $i_{th}$ stage.

$\implies T_{pipeline} \space = 3+5 + \space (5-1)*5 = 28 $

+4 votes

This is pipeline concept

two unit says we have to pipelines with same functionality.we can start both of them at same time.

Lets distribute i's values 1 to 5 for pipeline1...6 to 10 for pipeline2...

T1=(3+5)+(4*5)=28=T2

T1=Time required for pipeline1

T2=Time required for pipeline2

Parallel running so answer should be 28.

- All categories
- General Aptitude 1.9k
- Engineering Mathematics 7.5k
- Digital Logic 2.9k
- Programming and DS 4.9k
- Algorithms 4.4k
- Theory of Computation 6.2k
- Compiler Design 2.1k
- Databases 4.1k
- CO and Architecture 3.4k
- Computer Networks 4.2k
- Non GATE 1.4k
- Others 1.4k
- Admissions 595
- Exam Queries 573
- Tier 1 Placement Questions 23
- Job Queries 72
- Projects 18

50,737 questions

57,313 answers

198,349 comments

105,053 users