GATE IT 2007 | Question: 41

Question

GATE IT 2007 | Question: 41

Ishrat Jahan asked Oct 29, 2014 • edited Apr 12, 2019 by ajaysoni1924

25,050 views

Following table indicates the latencies of operations between the instruction producing the result and instruction using the result.
$$\begin{array}{|l|l|c|} \hline \textbf {Instruction producing the result} & \textbf{Instruction using the result }& \textbf{Latency} \\\hline \text{ALU Operation} & \text{ALU Operation} & 2 \\\hline \text{ALU Operation} & \text{Store} & \text{2}\\\hline \text{Load} & \text{ALU Operation} & \text{1}\\\hline \text{Load} & \text{Store} & \text{0} \\\hline \end{array}$$

Consider the following code segment:

 Load R1, Loc 1; Load R1 from memory location Loc1
 Load R2, Loc 2; Load R2 from memory location Loc 2
 Add R1, R2, R1; Add R1 and R2 and save result in R1
 Dec R2;         Decrement R2
 Dec R1;         Decrement R1
 Mpy R1, R2, R3; Multiply R1 and R2 and save result in R3
 Store R3, Loc 3; Store R3 in memory location Loc 3

What is the number of cycles needed to execute the above code segment assuming each instruction takes one cycle to execute?

$7$
$10$
$13$
$14$

Ishrat Jahan asked Oct 29, 2014 • edited Apr 12, 2019 by ajaysoni1924

Ishrat Jahan

25.1k views

See all

5 Answers

Best answer

In the given question there are $7$ instructions each of which takes $1$ clock cycle to complete. (Pipelining may be used)
If an instruction is in execution phase and any other instructions can not be in the execution phase. So, at least $7$ clock cycles will be taken.
Now, it is given that between two instructions latency or delay should be there based on their operation. Ex- $1^{st}$ line of the table says that between two operations in which first is producing the result of an ALU operation and the $2^{nd}$ is using the result there should be a delay of $2$ clock cycles.

$$\begin{array}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c||} \hline \text {Clock cycle} & \text{T1} & \text{T2}& \text{T3 }& \text{T4} & \text{T5} & \text{T6} & \text{T7}& \text{T8 }& \text{T9} & \text{T10} & \text{T11} & \text{T12}& \text{T13 }\\\hline & \text{I1} & \text{I2} & & \text{I3}& \text{I4 } & & \text{I5} & & & \text{I6} & & & \text{I7}\\\hline \end{array}$$

Load R1, Loc 1; Load R1 from memory location Loc1
Takes 1 clock cycle, simply loading R1 on loc1.
Load R2, Loc 2; Load R2 from memory location Loc2
Takes 1 clock cycle, simply loading r2 on loc2.
(Add R1, R2, R1; Add R1 and R2 and save result in R1
R1=R1+R2;
Hence, this instruction is using the result of R1 and R2, i.e. result of Instruction 1 and Instruction 2.
As instruction 1 is load operation and instruction 3 is ALU operation. So, there should be a delay of 1 clock cycle between instruction 1 and instruction 3, which is already there due to I2.
As instruction 2 is load operation and instruction 3 is ALU operation. So, there should be a delay of 1 clock cycle between instruction 2 and instruction 3.
Dec R2; Decrement R2
This instruction is dependent on instruction 2 and there should be a delay of one clock cycle between Instruction 2 and
Instruction 4. As instruction 2 is load and 4 is ALU, which is already there due to Instruction 3.
Dec R1 Decrement R1
This instruction is dependent on Instruction 3
As Instruction I3 is ALU and I5 is also ALU so a delay of 2 clock cycles should be there between them of which 1 clock cycle delay is already there due to I4 so one clock cycle delay between I4 and I5.
MPY R1, R2, R3; Multiply R1 and R2 and save result in R3
R3=R1*R2;
This instruction uses the result of Instruction 5, as both instruction 5 and 6 are ALU so there should be a delay of 2 clock cycles.
Store R3, Loc 3 Store R3 in memory location Loc3
This instruction is dependent on instruction 6 which is ALU and instruction 7 is store so there should be a delay of 2 clock cycles between them.

Hence, a total of 13 clock cycles will be there.

Correct Answer: $C$

Madhab answered Dec 31, 2016 • edited May 16, 2019 by Naveen Kumar 3

Madhab

See all

Show 22 previous comments

See all

Page:

instruction i4 is decrementing i.e arithmatic operation it is betn LOAD and ALU operation so latency is 1

you havnt included that ... — Pravin Paikrao, Oct 26, 2016
No.The explanation given above is correct and answer is 13 only.

See-->

I3 is ALU operation which uses result of LOAD in I2 , so latency is of 1 cycle (after I2)

I4 is ALU operation which uses result of LOAD in I2 , so latency is of 1 cycle (after I2) but I3 is executing at cycle 4. Therefore. I4 will execute at cycle no. 5.

I5 is ALU operation using result of ALU in I3 therefore has to wait for 2 cycles after I3

I6 is ALU and uses result of ALU in I5 ,therefore waits 2 cycles — GateAnis, Oct 27, 2016
I am having the same doubt ....we don't consider it because of parallel execution ?? Means both the operands are fetched parallely so the maximum is 2 cc ?? @Arjun Sir please clarify ... — Manisha Jaishwal, May 6, 2022
totally not cool. For such a good answer to be at bottom! — Vandit Shah, Jun 18, 2021

Sandeep_Uniyal · Answer 1 · 2015-01-20T06:42:11+0000

Answer is (C)

Here each instruction takes $1$ cycle but apart from that we have to consider latencies b/w
instruction: If there are two ALU operations by $I1$ and $I2$ such that $I2$ uses the value
produced by $I1$ in some register then $I2$ will be executed ONLY after waiting TWO more
cycles after $I1$ has executed because latency b/w two ALU operations is $2$

See here:

Clock 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Inst. I1 I2 - I3 I4 - I5 - - I6 - - I7

$I3$ is ALU operation which uses the result of LOAD in $I2,$ so latency is of $1$ cycle.

$I5$ is ALU operation using result of ALU in I3, therefore, has to wait for $2$ cycles after $I3$

$I6$ is ALU and uses result of ALU in I5, therefore waits for $2$ cycles

I3 is ALU operation which uses result of load in I2 hence latency should be two cycles

I4 is also ALU opn which uses result of load in I2 hence latency 2

I5 is ALU - ALU hence 2

I6 latency 2

I7 latency 1

In this way answer comes to be 16.

Can you please clear my doubt sir? — GateAnis, Oct 21, 2016
instruction i4 is decrementing i.e arithmatic operation it is betn LOAD and ALU operation so latency is 1

you havnt included that . — Pravin Paikrao, Oct 26, 2016

Ravi Thakur · Answer 2 · 2015-01-08T03:40:16+0000

As per the given table and the assumption that
The instruction of Type R1 <- R1 + R2 requires
2 x (load - alu op) type
1 x (alu op - store) type

Then the answer should be 14

Analysis:
Instr | Cost
==== ====
1. 0
2. 0
3. 1+1+2 = 4
4. 1+2 = 3
5. 1+2 = 3
6. 1+1+2 = 4
7. 0

AND
if, R1 <- R1+R2
is only an (alu-store) type.. then intrs. 3 and 6 take 2 time units each
resulting in the answer as 10..

Unless, somebody presents a different interpretation..

I m not sure between (A) and (B).. If go with table.. it will come out to be 10 clocks, assuming 1 latency is 1 clock.

Next it is clearly given that "Assuming each instruction takes one cycle to execute".. here there are 7 instructions.. So 7 cycle.. isn't it?.. So which should we follow the table or the statement that given..? — Vicky Bajoria, Jan 9, 2015

tags	tag:apple
author	user:martin
title	title:apple
content	content:apple
exclude	-tag:apple
force match	+apple
views	views:100
score	score:10
answers	answers:2
is accepted	isaccepted:true
is closed	isclosed:true

GATE IT 2007 | Question: 41

Please log in or register to add a comment.

Please log in or register to answer this question.

5 Answers

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Related questions

3 3 Comments reply

Please log in or register to add a comment.

Please log in or register to answer this question.

5 Answers

25 25 Comments reply

Please log in or register to add a comment.

4 4 Comments reply

Please log in or register to add a comment.

2 2 Comments reply

Please log in or register to add a comment.

2 2 Comments reply

Please log in or register to add a comment.

Related questions

3 3 Comments

25 25 Comments

4 4 Comments

2 2 Comments

2 2 Comments