edited by
2,271 views
0 votes
0 votes
The baseline execution time of a program on a $2 \mathrm{GHz}$ single core machine is $100$ nanoseconds ( $n s)$. The code corresponding to $90 \%$ of the execution time can be fully parallelized. The overhead for using an additional core is $10 ~ns$ when running on a multicore system. Assume that all cores in the multicore system run their share of the parallelized code for an equal amount of time.

The number of cores that minimize the execution time of the program is __________.
edited by

3 Answers

2 votes
2 votes
The code that cannot be parallelized runs for 10ns.

For every additional core, additional 10ns overhead is added to total execution time, overhead for n cores is $(n-1)10$ns.

The code that can be parallelized runs for 90ns on one core, it runs for $\frac{90}{n}$ns on n cores.

Let t(n) be the execution time, it is given by -

$t(n) = 10 + (n-1)10 + \frac{90}{n} = 10n + \frac{90}{n}$

$t(1) = 100, t(2) = 65, t(3)  = 60, t(4) = 62.5, t(5) = 68, ...$

Answer - 3
1 votes
1 votes

2 GHz Machine implies Cycle-Time of 0.5 ns or you can say for every 1 ns, there are 2 cycles

Now, Given Exec Time of 100 ns implies 200 cycles out of which 90% i.e. 180 cycles can be parallelized

Task is to distribute 180 cycles equally in 'x' number of cores (each 2 GHz) such that exec time gets minimized

Now 

Take x=2, which means 20ns(Overhead) + 90 cycles(each core i.e. 180 cycles/2) i.e. 45ns totaling to 65ns

Take x=3 which means 30ns(Overhead) + 60 cycles(each core i.e. 180 cycles/3) i.e. 30ns totaling to 60ns

Take x=4 which means 40ns(Overhead) + 45 cycles(each core i.e. 180 cycles/4) i.e. 22.5 ns totaling to 62.5ns

Take x=5 which means 50 ns(Overhead) + 36 cycles(each core i.e. 180 cycles/5) i.e. 18 ns totaling to 65 ns

Clearly, minima can be observed at x=3 i.e. 60 ns

0 votes
0 votes
Baseline execution time is 100 ns.

Program that can be parallelized is 90% i.e. 90 ns program.

Overhead cost of additional core is 10 ns.

 

Execution time (for n core) =    $\frac{Parallelized\,time}{n}$ + Fixed time + (overhead time $*$ (n - 1))

t(n) = $\frac{90}{n}$ + 10 + 10 $*$ (n-1)

t(n) = $\frac{90}{n}$ + 10n

 

To minimise t(n), derivative of t(n) = 0

\begin{align*}
&\Rightarrow -\frac{90}{n^2} + 10 = 0 \\
&\Rightarrow 10 = \frac{90}{n^2} \\
&\Rightarrow n^2 = 9 \\
&\Rightarrow n = \sqrt{9} \\
&\Rightarrow n = 3
\end{align*}

 

Hence, Answer = 3
Answer:

Related questions

3.3k
views
2 answers
2 votes
Arjun asked Feb 16
3,323 views
A given program has $25 \%$ load/store instructions. Suppose the ideal $\text{CPI}$ (cycles per instruction) without any memory stalls is $2$. The program ... a perfect cache (i.e., with NO data or instruction cache misses) is __________.
18.6k
views
7 answers
50 votes
go_editor asked Sep 28, 2014
18,589 views
Consider two processors $P_1$ and $P_2$ executing the same instruction set. Assume that under identical conditions, for the same input, a program running on $P_2$ takes ... $P_2$ (in GHz) is ______.
2.7k
views
3 answers
2 votes
Arjun asked Feb 16
2,711 views
Which one of the following statements is FALSE?In the cycle stealing mode of DMA, one word of data is transferred between an I/ ... executing an interrupt service routine faster with vectored interrupts than with non-vectored interrupts
4.8k
views
2 answers
5 votes
Arjun asked Feb 16
4,808 views
Consider a $5$-stage pipelined processor with Instruction Fetch (IF), Instruction Decode (ID), Execute (EX), Memory Access (MEM), and Register Writeback ... does not require any extra hardware to retrieve the data from the pipeline stages