# GATE2008-73

3.8k views

Consider a machine with a $2$-way set associative data cache of size $64$ $Kbytes$ and block size $16$ $bytes$. The cache is managed using $32$ $bit$ virtual addresses and the page size is $4$ $Kbytes$. A program to be run on this machine begins as follows:

double ARR[1024][1024];
int i, j;
/*Initialize array ARR to 0.0 */
for(i = 0; i < 1024; i++)
for(j = 0; j < 1024; j++)
ARR[i][j] = 0.0;

The size of double is 8 bytes. Array $ARR$ is located in memory starting at the beginning of virtual page $0xFF000$ and stored in row major order. The cache is initially empty and no pre-fetching is done. The only data memory references made by the program are those to array $ARR$.

The cache hit ratio for this initialization loop is:

1. $0$%
2. $25$%
3. $50$%
4. $75$%

edited

Block size $=16B$ and one element$=8B.$
So, in one block $2$ element will be stored.

For $1024\times 1024$ element num of block required $=\dfrac{1024\times 1024}{2} =2^{19}$ blocks required.

In one block the first element will be a miss and second one is hit(since we are transferring two unit at a time)

$\Rightarrow \text{hit ratio}=\dfrac{\text{Total hit}}{\text{Total reference}}$

$=\dfrac{2^{19}}{2^{20}}$

$=\dfrac{1}{2}=0.5$

$=0.5\times 100=50\%$

Correct Answer: $C$

edited
2
how to find the number of misses in this problem
1
(1-HIT RATIO) OR (TOTAL MISS/TOTAL REFERENCE)
0
i want miss rate??? is it (1024 x 1024)/2?? please explain
0
RATE MEANS PER UNIT TIME

IF U WANT MISS RATE THEN IT WILL BE NUMBER OF MISS PER UNIT ACCESS

THAT IS 1/2 HERE(IN ONE BLOCK )
0
ok.thanks so,total number of misses =(2^12)/2..=2^11...pls correct me if i am wrong,total number of misses=2048 pls verify.
0
JUST THINK

MISS RATE IS A TIME = AT WHAT RATE WE HAVE A MISS(THAT MEANS PER 100 INSTRUCTION,,1000 INSTRUCTION......)

= SIMPLY MISS RATIO*MEMORY ACCESS PER INSTRUCTION

AND HERE WE CAN'T CALCULATE MEMORY ACCESS PER INSTRUCTION(COZ TO ACCESS ONE BYTE THERE IS NO TIME BOUND)
0
Ok thanks buddy...
0
for each block reference we have 1 hit and 1 miss
total no of blocks required = 2^19
no of miss = no of hit = 2^19
0
yes
0
It is 2 way associative, why we are ignoring it on question?
0
we need 2^19 blocks for array but we have 2^12 blocks only...right ?? so how 2^20 comes and also please elaborate your answer ..?
1
we are not ignoring it in the question, but it doesn't matter. Since one block is 16 bytes and each array element takes 8 bytes, and we are fetching the elements row wise and default storage scheme is also row wise, we will get one hit out of 2 elements accessed. Hence 50% hit.
0

@  sushmita Boss

i understand 50% by thinking only but in answer given above how it is done that i want to know..?

0
Are we transfering 2 units at a time because its 2 way associative ?
0
doesnt the 2 way assosciativity play any role here?
0
@arjun sir

This explanation is awesome and I'm clear how it worked for this question.

However I have a general doubt regarding virtually addressed caches. When CPU generates the virtual address 0XFF000 000, cache has to check whether the MM block corresponding to this VA is present in cache or not. So to performs this check, is it the same way as it's done for physically addressed cache, by breaking the address into tag bit, block offset and word offset then comparing tag bits for corresponding block? Is this process same for all Pipt cache, Vivt cache, Vipt cache?

Help would be really appreciated. Thanks!

(P-physically, v- virtually, i- indexed, t- tagged)
0

I think, answer should be 75%.

BS = 16 bytes and Array element size is 8 bytes $\Rightarrow$ each block can hold 2 Array elements.

Given, the Array is stored in Row Major Order. Now, when the loop starts, we ask for ARR[0][0]. Since, it is not present in Cache, we move the required Block from Main Memory. Along with ARR[0][0], we move ARR[0][1] as well. There will be a miss for ARR[0][0] read operation and a hit for ARR[0][0] write operation. For the second element i.e. ARR[0][1], we have a Hit for both read and write operation. Therefore, we can say that for a pair, we have 3 hits out of 4 accesses.

We have such = $(1024\ast 1024)\div 2$ pairs (where each pair has 3 Hits) and a total of (1024 $\ast$ 1024 $\ast$ 2) accesses (1 for read operation and 1 for write operation).

Hit ratio = No. Hits / No. of accesses $\Rightarrow$ $(524288 \ast 3) \div (1024\ast 1024\ast 2)$ $\Rightarrow$75%.

2

There will be a miss for ARR[0][0] read operation and a hit for ARR[0][1] write operation. For the second element i.e. ARR[0][1], we have a Hit for both read and write operation.

there is only memory access operation here. where did u get  read and write operations?

0

@King Suleiman

Sorry, my bad. There is only Write operation.

0

@ayushsomani I think there is only read here? how can you write without read?

Given cache block size = 16B, 1 element size in array = 8B (size of Double =8B). Therefore 1Block contain 2 elements

Total elements = 1024 x1024 =2^20
each time 1 block ( 2 elements) stored in to  cache, so 1st element is compulsory miss and when we access  2nd element           that will be hit because it is already present in cache.

Therefore each time 1 miss 1 hit occur, For complete 2^20 elements 2^19 misses and 2^19 Hits will occur
Hit ratio = 2^19/2^20 = 0.5

= 0.5 x 100 = 50%

0
Two elements are contained in one line as each block is of 16 bytes and each element is of two byte . There will be 2048 sets and in one set there will be two lines and each line is of 16 bytes. One row will be accommodated in 512 lines and one row consists of 1024 elements.  For 1024 elements half will be miss and half will be hit. Therefore,

Hit rate = 512/1024=(1/2)*100=50%

CM Size$=64KB$

Block Size$=16B$

No. of lines$=2^{12}$

No. of sets$=2^{11}$

Element size$=8B$

No. of elements/Block$=2$

No. of elements in the Row$=1024$

No. of Blocks/Row$=\frac{1024}{2}=512$

 $0$ $a[0][0], a[0][1]$ $a[4][0], a[4][1]$ $1$ $a[0][2], a[0][3]$ $a[4][2], a[4][3]$ . . $511$ $a[0][1022], a[0][1023]$ $512$ $a[1][0], a[1][1]$ $5^{th} row$ . . . $1023$ $a[1][1022], a[1][1023]$ . . . . $2047$ $a[3][1022], a[3][1023]$ .

$a[0][0]=Miss$

$a[0][1]=Hit$

$a[0][2]=Miss$

$a[0][3]=Hit$

Miss follows Hit

The cache hit ratio for this initialization loop is:

$=50\%$

0

No. of elements in the Row=1024@Kushagra गुप्ता how?

0

@manisha11

Look at the array size : $ARR[1024][1024]=ARR[row][column]$

There are 1024 rows and each row consist of 1024 elements because there are 1024 columns.

0

$2^{16}$ -- cache size

$2^{3}$  -- element size (of array)

so at one time the cache will have $2^{13}$ 8Byte elements
0-1023 $2^{10}$ elements 3 times
so the cache will fill up from

0,1023 to 2,1023
and then begin from 4,0
is that correct interpretation? I haven't used set/block here because I thnk we can directly see the arragment here

With each cash block we will bring 2 bytes of data, So first we will have Miss then in next reference we will have it, this follow…

So Hit ratio will be (Hit/ Hit+Miss )*100 = 50%

It brings 16 bytes(1 block = 16 bytes) at once, which is equal to 2 array elements(each element 8 bytes). Thus every alternate access will be a hit. 50%

## Related questions

1
7.1k views
Consider a machine with a $2$-way set associative data cache of size $64$ Kbytes and block size $16$ bytes. The cache is managed using $32$ bit virtual addresses and the page size is $4$ Kbytes. A program to be run on this machine begins as follows: double ARR[1024][1024]; int i, j; /*Initialize ... elements have the same cache index as $ARR[0][0]$? $ARR[0][4]$ $ARR[4][0]$ $ARR[0][5]$ $ARR[5][0]$
Consider a machine with a $2$-way set associative data cache of size $64$ $Kbytes$ and block size $16$ $bytes$. The cache is managed using $32$ $bi$t virtual addresses and the page size is $4$ $Kbytes$. A program to be run on this machine begins as follows: double ARR[1024][ ... array $ARR$. The total size of the tags in the cache directory is: $32$ $Kbits$ $34$ $Kbits$ $64$ $Kbits$ $68$ $Kbits$
For inclusion to hold between two cache levels $L_1$ and $L_2$ in a multi-level cache hierarchy, which of the following are necessary? $L_1$ must be write-through cache $L_2$ must be a write-through cache The associativity of $L_2$ must be greater than that of $L_1$ The $L_2$ cache must be at least as large as the $L_1$ cache IV only I and IV only I, II and IV only I, II, III and IV
Delayed branching can help in the handling of control hazards The following code is to run on a pipelined processor with one branch delay slot: I1: ADD $R2 \leftarrow R7 + R8$ I2: Sub $R4 \leftarrow R5 &ndash; R6$ I3: ADD $R1 \leftarrow R2 + R3$ I4: STORE ... Which of the instructions I1, I2, I3 or I4 can legitimately occupy the delay slot without any program modification? I1 I2 I3 I4