4.1k views

Consider a machine with a $2$-way set associative data cache of size $64$ $Kbytes$ and block size $16$ $bytes$. The cache is managed using $32$ $bi$t virtual addresses and the page size is $4$ $Kbytes$. A program to be run on this machine begins as follows:

double ARR[1024][1024];
int i, j;
/*Initialize array ARR to 0.0 */
for(i = 0; i < 1024; i++)
for(j = 0; j < 1024; j++)
ARR[i][j] = 0.0;

The size of double is $8$ bytes. Array $ARR$ is located in memory starting at the beginning of virtual page $0xFF000$ and stored in row major order. The cache is initially empty and no pre-fetching is done. The only data memory references made by the program are those to array $ARR$.

The total size of the tags in the cache directory is:

1. $32$ $Kbits$
2. $34$ $Kbits$
3. $64$ $Kbits$
4. $68$ $Kbits$
edited | 4.1k views
0
Sir why is the cache block size and page size different
+5
:O Why should they be same?
0

so this is a virtually indexed virtually tagged cache @Arjun sir

Number of sets $=\dfrac{\text{cache size}}{\text{size of a set}}$

$=\dfrac{64\ KB}{(16\ B\times 2)}$ (two blocks per set)

$=2\ K=2^{11}$

So, we need $11\text{-bits}$ for set indexing.

Number of WORD bits required $=4$ as a cache block consists of $16\text{ bytes}$ and we need $4\text{-bits}$ to address each of them.

So, number of tag bits $=32-11-4=17$

Total size of the tag$= 17\times \text{Number of cache blocks}$

$= 17\times 2^{11}\times 2$ (since each set has $2$ blocks)

$=68\text{ Kbits}$

Answer is option D) 68 Kbits

We use the top $17\text{-bits}$ for tag and the next $11\text{-bits}$ for indexing and next $4$ for offset. So, for two addresses to have the same cache index, their $11$ address bits after the $4$ offset bits from right must be same.

$ARR[0][0]$ is located at virtual address $\text{0x FF000 000. (FF000 is page address and 000 is page offset).}$
So, index bits are $00000000000$

Address of $ARR[0][4]=\text{0xFF000}+4\times \text{sizeof (double)}$
$=\text{0xFF000 000}+4\times 8=\text{0xFF000 020 (32 = 20 in hex)}$ (index bits differ)

Address of $ARR[4][0] =\text{0xFF000}+4\times 1024\times \text{sizeof (double)}$
[since we use row major storage]
$=\text{0xFF000 000}+4096\times 8=\text{0xFF000 000 + 0x8000 = 0xFF008 000}$
( index bits matches that of $ARR [0][0]$ as both read $\text{000 0000 0000}$ )

Address of $ARR[0][5] =\text{0xFF000} + 5\times \text{sizeof (double) = 0xFF000 000}$$+ 5\times 8 =\text{0xFF000 028 (40 = 28 in hex)}$ (index bits differ)

Address of $ARR[5][0] =\text{0xFF000} + 5\times 1024\times \text{ sizeof (double)}$ [since we use row major storage]
$=\text{0xFF000 000}+5120\times 8=\text{0xFF000 000 + 0xA000 = 0xFF00A 000}$ (index bits differ)

So, only $ARR[4][0]$ and $ARR[0][0]$ have the same cache index.

The inner loop is iterating from $0$ to $1023,$ so consecutive memory locations are accessed in sequence. Since cache block size is only $16\text{ bytes}$ and our element being double is of size $8\text{ bytes},$ during a memory access only the next element gets filled in the cache. i.e.; every alternative memory access is a cache miss giving a hit ratio of $50%.$ (If loops $i$ and $j$ are reversed, all accesses will be misses and hit ratio will become $0$ ).

edited
+1

@Bikram sir, for the part B) please see my method and point out where I am doing wrong.

#set=211=2048

#elements/cache block=16/8=2

#elements/set=4

#array elements which can be in cache simultaneously without conflicting the cache block:

total set*#elements per set

2048*4=8192=8 rows

so after 8 rows cache will be full and second wrap up will start, so A[8][0] will conflict to cache index with  A[0][0].

But as this is not the answer,soemthing is wrong in this..

0
@reena_kandari , of course you are correct, a[8][0] will index into set 0. Even if you calculate virtual address of this element you will get it as 0xFF000 000 + 65536 and you will see that the index no or set no (that is 11 bits after the rightmost 4 bits) are same that is 0. Here bit with index 16 will be 1.
+1
but it means between A[0][0] and A[8][0] there must not be any conflicting index right? but answer is A[4][0]...how is it possible?
0
I am not sure if I understood your question properly. At index 0 or set 0 we have 4 elements a[0][0] , a[0][1], a[4][0],a[4][1] after first round . What conflict are you taking about ?
+1
see, element A[0][0] will be store at set 0 and  after that all elements will be stored  at set1,2,3..until cache is full and  A[7][1023] will be stored at set number 2047.between these elements  no two elements will have the same index number.then how the element A[4][0] will have the same index as A[0][0].(according to my method)

I hope it is clear now.
+1

we use row major storage ( means we store in row major order ), see  Arjun's answer again ..

ARR[0][0] is located at virtual address 0x FF000 000. (FF000 is page address and 000 is page offset). So, index bits are 000 0000 0000

Address of ARR[4][0] = 0xFF000 + 4 * 1024 * sizeof (double) [since we use row major storage]

= 0xFF000 000 + 4096*8

= 0xFF000 000 + 0x8000

= 0x FF008 000 ( index bits matches that of ARR [0][0] as both read 000 0000 0000)

ARR[0][0] is located at virtual address 0x FF000 000. (FF000 (17 bits) is cache block tag, next 11 bits are index bits and 4 least bits are offset bits)

for two addresses  have the same cache index, means their 11 address bits after the 4 offset bits from right must be same.

0

yes sir, I re-read arjun sir' answer and it is correct.If we calculate address of each element and then check for those 11 bits then answer is A[4][0] but sir my main doubt is the procedure I commented above why it fails although I feels all things are correct but conclusion is not true

#set=211=2048

#elements/cache block=16/8=2

#elements/set=4

#array elements which can be in cache simultaneously without conflicting the cache index  :

total set * #elements per set

2048*4=8192=8 rows

so after 8 rows cache will be full and second wrap up will start, so A[8][0] will conflict to cache index with  A[0][0].

0
@reena_kandari

what exactly is yours doubt ??
+1
+1
thank you sir now I got it,  I was taking individual block number as index..thank you so much :)

1
2