39 votes

Consider a machine with a $2$-way set associative data cache of size $64$ Kbytes and block size $16$ bytes. The cache is managed using $32$ bit virtual addresses and the page size is $4$ Kbytes. A program to be run on this machine begins as follows:

double ARR[1024][1024]; int i, j; /*Initialize array ARR to 0.0 */ for(i = 0; i < 1024; i++) for(j = 0; j < 1024; j++) ARR[i][j] = 0.0;

The size of double is $8$ bytes. Array $ARR$ is located in memory starting at the beginning of virtual page $0xFF000$ and stored in row major order. The cache is initially empty and no pre-fetching is done. The only data memory references made by the program are those to array $ARR$.

Which of the following array elements have the same cache index as $ARR[0][0]$?

- $ARR[0][4]$
- $ARR[4][0]$
- $ARR[0][5]$
- $ARR[5][0]$

8

Notice in this question 0xFF000 is virtual page address that is why it is of size 20 bits. page offset size is 12. So total is 32 bits.

Thanks @just_bhavana ji for correcting me.

0

It will help you to understand Address calculation for array.

http://www.guideforschool.com/625348-memory-address-calculation-in-an-array/

3

$A[0][0]$

$0xFF000\ 000=\underbrace{1111\ 1111\ 0000\ 0000\ 0}|\underbrace{000\ 0000\ 0000}|\ 0000$

$a)\ A[0][4]\times$

$0xFF000\ 000(Base\ address)+(0\times 1024+4)\times 8=0xFF000\ 000+0x20(32_{10}=20_{16})$

$0xFF000\ 020=\underbrace{1111\ 1111\ 0000\ 0000\ 0}|\underbrace{000\ 0000\ 0010}|\ 0000$

$b)\ A[4][0]\checkmark$

$0xFF000\ 000(Base\ address)+(4\times 1024+0)\times 8=0xFF000\ 000+0x8000(32768_{10}=8000_{16})$

$0xFF008\ 000=\underbrace{1111\ 1111\ 0000\ 0000\ 1}|\underbrace{000\ 0000\ 0000}|\ 0000$

$c)\ A[0][5]\times$

$0xFF000\ 000(Base\ address)+(0\times 1024+5)\times 8=0xFF000\ 000+0x28(40_{10}=28_{16})$

$0xFF000\ 028=\underbrace{1111\ 1111\ 0000\ 0000\ 0}|\underbrace{000\ 0000\ 0010}|\ 1000$

$d)\ A[5][0]\times$

$0xFF000\ 000(Base\ address)+(5\times 1024+0)\times 8=0xFF000\ 000+0xA000(40960_{10}=A000_{16})$

$0xFF00A\ 000=\underbrace{1111\ 1111\ 0000\ 0000\ 1}|\underbrace{010\ 0000\ 0000}|\ 0000$

$0xFF000\ 000=\underbrace{1111\ 1111\ 0000\ 0000\ 0}|\underbrace{000\ 0000\ 0000}|\ 0000$

$a)\ A[0][4]\times$

$0xFF000\ 000(Base\ address)+(0\times 1024+4)\times 8=0xFF000\ 000+0x20(32_{10}=20_{16})$

$0xFF000\ 020=\underbrace{1111\ 1111\ 0000\ 0000\ 0}|\underbrace{000\ 0000\ 0010}|\ 0000$

$b)\ A[4][0]\checkmark$

$0xFF000\ 000(Base\ address)+(4\times 1024+0)\times 8=0xFF000\ 000+0x8000(32768_{10}=8000_{16})$

$0xFF008\ 000=\underbrace{1111\ 1111\ 0000\ 0000\ 1}|\underbrace{000\ 0000\ 0000}|\ 0000$

$c)\ A[0][5]\times$

$0xFF000\ 000(Base\ address)+(0\times 1024+5)\times 8=0xFF000\ 000+0x28(40_{10}=28_{16})$

$0xFF000\ 028=\underbrace{1111\ 1111\ 0000\ 0000\ 0}|\underbrace{000\ 0000\ 0010}|\ 1000$

$d)\ A[5][0]\times$

$0xFF000\ 000(Base\ address)+(5\times 1024+0)\times 8=0xFF000\ 000+0xA000(40960_{10}=A000_{16})$

$0xFF00A\ 000=\underbrace{1111\ 1111\ 0000\ 0000\ 1}|\underbrace{010\ 0000\ 0000}|\ 0000$

45 votes

Best answer

Number of sets $=$ cache size/ size of a set

$= 64 \ \text{KB} / (16 \ \text{B} \times 2)$ (two blocks per set)

$= 2 \ K = 2^{11}$

So, we need 11 bits for set indexing.

Number of WORD bits required $= 4$ as a cache block consists of $16$ bytes and we need $4$ bits to address each of them.

So, number of tag bits $= 32 - 11 - 4 = 17$

Total size of the tag $= 17 \times$ Number of cache blocks

$=17 \times 2^{11} \times 2$ (since each set has $2$ blocks)

$= 68 \ KB$

We use the top $17$ bits for tag and the next $11$ bits for indexing and next $4$ for offset. So, for two addresses to have the same cache index, their $11$ address bits after the $4$ offset bits from right must be same.

$ARR[0][0]$ is located at virtual address $0x \ FF000 \ 000$. ($FF000$ is page address and $000$ is page offset). So, index bits are $00000000000$

Address of $ARR[0][4] = 0xFF000 + 4 \times$ sizeof (double) $= 0xFF000 000 + 4\times 8 = 0xFF000 \ 020 \ (32 = 20$ in hex) (index bits differ)

Address of $ARR[4][0] = 0xFF000 + 4 \times 1024 \times$ sizeof (double) [since we use row major storage] $= 0xFF000 \ 000 + 4096\times 8 = 0xFF000 \ 000 + 0x8000 = 0xFF008 \ 000$ ( index bits matches that of $ARR \ [0][0]$ as both read $000 \ 0000 \ 0000$)

Address of $ARR[0][5] = 0xFF000 + 5 \times$ sizeof (double) $= 0xFF000 \ 000+ 5\times 8 = 0xFF000 \ 028 (40 = 28$ in hex) (index bits differ)

Address of $ARR[5][0] = 0xFF000 + 5 \times 1024 \times$ sizeof (double) [since we use row major storage] $ = 0xFF000 \ 000 + 5120\times 8 = 0xFF000 \ 000 + 0xA000 = 0xFF00A \ 000$ (index bits differ)

So, only $ARR[4][0]$ and $ARR[0][0]$ have the same cache index.

The inner loop is iterating from $0$ to $1023$, so consecutive memory locations are accessed in sequence. Since cache block size is only $16$ bytes and our element being double is of size $8$ bytes, during a memory access only the next element gets filled in the cache. i.e.; every alternative memory access is a cache miss giving a hit ratio of $50\%$. (If loops i and j are reversed, all accesses will be misses and hit ratio will become $0$).

Correct Answer: $B$

$= 64 \ \text{KB} / (16 \ \text{B} \times 2)$ (two blocks per set)

$= 2 \ K = 2^{11}$

So, we need 11 bits for set indexing.

Number of WORD bits required $= 4$ as a cache block consists of $16$ bytes and we need $4$ bits to address each of them.

So, number of tag bits $= 32 - 11 - 4 = 17$

Total size of the tag $= 17 \times$ Number of cache blocks

$=17 \times 2^{11} \times 2$ (since each set has $2$ blocks)

$= 68 \ KB$

We use the top $17$ bits for tag and the next $11$ bits for indexing and next $4$ for offset. So, for two addresses to have the same cache index, their $11$ address bits after the $4$ offset bits from right must be same.

$ARR[0][0]$ is located at virtual address $0x \ FF000 \ 000$. ($FF000$ is page address and $000$ is page offset). So, index bits are $00000000000$

Address of $ARR[0][4] = 0xFF000 + 4 \times$ sizeof (double) $= 0xFF000 000 + 4\times 8 = 0xFF000 \ 020 \ (32 = 20$ in hex) (index bits differ)

Address of $ARR[4][0] = 0xFF000 + 4 \times 1024 \times$ sizeof (double) [since we use row major storage] $= 0xFF000 \ 000 + 4096\times 8 = 0xFF000 \ 000 + 0x8000 = 0xFF008 \ 000$ ( index bits matches that of $ARR \ [0][0]$ as both read $000 \ 0000 \ 0000$)

Address of $ARR[0][5] = 0xFF000 + 5 \times$ sizeof (double) $= 0xFF000 \ 000+ 5\times 8 = 0xFF000 \ 028 (40 = 28$ in hex) (index bits differ)

Address of $ARR[5][0] = 0xFF000 + 5 \times 1024 \times$ sizeof (double) [since we use row major storage] $ = 0xFF000 \ 000 + 5120\times 8 = 0xFF000 \ 000 + 0xA000 = 0xFF00A \ 000$ (index bits differ)

So, only $ARR[4][0]$ and $ARR[0][0]$ have the same cache index.

The inner loop is iterating from $0$ to $1023$, so consecutive memory locations are accessed in sequence. Since cache block size is only $16$ bytes and our element being double is of size $8$ bytes, during a memory access only the next element gets filled in the cache. i.e.; every alternative memory access is a cache miss giving a hit ratio of $50\%$. (If loops i and j are reversed, all accesses will be misses and hit ratio will become $0$).

Correct Answer: $B$

2

Arjun sir, so if my understanding is correct will [5][0] occupy 512 th set?[0][5],[0][4] will occupy 3rd set and sir i did not understand the 0xf000 this part,how did you break into 0xf000 into 000 and ff000 that part? :(

2

@Arjun Sir can you explain this part?

ARR[0][0] is located at virtual address 0x FF000 000. (FF000 is page address and 000 is page offset). So, index bits are 00000000000

I dont understand how are we adding the 3 zeros in the end. Also given that page size is 4KB, where are we using this information ? please help.

1

@Arjun sir

Sir , can you please explain this part again ,I still am not able to visualise it .

RR[0][0] is located at virtual address 0x FF000 000. (FF000 is page address and 000 is page offset). "

Thanx

5

It is given page size is 4KB and hence assuming byte addressing, 12 bits are required for page offset. That is why I took the lowermost 12 bits (000) for page offset and higher 20 bits for page address.

29

perfect explanation!

Summary:

Cache Address = |17 tag bits | 11 index bits | 4 word offset|

Address of arr[0][0] = 1111 1111 0000 0000 0000 0000 0000 0000

Cache controller interpretation

Address of arr[0][0]= |1111 1111 0000 0000 0 | **000 0000 0000** | 0000

Address of arr[0][4] = |1111 1111 0000 0000 0 | **000 0000 0010** | 0000

Address of arr[4][0] = |1111 1111 0000 0000 1 | **000 0000 0000** | 0000

Address of arr[5][0] = |1111 1111 0000 0000 0 | **101 0000 0000** | 0000

As it can be seen that 11 index bits are only same for the array elements arr[0][0] and arr[4][0].

0

from where you get that much 1's 1111 1111 1111 1111 0 | **000 0000 0000** | 0000

i think it will be

Address of arr[0][0] = 1111 1111 0000 0000 0 | **000 0000 0000** | 0000

0

@anu may be, i was too exhausted when i commented this, there are 2 FF, hence there will 1111 1111

as you can see it in the above part of the comment

"Address of arr[0][0] = 1111 1111 0000 0000 0000 0000 0000 0000 "

anyway, i will correct it.

as you can see it in the above part of the comment

"Address of arr[0][0] = 1111 1111 0000 0000 0000 0000 0000 0000 "

anyway, i will correct it.

0

@Manu thakur

can u tell me

why we take arr[0][0] and then arr[0][4] and then arr[4][0]

I mean why it not be arr[0][0], then arr[0][1], then arr[0][2]...........like that?

I mean why contiguous access not taken?

45 votes

[ Need @arjun sir to verify this solution ]

Every element = 8B

Total Size Required By the Array = 1024 *1024 * 8 = 2 ^{23} B

Every Block = 16 B

Number of blocks for Array =$\frac{2 ^{23}}{2^{4} }$ = 2 ^{19}

Elements in one block = $\frac{16}{8}$= 2

Number of blocks in cache =$\frac{2 ^{16}}{2^{4} }$ = 2 ^{12}

Number of Sets = $\frac{2 ^{12}}{2^{1} }$= 2^{11}

$\Rightarrow$ 2 elements in one block

MM block 0 $\varepsilon$ 0 mod 2^{11} = set 0

MM block 1 $\varepsilon$ 1 mod 2^{11} = set 1

..

..

MM block 2048 $\varepsilon$ 2048 mod 2^{11} = set 0 // **sharing same set **

Each Row of array has 1024 elements $\Rightarrow$ 512 blocks required

Row 0 : 0-511 blocks

Row1 : 512-1023

..

..

..

Row 'x' : 2048-2559

We need to find 'x'

x= $\frac{2048}{512}$= 4

$\Rightarrow$ A[4][0] shares same set that of A[0][0]

0

@akhilnadhpc

The cache is managed using 32 bit virtual addresses and the page size is 4 Kbytes.

What is the Significance of above statement here??
0

Does sharing same cache index means, belonging to the same set or the same block in which A[0][0] is present ?

0

plz tell me

each element of array contain , each element of block or each element of set?

Because if array contain each element of block, then it will contain 1 element for each element of array. means a[0][1]= contain 1 element

But if it contain each element of set then it will contain 2 elements for each element of array, means a[0][1]=contain 2 elements etc.

each element of array contain , each element of block or each element of set?

Because if array contain each element of block, then it will contain 1 element for each element of array. means a[0][1]= contain 1 element

But if it contain each element of set then it will contain 2 elements for each element of array, means a[0][1]=contain 2 elements etc.

1

@Arjun sir,@pc.please following doubts

1. Cache index means only same set number or same same and same word offset in that?Can we say a[4][1] will also have same cache index as a[0][0] or not?

2.0xFF000 is the starting location of the array.So we should decode this to get the set number and start from that start instead of starting from 0.Although answer will be same,but if they ask that which set number is that then we can say 0.Please correct me if i am getting this wong

1. Cache index means only same set number or same same and same word offset in that?Can we say a[4][1] will also have same cache index as a[0][0] or not?

2.0xFF000 is the starting location of the array.So we should decode this to get the set number and start from that start instead of starting from 0.Although answer will be same,but if they ask that which set number is that then we can say 0.Please correct me if i am getting this wong

0

Sir plz i have a doubt 0xFF000 since its a set associative so 11bits represent the starting set for the first block bt sir u mapped first block at 0 and so on...

1

PC calculated the next element which will get mapped to A[0][0] by finding cache size and all. He found out that 4 is the factor. A[0][0], A[4][0], A[8][0]......... will mapped to set 0.

But if the question is like this - *Find the next element's address which have the same cache index as ARR[0][0]*. Then we need to use Arjun Sir's approach.

5 votes

virtual address 32 bit

block size 16 B

Each block contain 2 elements

Data cache size 64 KB

Number of blocks=2^{12}

but each block contain 2 elements.

So, no. of blocks will be 2^{11}

So, for the machine total size of the blocks $2^{11}\times 2^{4}=2^{15}$

Now , each element size $2^{3}$ B

So, no. of elements could be present in total block=$2^{12}$

So, 1st time array contain 0 to 1024 elements.

2nd time array can contain 1 to 1024 elements

3rd time array can contain 2 to 1024 elements

4th time array can contain 3 to 1024 elements

So, now all blocks are full.

No, next time block access it will replace 1st element by Arr[4][0]

1 vote

0 votes

#sets=2^11

block size=16 bytes

element size=8 byte

#elements in one block=2

each set contain=2 blocks **set 0:: A[0][0] | A[0][1]**

#elements in one set=4 elements __A[0][2] | A[0][3]__

total no.of elements covered in 2^11 sets = 2^11*4=2^13 ELEMENTS **set 1:: A[0][4] | A[0][5]**

A[0][0]..........................A[0][1023]=2^10 ELEMENTS __ A[0][6] | A[0][7]__

A[1][0]..........................A[1][1023]=2^10 ELEMENTS** **

A[2][0]..........................A[2][1023]=2^10 ELEMENTS

A[3][0]..........................A[3][1023]=2^10 ELEMENTS

TOTAL ELEMENTS COVERED FROM A[0][0] ............................A[3][1023]=2^10 + 2^10 + 2^10 +2^10 = 2^12 elements ..

TOTAL sets COVERED FROM A[0][0] ............................A[3][1023] = (total elements covered)/(no.of elements in one set)= (2^12)/4= 2^10 sets covered...

NOW we have 2^11 sets and only 2^10 covered how a[4][0] is the answer ..............................

MY QUESTION IS --------------

HELP ME @Arjun SIR