edited by
546 views
6 votes
6 votes

For the C code below, determine the number of data cache misses assuming an $8$ KB direct-mapped data cache with $64$-byte blocks, and a write-back cache that does write allocate. The elements of $a$ and $b$ are $4$ bytes. Let’s also assume they are not in the cache at the start of the program and starting addresses of $a$ and $b$ are at cache line boundaries.

int a[4][96], b[97][4];
....
for (i = 0; i < 4; i++)
    for (j = 0; j < 96; j++)
        a[i][j] = b[j][0] * b[j+1][0];
edited by

1 Answer

Best answer
6 votes
6 votes
Since, cache size is $8\;KB$ and block size is $16$ bytes, and being direct mapped, number of cache blocks $ = \dfrac{8\;KB}{64 \;B} = 128. $

Lets consider the first iteration. The accesses are $a[0][0], b[0][0]$ and $b[1][0].$

Since, a cache line is $64$ bytes and an element being $4$ bytes, $16$ elements can go to a cache block.

For $a,$ this means, when $a[0][0]$ is brought to cache, $a[0][0] - a[0][15]$ come to cache.

For $b,$ this means, when $b[0][0]$ is brought to cache, $b[0][0] - b[0][3], b[1][0] - b[1][3], b[2][0] - b[2][3]$ and $b[3][0] - b[3][3]$ come to cache.

So, in the first iteration we have $2$ cache misses for the $3$ memory accesses. For the next $2$ iterations, we do not have any cache misses.

For iteration $4,$ we have an access $b[4][0],$ which causes a cache miss. For simplicity of calculation, we can linearize this as $b[16].$

For iteration $5-7$, we do not have any cache miss, and the next cache miss is for iteration $8,$ when $b[32]$ is accessed.

i.e., for $a$ we have a cache miss every $16$ iterations, and for $b$ we have a cache miss every fourth iteration starting from iteration number $4.$

So, total misses for $a = \left\lceil \dfrac{96}{16} \right\rceil = 6$

Total misses for $b = \left\lceil \dfrac{96-3}{4} \right\rceil + 1= 25 $

Lets calculate the total size of $a.$ This will be $4\times 4 \times 96 = 1536$ bytes means $\left\lceil\dfrac{1536}{64} \right\rceil = 24$ cache lines which will be mapping to consecutive cache blocks. After this we have cache blocks of $b.$ Size of $b = 97 \times 4 \times 4 = 1552$ bytes meaning $\left\lceil\dfrac{1552}{64}\right \rceil = 25$ cache lines.

Since, we have $128$ cache lines, the $24+25 = 49$ cache lines for $a$ and $b$ never conflicts and we do not have any conflict miss or compulsory misses here. i.e., after the first iteration of the outer loop we do not have any misses for $b.$

For $a,$ we have the same number of cache misses for the $4$ outer loop iterations which will be $4\times 6 = 24$ cache misses.

Thus, total cache misses $ = 25 + 24 = 49.$

If we identify that we have only compulsory misses here, we can also calculate the total misses as follows also:

Total misses for $a = \left\lceil \dfrac{1536}{64}\right \rceil = 24$

Total misses for $b =\left \lceil \dfrac{1552}{64} \right\rceil = 25 $
selected by
Answer:

Related questions

6 votes
6 votes
1 answer
1
gatecse asked Aug 3, 2020
535 views
Consider a direct mapped cache of size $16$ KB and block size $64$ bytes and using LRU replacement. Initially the cache is empty. The following sequence of access to memo...
4 votes
4 votes
1 answer
3
10 votes
10 votes
1 answer
4
gatecse asked Aug 3, 2020
802 views
Consider a CPU with an average CPI of $1.4$ when all memory accesses hit on the cache.Assume an instruction mix$$\begin{array}{|c | c|}\hline\text{ALU }& 45\%\\\text{LOAD...