(1) Cache size $=32 KB=2^{15}$
(2) Virtual address size > 15 bits (lets denote it as {TagBits + 15 bits} )
(3) Block size $=128 B=2^7$
(4) $512=2^9$
(5) Array element size $=8B=2^3$
(6) $\frac{(3)}{(5)}=\frac{128B}{8B}=2^4=16$ elements per block
(7) Word offset = 7 bits …from (3)
(8) Bits required to index word in cache =15 …from (1)
(9) Bits to index cache line $=(8)-(7)=15-7=9$
(10) Assume A[0][0] is stored at 0th memory location = {TagBits + 15 zeroes}
15 zeroes mod $2^7$ = most significant 8 zeroes, which are used to index cache lines:
So A[0][0] will go in line 0.
(11) Bytes/row = (4) × (5) = $2^{12}$
(12) Address of last byte in first row, i.e. of A[0][511] = {00..00 + twelve 1’s}
(13) Address of first byte in 2nd row = next address of (12) = {00..00 + 1 + twelve 0’s}
(14) Address of first byte in 3rd row = next address of (12) = {00..00 + 10 + twelve 0’s}
(15) So, CACHE_LINE occupied by row is incremented $32=2^5$ for each row.
There are $2^9=512$ rows … from (9)
$\frac{2^9}{2^5}=2^4=16$
Every 16th row will occupy same line. Cache line of block containing A[0][0] will be replaced by block of A[16][0], far before accessing A[0][1]. Same for other elements. Thus there will be no hits at all while accessing any array element.