5.4k views

A CPU has a cache with block size 64 bytes. The main memory has $k$ banks, each bank being $c$ bytes wide. Consecutive $c$ − byte chunks are mapped on consecutive banks with wrap-around. All the $k$ banks can be accessed in parallel, but two accesses to the same bank must be serialized. A cache block access may involve multiple iterations of parallel bank accesses depending on the amount of data obtained by accessing all the $k$ banks in parallel. Each iteration requires decoding the bank numbers to be accessed in parallel and this takes $\frac{k}{2} ns$.The latency of one bank access is 80 ns. If $c = 2$ and $k = 24$, the latency of retrieving a cache block starting at address zero from main memory is:

1. 92 ns
2. 104 ns
3. 172 ns
4. 184 ns

retagged | 5.4k views
+4

The banks can be thought of as the parllel RAM chips that are used in parllel and called collectively as main memory. In this type of system more than one ram chip is used to constitute main memory. The width of each cell = width of main memory / number of banks. like this

In one access c*k data will be available.

0

"The latency of one bank access is 80 ns. "

Latency of one bank not k banks and to get 48 bytes in one iteration we would require to span 24 banks.

Still 80 nsec taken only once in one iteration???

+2
It is because of parallel access to the banks.We can access all the banks in a iteration at the same time, hence time needed will be only 80, which is the time needed to access a single bank.Further as we need to access the banks( 0 - 7) twice ,we need serailized access for that hence it is done in the second iteration.
+2
Is this question not present in Go 2019 book?I was not able to find it.

Cache block = 64 bytes.

Each bank in memory is 2 bytes and there are 24 such banks. So, in 1 iteration we can get 2*24 = 48 bytes and getting 64 bytes requires 2 iterations.

So, latency = k/2 + 80 + k/2 + 80 (since in each iteration we need to select the banks and the bank decoding time (k/2) is independent of the number of banks we are going to access)

= 12 + 80 + 12 + 80 = 184 ns

But total memory capacity here is 2*24 = 48 bytes. What is the need of cache management and decoding mechanism?
by Veteran (431k points)
selected
0

@Arjun wat's the meaning of this line" but two accesses to the same bank must be serialized."

+4
Two accesses must be serialized means one can happen only after the previous one has finished. (So, it is better to put data in multiple memory banks than to put everything in a single bank)
0
@Arjun sir, can u pls. tell me that the value of k is 24...(given)..then in the solution, why we r using 12 ??
0
@Sneha Sorry. It was a typo. I have corrected it :)
0
sir, in first iteration we are accesing only 48 bytes, rest 16 bytes are to accessed again from main memory . for these 16 bytes we need 8 banks more . so out of 24 banks which bank will be needed to be mapped in cache memory. how this thing can be recoginsed that which are those 8 banks??
0
That will be the first 8 ones as it is mentioned in question "wrap-around".
+12
Since we need to access 64 bytes, we need to fetch 32 banks. Hence the address generated will be 0 to 31.

But since its mentioned wrap around addressing and the capacity of memory is 24 banks, the address generated will be 0-23,0-7

Since one memory bank cannot be accesses twice at the same time, we would fetch 0-23 in the first iteration, and 0-7 in the next iteration.

I hope this explains why we need two iteration
0
but 0-23 includes 0-7 rt?
+1
Yes. but there will be two references to copy 0-7.
0
memory bank  != memory block ??

how to visualise this question ,can someone put up a diagram?

or point me to a good read on memory banks
0
anyone help?
0
@Arjun are we accessing the same value from main memory in the second iteration?It is not clear to me,are we accessing the same value from the fisrt 8 banks that are accessd in the first iteration?
0
I am not yet clear on the concept of memory banking, please can someone please explain why we did 24/2
0
^It is given in question $k/2$ ns for $k$ banks.
0
Cache size is bigger than main memory size!!! This question must be wrong, correct?
+1
After accessing 48 bytes in 1st iteration, remaining 16 more bytes need to be fetched, for this 8 mem. bank need to be interfaced. For decoding these may be 8/2 ns needed.

Therefore total time should be

80+12+80+ 4= 176 ns

+1

I think so. The given answers either access same data twice, or worse access data that doesn't even exist in the main memory. Think about it, the main memory size is 48 bytes, but a single cache block is 64 bytes. Let's assume we only fill a cache block with 48 bytes

Then we have k/2 = 12ns for selecting 24 banks in parallel. After they have been selected, 80ns time is taken to access the first byte of the 24 banks, then another 80ns to access the second byte of the 24 banks.

Total time taken = 12 + 80 + 80 = 172ns

Ans. (C)

+1

@Arjun

Here after 1st iteration's 12ns decoding .. and during 80ns memory access ..  the decoding phase will be free.. soo .

Can'nt we decode the next bank numbers during this time?

Because if we can,  then ans would be 12 + 80 + 80 = 172ns

+1

@Divy Kala

by c bytes wide .. they mean something like.. a word being c bytes or 2 bytes here. and not a memory bank being c bytes.

0

@LiteYagami Yes, but then a queue of decoded addresses must be kept and this makes the memory complex.

0
It's just a question, so you need not to consider this as a real organization.
0
Sir , in question given that the latency of one bank access is 80ns .we are accessing 24 bank in first iteration then it may be 24*80. Explain pls

This question is based on the concept of MEMORY INTERLEAVING... which says that instead of accessing data from memory every time, it is better to divide memory in modules or banks and distribute consecutive data on each module to access the data in parallel..to improve data transfer rate. For this purpose the additional decoder is used to access each module in parallel, so we have to count the latency of decoder also along with each module latency.

now i am going to explain the solution:---->

according to the original question there are k banks and k=24 and each bank has c bytes where c=2 . So total we got 2*24=48 bytes in one iteration.

now we have to calculate one iteration latency:

decoding time for one iteration is k/2 ns: 24/2=12 ns

and each bank latency is 80 ns

normally when decoder latency is given then total iteration time is calculated as; K*(decoder latency) + bank latency

but here we have given the total decoding latency of iteration=12 ns

therefore for one iteration we require : 12+80= 92 ns

Now as we discussed above in one iteration we can get 48 bytes of data but question ask for cache block(64 bytes) transfer therefore we require 2 iterations....... that is 2*92=184 ns

by Active (3.6k points)
0
But we only need 8 banks more to decode in 2nd iteration so decoder latency  = K/2 =  8/2 = 4ns and then 80ns more to fetch data .

Hence total should be 92+ 84 = 176ns ??
0
All K Banks are accessed in parallel (given in question), so in next iteration all banks accessed parallel.
0

Nice explanation. Got all my concept cleared.

Thanks @vamp_vaibhav

Explanation: Size of cache block=64 B No. of main memory banks K=24 Size of each bank C=2 bytes So time taken for  parallel access T=decoding time +latency time T=(K/2)+latency =12+80=92 ns But C=2 for accesses =2*92=184ns So (D) is correct option

by Boss (10.2k points)
0
What is the use of cache block size 64 bytes??