283 views

Consider a CPU contains 2000 instructions, there are 80 misses in the L1  cache and 40 misses in the L2 cache. Assume miss penalty from the L2 cache to memory is 200 clock cycles, the hit time of L2 cache is 30 clock cycles, the hit time of L1 cache is 5 clock cycles and these are 1.8 memory references per instruction, then average stall per instruction is ________.

But I think it should be 6.5.

My Calculations:-

1 inst --------- 1.8 Mem Ref

20000 inst ------- 2000 * 1.8 mem ref = 3600 mem ref.

Miss ratio of L1 = $\frac{80}{3600}$ = 0.05, Hit ratio of L1 = 0.95.

Miss ratio of L2 = $\frac{40}{80}$ = 0.5 = Hit in L2.

Average stalls per instructions = Hit ratio of L1 * Hit in L1 + Miss in L1 * (Hit in L2 + Miss in L2*Miss Penalty of L2).

0.95 * 0 {As there is no stalls when hit in L1 cache} + 0.05(30 + 0.5 * 200)

6.5 stalls/instruction.

| 283 views
+2

it should be 5.2

total memory references are $2000*1.8 = 3600$

now 40 references will hit in L2 and 40 will miss in L2(i.e 40 will hit in MM)

total stalls cycles = $40*30 + 40(200 + 30)=10400$

stalls cycles per memory reference = $\frac{10400}{3600}$, now

since there are 1.8 memory references per instruction,

therefore avg. stalls cycles per instruction = $\frac{10400}{3600}*1.8 = 5.2$

0
what is wrong with my solution?
+1

Average stalls per instructions = Hit ratio of L1 * Hit in L1 + Miss in L1 * (Hit in L2 + Miss in L2*Miss Penalty of L2).

0.95 * 0 {As there is no stalls when hit in L1 cache} + 0.05(30 + 0.5 * 200)

6.5 stalls/instruction.

see in your calculation there is a mistake..

for 80 miss in L1 we access L2.

In L2 40 are hits -> 40 will take only 30 stall cycles to come to L2.

40 are miss -> now they have taken 30 stall cycles to come to L2 and will take 200 stalls to access memory.Hence a total of 230 stall cycles.

total stall cycles=40(30)+40(30+200)=10400.

or u can break like this

total stall cycles= 80(30)+40(200)= 40(30)+40(30)+40(200)

Rest is clear from @ joshi_nitish's explanation.

Hope it helps :)

0

Thanks @gari

now they have taken 30 stall cycles to come to L2 and will take 200 stalls to access memory.Hence a total of 230 stall cycles.

40(30+200) here we are adding 30 to copy data into L2?

0
when there is a miss in L1 then 30 cycles (stalls) are used to access L2. We are adding 30 because we didn't get the required data in L1 so we access L2. Now we don't get the required data in L2 as well so we access memory which takes 200 cycles.
+1
Shubhanshu do like that
80(30) + 40 (200) / 3600 = 2.88

now for instrcution =2.88 *1.8 = 5.2
0
and whatever no of cycles the processor take for accessing data from the L1 cache is not considered as stalls. rty?

sorry for silly doubt :)