2.2k views

Cache 1

Direct Mapping

 17 tag 10 block 5 word

Cache 2

Set Assosicative  2- way associative

 18 9 5

Cache 3

Associative Cache

 27 5

1. Calculate Hit Latenncy in Each of Three Cache Memory if
• 2 to 1 MUX / OR  has latency of 0.6ns
• k-bit comparator has katency of k/10 ns

2.How much number of MUX / OR ? camparators are used in each of three

retagged | 2.2k views
0

@Arjun sir is it 3 ns or 2.4 ns for the set associative one ?

First of all what has a multiplexer and comparator have to do in a cache? Lets see:

In a cache memory implementation we allow a set of main memory frames to be mapped to a particular cache entry. So, once we reach an entry in cache we also need to know which 1 of the $n$ possible entries is currently being present there. For this purpose we use tag bits. And we need $\log n$ bits for this and we also needs to compare all these bits using a comparator to ensure "cache hit". So, in a direct mapped cache, first we index to an entry using index bits and then compare the tag bits. So, we can solve case 1 here:

The second part - 10 bits for block is used for indexing. The final part is offset within a cache entry and is not relevant for this question. So, hit latency will be

Cache indexing time (lets assume negligible as no data given in question) + Tag comparison time

$= 17/10 = 1.7 ns$

PS: Actually this is the time till we identify a cache hit. After this we have to decide which word in the block is requested, which requires a 3-1 MUX here as word bits are 5. (this is why usually offset bits is power of 2)

For direct mapped entry we assumed that each cache entry can hold a single cache block. But this changes in a set-associative cache and with a 2-way set, each cache entry can hold 2 cache blocks. i.e., after indexing to a cache, we have to compare the tag bits of 2 entries. But this comparison can be done in parallel and result be multiplexed. We also need an OR gate to determine the output of tag comparison for 2 set entries. The tag comparison result of the entries serve as select lines to the MUX. So, here hit time

$= 18/10 + 0.6 + 0.6 = 3.0 ns$

A fully associative cache means we do not need to index and all entries goes to a single slot. So, we need to compare the tag bits of all the cache entries to determine a hit and then has to multiplex this output. Hit time

$= 27/10 + x$

I used $x$ because here we need 1024-1 multiplexer as we had used all 10 block bits - which gives 1024 entries in a set. Can we get this time from 2-1 multiplexer? We can but not doing the calculation here.

by Veteran (432k points)
selected by
0

Sir, In case of fully associative cache-

No. of "27 bit comparators"  we need  = number of cache lines/blocks = 210.

Since each of comparators are in parallel, so net delay because of all the comparators = delay by one 27 bit comparator = 27/10.

No. of mux needed = ??. ( I think in case of fully associative we don't need any mux, because we don't have to select any set( in case of set associative) / line(in case of direct)).

And after getting result of all the comparators, we need to pass them through an OR gate.

So OR gate delay should be added here.

I mean net delay in case of fully associative = 27/10 + 0.6 = 3.3 ns.

0
In that case same should happen for set-associative. But I don't know if MUX or OR Gate is used here. Someone good with circuits can tell that..
+2

To design a 1024 input multiplexer from 2 input multiplexer.

1st level ==> 2multiplexers and each one taking 2 inputs , so total 210 inputs are there.

we will get 2outputs 1st level multiplexers.

2nd level ==> 2MUX and each one taking 2 inputs , so total 29 inputs are there.

........

Similarly, adding all levels MUX , we have

29 + 2+ ..... + 21 + 1 = 1023 total multiplexers.

Hence, x = 0.6 * 1023 = 613.8 ns.

0
Sir, mux needs selection lines. How can we use mux here.
+1
we are designing an OR gate using 2 : 1 MUX.
+1

@Vijay you can see the figure here: We have to select all 'x' bytes where x is the data width. The select lines will be the output of tag comparison.

https://courses.cs.washington.edu/courses/cse378/09au/lectures/cse378au09-19.pdf

0
it seems to be extension of  gate question finally figured it out
https://gateoverflow.in/1851/gate2006-74
0

@arjun sir correct me if Im wrong  ,

In Direct Mapped Cache
Question 1
Hit Latency = 210to 1 MUX Delay + n-bit tag comparator  Delay
=  0 + 17/10 =1.7ns

( No information Regarding 210to 1 MUX delay . So let us assume it to be 0 )

Question 2

only 1 comparator and 17  210-to-1 MUX are required but  MUX are operating in parallel

2-way Set Associative Cache

Question 1

Hit Latency = 29to1 MUX Delay + Comparator Delay + OR gate delay

= 0 + 18/10 + 0.6 = 2.4 ns

Question 2

18(= 2 comparators per set )  324 (= 36 29to1 MUX  per set ) 9 (= 1 OR gate per set ) are required   . comparators are operating parallelly and MUX are operating parallelly . OR gate can be realised usuing 2to1 MUX

Fully Associative Cache

Question 1

Hit Latency = Comparator Delay + OR gate Delay

= 0.6 +0.6 = 1.2 n sec
Question 2

27 tag comparators are 1 OR gate are required . Comparators are operating in parallel

0
i have seen this question somewhere ....half of the question is made by u..akhil..of fully associative..

secondly...we don't use mux in direct-mapped...
thirdly in 2-way associative ...why u created confusion..by entering OR gate...when MUX is available ..then..why u are using OR gate ???
–1
@Tauhin_Gangwar . It is extension to previous gate question . I have said that in my previous comment . When i see part of this question in a book for first time i never knew it was GATE question . So i extended it to nurture my understanding in cache memory,

We do use MUX in direct mapped cache . See my diagram .

We do use OR gate in set associative see diagram for more clarification .
0
do u know why mux is used...?? tell me right now..

you are using that doesn't mean it is required...

ur diagram r wrong..
+1
@Tauhin Sir ..  mux is used to get the tag  bits of a particular  block/set.

Now you tell us please what should be correct diagram.?
0
@vijay what is the problem in the link provided by arjun sir....diagram is there

///and tag bits are used as select lines
+1
@Tauhin Sir
MUX is used to select one TAG bit from each block. ( see my diagram ) Depending on number of TAG bits that much number of MUX are required .  I dont have  got any reasons to believe my diagram is wrong , yet .
Thanks
0
akhil i never gave down .....to anybody....then why i will downvote u... check my profile ...u will come 2 knw...downvote gave out= 0.
0
sorry.. that was a general msg.. Neither i dont have any reasons to believe downvote is given by a particular user. No hurt feelings. I wann learn something
+2

@Akhil 1024:1 MUX selects the cache entry. So, only one out of 1024 entries is valid. So, for the comparator input we do not need a MUX- consider a BUS connecting all the 1024 outputs to the comparator. Similarly we do not need a MUX to select one of the 1024 entries- we can have 1024 signals going to them and at any point at most one will be active.

+1
thanks Sir. Nicely explained and cleared all the lacking concepts. :)
0
PS: Actually this is the time till we identify a cache hit. After this we have to decide which word in the block is requested, which requires a 3-1 MUX here as word bits are 5. (this is why usually offset bits is power of 2) . What is 3-1 mux here? Plz explain
+1

@arjun sir,

From the diagram what I interpreted is that :

When the comparator is doing the comparison, using the block offset, we can select the required byte within the block parallelly.

Now in case of 2-way set associative, we are doing the 2 comparisons in parallel. Now we need to check which among them got hit or not. So we need an OR gate here.

Therefore, time to get the comparison done will be = 18 / 10  + 0.6 = 2.4

After getting hit we need to pass it to 2 * 1 MUX as a select line to select among the 2 data we got.

Total time = 18 / 10 + 0.6 + 0.6 = 3.0

But why here, https://gateoverflow.in/1851/gate2006-74     you counted total time = 2.4.

0
What are the select lines for the MUX in the above figure?
0
Sir, in the diagram it is not there, but we should have it otherwise how we will know, from which block we should select the data.? And I think, select line should be from the block whose tag got a match.
0
yes, so in that case why do we need to know if a Hit occured for getting the selcted data? The OR gate is used in the diagram to determine a HIT/MISS so that a MISS operation can be started if required.
+1
Yes, for getting the select line we don't need to know whether it is hit or miss. After comparison, we can directly provide them to mux. Also, do parallel checking using OR gate for miss/hit. If it is a miss, we canceled otherwise our data will be ready in the data bus.Then total delay should be 1.8 + 0.6 = 2.4 ns. How you got 3.0 ns?
0
In the diagram above (for direct mapping) we use a 4:1 MUX to choose one word among the four words in the block. So in this example, the block size is 32 bytes. Why don't we assume that the block can hold 8 words and use a 8:1 MUX (which has seven 2:1 MUX)?
0

@Arjun

PS: Actually this is the time till we identify a cache hit. After this we have to decide which word in the block is requested, which requires a 3-1 MUX here as word bits are 5. (this is why usually offset bits is power of 2)

Did you mean to say $8-1\ MUX$? Assuming $word\ size = 4\ bytes$, lower $2\ bits$ of the block offset will denote a $byte$ in a word and higher $3\ bits$ will denote the word no. That means to select a word out of $8$ words in the block we need a $2^3-1 \ MUX$ with $3$ select lines provided by the higher $3 \ bits$ of the block offset.

Need verification from Arjun Sir,

For (A)Direct Mapped Cache:

So, hit latency comes to be Line decoding time(Assume 0 if not given) +Tag comparison time+AND GATE delay.

AND gate delay is not given so assume it to be zero, Hit time=$\frac{17}{10}=1.7ns$

Since other delays which were present, but assumed to be zero(either not given or told to be zero), the cache hit time must be atleast 1.7ns here should be the correct way to say this.

(B)Set Associative cache design:

So, here we can see that after the Tag comparison is done, and assuming AND gate delay to be zero, we see that the multiplexor uses the lines of the tag comparison result as it's input to decide which one of the block of the set must be selected. So, After TAG comparison is done, MUX and the or gate can produce the result in parallel.

Also, note that the set decoding time if given is to be taken into account while calculating hit time.

So, here Cache hit time=$1.8(tag)+0.6(\,or\,+MUX\,delay\,in\,parallel)=2.4ns$ (Remember cache hit time is atleast this).

(C)Associative Cache: So, here my cache has $2^{10}$ lines and each tag has 27 bits. So, for all $2^{10}$ lines, the tag comparison will be done in parallel and this will take $2.7ns$. After this, we will have 1024 lines of tag comparator output. Since now we need to select 1 block out of 1024, we need the functionality of a 1024X1 MUX. But we have only 2X1 MUX.

So, my 1024x1 MUX, can be implemented like below

So, we need 512 in count 2X1 Mux at first level to select out of 1024 lines,

256 at the second level,128 at third,64 at fourth,32 at fifth,16 at sixth,8 at seventh,4 at eight,2 at level 9, and finally 1(a single) 2x1 mux at 10th level.

So, total 10 level of multiplexing needed and this will take $10*0.6=6ns$

So, cache hit time here=$2.7+6=8.7ns.$

by Boss (29.2k points)
0

In set assoc cache here mux size is 4 : 1..as associativity is 4

m confused what actually input to mux it is shown block ..is it particular word from each block ??

0

@jatin khachane 1-Input to mux is all input lines of the blocks of the set which is decoded.Yes, it depends on the associativity.

0
Mux size is 4 :1 in pic you posted...it can take 4 inputs and give one output...for one block of set one input to mux..but how we decide which word of that block to give to mux as input obvi we cant give entire block to mux input??
0

@jatin khachane 1-I think the lines which come after tag comparison, they are acting as selection lines for MUX.

But a 4x1 MUX would need only 2 selection lines.

I have doubt here.

Now, how this 4 gets converted to 2.

0

Here lines of tag comparision (outout of 4 ANDs) given to MUX ..which are not select inputs right ..because MUX is 4X1 it will be having 2 select inputs

0

1) we need to say hit/miss, no need to select the word.

2) hit latency comes to be Line decoding time(Assume 0 if not given) +Tag comparison time+AND GATE delay.

line decoding done by MUX, they given 2 x 1 mux, then as you implemented in last case, we can implement it also...

that AND gate, is referring valid or invalid, but in the question, nothing says about block valid or invalid, no need to use AND gate.

0
That's why I assumed it to be zero.
0
did you mean for AND gate or for 2$^{10}$x1 MUX ?
0
For AND gate.For MUX, we need to consider it. And even Arjun SIr has mentioned in his answer as $x$.
0

Patterson said that the $4$ results from the comparators will serve as decoded select signal for the $4-to-1 \ MUX$.

That means the $4$ lines from the comparators will have to encoded at first before applying as select lines to the MUX.

I think the complete diagram will be like this :

–1 vote

Now we simply talk about the direct mapping

In direct mapping mapping number of multiplexer required depend upon the number of tag bits or size of comparator

Along with only only one k-bit comparator required;

In most of the cases time taken for the multiplexer taken as zero

Now

Number of Tag bits=17bits; So 17 bit comparator and 17 multiplexer

Total time taken=k/10+0.6=2.3ns

Now we talk about set associative mapping::

No multiplexer;

Only compartor;

Time taken=1.7ns

by Boss (10.2k points)