From: http://www.cs.umd.edu/class/sum2003/cmsc311/Notes/Memory/set.html
Introduction
A set-associative scheme is a hybrid between a fully associative
Cache, and direct mapped cache. It's considered a reasonable compromise
Between the complex hardware needed for fully associative caches (which
Requires parallel searches of all slots), and the simplistic direct-mapped
Scheme, which may cause collisions of addresses to the same slot (similar
To collisions in a hash table ).
Let's assume, as we did for fully associate caches that we have:
- 128 Slots
- 32 bytes per slot
Furthermore, let's assume that we can group slots together
Sets. In particle, we will assume that we have 8 slots per set.
Parking lot analogy
Suppose we have 1000 parking spots. This time, instead
Using a 3 digit number for each parking spot, we use 2 digits.
Thus, the parking spots are numbered 00 up to 99.
However, instead of one parking spot per number, we have
10 for each number. Thus, there are ten parking spots numbered
00, ten numbered 01,..., and ten numbered 99.
Your parking spot is based on the first 2 digits of your student
ID number.
In this case, you use the first 2 digits of your student ID,
And have up to 10 different parking spots you can park at. This
Gives you some flexibility about where to park.
In effect, the various parking permits on a large commuter campus
Work just like that. There are character lots, each with their own letter
Or number. You are given a permit for a particle lot, but you can
Park anywhere within this lot. The advantage is that you only have
Search for a spot in one large lot, as opposed to searching for
Parking spot in all of campus.
Set associative scheme
Like the direct mapped scheme, we still treat the slots like
An array. The slots are still numbered 0000000 up to 1111111 (there
These are 128 slots ).
However, we group the slots into sets, and the key is
Keep track of the sets, instead of the slots.
How many sets do we have? 128 slots divided by 8 slots per
Sets, gives us 16 sets.
We need to specify the set number, instead of the slot number,
And that takesLG 16 = 4
Bits.
Here's how the bits of the address break down. It's very
Similar to direct mapped, since t we use 4 bits for the set, instead
Of the slot.
BitsA4-0
Is still the offset. The set
Number are the next 4 bits, BitsA8-5
.
Remaining bits,A31-9
Is the tag.
Finding the slot
Finding a slot is more complex than in direct-mapped caches.
Suppose you have addressB31-0
.
- Use bitsB8-5
To find the set.
- This shoshould specify 8 slots (since we said there were 8 Slots
Per set. The slots shoshould have following slot indexes:
- B8-5
000
- B8-5
001
- B8-5
010
- B8-5
011
- B8-5
100
- B8-5
101
- B8-5
110
- B8-5
111
In effect, the set number specifies the Upper 4 bits of
Index, and the bottom 3 bits are all possible 3 bit bitstring
Values.
- Search in all 8 slots to see if the tagA31-9
Matches the tag in the slot.
- If it matches one of the slots, get the byte
OffsetB4-0
.
- If not, decide which slot shocould be used (possibly evicting a slot ),
Fetch the 32 bytes from memory, slot, updating valid bit,
Dirty bit, and tag as neededx
This is called 8-way set associative cache, since each set contains
8 slots. You can have n-way set-Associative Caches, where each
Set contains N slots (where N is a power of 2 ).
Compromises
This scheme is a compromise. You only have to use the complex
Comparison hardware (to find the correct slot) on a small set
Slots, instead of over all the slots. Presumably, such comparison
Hardware is more than linear in the number of slots, so
Fewer the slots you need to search through, the less overall
Hardware is needed.
Yet, you gain the flexibility of allowing up to N cache lines
Per slot for an N-way set associative scheme.
Summary
ASet-associative
Cache scheme is a combination
Fully associative and direct mapped schemes. You group Slots
Into sets. You find the appropriate set for a given address (which
Is like the direct mapped scheme ),
And within the set you find the appropriate slot (which is like
The fully associative scheme ).
This scheme has fewer collisions because you have more slots
Pick from, even when cache lines map to the same set.