(This is an MIT classmate 6.004 computation Structure course notes, the content is clear and easy to understand, to see a simple understanding of the basic computer composition, a total of 55 pages, planning all translated into Chinese. Reprint please indicate the source. )
(Original note:? https://app.box.com/s/hj73i5cnek38kpy9yw22)
?
Storage hierarchy (memory Hierarchy)
The efficiency of cup and memory data exchange is not high, which is one of the bottleneck of instruction Pipelining processor. We can solve this problem by means of storage hierarchy (memory Hierarchy). Within this storage hierarchy are:
1, small capacity, high-speed cache
2, the larger capacity, slow a bit of memory
3, very large, but very slow hard disk
?
So when the CPU gets a memory address A, it usually passes these steps:
1, check the cache: If address A is stored in the cache (called Cache hit), then return this data, otherwise go to the 2nd step
2. Check the RAM (Main memory): If A is in memory, return a and put it in the cache, otherwise perform the 3rd step
3, if a is not in the cache and in memory, a must be in the hard drive (disk). We put a into main memory and cache, and returned the data.
?
To simplify this structure, let's assume that a is not in the hard disk. So if it is not in the cache, it will be in main memory.
?
The cache is a small and fast storage unit that holds a temporary backup of the memory. In the 6.004 lesson , you can assume that the clock rate is inside a CPU, at which time the cache hit returns the cache-saved data within a period (cycle). And after a failed cache lookup, the CPU pauses, knowing that the data is found in memory. If the probability of cache miss is not very high, then using the cache can significantly increase the average read time of memory.
t = T(cache) +? Miss_rate * t(memory)
?
?
localization (Locality)
The cache is useful because it leverages the locality strategy:
If you get data for location A, chances are you'll be using it soon. Whenever the cache Miss, the memory address goes into the cache and replaces some long useless data in the cache.
?
?
full Associative cache (Fully associative cache)
?
A cache includes a number of lines (cache line). A cache line includes:
1, a valid bit (Valid bit), used to indicate whether the data in this line is valid.
2, an optional status bit (state bit), to indicate whether the data has been modified (Dirty), whether it is read-only (read-only), etc.
3, a label, is generated by the memory address
4, and the data corresponding to the address
?
We will discuss 3 kinds of caches, one of which is the fully associative cache. It differs from other structures in that the full associative cache can place data anywhere .
When a memory address is given to a fully associative memory, the address tag is compared to all valid lines in the cache (Valid line), and if the tag is the same, it is counted as cache hit. The data for this line is sent to the CPU, otherwise it counts as a cache Miss.
If all the lines in the cache are valid (the cache is full), a cache Miss has occurred. We need to do a data substitution (replace the resulting data with one of the data in the cache). The principle of substitution will be mentioned later in this article. If the line is valid and modified (V =1,d = 1), then we must write the data in this line back to main memory. If it hasn't been modified, we'll cover the line directly.
There are some useful equations for the fully associative cache:
capacity = # lines * (1 valid bit + S status bits + T tag bits + D data bits)
In the 6.004 lesson, there are some special values: T = bits, D = + bits
?
The following examples show the three-time fetch operation. Start with an empty, fully associative cache. Each image shows the status of the cache following the execution of the comment below the image.
?
?
Direct Mapping cache (direct-mapped cache)
Direct-mapped caches are cheaper than full-associative caches. Because there is only one label comparator (tag Comparator) in the entire cache. This differs from a comparator for each line in the fully associative cache.
In the direct mapping cache, however, there is an address conflict, which can affect the effect of the cache.
When a memory address is given, index selects the cache line. If the target tag and the tag in the cache are equal, a cache hit is reached. The data in this cache is sent to the CPU.
If the cache does not have an index corresponding to the line, we need to go to main memory, the data in main memory to replace the cached index corresponding to the line data. If the data in the line is valid and has been modified (v=1, d=1). The cache must write that data into memory and throw it away.
?
Seeing here, we're sure to think, the fully associative cache and the direct mapping cache which is better, depending on the pattern of data acquisition (Memory access pattern). Different patterns and data substitution strategies (replacement strategy, which is the data substitution rules after the cache miss occur) are different from one cache structure to another.
Here are some useful equations for direct caching:
capacity = # lines * (1 valid bit + S status bit + T tag bits + D data bits)
# lines = 2^ (# index bits)
The following examples show the four-time fetch operation. Start with an empty direct cache. Each image shows the status of the cache following the execution of the comment below the image.
?
N-channel group-linked cache (direct-mapped cache)
Next time write.
Three types of cache modes