Original source: http://gsqls.blog.163.com/blog/static/45971218201341310221675/
function, structure and working principle of high speed buffer memory
High-speed buffer memory is a memory that exists between main memory and CPU, and is composed of static memory chip (SRAM), which has a much higher capacity than main memory and is close to CPU speed.
1, the introduction of cache
Please note the following two cases:
① Large capacity main memory uses DRAM, relative SRAM speed is slow, and SRAM speed is fast, but the price is high.
② programs and data are localized, that is, in a short period of time, the program or data is often concentrated in a very small memory address range.
Therefore, between main memory and the CPU can set a fast and relatively small memory, where the CPU is currently in use and a short time will be used in the program and data, so that the CPU can greatly speed up access to memory, improve the efficiency of the machine.
The basic working principle of the cache is as follows:
Cache function is used to store those recently need to run the instructions and data, the purpose is to improve the CPU access to memory, to solve 2 technical problems:
The first is the mapping and conversion of main memory address and cache address;
The second is to replace the contents of the cache according to certain principles.
The structure and working principle of the cache are shown in the following illustration.
Mainly consists of three major parts:
Cache storage: Store the commands and data blocks that are transferred by main memory.
Address Translation part: Create a table of contents to achieve the conversion of main memory address to cache address.
Replacement part: Block replacement of blocks by a certain policy when the cache is full, and modify the address translation part.
2. The basic principle of cache
Cache and main memory are divided into blocks (often the cache block as a cache line), each block consists of multiple bytes, the same size. In a time period, the cache of a block of memory in a block of all the information, that is, a piece of cache is a copy of main memory block (or image), as shown in the following figure.
The cache, in addition to the data portion, should also record the information placed in a block as a copy of the main memory. Therefore, there should be a second component, tag (tag), which records the block address information of main memory blocks.
After using cache, access memory operation, not first access main memory, but first access cache. So there is a problem of understanding the main memory address when accessing cache (physical address). Because the cache block and main memory block are the same size, the low address portion of the main memory address (block mainland address) can be used as the block of cache data block.
Ii. Address Mapping and transformation
An address map is the correspondence between an address in memory and an address in the cache for a data. Here are three ways to address mapping.
1, full-linked mapping method
Full-phase mapping means that any block in main memory can be mapped to any block in the cache, that is, when a piece of main memory needs to be transferred into the cache, according to the cache block occupancy or allocation situation, select a block to memory block storage, The selected cache block can be any block in the cache. For example, the cache has a total of 2C blocks, main memory has a total of 2M, when the main memory of a piece of j need to redeployment cache, it can be stored in the cache block 0, Block 1 、...、 block I 、... or block 2c-1 on any piece. As shown in the following figure.
Full-linked mapping method
In the full-phase mapping mode, the CPU's main memory address is as follows:
where M is the main block number, W is the font size within the block. And the CPU visit cache address form:
Where c is the cache block number, W is the font size within the block.
The conversion of main memory address to cache address is done by looking up a block table implemented by the connected memory, and its formation process is shown in the following figure.
Address translation of fully-linked mappings
When a main memory block is transferred into the cache, it is registered with a memory block number and the cache Block number mapping table in the linked storage. CPU visit, first of all, according to main memory address in the main memory block number m in the linked memory to find the cache block number C, if found, then the visit cache hit, and then the corresponding cache block number out, and to visit the cache address Block number C field, Shortly thereafter, the main memory address of the block within the font directly to the cache address Block font size w field, resulting in a visit to the cache address, and finally according to the address of the cache unit to complete access.
Advantages: High hit ratio, cache storage space utilization is high.
Disadvantages: The connection memory is huge, the comparison circuit is complex, accesses the related memory, each time must compare with the whole content, the speed is low, the cost is high, therefore only suitable for the small capacity cache use, the application is few.
2. Direct-linked mapping
Direct-linked mapping means that a block of main memory J can only be mapped to a cache block I that satisfies the following specific relationship:
I=J MoD 2C
Direct-linked mapping method
In the figure above, the main memory of the No. 0, 2 C, 2c+1 、... Block can only be mapped to the cache of the NO. 0 block, main memory of the 1th, 2c+1, 2c+1+1 、... Blocks can only be mapped to the 1th block of the cache, ..., main memory of the 2c-1, 2c+1-1 、... 2m-1 blocks can only be mapped to the 2c-1 block of the cache.
The main memory block with the same remainder after 2C is corresponding to the same block in the cache.
In the direct-linked mapping mode, the CPU's main memory address is as follows:
where T is the sign number, C is the cache block number, and W is the font size within the block. Here, the original stored block number M is actually divided into two fields: T and C, where c is used to indicate that the block of main memory can map the cache block, that is, the remainder of the 2C after the rest of the part, and for the same residual different main memory blocks, the Division 2 c after the quotient (that is, the above T) part is not the same.
Generally speaking, main memory block number is the number of blocks of cache integer times, that is, main memory block number 2M and cache block number 2C to meet the relationship: 2m=n 2c.
In direct-linked mapping, the tag number t is stored with each block of the cache, and its address translation process is shown in the following illustration.
Address translation of direct-linked mappings
When a main memory block into the cache, the main memory address of the T sign into the cache block flag field. When the CPU sends a visit address, first of all, according to the main memory address of the C field to find the corresponding block cache, and then the block flag fields stored in the logo and main memory address of the T-sign comparison, if consistent, indicating main memory block has been transferred into the cache block, then hit, Then use the main memory address of the W field to access the cache block of the corresponding word unit, if it does not match, then missed, then use main memory address directly to visit main memory.
Advantages: Compare the circuit is the simplest, address mapping method is simple, data access, just check the area code is equal, so you can get faster access speed, hardware equipment simple.
Disadvantage: Cache block conflict rate is high, the remainder of the same main memory block can not enter the cache at the same time, thereby reducing the cache utilization. Since each block of main memory can only be mapped to a specific block on the cache, when a block of main memory needs to be transferred into the cache, if the corresponding cache specific block has been occupied, and the other block in the cache even if idle, main memory block can only be replaced by the way into a specific block position, Can not be placed in the location of other blocks, the replacement operation is frequent, the hit rate is relatively low.
3. Group-linked mapping
Both of these methods have their advantages and disadvantages, and interestingly, their pros and cons are just the opposite, that is, the advantages of the full-phase mapping method is the disadvantage of direct-linked mapping, but for the full-phase mapping is the disadvantage of the direct-linked mapping method. So, can you find a way to map out the advantages of both approaches? Let's take a look at the group-linked mapping method in this way, the cache is divided into 2u groups, each containing 2v blocks. The block of main memory is directly linked with the cache group, and the whole-phase mapping is used in the blocks of the group. That is, a block of main memory can only be mapped to any one of the cache's specific groups. The relationship between a block J of main memory and the group K of the cache is as follows:
K=J MoD 2u
Set main Memory A total of 2sx2u blocks (that is, m=s+u), their mapping relationship is shown in the following figure.
Group-linked Mapping method
In the figure, main memory block 0, 2u, 2u+1