CPU cache [excerpt]

Source: Internet
Author: User

CPU Cache

Most CPUs only have Level 1 cache and level 2 cache, and some have Level 3 cache.
The cache memory is the temporary memory between the CPU and memory. It has a smaller capacity than the memory but is faster than the memory. The data in the cache is a small part of the memory, but this small part is about to be accessed by the CPU in a short time. When the CPU calls a large amount of data, you can avoid calling the memory directly from the cache to speed up reading.

The cache of the CPU improves the speed at which the CPU processes repeated data in the memory. The data to be computed by the CPU is obtained from the memory, but the memory speed is much slower than that of the CPU. the CPU is always waiting and cannot be allocated, therefore, a high-speed cache is designed in the CPU (the capacity is relatively small ). In advance, the data that may be used by the CPU is obtained from the memory to the cache. Due to the role of the prediction mechanism, in the case of more than 90%, the data required by the CPU is in the cache and can be obtained soon. This significantly improves the system efficiency.

The hard disk cache improves the speed between the CPU and the hard disk. in the same way, most of the data in the memory is obtained from the hard disk, and the memory speed is several dozen times faster than that of the hard disk. It is also annoying to read hard disks such as memory. Therefore, we designed a cache similar to the memory speed on the hard disk to read the data that may be used in the memory from the hard disk to the cache. If the memory needs data, first go to the hard disk cache to find the data. The data cannot be found and the hard disk can be read again.

High-speed cache is divided into Level 1 cache (L1 cache) and level 2 cache (l2cache ). When the CPU is running, it first reads data from the first-level cache, then reads data from the second-level cache, and then reads data from the memory and virtual memory, therefore, the high-speed cache capacity and speed directly affect the CPU performance. The first-level cache is built in the CPU and runs at the same speed as the CPU, which can effectively improve the CPU running efficiency. The larger the level-1 cache, the higher the CPU running efficiency. However, due to the limitation of the internal structure of the CPU, the level-1 cache capacity is very small. The second-level cache has a great impact on CPU running efficiency. Currently, the second-level cache is generally integrated in the middle, but it can be divided into two types: internal and external chips, the second-level cache integrated in the chip and the second-level cache at the same frequency as the CPU
(Full-speed second-level cache), while the second-level cache integrated outside the chip runs at half the CPU running frequency (that is, half-speed second-level cache), so the running efficiency is low. But what are the advantages of level 1 cache and level 2 cache? You have to tell our dealer that you actually need to talk to him in the most common words. So let's give them an analogy. This is like when you drive a car, the trunk is the entire level-1 cache. If there is a small box in the armrest, it is your level-2 cache. What are the benefits of second-level cache? When you are driving at any time, you can get things in it at any time. If your second-level cache is small, you have to stop your car and pick it up in the trunk.
First, let's take a look at the level 1 cache. At present, most of the mainstream processors have Level 1 cache and level 2 cache, and a few high-end processors also integrate Level 3 cache. The level-1 cache can be divided into level-1 Instruction Cache and level-1 data cache. The first-level command cache is used to temporarily store and deliver various computing commands to the CPU. The first-level data cache is used to temporarily store and deliver the required computing data to the CPU, this is the role of a level-1 cache (If you have difficulty understanding the above text, refer to it ).
So what is the role of level 2 slow storage? In short, the second-level cache is the buffer of the first-level cache: the manufacturing cost of the first-level cache is very high, so its capacity is limited, the second-level cache is used to store data that is used when the CPU is processed and cannot be stored in the first-level cache. Similarly, the three-level cache and memory can be seen as the buffer of the second-level cache. Their capacity increases, but the manufacturing cost per unit decreases. Note that no two-level cache, three-level cache, or memory can store the original commands for processor operations. These commands can only be stored in the CPU's first-level instruction cache, the remaining level-2 cache, level-3 cache, and memory are only used to store CPU data.
According to different working principles, the primary data cache currently used by mainstream processors can be divided into two types: real data read/write cache and data code command tracking cache, which are used by AMD and Intel respectively. Different level-1 data cache designs have different requirements for level-2 slow storage capacity. Let's take a look at the differences between these two types of level-1 data cache designs.

 

The cache memory is the temporary memory between the CPU and memory. It has a smaller capacity than the memory but is faster than the memory. The data in the cache is a small part of the memory, but this small part is about to be accessed by the CPU in a short time. When the CPU calls a large amount of data, you can avoid calling the memory directly from the cache to speed up reading. It can be seen that adding cache to the CPU is an efficient solution, so that the entire internal storage device (Cache + Memory) becomes a high speed of the existing cache, there is a large memory storage system. The cache has a great impact on CPU performance, mainly because of the CPU data exchange sequence and the CPU and Cache
Bandwidth.

The principle of caching is that when the CPU needs to read a piece of data, it first looks for it from the cache. if it finds it, it immediately reads it and sends it to the CPU for processing. If it does not, it reads data from the memory at a relatively slow speed and sends it to the CPU for processing. At the same time, it transfers the data block of the data to the cache, this allows you to read the entire block of data from the cache without calling the memory.

This reading mechanism makes the CPU read cache hit rate very high (most CPUs can reach about 90%), that is, 90% of the data to be read by the CPU next time is in the cache, only about 10% needs to be read from memory. This greatly saves the time for the CPU to directly read the memory, and does not need to wait for the CPU to read data. In general, the CPU reads data in the first cache and then the memory.

The earliest CPU cache was a whole and the capacity was very low. Intel classified the cache from the Pentium era. At that time, the cache integrated in the CPU kernel was insufficient to meet the requirements of the CPU, and the manufacturing process constraints could not greatly increase the cache capacity. Therefore, the cache integrated with the CPU kernel is called a level-1 cache, while the external cache is called a level-2 cache. Data Cache (D-Cache) and Instruction Cache (Instruction Cache, I-Cache) are also divided in the first-level cache ). The two commands are used to store and execute the data respectively, and the two commands can be accessed by the CPU at the same time, which reduces the conflict caused by contention for the cache and improves
Manager performance. When Intel launched the Pentium 4 processor, it replaced the instruction cache with a new level-1 trace cache with a capacity of 12 K μops, indicating that it could store 12 K micro-commands.

With the development of the CPU manufacturing process, the secondary cache can be easily integrated into the CPU kernel, and the capacity is also increasing year by year. It is no longer accurate to define level 1 and level 2 caches with or without integration in the CPU. As the second-level cache is integrated into the CPU kernel, the gap between the second-level cache and the CPU is also changed. At this time, it works at the same clock speed, it can provide higher transmission speed for the CPU.

Second-level cache is one of the keys to CPU performance. without changing the CPU core, increasing the second-level slow storage capacity can greatly improve the performance. The high and low-end CPUs of the same core are often different in the level-2 cache, which shows the importance of level-2 cache for CPU.

When the CPU finds useful data in the cache, it is called hit. When there is no data required by the CPU in the cache (this is called Miss), the CPU accesses the memory. Theoretically, In a CPU with a second-level cache, the hit rate of reading the first-level cache is 80%. That is to say, the useful data found in the CPU level-1 cache accounts for 80% of the total data, and the remaining 20% are read from the level-2 cache. Because the data to be executed cannot be accurately predicted, the hit rate of reading the second-level cache is also around 80% (16% of the total data read from the second-level cache ). There is still data that has to be called from the memory, but this is already a very small comparison. The current high-end CPU also has a level-3 cache, which is designed to read data that is not hit after the level-2 Cache. In the CPU with level-3 cache, only about 5% of data needs to be retrieved from
Memory, which further improves the CPU efficiency.

To ensure a high hit rate during CPU access, the content in the cache should be replaced by a certain algorithm. A commonly used algorithm is the least recently used algorithm (LRU), which removes the rows that have been least accessed in the recent period. Therefore, you need to set a counter for each row. The LRU algorithm clears the counters of hit rows and Adds 1 to the counters of other rows. When a replacement is required, the data row with the largest counter value is eliminated. This is an efficient and scientific algorithm. The counter clearing process can remove unnecessary data from the cache after frequent calls, improving the cache utilization.

In CPU products, the primary cache capacity is basically between 4 kb and 64 KB, and the secondary cache capacity is divided into 128kb, 256kb, 512kb, 1 MB, and 2 MB. The primary cache capacity varies slightly between products, while the secondary cache capacity is the key to improving CPU performance. The increase in the Level 2 slow storage capacity is determined by the CPU manufacturing process. The increase in the capacity will inevitably lead to an increase in the number of transistors in the CPU. A larger cache should be integrated into a limited CPU area, the higher the requirement for manufacturing process.

The second-level cache of the CPU is enabled by default. It is wrong to modify the corresponding second-level cache key value in the Registry to enable the second-level cache. It is just a ing. If the second-level cache is not enabled, the computer performance will be very affected.
In Windows XP, the second-level CPU cache is not enabled by default. To improve system performance, we can enable it by modifying the registry or using software such as "Windows optimization master.
Run the Registry Editor, expand the HKEY_LOCAL_MACHINE \ SYSTEM \ CurrentControlSet \ Control \ Session Manager \ Memory Management Branch, and double-click "secondleveldatacace" in the window on the right ", in the pop-up window, enter the second-level slow storage capacity of the CPU used by the current computer.
The second-level cache of the saiyang processor is kb and its value should be set to 80 (hexadecimal, the same below ). P ⅱ, P ⅲ, and P4 are both 512kb Level 2 caches, which should be set to 200; P ⅲ E (EB) and P4 Willamette only have 100 kb Level 2 caches, which should be set; AMD duron only has 64 kB Level 2 cache, should be set to 40; K6-3 has kb Level 2 cache; athlon has kb Level 2 cache; athlon XP has kb Level 2 cache; athlon XP (Barton core) it has a kb level-2 cache.
You can also use the Windows optimization master to correctly set the second-level cache of the CPU: Start the Windows optimization master and select "system performance optimization". In "file system optimization", the top is the setting item about the second-level cache of the CPU. Drag the slider to the corresponding position, save the settings, and restart the computer.

First, let's take a look at the level 1 cache. At present, most of the mainstream processors have Level 1 cache and level 2 cache, and a few high-end processors also integrate Level 3 cache. The level-1 cache can be classified into level-1 Instruction slow-down and level-1 data cache. The level-1 command cache is used to temporarily store and deliver various computing commands to the CPU. level-1 data cache is used to temporarily store and deliver the data required for computing to the CPU. This is the role of level-1 cache.

So what is the role of level-2 cache? In short, the second-level cache is the buffer of the first-level cache: the manufacturing cost of the first-level cache is very high, so its capacity is limited, the second-level cache is used to store data that is used when the CPU is processed and cannot be stored in the first-level cache. Similarly, the three-level cache and memory can be seen as the buffer of the second-level cache. Their capacity increases, but the single-bit manufacturing cost decreases. Note that no two-level cache, three-level cache, or memory can store the original commands for processor operations. These commands can only be stored in the CPU's first-level instruction cache, the remaining level-2 cache, level-3 cache, and memory are only used to store CPU data.

According to different working principles, the primary data cache currently used by mainstream processors can be divided into two types: real data read/write cache and data code command tracking cache, which are used by AMD and Intel respectively. Different level-1 data cache designs have different requirements for level-2 slow storage capacity. Let's take a look at the differences between these two types of level-1 data cache designs.
I. AMD primary data cache design

AMD adopts a level-1 cache design that is a traditional "real data read/write cache" design. The first-level data cache based on this architecture is mainly used to store the data first read by the CPU, while more read data is stored in the second-level slow memory and system memory. To make a simple assumption, assume that the processor needs to read the string "amd athlon 64 3000 + is good" (do not remember spaces ), first, the read amdathl will be stored in the first-level data cache, the remaining "on643000 + isgood" is stored in the second-level cache and system memory (as shown in ).

Note that the above assumption is only an abstract description of the AMD processor level-1 data cache. The length of data stored in the level-1 data cache and level-2 data cache is determined by the size of the cache capacity, rather than the several bytes in the preceding assumptions. The advantage of "real data read/write cache" is that data reading is direct and fast, but it also requires a certain capacity for a level-1 data cache, it increases the manufacturing difficulty of the processor (The unit manufacturing cost of the first-level data cache is higher than that of the second-level data cache ).
Ii. Intel Level 1 Data Cache Design

Since the P4 era, Intel began to adopt a brand new "data code instruction tracking cache" design. The level-1 data cache based on this architecture no longer stores actual data, but stores the instruction code of the data in the level-2 cache (that is, the starting address of the data stored in the level-2 cache ). Assume that the processor needs to read the string "Intel p4 is good" (without spaces), then all the data will be stored in the second-level cache, the first-level data code command tracking cache only needs to store the starting address of the above data (as shown in ).

Because the first-level data cache does not store actual data, the "data code command tracking cache" design can greatly reduce the CPU Requirements for the first-level data cache capacity and reduce the difficulty of processing the processor. However, the disadvantage of this design is that the data reading efficiency is lower than the "real data read/write Cache Design", and the dependence on the second-level slow storage capacity is very large.

After learning about the general functions and classification of level-1 and level-2 caches, let's answer the questions raised by cainiao Hardware Users below.
Theoretically, the higher the level-2 cache, the better the performance of the processor. However, this does not mean that the doubling of the level-2 cache capacity can multiply the performance of the processor. At present, the vast majority of data processed by the CPU is between 0 and kb, and the size of a small part of data is between kb and KB. Only a small amount of data exceeds kb. Therefore, as long as the processor's available level-1 and level-2 slow storage capacity reaches kb or above, it will be able to cope with normal applications. The level-2 cache of KB capacity is sufficient to meet the needs of most applications.

Among them, for AMD athlon 64 and sempron processors designed with "real data read/write cache", they already have 64 kB Level 1 Instruction Cache and 64 kB Level 1 data cache, as long as the second-level slow storage capacity of the processor is greater than or equal to kb, sufficient data and commands can be stored, so they are not highly dependent on the second-level cache. This is why socket 754 sempron 3000 + (3100 kb second-level cache), sempron 2800 + (kb second-level cache), and athlon 64 + (kb second-level cache) in most evaluations, the main reason is that the performance is very close. Therefore, for common users, 754
Sempron 2600 + is worth considering.

In contrast, Intel currently focuses on P4 and sai Yang series Processors, all of which adopt the "data code instruction tracking cache" architecture, among them, The Prescott kernel's level-1 cache only contains 12 kb Level 1 refers to cache and 16 KB Level 1 data cache, while the Northwood kernel only has 12 kb Level 1 Instruction Cache and 8 KB Level 1 data cache. Therefore, the P4 and saiyang series Processors are very dependent on the secondary cache, and the competition D 320 (2.4 kb second-level cache) and the competition Yang GHz (kb second-level cache) the huge performance gap proves this well, and the performance gap between the Competition D and the P4 e processor is also very obvious.

If you are an avid game enthusiast or a professional user engaged in multimedia production, therefore, the P4 processor with 1 MB level-2 Cache and the athlon 64 processor with KB/1 MB level-2 cache are your ideal options. In high-load computing, the first-level cache and second-level cache of the CPU are almost full. At this time, the large-capacity second-level cache can improve the performance of the processor by about 5%-10%, this is absolutely necessary for demanding users.

 

At present, most of the mainstream processors have Level 1 cache and level 2 cache, and a few high-end processors also integrate Level 3 cache. The level-1 cache can be divided into level-1 Instruction Cache and level-1 data cache. The level-1 command cache is used to temporarily store and deliver various computing commands to the CPU. level-1 data cache is used to temporarily store and deliver the data required for computing to the CPU. This is the role of level-1 cache.

Second-level cache is the buffer of the first-level cache: the manufacturing cost of the first-level cache is very high, so its capacity is limited, the second-level cache is used to store data that needs to be used during CPU processing and cannot be stored in the first-level cache. Similarly, the three-level cache and memory can be seen as the buffer of the second-level cache. Their capacity increases, but the manufacturing cost per unit decreases. Note that no two-level cache, three-level cache, or memory can store the original commands operated by the processor. These commands can only be stored in the CPU's first-level instruction cache, the remaining level-2 cache, level-3 cache, and memory are only used to store CPU data.

According to different working principles, the primary data cache currently used by mainstream processors can be divided into two types: real data read/write cache and data code command tracking cache, which are used by AMD and Intel respectively. Different level-1 data cache designs have different requirements for level-2 slow storage capacity. Let's take a look at the differences between these two types of level-1 data cache designs.

 

Level 1 is the most important, but now the level 1 cache of CPU is almost the same, so ignore.
Level-2 cache is very important for Intel's CPU. The higher the level-2 cache of Intel's CPU, the higher the performance is. AMD's CPU, although level-2 cache is also very important, however, the second-level cache size does not significantly improve AMD's CPU performance.
In fact, the third-level cache only plays a secondary role. In addition to servers, it is actually useless to our home machines, and the memory is still very important.

Therefore, the second-level cache is used to measure the CPU performance.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.