International - English

Cart Console

Topic Center

Contact Sales

Home > Cloud Computing > Cloud Applications

Linux Kernel Study Notes: CPU cache line alignment

Last Update:2017-09-26 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

High-speed cache of CPU is generally divided into level-1 cache and level-2 Cache. More CPUs now provide level-3 cache. When the CPU is running, it first reads data from the first-level cache. If the read fails, it reads data from the second-level cache. If the read fails, it reads data from the memory. However, the gap in the clock cycle between the CPU and the data that is eventually read from the first-level cache, second-level cache, or main memory is very large. Therefore, the high-speed cache capacity and speed directly affect the CPU performance. The first-level cache is built inside the CPU and runs at the same speed as the CPU, which can effectively improve the high-speed cache of the CPU. Generally, it is divided into the first-level cache and the second-level cache, today, more CPUs provide third-level caching. When the CPU is running, it first reads data from the first-level cache. If the read fails, it reads data from the second-level cache. If the read fails, it reads data from the memory. However, the gap in the clock cycle between the CPU and the data that is eventually read from the first-level cache, second-level cache, or main memory is very large. Therefore, the high-speed cache capacity and speed directly affect the CPU performance. The first-level cache is built in the CPU and runs at the same speed as the CPU, which can effectively improve the CPU running efficiency. The higher the level-1 cache, the higher the CPU running efficiency.

Level-1 cache is divided into data cache and command cache, which are composed of high-speed cache lines. For CPU in X86 architecture, high-speed cache lines are generally 32 bytes, in the early stage, the CPU only had about 512 rows of High-speed cache rows, that is, about 16 KB of first-level cache. The current CPU is generally a cache of more than 32 KB.

When the CPU needs to read a variable, the memory data of the variable in 32-byte groups will be read into the cache line together. Therefore, for programs with strict performance requirements, it is very important to make full use of the advantages of high-speed cache rows. Alignment 32-byte frequently accessed data at one time and read it into the cache to reduce data exchange between the CPU advanced cache and low-level cache and memory.

However, for computers with multiple CPUs, the situation is different. For example:

1. CPU1 reads a byte and Its Adjacent bytes are read into the cache of cpu1.

2. CPU2 has done the same job. In this way, the cache of CPU1 and CPU2 has the same data.

3. CPU 1 modifies the byte. After modification, the byte is put back to the cache line of CPU 1. However, this information is not written into RAM.

4. CPU2 accesses this byte, but because CPU1 does not write data to RAM, data is not synchronized.

When a CPU modifies the bytes in the cache row, other CPUs in the computer will be notified, and their cache will be regarded as invalid. Therefore, in the above situation, CPU2 finds that the data in its cache is invalid, and CPU1 will immediately write its data back to RAM, and then CPU2 reads the data again. It can be seen that the high-speed cache row may cause some disadvantages on the multi-processor.

From the above situation, we can see that when designing the data structure, we should try to separate the read-only data from the read/write data, and try to combine the data accessed at the same time. In this way, the CPU can read the required data at a time.

For example:

Struct _

{

Int id; // not easy to change

Int factor; // variable

Char name [64]; // not easy to change

Int value; // variable

};

Such a data structure is very unfavorable.

In X86, you can try to modify and adjust it.

Struct _

{

Int id; // not easy to change

Char name [64]; // not easy to change

Char _ Align [32-sizeof (int) + sizeof (name) * sizeof (name [0]) % 32]

Int factor; // variable

Int value; // variable

Char _ Align2 [32-2 * sizeof (int) % 32]

};

32-sizeof (int) + sizeof (name) * sizeof (name [0]) % 32

32 indicates that the cache behavior in the X86 architecture is 32 bytes in size. _ Align is used for explicit alignment.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

Related Keywords:

Ingenious, Webapi httpbasicauthorize the practice of building... 10-09

Agent simplifies integration between cloud applications and e... 07-21

Enable cross-Cloud applications-DNS-based load balancing 12-31

The best practices for iOS network programming and cloud appl... 12-06

Enable dazzling 3D desktop effects (group chart) in Fedora8) 10-30

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

What's Trending

Top 10 Tags

datastax versions naming convention zookeeper client class definition md5 microsoft sql server 2005 data structures exception handling error handling

Top 10 Keywords

microsoft download center down wordpress address url site address url wordpress address url windows installer 4 0 download 302 not found web address url definition site address url wordpress db2 integer mac os installation step by step pdf abbreviation for return

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Linux Kernel Study Notes: CPU cache line alignment

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support