About CPU Cache--The program apes need to know.

Last Update:2015-04-04 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Article welcome reprint, but reproduced when please retain this paragraph text, and placed on the top of the article
Lu Junyi (Cenalulu)
This text address: http://cenalulu.github.io/linux/all-about-cpu-cache/

Let's take a look at a mind map of all the concepts in this article.

Why do I need a CPU Cache

As the process has increased in recent decades CPU , and has been constrained by manufacturing processes and cost constraints, the current computer's memory is mostly DRAM and does not have a qualitative breakthrough in access speed. As a result, CPU the processing speed and memory access speed gap is increasing, even tens of thousands of times. In this case, the CPU traditional FSB way of direct-attached memory is obviously due to the memory access waiting, resulting in a large number of idle computing resources, reducing CPU overall throughput. At the same time, due to the focus of memory data access hotspot, CPU between and memory with a relatively fast and costly to SDRAM do a layer of cache, it appears cost-effective very high.

Why should I have a multi-level CPU Cache

With the development of science and Technology, the volume of hot-spot data is getting bigger, the simple increase of the first-level cache size is very low price. As a result, there is a level L1 Cache two cache () that adds a layer of access speed and cost between the first level cache () and the memory L2 Cache . Here is an excerpt from the What Every Programmer Should Know About Memory explanation:

Soon after the introduction of the cache, the system got more complicated. The speed difference between the cache and the main memory increased again, to a point that another level of cache is add Ed, bigger and slower than the First-level cache. Only increasing the size of the First-level cache is not a option for economical rea-sons.

In addition, due to the difference in behavior and hotspot distribution of program instruction and program data, it is L1 Cache also divided into L1i ( i for instruction ) and L1d ( d for data ) Two special-purpose caches.

The following diagram shows the response time gap between cache levels and how slow the memory is!

What is cache line

Cache LineCan be simply understood as the CPU Cache smallest cache unit in. The size of the current mainstream CPU Cache Cache Line is 64Bytes . Assuming that we have a 512 byte-level cache, the number of caches that this first- 64B level cache can hold is 512/64 = 8, according to the cache unit size. For details, see:

To get a better understanding Cache Line , we can also do the following interesting experiment on our own computer.

The following C code, which receives a parameter from the command line, creates an array of numbers as the size of the array N int . The array content is accessed sequentially from this array in turn, looping 10 billions of times. The total size of the final output array and the corresponding total execution time.

#include "stdio.h"#include <stdlib.h>#include <sys/time.h>LongTimediff (clock_t t1, clock_t T2) {LongElapsed Elapsed = ((Double) t2-t1)/clocks_per_sec * +;returnelapsed;}intMainintargcChar*argv[])#*******{intArray_size=atoi (argv[1]);intRepeat_times =1000000000;Long Array[Array_size]; for(intI=0; i<array_size; i++) {Array[I] =0; }intj=0;intk=0;intC=0; clock_t Start=clock (); while(J++<repeat_times) {if(k==array_size) {k=0; } C =Array[k++]; } clock_t end =clock ();printf("%lu\n", Timediff (start,end));return 0;}

If we make a line chart of this data, we will find that the total execution time 64Bytes has a more obvious inflection point when the array size exceeds (of course, because bloggers are Mac tested on their notebooks, they are disturbed by many other programs and therefore fluctuate). The reason is that when the array is less than the 64Bytes array is likely to fall Cache Line within one, and the access of an element will make the entire bar Cache Line is filled, so it is worthwhile to benefit from a number of subsequent elements of the cache acceleration. When the array is larger than 64Bytes that, there must be at least two Cache Line , and then two times on the loop, Cache Line because the cache fills more time than the data access response time, so the impact of multiple cache fills on the total execution is magnified, resulting

How does the concept of the cache line help our program ape?

Let's take a look at the following C examples of circular optimizations commonly used in this language
In the following two pieces of code, the first piece of code is C always faster than the second piece of code in the language. The specific reason to believe that you read carefully Cache Line after the introduction will be easy to understand.

for(int0; i < n; i++) {    for(int0; j < n; j++) {        int num;            //code        arr[i][j] = num;    }}

for(int0; i < n; i++) {    for(int0; j < n; j++) {        int num;            //code        arr[j][i] = num;    }}

About CPU Cache--The program apes need to know.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

About CPU Cache--The program apes need to know.

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

About CPU Cache--The program apes need to know.

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support