[Programming skills] Clever Use of CPU Cache Optimization code: array vs. Linked List

Source: Internet
Author: User

A common programming problem:TraversalWhich of the following is faster for arrays and linked lists of the same size?IfAlgorithmAnalysis method, you will come to the conclusion that these two are as fast as the time complexity is O (n ). However, in practice, the two are very different. The following analysis shows that arrays are much faster than linked lists.


First, we will introduce a concept: Memory Hierarchy (storage hierarchy). There are various types of memory in the computer, as shown in the following table.

    • CPU registers-Immediate access (0-1 CPU clock cycles)
    • CPU L1 Cache-Fast Access (three CPU clock cycles)
    • CPU L2 cache-Slightly slower access (10 CPU clock cycles)
    • Memory (RAM)-Slow access (100 CPU clock cycles)
    • Hard Disk (File System)-Very slow (10,000,000 CPU clock cycles)

(Data comes fromHttp://www.answers.com/topic/locality-of-reference)


The memory speed varies greatly between different levels, and the CPU register speed is 100 times the memory speed! This is why the CPU provider invented the CPU cache. This CPU cache is the key to the difference between arrays and linked lists.


The CPU cache reads a piece of continuous memory space because the array structure isSequential memory addressTherefore, all or some elements of the array are continuously stored in the CPU cache. The average time for reading each element is only three CPU clock cycles. The linked list node isScatteredIn the heap space, the CPU cache can only be used to read the memory, and the average read time requires 100 CPU clock cycles. In this case,Array access is 33 times faster than linked lists!(Here we will only introduce the concept. The specific numbers vary by CPU)


Therefore,ProgramTo make full use of the CPU cache power. This cache-friendly algorithm is called cache-Oblivious algorithm. If you are interested, refer to relevant materials. Another simple example:



For I in 0. n
For J in 0 .. m
For k in 0 .. p
C [I] [J] = C [I] [J] + A [I] [k] * B [k] [J];

For I in 0. n

For k in 0 .. p
For J in 0 .. m
C [I] [J] = C [I] [J] + A [I] [k] * B [k] [J];


Although the execution results are the same and the algorithm complexity is the same, you will find that the second method is much faster.


To sum up, the speed of various types of memory varies greatly, so it is absolutely necessary to consider this factor in programming. For example, the memory speed is 10 thousand times faster than that of the hard disk, so frequent hard disk reads and writes should be avoided in the program; the CPU cache speed is dozens times faster than that of the memory, and the memory should be used as much as possible in the program.



> OriginalArticleThe copyright belongs to the author. For more information, see the source and author information (Http://blog.csdn.net/WinGeek/), Thank you. <






Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.