Code performance--Inventory data structure design scheme

Source: Internet
Author: User

Each data type has a queue associated with it, which is licensed by the processor architecture and not by the language itself. Calibration data elements allow the processor to fetch data from memory in an efficient manner and thereby improve performance. To provide the best performance, the compiler tries to keep the queue for this data element. On 32-bit and 64-bit Linux systems, Intel? The typical alignment requirements for data types used on the C + + compiler are as follows:

Data Type

32-bit (bytes)

64-bit (bytes)

Char

1

1

Short

2

2

Int

4

4

Long

8

8

Float

4

4

Double

8

8

Long Long

8

8

Long double

4

16

Any pointer

4

8

In general, the compiler will meet the alignment requirements of these data elements whenever possible. Are you using Intel? In the case of C + + and Fortran compilers, you can use the-align (C/c++,fortran language) compiler switch to force or disallow natural alignment rules. For structures that typically contain different types of data elements, the compiler attempts to align the data elements that are persisted by inserting unused storage between the elements. This technique is called "padding". In addition, the compiler aligns the entire structure with its most stringent alignment member as a benchmark. The compiler may also increase the size of the structure, and when necessary, the compiler will multiply its implementation by adding padding at the end of the structure. This is called a "tail fill". As a result, populating the hospital with wasted storage space increases performance. If it is an Intel Xeon Phi Coprocessor, the amount of storage available to the application is limited in itself, which poses a serious problem.

best-in-breed design: minimizes memory waste

Developers can minimize this waste of memory by ordering the structure elements so that the largest/widest elements are in front, then the second wide, and then in turn. The following example can illustrate how the spatial size of a structure affects the ordering of data elements:

The structure S1 has 11 padding bytes, as shown in the following table:

Look at the following structure S2:

This structure contains only 3 tail-filled bytes, as shown in:

This saves memory. Therefore, it is possible to avoid memory wastage by simply rearrangement the data elements in the structure definition.

Best Design: Touch only a few elements at a time

One exception to this sort of element is that if your structure is larger than your cache line (64 bytes on the Intel Xeon Phi Coprocessor), some loops or cores will only be exposed to part of the structure. In this case, it may be beneficial to keep parts of the structure in memory, which may improve cache locality.

Best design: Decompose larger structures

If your structure is larger than the cache line, and some loops and cores can only touch one part of the structure, you can consider the smaller structures that are stored in separate permutations by decomposing large structures into multiple ones. This potentially increases the density of the data that can be contacted, and incident improves the locality of the cache.

Best Design: Force alignment of specific elements

You can also use the _decipsec (Align) property to instruct the compiler to align the data more closely than other methods, the syntax for this extended property is as follows:

C + +:

_decipsec (Align (n)) < data type declaration >

Fortran:

Cdec$atributes align:n::< data Type declaration >

Here n is the required queue, which is 2 of the maximum, 4096 in the Intel C + + compiler, and the largest in the Intel Fortran compiler is 16384. You can use this property for a single variable, static structure or automatic storage for the duration of the request alignment. However, this means that although you improve the consistency of the structure, this property does not adjust the alignment of elements within the structure. By placing _declpsec (align) in front of the keyword struct, you request an appropriate alignment for just this type of object. Let me illustrate my point with the following example:

In the above example, the alignment of the character A2 and the integer b2 remains each 1 bytes and 4 bytes, which is the default. However, each instance of a struct S2 is aligned to a 32-byte boundary, as described in the _declspsec declaration. Therefore, the structure of the structure S1 internal S2 Each instance is aligned to a 32-byte boundary.

Best Design: Dynamic allocation of memory alignment

We can further extend this example by dynamically allocating the arrangement of the Structure S2:

In this case, you still need to use _MM_MALLOC or a Portable Operating system interface (POSIX) that is equivalent to assigning aligned memory to the pointer, but by using _declspec (align (32)), You just want to force the alignment to 32 bytes for each element in the permutation arr1.

best-of-breed design: Use Align (n) and structs to enforce cache locality for small data elements

You can also use this data alignment support to provide the benefits of using optimization for cache lines. By aggregating small objects that are often used together into a structure and forcing the structure to allocate memory from the start of the cache line, you can effectively guarantee that every object will be loaded into the cache in a timely manner when needed, which can have a noticeable performance boost. For example, considering the two frequently called variables I and J, they may be assigned to different tell cache lines. You can declare them as follows:

By declaring variables in this way, the compiler can ensure that the variables are assigned to the same cache line.

HPE offers a "Genuine IDE Joint Promotion program" with a range of Ides as low as half-price (due date 2014/12/31). There are also 50 percent time-limited snapping and free collar iphone?6, Ipad?air and other good gifts!


Code performance--Inventory data structure design scheme

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.