First, what is memory alignment and why is memory alignment required?
The memory space in modern computers is divided by Byte, in theory it seems that access to any type of variable can start from any address, but the reality is that when accessing a particular type of variable, it is often accessed at a specific memory address, which requires all types of data to be spatially arranged according to certain rules, Instead of sequentially one by one emissions, that's the alignment.
Word, double word, and four words do not need to be aligned in memory on natural boundaries. (for words, double words, and four words, the natural boundary is an even address, an address that can be divisible by 4, and an address divisible by 8.) )
In any case, in order to improve the performance of the program, data structures (especially stacks) should be aligned as far as possible on natural boundaries. The reason is that in order to access unaligned memory, the processor needs to make two memory accesses; However, aligned memory access requires only one access.
Some operations with a four-word instruction require that the number of memory operands be aligned on the natural boundary. If the operands are not aligned, these instructions will produce a generic protection exception. A four-word natural boundary is an address that can be divisible by 16. Other operations with a double four-word instruction allow unaligned access (no generic protection exception), however, additional memory bus cycles are required to access misaligned data in memory.
In terms of terminology, the basic C type is self-aligning (self-aligned) on both X86 and arm. pointers, whether 32-bit (4-byte) or 64-bit (8-byte), are self-aligning.
Self-aligning can be accessed faster because it can access that type of data with an instruction. On the other hand, if there is no alignment limit, the code may use more than two instructions when accessing across machine word boundaries. Characters are special cases: regardless of where they are in the machine word, the access cost is the same. So they don't have alignment requirements.
Second, the alignment rules
Compilers on each particular platform have their own default "alignment factor" (also known as the number of Zimo). Programmers can change this factor by precompiling the command #pragma pack (n), where n is the "alignment factor" you want to specify.
Rules:
(1) data member alignment rules: data members of a struct (struct) (or union), where the first data member is placed at offset 0, and subsequent alignment of each data member according to the value specified by the #pragma pack and the length of the data member itself, The relatively small one is carried out.
(2) The overall alignment rule of the structure (or union): After the data members have completed their respective alignments, the structure (or union) itself is aligned, and the alignment will be performed according to the value and structure (or union) of the maximum data member length specified by the #pragma pack, whichever is smaller.
(3) when the n value of the #pragma pack equals or exceeds the length of all data members, the size of the N value will have no effect.
Three, fill (padding)
Now let's look at a simple example where the variables are distributed in memory.
1 2 3 |
Char *p; Char c; int a; |
If you do not know the data alignment, you may assume that the three variables occupy contiguous bytes in memory. That is, a 4-byte pointer on a 32-bit machine immediately follows a 1-byte char, followed by a 4-byte int. On a 64-bit machine, the only difference is that the pointer is 8 bytes.
This is the actual situation (on x86 or arm or any self-aligning machine): p is stored in a 4-byte or 8-byte aligned position (determined by the machine's word length). This refers to alignment-the most stringent possible situation.
The storage of C follows the p. However, a 4-byte alignment requirement creates a gap, as if there is a fourth variable inserted into it:
1 2 3 4 |
Char *p; //4 or 8 bytes Char c; //1 bytes Charpad[3]; //3 bytes int a; //4 bytes |
CHARPAD[3]; Represents a waste of 3 bytes.
If A is a 2-byte short, this is the case with the memory distribution:
1 2 3 4 |
Char *p; //4 or 8 bytes Char c; //1 bytes Charpad[1]; //1 bytes Short a; //2 bytes |
If you want these variables to take up less space, you can swap the positions of A and C:
1 2 3 |
Char *p; //4 or 8 bytes int a; //4 bytes Char c; //1 bytes |
Iv. alignment and padding of structures
It says that the struct will actually align with its widest member, compiling this because it is the easiest way to ensure that all members are self-aligning for fast access.
Look at this structure:
1 2 3 4 5 |
struct user{ Char *name; Char c; int Age; }; |
Assume that on a 32-bit machine, the memory distribution is this:
1 2 3 4 5 6 |
struct user{ Char *name; //4 bytes Char c; //1 bytes Charpad[3]; //3 bytes int Age; //4 bytes }; |
In this case, sizeof (user) is 12 bytes
So if we exchange the position of C and age.
1 2 3 4 5 |
struct user{ Char *name; //4 bytes int Age; //4 bytes Char c; //1 bytes }; |
You might think that sizeof (user) is 9, but that's why sizeof (user) is 12 bytes.
Because the struct is aligned according to the widest member, it is still populated with 3 bytes but not used at the end.
struct user uu[4];
Thus, in the UU array, each member has a 3-byte trailing fill, because the first member of the next struct needs to be aligned on a 4-byte boundary.
Now let's consider the bit field (bitfields). They allow you to declare a member that is smaller than the byte width, as low as 1 bits.
1 2 3 4 5 6 7 |
struct St{ Short s; Char c; int Flip:1; int nybble:4; int septet:7; }; |
From the compiler's point of view, the bit field in the struct St is like a 2-byte, 16-bit character array, using only 12 bits. To make the length of a struct a number of multiples of its widest member length (that is, sizeof (short)), there is also a byte fill:
1 2 3 4 5 6 7 8 9 |
Struct ST{ Short S;2 bytes Char C;1 bytes int flip:1; //total 1 bit int nybble :4; //total 5 BITS&NBSP int septet:7< Span class= "sy0" >; //total 12 bits int pad:4; //total bits Charpad; //1 bytes } |
If your structure contains a structure, the structure inside will have the same alignment as the longest scalar.
V. Structural member rearrangement
Take a look at these two scenarios under the 32-bit system:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
struct user{ Char A;1 bytes Charpad[3];3 bytes int C;4 bytes Char b;//1 bytes Charpad[3]; //3 bytes }; struct user{ int c< Span class= "sy0" >; //4 bytes char a; //1 bytes char b; //1 bytes Charpad[2]; //2 bytes }; |
The above structure members are the same, but the order is not the same, but the size of the front one is 12 bytes, followed by 8 bytes.
First we noticed that the overflow only occurred in two places. One is that the larger data type (which requires tighter alignment) follows the smaller data. The other is where the structure naturally ends up being populated between the stepping addresses so that the next identical structure can be aligned correctly. The simplest way to eliminate overflows is to sort members by the decrement of the alignment values.
1 2 3 4 5 |
Union u{ Char a; int b; A long double C; }; //size 8 bytes 32 bits |
The Struct/class/union memory alignment principle is the same.
This article link: http://www.blogfshare.com/memory-alignment.html
Align memory in C/C + + with memory alignment