Let me talk about the problem first: I wrote a piece of code to explain the hard disk MBR partition table when I learned C program design in the early years. What can I do with the disk editor? What can I do? When I can execute it, the result will be wrong. At that time, the debugging was not very good, and I had never heard of structure alignment. Therefore, the problem could not be solved, and it had been quite tangled for several days. Later, I had no choice but to ask a friend to learn about the possibility of structure alignment. I checked and changed it.
The problem is solved, but most of the information on the Internet only mentions how memory alignment is done, but seldom mentions why (even if it is mentioned, it is quite simple ). I am a super forgetful, and it is difficult to remember these broken rules mechanically. So I thought carefully and finally understood the reasons. In this way, these alignment rules will not be easily forgotten.
Not only does struct have memory alignment, but also classes (objects), and even the storage of all variables in the memory alignment (but these variables are transparent to programmers and do not need to be concerned ). In fact, this alignment is a technical means to balance the space and complexity. Simply put, it is to make it possible to waste space, increase the minimum (FAST) processing of the same operation process as much as possible. Here is an example:
Assume that the machine font is 32 characters long (4 bytes in size). In the following example, the machine font length is used to process data in any memory. Now there are two variables:
Char;
Int B;
Assume that the two variables are allocated from the memory address 0. If alignment is not considered, they should be stored in this way (see the little endian on intel for example, every 16 bytes is divided into one line, followed by the same ):
Because the computer's word length is 4 bytes, the process of Processing Variables A and B may be roughly as follows:
A: read 32-bit 0x00-0x03 into the Register, and then shift 24-bit left and then 24-bit right to get the value of a (or with 0x000000FF)
B: Read the 32-bit 0x00-0x03 into the register and get the 24-bit low value through bitwise operations. Then, read the 32-bit 0x04-0x07 into the register and get the 8-bit high value through bitwise operations; and then perform bitwise operations with the first 24 bits to obtain the entire 32-bit value.
As described above, the processing of a is the simplest and B can be processed. It is itself a 32-digit number, but it has to be converted into two parts during processing and then merged, efficiency is somewhat lower.
To solve this problem, you need to pay a few bytes of waste. The allocation method is changed:
According to the above allocation method, the processing process of A remains unchanged; B is much simpler: it is OK to read the 32 characters 0x04-0x07 into the register.
We can talk about the alignment of struct or class members:
After the struct is compiled into machine code, it does not actually have its own set concept. Classes are actually an enhanced struct. When class objects are instantiated, the space set of some variables applied in the memory (similar to struct and does not contain function pointers ). Each variable in these sets needs to be involved in the use of the above processing principles. Naturally, we need to make a trade-off between efficiency and space.
To conveniently process multiple original variables of the same type, simplify the addressing of the original variables, and summarize the minimum processing principles described above, the length of the original variable can usually be used as the allocation unit for this variable, for example, the memory can be 64 units. If the length of an original variable is 8 bytes, even if the machine word length is 4 bytes, the allocation is 8 bytes aligned (seemingly identical IO times ), in this way, the addressing and allocation operations can be performed per 8 bytes, which simplifies the operation and makes it more efficient.
The default alignment rules of the system are at least two points: 1. The most efficient processing of variables 2. The minimum space for achieving objective 1
For example, a struct is as follows:
// By www.datahf.net zhangyu
Typedef struct T
{
Char c; // its own length is 1 byte
_ Int64 d; // The length is 8 bytes.
Int e; // 4 bytes in length
Short f; // its own length is 2 bytes
Char g; // The length is 1 byte.
Short h; // its own length is 2 bytes
};
Assume that a struct variable C is defined and allocated to the 0x00 position in the memory. Obviously:
C. c is also a register read in any case, so it occupies one byte first.
For member C. d is a 64-bit variable. c storage, it must be read into the register at least three times. In order to achieve at least two reads, at least 4 bytes must be aligned. For 8 bytes of original variables, in order to unify the addressing unit, 8-byte alignment is required. Therefore, it should be allocated to 0x08-0xF.
If C. e is a 32-bit variable, it can be allocated to 0x10-0x13 because it must start with an integer of 32 characters.
C. f is a 16-bit variable, which is directly allocated to 0x14-0x16. In this way, you only need to read the register once and then process it. The boundary is also 16-bit aligned.
For member C. g is an 8-bit variable, which must be read into the register and processed at a time. At the same time, the variable in one byte is stored in alignment starting from any byte. Therefore, allocated to the position 0x17.
C. h is a 16-bit variable. to ensure alignment with the 16-bit boundary, it is allocated to the position 0x18-0x1A.
The chart is as follows (incorrect yet, read it with patience ):
Can the space occupied by struct C end with h? Let's look for an example: if we define a struct Array CA [2], based on the principle of variable allocation, the two struct should be stored continuously in the memory, and the allocation should be as follows:
After analysis, we can see that many members of CA [1] are no longer aligned. The reason is that the starting boundary of the structure is not aligned.
Which of the following conditions can be met before the start offset of the struct can be aligned. Think about it and you will understand: it is very easy to ensure that the length of the struct is an integer multiple of the maximum allocation of the original members.
The above struct should be aligned with the longest. d Member, that is, alignment with 8 bytes. The correct distribution chart is as follows:
Of course, the length of the struct T: sizeof (T) = 0x20;
Next, let's take a look at the alignment rules of the structure members under the default alignment rules:
// By www.datahf.net zhangyu
Typedef struct
{
Char c; // 1 byte
Int d; // 4 bytes, which must be aligned with 4 bytes. Therefore, it is allocated to 4th bytes.
Short e; // two bytes. After the above two members, they are aligned with 2, so they are not filled before.
}; // The entire struct. The longest member is 4 bytes, and the total length must be aligned with 4 bytes. Therefore, sizeof (A) = 12
Typedef struct B
{
Char c; // 1 byte
_ Int64 d; // 8 bytes. The position must be aligned with 8 bytes, so it is allocated to 8th bytes.
Int e; // 4 bytes, member d ends at 15 bytes, followed by 16 bytes aligned at 4 bytes, so allocated to 16-19
Short f; // two bytes. Member e ends at 19 bytes, followed by 20 bytes aligned with 2 bytes, so allocated to 20-21
A g; // The length of the struct is 12 bytes, and the longest member is 4 bytes. Therefore, the first two bytes are skipped,
// To 24-35 bytes
Char h; // 1 byte, allocated to 36 bytes
Int I; // 4 bytes. It must be 4 bytes aligned. 3 bytes are skipped and allocated to 40-43 bytes.
}; // The maximum allocation Member of the entire struct is 8 bytes. Therefore, the struct is filled with 5 bytes and is 48 bytes. Therefore:
// Sizeof (B) = 48;
The specific distribution chart is as follows:
The above test code is as follows:
// By www.datahf.net zhangyu
# Include "stdio. h"
Typedef struct
{
Char c;
Int d;
Short e;
};
Typedef struct B
{
Char c;
_ Int64 d;
Int e;
Short f;
A g;
Char h;
Int I;
};
Typedef struct C
{
Char c;
_ Int64 d;
Int e;
Short f;
Char g;
Short h;
};
Typedef struct D
{
Char;
Short B;
Char c;
};
Int main ()
{
B * B = new B;
Void * s [32];
S [0] = B;
S [1] = & B-> c;
S [2] = & B-> d;
S [3] = & B-> e;
S [4] = & B-> f;
S [5] = & B-> g;
S [6] = & B-> h;
S [7] = & B-> g. c;
S [8] = & B-> g. d;
S [9] = & B-> g. e;
S [10] = & B-> I;
B-> c = 0x11;
B-> d = 0x2222222222222222;
B-> e = 0x33333333;
B-> f = 0x4444;
B-> g. c = 0x50;
B-> g. d = 0x51515151;
B-> g. e = 0x5252;
B-> h = 0x66;
Int i1 = sizeof ();
Int i2 = sizeof (B );
Int i3 = sizeof (C );
Int i4 = sizeof (D );
Printf ("i1: % d \ n2: % d \ ni3: % d \ ni4: % d \ n", i1, i2, i3, i4 ); // 12 48 32 6
}
The memory condition during running is as follows:
Finally, the memory alignment principle is as follows:
First, we will introduce four concepts:
1) alignment value of the Data Type itself: Alignment value of the basic data type, equal to sizeof (basic data type ).
2) Specify the alignment value: # specify the alignment value when pragma pack (value) is used.
3) alignment value of a struct or class: The value with the largest alignment value among its members.
4) Valid alignment values of data members, struct, and classes: the alignment value itself and the smaller value in the specified alignment value.
The valid alignment value N is the final value used to determine the data storage address. Valid alignment means "alignment on N", that is, the "Starting address for storing the data % N = 0 ". data variables in the data structure are discharged in the defined order. The starting address of the first data variable is the starting address of the data structure. The member variables of the struct must be aligned and discharged, and the struct itself must be rounded according to its own valid alignment values (that is, the total length occupied by the member variables of the struct must be an integer multiple of the valid alignment values of the struct)
# Pragma pack (value) tells the compiler to replace the default value with the specified alignment value.
For example, # pragma pack (1)/* specifies to align by 2 bytes */
# Pragma pack ()/* cancel the specified alignment and restore the default alignment */
Author: Zhang Yu (data recovery )"