1. What is byte alignment? Why?
In modern computers, memory space is divided by byte. Theoretically, it seems that access to any type of variables can start from any address, however, the actual situation is that access to specific types of variables is often performed at specific memory addresses, which requires various types of data to be arranged in space according to certain rules, instead of sequential emissions, this is alignment.
Alignment functions and causes: the processing of storage space varies greatly by hardware platform. Some platforms can only access certain types of data from some specific addresses. For example, some architectures may encounter errors when the CPU accesses a variable that is not aligned, so in this architecture, programming must ensure byte alignment. this may not be the case for other platforms, but the most common problem is that alignment of data storage according to the requirements of their platforms may cause a loss of access efficiency. For example, some platforms start from the even address each time they read data. If an int type (assuming a 32-bit System) is stored at the beginning of the even address, the 32bit can be read in a read cycle, if the data is stored at the beginning of the odd address, two read cycles are required, and the high and low bytes of the two read results are pieced together to obtain the 32bit data. Obviously, reading efficiency is greatly reduced.
Ii. Effect of byte alignment on programs:
Let's take a few examples (32bit, x86 environment, GCC compiler ):
The struct is defined as follows:
Struct
{
Int;
Char B;
Short C;
};
Struct B
{
Char B;
Int;
Short C;
};
The length of various data types on 32-bit machines is known as follows:
CHAR: 1 (signed and unsigned)
Short: 2 (signed and unsigned)
INT: 4 (signed and unsigned)
Long: 4 (signed and unsigned)
Float: 4 double: 8
What is the size of the above two structures?
The result is:
The sizeof (strcut a) value is 8.
The value of sizeof (struct B) is 12.
Struct a contains a four-byte int, a one-byte char, and a two-byte short data. The same applies to B., B must be 7 bytes in size.
The above result is displayed because the compiler needs to align data members in space. The above is the result of alignment according to the default settings of the compiler. Can we change the default alignment settings of the compiler, of course? For example:
# Pragma pack (2)/* specify to align by 2 bytes */
Struct C
{
Char B;
Int;
Short C;
};
# Pragma pack ()/* cancel the specified alignment and restore the default alignment */
The value of sizeof (struct C) is 8.
Modify the alignment value to 1:
# Pragma pack (1)/* specify to align by 1 byte */
Struct d
{
Char B;
Int;
Short C;
};
# Pragma pack ()/* cancel the specified alignment and restore the default alignment */
The sizeof (struct d) value is 7.
Next we will explain the role of # pragma pack.
Iii. What principles does the compiler align?
Let's take a look at four important basic concepts:
1. Alignment of data types:
For char data, its own alignment value is 1, for short data is 2, for int, float, double type, its own alignment value is 4, in bytes.
2. The alignment value of a struct or class: The value with the largest alignment value among its members.
3. Specify the alignment value: # The alignment value specified when Pragma pack (value) is used.
4. Valid alignment values of data members, struct, and classes: the alignment value of the data itself and the value smaller than the specified alignment value.
With these values, we can easily discuss the data structure members and their alignment. The valid alignment value n is the final value used to determine the data storage address. Valid alignment means "alignment on N", that is, the "Starting address for storing the data % n = 0 ". data variables in the data structure are discharged in the defined order. The starting address of the first data variable is the starting address of the data structure. The member variables of the struct must be aligned and discharged, and the struct itself must be rounded according to its own valid alignment values (that is, the total length occupied by the member variables of the struct must be an integer multiple of the valid alignment values of the struct, ). In this way, you cannot understand the values of the above examples.
Example Analysis:
Analysis example B;
Struct B
{
Char B;
Int;
Short C;
};
False B is discharged from the address space 0x0000. The alignment value is not defined in this example. In the author's environment, this value is 4 by default. The first member variable B's own alignment value is 1, which is smaller than the specified or default alignment value 4. Therefore, the valid alignment value is 1, therefore, the storage address 0x0000 is 0 x 0000% 1 = 0. the alignment value of the second member variable A is 4, so the valid alignment value is 4. Therefore, it can only be stored in the four consecutive bytes from the starting address 0x0004 to 0x0007, review 0 x 0004% 4 = 0, which is close to the first variable. The third variable C has its own alignment value of 2, so the valid alignment value is also 2, which can be stored in the two bytes from 0x0008 to 0x0009, Which is 0 x 0008% 2 = 0. Therefore, B content is stored from 0x0000 to 0x0009. Then, let's look at the alignment value of Data Structure B as the maximum alignment value in its variable (here it is B), so it is 4, so the valid alignment value of the structure is also 4. According to the requirements of the structure, 0x0009 to 0x0000 = 10 bytes, (10 + 2) % 4 = 0. Therefore, 0x0000a to 0x000b is also occupied by struct B. Therefore, B has a total of 12 bytes from 0x0000 to 0x000b, and sizeof (struct B) = 12. In fact, if this one is used, it will satisfy the byte alignment, because its starting address is 0, it must be aligned. The reason why two bytes are added to the end is that the compiler aims to achieve the access efficiency of the structure array, imagine if we define an array of structure B, the starting address of the first structure is 0, but what about the second structure? According to the definition of the array, all elements in the array are adjacent. If we do not add the size of the structure to an integer multiple of 4, the starting address of the next structure will be 0x0000a, this obviously cannot satisfy the address alignment of the structure, so we need to add the structure to an integer multiple of the valid alignment size. in fact, for char type data, its own alignment value is 1, for short type is 2, for int, float, double type, its own alignment value is 4, the alignment values of these existing types are also based on arrays, but their alignment values are also known because their lengths are known.
Similarly, analyze the above example C:
# Pragma pack (2)/* specify to align by 2 bytes */
Struct C
{
Char B;
Int;
Short C;
};
# Pragma pack ()/* cancel the specified alignment and restore the default alignment */
The first variable B's own alignment value is 1 and the specified alignment value is 2. Therefore, the valid alignment value of B is 1. Suppose C starts from 0x0000, then B is stored in 0x0000, conforms to 0 x 0000% 1 = 0; the second variable, its own alignment value is 4, and the specified alignment value is 2, so the valid alignment value is 2, therefore, the sequence is stored in four consecutive bytes, namely 0x0002, 0x0003, 0x0004, and 0 x 0002%. The alignment value of the third variable C is 2, so the valid alignment value is 2, which is stored in sequence.
In 0x0006, 0x0007, 0 x 0006% 2 = 0. Therefore, from 0x0000 to 0x00007, a total of eight characters are stored in the C variable. And C's own alignment value is 4, so the valid alignment value of C is 2. Again 8% 2 = 0, C only occupies eight bytes from 0x0000 to 0x0007. So sizeof (struct c) = 8.
4. How to modify the default alignment value of the compiler?
1. In vc ide, you can modify the code generation option struct of [project] | [settings], C/C ++ tab category as follows:
Modified in member alignment. The default value is 8 bytes.
2. You can modify the code dynamically as follows: # pragma pack. Note: It is Pragma instead of progma.
5. How should we consider byte alignment in programming?
If we want to save space during programming, we only need to assume that the first address of the structure is 0, and then sort the variables according to the above principles, the basic principle is to declare the variables in the structure according to the type size from small to large, and minimize the space to fill. another way is to take the space for the efficiency of time, we show to fill the space for alignment, for example, there is a way to use the space for time is to explicitly insert reserved members:
Struct {
Char;
Char reserved [3]; // use space for time
Int B;
}
The reserved member has no significance for our program. It just fills the space to achieve byte alignment. Of course, even if this member is not added, the compiler will automatically fill the alignment for us, we add it as an explicit reminder.
6. potential risks of byte alignment:
Many of the potential alignment risks in the Code are implicit. For example, in forced type conversion. For example:
Unsigned int I = 0x12345678;
Unsigned char * P = NULL;
Unsigned short * P1 = NULL;
P = & I;
* P = 0x00;
P1 = (unsigned short *) (p + 1 );
* P1 = 0x0000;
The last two sentences of code access the unsignedshort variable from the odd boundary, which obviously does not comply with the alignment rules.
On x86, similar operations only affect the efficiency, but on MIPS or iSCSI, they may be an error because they must be in byte alignment.
7. How to find problems with byte alignment:
If alignment or assignment occurs, first check
1. Compiler's big little side settings
2. Check whether the system supports non-alignment access.
3. If alignment or alignment is supported, some special modifications are required to mark special access operations.
8. Related Articles: conversion fromHttp://blog.csdn.net/goodluckyxl/archive/2005/10/17/506827.aspx
Alignment processing under arm
From dui0067d_ads1_2_complib
3.13 type qulifiers
Some of them are taken from the alignment section of the arm compiler documentation.
Alignment usage:
1. _ align (Num)
This is used to modify the byte boundary of the highest level object. When ldrd or strd is used in assembly
This command _ align (8) is required for modification. To ensure that the data objects are aligned accordingly.
The command for modifying an object can contain a maximum of 8 bytes, so that a 2-Byte object can be 4 bytes long.
Alignment, but cannot make the 4-Byte object 2-byte alignment.
_ Align is a storage class modification. It only modifies objects of the highest level and cannot be used for structures or function objects.
2. _ packed
_ Packed is used for one-byte alignment.
1. The packed object cannot be aligned.
2. Read and Write Access to all objects is non-aligned.
3. Float and the objects containing the float Structure union and unused _ packed cannot be aligned in bytes.
4. _ packed has no effect on partial Integer Variables
5. Forced conversion from unpacked object to packed object is undefined, and integer pointer can be set legally
The value is packed.
_ Packed int * P; // _ packed int indicates no meaning.
6. Problems with alignment or non-alignment read/write access
_ Packed struct struct_test
{
Char;
Int B;
Char C;
}; // Define the following structure. At this time, the starting address of B must be not aligned.
// Access B in the stack may be faulty, because the data on the stack must be alignally accessed [from Cl]
// Define the following variables as Global static not on the stack
Static char * P;
Static struct struct_test;
Void main ()
{
_ Packed int * q; // It is defined as _ packed to modify the access under the Non-Alignment data address of the current Q.
P = (char *) &;
Q = (int *) (p + 1 );
* Q = 0x87654321;
/*
The Assembly command that gets the value assignment is clear.
LDR R5, 0x20001590; = #0x12345678
[0xe1a00005] mov r0, R5
[0xeb1_b0] BL _ rt_uwrite4 // call an operation function to write 4 bytes here
[0xe5c10000] strb r0, [R1, #0] // The function performs four strb operations and then returns the result to ensure correct data access.
[0xe1a02420] mov R2, R0, LSR #8
[0xe5c12001] strb R2, [R1, #1]
[0xe1a02820] mov R2, R0, LSR #16
[0xe5c12002] strb R2, [R1, #2]
[0xe1a02c20] mov R2, R0, LSR #24
[0xe5c12003] strb R2, [R1, #3]
[0xe1a0f00e] mov PC, R14
*/
/*
If Q is not decorated with _ packed, the compiled command will directly cause access to the odd address to fail.
[0xe59f2018] LDR R2, 0x20001594; = #0x87654321
[0xe5812000] STR R2, [R1, #0]
*/
// This clearly shows how non-alignment access produces errors.
// And how to eliminate problems caused by non-alignment access
// You can also see that the command difference between non-alignment access and alignment access causes efficiency problems.
}
Source: http://blog.chinaunix.net/u1/36006/showart_569730.html