C-Language byte alignment

Source: Internet
Author: User
Tags data structures reserved

Original address: http://blog.csdn.net/21aspnet/article/details/6729724

The end of the article I made a picture, a look at it, this problem on the Internet to speak a lot, but did not put the problem through.

First, the concept

Alignment is related to the location of the data in memory. If the memory address of a variable is exactly the number of integers in its length, he is called a natural alignment. For example, under the 32-bit CPU, assuming that an integer variable has an address of 0x00000004, it is naturally aligned.

Second, why should the byte alignment

The root cause of the need for byte alignment is the efficiency of CPU access to data. Assuming that the address of the integer variable above is not naturally aligned, for example for 0x00000002, then the CPU needs to access two memory if it takes its value, and the first time it takes a short from 0x00000002-0x00000003, The second takes a short from the 0x00000004-0x00000005 and then combines the desired data, and if the variable is on the 0x00000003 address, it accesses three memory, the first char, the second short, and the third time char, The integer data is then combined. If the variable is in a natural alignment position, the data can be fetched only once. Some systems are very strict with alignment requirements, such as SPARC systems, where an error occurs if an misaligned data is taken, for example:

Char Ch[8];
Char *p = &ch[1];
int i = * (int *) p;

  
The runtime will report segment error, and in the x86 will not be wrong, but the efficiency of the decline.
  
Third, the correct processing of byte alignment
  
For a standard data type, its address is as long as it is an integer multiple of its length, while non-standard data types are aligned with the following principles:
  
Arrays: aligned According to the basic data type, the first one is aligned the nature of the back is aligned.
Union: Aligns to the data type that contains the largest length.
Structure Body: Each data type in the structure is aligned.
For example, there is a structural body as follows:
  
struct stu{
char sex;
int length;
Char name[10];
};
struct Stu My_stu;

  
Because GCC defaults to 4-byte alignment under x86, it fills the length and the entire structure with three and two bytes after sex followed by the name, respectively. So we sizeof (MY_STU) will get a length of 20 instead of 15.
  
Four, __attribute__ option
  
We can compile the program according to the alignment size we set, GNU uses the __ATTRIBUTE__ option to set, for example, we want the structure to be aligned by a byte, so we can define the structure
  
struct stu{
char sex;
int length;
Char name[10];
}__ATTRIBUTE__ ((Aligned (1)));
  
struct Stu My_stu;

  
Then sizeof (MY_STU) can get a size of 15.
  
The definition above is equivalent to
  
struct stu{
char sex;
int length;
Char name[10];
}__ATTRIBUTE__ ((packed));
struct Stu My_stu;

  
The __attribute__ ((packed)) variable or struct member uses the smallest alignment, that is, a byte alignment to the variable, and a bit alignment for the field (field).
  
When you need to set up alignment
  
In the design of communication protocols under different CPUs, or the structure of the registers when writing hardware drivers, the two places need to be aligned in one byte. Even if it seems inherently justified, align it so that the code generated by different compilers is not the same.

First, a quick understanding

1. What is byte alignment.

In C, a structure is a composite data type whose constituent elements can be variables of basic data types (such as int, long, float, and so on) or data units of some composite data types (such as arrays, structs, unions, etc.). In structs, the compiler allocates space for each member of the structure by its natural boundary (alignment). Each member is stored sequentially in memory in the order in which they are declared, with the address of the first member the same as the address of the entire structure.

In order for the CPU to be able to access the variable quickly, the starting address of the variable should have certain characteristics, that is, the so-called "alignment". For example, a 4-byte int, whose starting address should be located on a 4-byte boundary, that is, the starting address can be divisible by 4.

2. What is the effect of byte alignment.

Byte alignment is not only convenient for fast CPU access, and reasonable use of byte alignment can effectively save storage space.

For 32-bit machines, 4-byte alignment can increase CPU access speed, such as a long variable, if the 4-byte boundary storage, the CPU to read two times, so inefficient. However, using 1-byte or 2-byte alignment in a 32-bit machine can reduce the speed of variable access. So this takes into account the processor type, as well as the compiler type. The default in VC is 4-byte alignment, and GNU GCC is also the default 4-byte alignment.

3. Change the default byte alignment of the C compiler

By default, the C compiler allocates space for each variable or data unit according to its natural boundary condition. Generally, you can change the default boundary condition by using the following methods:
· Using pseudo-directive #pragma pack (n), the C compiler is aligned according to n bytes.
· Use the pseudo Directive #pragma pack () to cancel the custom byte alignment.

In addition, there are also the following ways:
· __attribute ((aligned (n)) to align the member of the structure to the N-byte natural boundary. If the length of a member in the structure is greater than N, it is aligned according to the length of the maximum member.
· __ATTRIBUTE__ ((packed)), cancels the optimized alignment of the structure during compilation, and aligns to the actual number of bytes occupied.

4. Examples

Example 1

struct test
{
Char X1;
Short X2;
Float X3;
Char x4;
};

Because the compiler defaults to this struct as a natural boundary (some say "natural to the boundary" I think the border more comfortable) alignment, the first member of the structure X1, its offset address is 0, occupies the 1th byte. The second member, X2, is of type short and its starting address must be 2 bytes in bounds, so the compiler populates a null byte between X2 and X1. The third member of the structure, X3 and fourth member X4, happens to be on its natural boundary address, and no additional padding bytes are required before them. In the test structure, member X3 requires a 4-byte pair, which is the maximum boundary element required by all members of the structure, so the natural bounds of test structure is 4 bytes, and the compiler populates 3 empty bytes after member X4. The entire structure occupies 12 bytes of space.

Example 2

#pragma pack (1)//Let the compiler make 1-byte alignment to this structure
struct test
{
Char X1;
Short X2;
Float X3;
Char x4;
};
#pragma pack ()//Cancel 1-byte alignment and revert to default 4-byte alignment

The value of sizeof (struct test) is 8.

Example 3

#define Gnuc_packed __attribute__ ((PACKED))
struct PACKED test
{
Char X1;
Short X2;
Float X3;
Char x4;
}gnuc_packed;

The value of sizeof (struct test) is still 8.

Second, in-depth understanding

What is a byte alignment, and why should it be aligned?
Tragicjun published in 2006-9-18 9:41:00 the memory space in modern computers is divided by byte, and theoretically it seems that access to any type of variable can begin at any address, But the reality is that when accessing a particular type of variable, it is often accessed at a specific memory address, which requires that all types of data be arranged in space according to a certain rule, rather than one after another in order, which is alignment.
The role and cause of        alignment: There is a great deal of difference in storage space for each hardware platform. Some platforms have access to certain types of data only from certain addresses. For example, some of the architecture of the CPU to access a variable does not have to be aligned, when the error occurs, then programming in this architecture must ensure byte alignment. Other platforms may not, but the most common is the loss of access efficiency if the data is not aligned according to its platform requirements. For example, some platforms start every time from the even address, if an int (assuming 32-bit system) if stored in the beginning of the even address, then a read cycle can read out the 32bit, and if the location of the beginning of the odd address, it will take 2 reading cycles, The 32bit data can be obtained by piecing together the high and low byte of the results of two readings. Obviously, the reading efficiency is much lower.
Two. The effect of byte alignment on programs:

         Let's take A look at a few examples (32bit,x86 environment, GCC compiler):
Set the struct body as follows:
struct A
{
        int A;
        char B;
        short C;
};
struct b
{
        char b;
         int A;
        short C;
};
The length of the various data types on the 32-bit machine is now known as follows:
Char:1 (signed unsigned)    
Short:2 (signed unsigned)    
Int:4 (Signed and unsigned)   &NBSP
Long:4 (signed unsigned)    
float:4         double:8
What about the top two structure sizes?
The result is that
sizeof (Strcut A) has a value of 8
sizeof (struct B) is a value of

The structure body A contains a 4-byte length int, a 1-byte length char and a 2-byte length of the short data one, and B is the same; the a,b size should be 7 bytes.
The result of this is because the compiler wants to align data members in space. This is the result of aligning the compiler's default settings, so we can change the compiler's default alignment settings, of course. For example:
#pragma pack (2)//* Specify 2-byte alignment */
struct C
{
Char b;
int A;
Short C;
};
#pragma pack ()/* To cancel the specified alignment and restore the default alignment */
The value of sizeof (struct C) is 8.
The modified alignment value is 1:
#pragma pack (1)//* Specify 1-byte alignment */
struct D
{
Char b;
int A;
Short C;
};
#pragma pack ()/* To cancel the specified alignment and restore the default alignment */
The sizeof (struct D) value is 7.
We'll explain the role of the #pragma pack () later.

Three. What principle does the compiler align with?

Let's first look at four important basic concepts:


1. The alignment value of the data type itself:
For char data, its own alignment value is 1, for the short type 2, for the int,float,double type, its own alignment value is 4, Unit bytes.
2. A struct or a class's own alignment value: The value of its member's own alignment value.
3. Specify alignment Value: Value value of the specified alignment when #pragma pack (value).
4. Valid alignment values for data members, structs, and classes: their own alignment values and the value of the specified alignment value.
With these values, we can easily discuss the members of the specific data structures and their own alignment. Valid alignment value n is the final value used to determine how the data is stored in the address, most importantly. A valid alignment of n means "Snap to n", which means that the data "holds the starting address%n=0". Data variables in the structure are emitted in the order defined. The starting address of the first data variable is the starting address of the structure. The member variables of the structure should be aligned to emit, and the structure itself should be aligned according to its own valid value round (that is, the total length of the struct member variable should be an integral multiple of the effective alignment value of the structure, combined with the following example). This will not be able to understand the values of the above examples.
Example Analysis:
analysis example B;
struct B
{
Char b;
int A;
Short C;
};
Let's say B starts emitting from the address space 0x0000. The specified alignment value is not defined in this example, which defaults to 4 in the author environment. The self alignment value of the first member variable B is 1, is smaller than the specified or specified alignment value of 4, so its valid alignment value is 1, so its address 0x0000 conforms to 0x0000%1=0. The second member variable A, its own alignment value is 4, so the valid alignment value is 4, Therefore, it can only be stored in the four contiguous byte spaces from the starting address 0x0004 to the 0x0007, and the 0x0004%4=0 is reviewed, and immediately after the first variable. The third variable, C, has its own alignment value of 2, so the valid alignment value is also 2, which can be stored in the two-byte space 0x0008 to 0x0009, in line with 0x0008%2=0. So everything from 0x0000 to 0x0009 is stored in B content. Then look at the data structure B's own alignment value for its variable maximum alignment value (here is B) so that is 4, so the effective alignment of the structure is also 4. According to the requirements of the structural body rounding, 0x0009 to 0x0000=10 Byte, (10+2)%4=0. So the 0x0000a to the 0x000b is also occupied by the structural body B. So B from 0x0000 to 0x000b a total of 12 bytes, sizeof (struct b) = 12; Actually, if this is the one, it's already aligned to the byte, because its starting address is 0, so it's definitely aligned, and then 2 bytes are added, is because the compiler to achieve the access efficiency of the structure array, imagine if we define a structure B array, then the first structure start address is 0 no problem, but the second structure? All elements of an array are next to each other, as defined by the array, and if we do not add the size of the structure to the integer multiple of 4, Then the starting address for the next structure will be 0x0000a, this obviously does not satisfy the address alignment of the structure, so we are going to add the structure to the integer multiple of the valid alignment size. In fact, such as: for char data, its own alignment value is 1, for the short type is 2, for Int,float, Double with its own alignment value of 4, these existing types of their own alignment values are also based on the array, only because the lengths of these types are known, so their own alignment values are known.
Similarly, analyze the above example C:
#pragma pack (2)//* Specify 2-byte alignment */
struct C
{
Char b;
int A;
Short C;
};
#pragma pack ()/* To cancel the specified alignment and restore the default alignment */
The first variable B has its own alignment value of 1, specifies that the alignment value is 2, so that its valid alignment value is 1, assuming C starts from 0x0000, then B is stored in 0x0000, conforms to 0x0000%1=0, the second variable, its own alignment value is 4, the specified alignment value is 2, so the valid alignment value is 2, So the order is stored in 0x0002, 0x0003, 0x0004, 0x0005 four consecutive bytes, in line with the 0x0002%2=0. The third variable C has its own alignment value of 2, so the valid alignment value is 2, in order to store
In 0x0006, 0x0007, in line with 0x0006%2=0. So from 0x0000 to 0x00007 a total of eight bytes is stored in C variables. and C's own alignment value is 4, so C's valid alignment value is 2. Also 8%2=0,c occupies only eight bytes of 0x0000 to 0x0007. So sizeof (struct C) =8.

Four. How do I modify the compiler's default alignment values?

1. In the VC IDE, you can modify this: [project]| The struct member alignment of the Code generation option category the Settings],c/c++ tab is modified by 8 bytes.
2. When encoding, you can modify this dynamically: #pragma pack. Note: It's pragma, not progma.

Five. For byte alignment, how do we consider in programming?
If you want to consider saving space in programming, then we only need to assume that the first address of the structure is 0, and then each variable according to the principle of the above arrangement can be, the basic principle is to the structure of variables in accordance with the type size of small to large declarations, Minimize the filling space in the middle. Another is to space in exchange for the efficiency of time, we are shown to fill the space to align, for example: there is a use of space-time approach is to explicitly insert reserved members:
struct a{
Char A;
Char reserved[3];//use space to swap time
int b;
}

Reserved members have no meaning to our program, it just fills up the space to achieve the purpose of byte alignment, of course, even without this member usually the compiler will give us automatic filling alignment, we add it to only play an explicit role in the reminder.

Six. Byte alignment may pose a potential risk:

Many of the pitfalls of alignment in your code are implicit. For example, when you force type conversions. For example:
unsigned int i = 0x12345678;
unsigned char *p=null;
unsigned short *p1=null;

p=&i;
*p=0x00;
p1= (unsigned short *) (p+1);
*p1=0x0000;
The last two codes, which access unsignedshort variables from odd-numbered boundaries, clearly do not conform to the rules of alignment.
On x86, similar operations only affect efficiency, but on MIPS or SPARC, it may be an error because they require byte alignment.

Seven. How to find the problem with byte alignment:

If an alignment or assignment problem occurs first view
1. Compiler's big little end setting
2. See if the system itself supports non aligned access
3. If the support to see the alignment is set or not, if you do not see the access need to add some special decorations to flag its special access operation

Example: [CPP]   View plain copy #include  <stdio.h>   main ()    {    struct a {       int a;        char b;       short c;  };      struct b {       char b;       int  a;       short c;  };      #pragma   pack  (2)  /* specifies 2-byte alignment */   struct c {       char b ;       int a;       short c;  } ;   #pragma  pack  ()  /* cancel the specified alignment, restore the default alignment */            # pragma pack  (1)  /* specifies 1-byte alignment */  

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.