Data Type byte alignment

Source: Internet
Author: User
Tags modulus microsoft c

After reading an article today, I suddenly realized why sizeof () is always used in source code and why is it always important to emphasize byte alignment during embedded development in C, it turned out to be because of this. Here, I am ashamed. Wangshi, your abilities and levels are really unsatisfactory. These basic knowledge is not clear. It's also seven years since I went to school to study computers. I am really ashamed of myself. Come on! You can't make others look at yourself. You need to make up for the past. Come on !! At the same time, I hope to make improvements every day.

Repost the articles written by others and give yourself a reminder. I hope it will be helpful to everyone. Good articles need to be shared by us. ^_^
The post is as follows:

When a structure type is defined in C, is its size equal to the sum of the sizes of fields? How will the compiler place these fields in memory? What are the requirements of ansi c for the structure memory layout? Can our programs depend on this layout? These questions may be a bit vague for many friends, so this article will try to explore the secrets behind them. First, at least one thing is certain, that is, ansi c ensures that the locations where fields in the struct appear in the memory increase sequentially with their Declaration Order, and the first address of the first field is equal to the first address of the entire struct instance. For example, there is a struct:

Struct vector {int x, y, z;} s;
Int * P, * q, * R;
Struct vector * pS;

P = & S. X;
Q = & S. Y;
R = & S. Z;
PS = & S;
Assert (P <q );
Assert (P <R );
Assert (q <R );
Assert (int *) PS = P );
// The above assertions will not fail
At this time, a friend may ask: "Does the standard stipulate that adjacent fields are also adjacent in the memory? ". Well, sorry, ansi c does not guarantee that your program should not rely on this assumption at any time. Does this mean that we can never outline a clearer and more precise structure memory layout? Oh, of course not. But let's take a moment out of this issue and take a look at another important issue-memory alignment.
Many real computer systems have limits on the locations where basic data is stored in the memory. They require that the first address value of the data be K (usually 4 or 8) this is the memory alignment, and this K is called the alignment modulus of the data type ). When the ratio of the alignment modulus of one type of S to the alignment modulus of another type of T is an integer greater than 1, we call it the alignment requirement of type s stronger than that of T (strict ), t is weaker (loose) than S ). This mandatory requirement simplifies the design of the transmission system between the processor and the memory, and improves the Data Reading speed. For example, a processor reads or writes 8 bytes of data at a time starting from an eight-fold address each time it reads/writes memory, if the software can ensure that data of the double type starts from an eight-fold address, then only one memory operation is required to read or write data of the double type. Otherwise, we may need two memory operations to complete this operation, because the data may be distributed across two 8-byte memory blocks that meet the alignment requirements. Some Processors may encounter errors when the data does not meet the alignment requirements, but Intel's ia32 architecture processor can work correctly regardless of whether the data is aligned. However, Intel recommends that if you want to improve performance, all program data should be aligned as much as possible. The Microsoft C compiler (cl.exe for 80x86) in win32platform uses the following alignment rules by default: The alignment modulus of any basic data type T is the size of T, that is, sizeof (t ). For example, for the double type (8 bytes), it is required that the address of this type of data is always a multiple of 8, and the char type data (1 byte) can start from any address. In Linux, GCC adopts another set of rules (not verified in the data, please correct the error): Any 2-byte size (including single-byte ?) The alignment modulus of data types (such as short) is 2, while all other data types (such as long and double) that exceed 2 bytes are 4 as alignment modulus.
Return to the struct we care about. Ansi c specifies that the size of a structure type is the sum of the size of all its fields and the size of the padding areas between or at the end of the field. Hmm? Fill area? Yes, this is the space allocated to the struct to make the struct field meet the memory alignment requirements. So what are the alignment requirements of the struct itself? Yes, the ansi c standard specifies that the alignment requirement of the struct type cannot be looser than the strictest one in all its fields (but this is not mandatory, vc7.1 is just as strict as they are ). Let's take a look at an example (the following test environment is Intel celon 2.4g + Win2000 Pro + vc7.1, and the memory alignment compilation option is "default", that is, the/ZP and/Pack options are not specified):
Typedef struct ms1
{
Char;
Int B;
} Ms1;
Assume that ms1 uses the following memory layout (the memory addresses in this article increase from left to right ):

+ --------------------------- +
|
| A | B |
|
+ --------------------------- +
1 byte 4 byte

Because the strongest alignment requirement in ms1 is the B field (INT), according to the alignment rules of the compiler and the ANSI C standard, the first address of the ms1 object must be 4 (alignment modulus of the int type). So can the B Field in the above memory layout meet the int type alignment requirements? Well, of course not. If you are a compiler, how can you cleverly arrange it to satisfy your CPU preferences? Haha, after 1 ms of hard thinking, you must come up with the following solution:
_______________________________________
| // |
| A | // padding // | B |
| // |
+ ------------------------------------- +
Bytes: 1 3 4
This scheme allocates three additional Padding Bytes between A and B, so that when the first address of the entire struct object meets the 4-byte alignment requirement, the B field must also meet the 4-byte alignment requirements of the int type. Therefore, sizeof (ms1) should be 8, and the offset of field B to the first address of the struct is 4. Very understandable, right? Now, we exchange the fields in ms1 in the following order:
Typedef struct MS2
{
Int;
Char B;
} MS2;
Maybe you think MS2 is simpler than ms1, and its layout should be
_______________________
|
| A | B |
|
+ --------------------- +
Bytes: 4 1
Because the MS2 object must also comply with the 4-byte alignment rules, the address of a must be 4-byte alignment because it is equal to the first address of the structure. Well, the analysis is justified, but not comprehensive. Let's take a look at the problem of defining an MS2 array. The C standard ensures that the space occupied by arrays of any type (including custom structure types) must be equal to the size of a single data of this type multiplied by the number of array elements. In other words, there is no gap between the elements of the array. According to the above scheme, the layout of an MS2 array is:
| <-Array [1]-> | <-array [2]-> | <-array [3] ......
__________________________________________________________
|
| A | B | .............
|
+ ----------------------------------------------------------
Bytes: 4 1 4 1
When the first address of the array is 4-byte alignment, array [1]. A is also 4-byte alignment, but what about array [2].? What about array [3].? It can be seen that this scheme does not allow the fields of all elements in the array to meet the alignment requirements when defining the struct array, and must be modified to the following form:
___________________________________
| // |
| A | B | // padding // |
| // |
+ --------------------------------- +
Bytes: 4 1 3
Now, whether it is to define a separate MS2 variable or MS2 array, all the fields of all elements can meet the alignment requirements. The sizeof (MS2) is still 8, the offset of A is 0, and that of B is 4.
Okay. Now you have mastered the basic principles of structured memory layout. Try to analyze a type that is slightly more complex.
Typedef struct ms3
{
Char;
Short B;
Double C;
} Ms3;
I think you can get the correct layout:

Padding
|
_____ V _________________________________
|/| // |
| A |/| B |/padding/| c |
|/| // |
+ ------------------------------------- +
Bytes: 1 1 2 4 8

The sizeof (short) is equal to 2, and the B field should start with an even address. Therefore, a is followed by a byte, And the sizeof (double) is equal to 8. The C field must start with an address multiple of 8, the preceding fields A and B have 4 bytes plus the padding bytes. Therefore, filling the fields B with four more bytes ensures the alignment of the C field. Sizeof (ms3) is equal to 16, B's offset is 2, and C's offset is 8. Next let's take a look at the field or structure type in the struct:
Typedef struct MS4
{
Char;
Ms3 B;
} MS4;
In ms3, the most stringent field in memory is C, so the alignment modulus of ms3 data is the same as that of double (8), and field A should be filled with 7 bytes, therefore, the MS4 layout should be:
_______________________________________
| // |
| A | // padding // | B |
| // |
+ ------------------------------------- +
Bytes: 1 7 16
Apparently, sizeof (MS4) is equal to 24, and B's offset is equal to 8.
In actual development, we can change the alignment rules of the compiler by specifying/ZP compilation options. For example, specifying/zpn (N in vc7.1 can be 1, 2, 4, 8, or 16) tells the compiler that the maximum alignment modulus is N. In this case, the alignment rules of all basic data types smaller than or equal to n Bytes are the same as those of the default one, but the alignment modulus of Data Types greater than n Bytes is limited to n. In fact, the default alignment option of vc7.1 is equivalent to/zp8. By taking a closer look at msdn's description of this option, we will find that it solemnly warns programmers not to use the/zp1 and/zp2 options on MIPS and Alpha platforms, do not specify/zp4 and/zp8 on a 16-bit platform (think about why ?). Changing the alignment options of the compiler makes it a good review to re-analyze the memory layout of the above four types of structs Based on the program running results.
Here, we can answer the last question raised in this article. The memory layout of the struct depends on the CPU, operating system, compiler, and alignment options during compilation. Your program may need to run on multiple platforms, your source code may be compiled by different people using different compilers (imagine that you provide others with an open source Library), unless absolutely necessary, otherwise, your program will never rely on these weird memory la S. By the way, if two modules in a program are compiled with different alignment options, it may produce some very subtle errors. If your program does have behaviors that are hard to understand, check the compilation options of each module carefully.

This article from the csdn blog, reproduced please indicate the source: http://blog.csdn.net/qiuqiu173/archive/2007/12/12/1931283.aspx

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.