Memory alignment problem and bit domain of C language struct

Source: Internet
Author: User
Tags modulus domain list microsoft c

I. Memory alignment

Many computer systems have limits on the locations where basic data is stored in the memory. They require that the first address value of the data be K (usually 4 or 8) this is the memory alignment, and this K is called the alignment modulus of the data type ). When the ratio of the alignment modulus of one type of S to the alignment modulus of another type of T is an integer greater than 1, we call it the alignment requirement of type s stronger than that of T (strict ), t is weaker (loose) than S ). This mandatory requirement simplifies the design of the transmission system between the processor and the memory, and improves the Data Reading speed. For example, a processor reads or writes 8 bytes of data at a time starting from an eight-fold address each time it reads/writes memory, if the software can ensure that data of the double type starts from an eight-fold address, then only one memory operation is required to read or write data of the double type. Otherwise, we may need two memory operations to complete this operation, because the data may be distributed across two 8-byte memory blocks that meet the alignment requirements. Some Processors may encounter errors when the data does not meet the alignment requirements, but Intel's ia32 architecture processor can work correctly regardless of whether the data is aligned. However, Intel recommends that if you want to improve performance, all program data should be aligned as much as possible.

The ansi c standard does not stipulate that variables declared adjacent must be adjacent in memory. For program efficiency, memory alignment problems are flexibly handled by the compiler, which may cause some padding bytes between adjacent variables. For the basic data type (INT char), the memory space they occupy has a fixed value in a fixed hardware system. Therefore, we will only consider the memory allocation of struct members.

Alignment policy of Microsoft C compiler (cl.exe for 80 × 86) in win32platform:
1) The first address of the struct variable can be divisible by the size of its widest basic type member;
Note: When the compiler opens space for the struct, it first finds the widest basic data type in the struct, and then finds the location where the memory address can be divisible by the basic data type, the first address of the struct. Use the size of the widest basic data type as the alignment modulus described above.
2) The offset (offset) of each member of the struct to the first address of the struct is an integer multiple of the member size. If necessary, the compiler will add the internal adding between the members );
Note: before opening a space for a member of the struct, the compiler first checks whether the offset of the first address of the pre-opening space to the first address of the struct is an integer multiple of the current member. If yes, It stores the member, on the contrary, a certain number of bytes are filled between the current member and the previous Member to meet the integer double requirement, that is, the first address of the pre-opened space is removed several bytes.
3) the total size of the struct is an integer multiple of the size of the widest basic type of the struct. If necessary, the compiler will add the trailing padding after the last member ).
Note: The total size of the struct includes the padding byte. The last member must meet the preceding two conditions and the third condition. Otherwise, the last few bytes must be filled to meet the requirements.

According to the preceding rules, in windows, the size of sizeof (t) is 8 bytes using the VC compiler.

In the GNU gcc compiler, there are some differences in the principles followed. The alignment modulus is not determined based on the widest basic data type as described above.

In GCC, the maximum alignment modulus is 4. That is to say, the alignment modulus can only be 1, 2, and 4, even if the structure has a double type. In addition, in the preceding three items, the offset value must be an integer multiple of the member size. If the member size is smaller than or equal to 4, the offset value is calculated according to the preceding rules. If the member size is greater than 4, the offset of each member of the struct to the first address of the struct can only be determined by an integer multiple of 4.
Take the following example:

Struct
T
{
Char ch;
Double D;
};

In GCC, sizeof (t) should be 12 bytes.

If the struct contains bit-field, the guidelines in VC must be changed:
1) if the types of adjacent fields are the same, and the sum of the bit widths is smaller than the sizeof size of the type, the subsequent fields will be stored next to the previous field until they cannot be accommodated;
2) If the Field Types of adjacent bit fields are the same, but the sum of Bit Width is greater than the sizeof size of the type, the subsequent fields start from the new storage unit, its offset is an integer multiple of its type;
3) if the types of adjacent bitfield fields are different, the specific implementation of each compiler varies, vc6 adopts the non-compression mode (the fields of different bit domains are stored in different bit domain type bytes), and both Dev-C ++ and GCC adopt the compression mode;
Note: When the two fields are of different types, for example:

Struct
N
{
Char C: 2;
Int I: 4;
};

The memory alignment criteria for the non-bit domain struct are still met. the offset of the I member to the first address of the struct should be an integer multiple of 4. Therefore, the C member must be filled with three bytes, then the space of four bytes is opened up as the int type, four of which are used to store I, so the space occupied by the above struct in VC is 8 bytes; for compilers that adopt compression, the memory alignment criteria of the non-bit domain structure are followed. The difference is that if the three words are filled with energy saving, the data is compressed to the padding byte, which cannot be accommodated. Therefore, the space occupied by the above struct N in GCC or Dev-C ++ should be 4 bytes.

4) do not compress fields that are interspersed with non-bit fields;
Note:
Struct
5) the total size of the entire struct is an integer multiple of the size of the widest basic type.

Typedef
Struct
{
Char C: 2;
Double I;
Int C2: 4;
} N3;

The space occupied by GCC is 16 bytes, and the space occupied by VC is 24 bytes.

PS:

  • The choice of alignment modulus can only be based on the basic data type. Therefore, for the nested struct In the struct, you can only consider the basic data type to be split. For the 2nd records in the alignment criterion, the entire struct is regarded as a member. The size of the member is determined based on the alignment criterion.
  • Class objects are stored in the memory in a similar way as struct, which is not described here. It should be noted that the size of the class object only includes the space occupied by non-static member variables of the class. If there is a virtual function, you can add another space occupied by the pointer.
  • 1. Memory alignment is related to compiler settings. First, you need to know the default value of the compiler.

    2. If you do not want to use the default compiler, you can use # pragma pack (n) to specify the alignment according to n.

    3. alignment of each struct variable. If the alignment parameter n (specified by the compiler by default or by Pragma) is greater than the number of bytes occupied by the variable (m), the alignment is based on M, the address after the memory offset is a multiple of M, otherwise it is aligned with N, and the address after the memory offset is a multiple of N. That is, the minimum length rule.

    4. Total struct size:
    The alignment length must be an integer multiple of the largest Alignment Parameters in the member. The maximum alignment parameter is obtained in step 3.

    5. Supplement: If struct a requires struct B, the alignment of struct B is to select the alignment of the longest member in struct.

  • II. Bit domain

    When storing some information, it does not need to occupy a full byte, but only needs to occupy a few or one binary bit. For example, when storing a switch value, there are only two States: 0 and 1. Use one binary digit. To save storage space and simplify processing, the C language also provides a data structure called "bit domain" or "bit segment ". The so-called "bit field" refers to dividing the binary character in a byte into several different regions and showing the digits of each region. Each domain has a domain name, which allows operations by domain name in the program. In this way, several different objects can be represented by a byte binary field. 1. Definition of a bit field and description of a bit field variable the definition of a bit field is similar to that of a structure, in the form:
    Struct bit domain structure name
    {Bit domain list };
    The format of the bit domain list is: type description Character Domain Name: Bit domain Length

    For example:
    Struct BS
    {
    Int A: 8;
    Int B: 2;
    Int C: 6;
    };
    The description of bitfield variables is the same as that of structure variables. You can first define and then describe, and define or directly describe these three methods. For example:
    Struct BS
    {
    Int A: 8;
    Int B: 2;
    Int C: 6;
    } Data;
    It indicates that data is a BS variable, which occupies two bytes in total. Where a occupies 8 places, B occupies 2 places, and C occupies 6 places. The definitions of bit domains are described as follows:

    1. A single-byte field must be stored in the same byte, and cannot span two bytes. If the remaining space of one byte is insufficient to store another domain, it should be stored from the next unit. You can also intentionally start a domain from the next unit. For example:
    Struct BS
    {
    Unsigned A: 4
    Unsigned: 0/* airspace */
    Unsigned B: 4/* stored from the next unit */
    Unsigned C: 4
    }
    In the definition of this bit field, a occupies 4 bits in the first byte, And the last 4 bits enter 0 to indicate that it is not used. B starts from the second byte and occupies 4 bits, and C occupies 4 bits.

    2. Because the bit field cannot span two bytes, the length of the bit field cannot exceed the length of one byte, that is, it cannot exceed 8-bit binary.

    3. A bit domain can be a non-bit domain name. In this case, it is only used for filling or adjusting the position. An anonymous domain cannot be used. For example:
    Struct K
    {
    Int A: 1
    INT: 2/* The two digits cannot be used */
    Int B: 3
    Int C: 2
    };
    From the above analysis, we can see that the bit field is essentially a structure type, but its members are allocated by binary.

    3. Bit domain usage

    The usage of bit domains is the same as that of structure members. Generally, the form of bit domain variable name-bit domain name can be output in various formats.
    Main (){
    Struct BS
    {
    Unsigned A: 1;
    Unsigned B: 3;
    Unsigned C: 4;
    } Bit, * pbit;
    Bit. A = 1;
    Bit. B = 7;
    Bit. c = 15;
    Printf ("% d, % d, % d/N", bit. A, bit. B, bit. C );
    Pbit = & bit;
    Pbit-> A = 0;
    Pbit-> B & = 3;
    Pbit-> C | = 1;
    Printf ("% d, % d, % d/N", pbit-> A, pbit-> B, pbit-> C );
    }

    In the preceding example, the bit domain structure Bs is defined. The three bit domains are A, B, and C. This section describes the BS type variable bit and the BS type pointer variable pbit. This indicates that pointers can also be used for bit fields.
    The program's Lines 9, 10, and 11 assign values to the three single-digit domains. (Note that the value assignment cannot exceed the permitted range of the bit field) The program outputs the content of the three fields in integer format in line 1. Row 3 sends the bit address of the bit field variable to the pointer variable pbit. Row 14th re-assigns a value to bit field A as a pointer and assigns it to 0. Row 15th uses the compound bitwise operator "& =", which is equivalent to 7 in the original value of pbit-> B = pbit-> B & 3-bit Domain B, the bitwise AND operation result of 3 is 3 (111 & 011 = 011, And the decimal value is 3 ). Similarly, the Code uses the compound bitwise operation "| =" in line 1, which is equivalent to pbit-> C = pbit-> C | 1 and the result is 15. The program output the values of the three fields in the pointer mode in Row 3.

    Related Article

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.