byte alignment Considerations

Source: Internet
Author: User
Tags data structures pack
Alignment Criteria

Let's take a look at four important basic concepts:

1 The alignment value of the data type itself: char data has its own alignment value of 1 bytes, the short data is 2 bytes, the int/float type is 4 bytes, and the double is 8 bytes.

2 The value of the structure or class's own alignment: The value that is the largest in its members.

3 Specifies the alignment value of the specified alignment value, #pragma pack (value).

4 valid alignment values for data members, structs, and classes: their own alignment values and the lesser of the specified alignment values, that is, the valid alignment value =min{itself, the currently specified pack value}.

Based on these values, you can easily discuss the members of specific data structures and their own alignment.

Where the valid alignment value n is the value that is ultimately used to determine how the data is stored in the address. Valid alignment n means "Snap to n", that is, the "store start address%n=0" for that data. Data variables are stored in the order of definition. The starting address of the first data variable is the starting address of the structure. The member variables of the structure should be aligned and stored, and the structure itself should be aligned according to its own valid value (that is, the total length of the struct member variable occupies the integral multiple of the valid alignment value of the structure body).

The above concept is very easy to understand, but the individual still prefers the following alignment guidelines.

The details of the structure byte alignment are related to specific compiler implementations, but generally meet three criteria:

1 The first address of the structure variable can be divisible by the size of its widest base type member;

2 The offset of each member relative to the first address of the structure body is an integer multiple of the member size, and if necessary the compiler will add padding bytes between the members (internal adding);

3 The total size of the structure is an integral multiple of the size of the widest base type member of the structure, and if necessary the compiler will add the padding byte {trailing padding} after the last member.

The instructions for the above rules are as follows:

First: When the compiler opens up space for the structure, it first finds the widest basic data type in the structure, and then looks for the location where the memory address can be divisible by the base data type, as the first address of the structure body. The size of this widest base data type is used as the Zimo number described above.

Article two: Before opening up space for a member of a struct, the compiler first checks whether the first address of the pre-open space is an integer multiple of the size of the member relative to the first address of the structure body, and if so, the member, or, conversely, fills a certain byte between this member and the previous member to achieve the requirement of integer times, That is, the first address of the pre-open space is shifted several bytes.

Article three: The total size of the structure is to include padding bytes, the last member to meet the above two, but also must satisfy the third, otherwise you must fill in the last few bytes to achieve this requirement. the pitfalls of alignment 1. Data type conversion

Many of the pitfalls of alignment in your code are implicit. For example, when a type conversion is enforced:

1 int main (void) {  
 2     unsigned int i = 0x12345678;
 3         
 4     unsigned char *p = (unsigned char *) &i;
 5     *p = 0x00;
 6     unsigned short *p1 = (unsigned short *) (p+1);
 7     *p1 = 0x0000;
 8 
 9 return     0;
10}

The last two codes, which access unsigned short variables from odd-numbered boundaries, clearly do not conform to the rules of alignment. Similar operations can only affect efficiency on X86, but may cause error on MIPS or SPARC because they require byte alignment.

As for the structure of the 3.1.1 section, struct B, define the following functions:

1 void Func (struct B *p) {
2     //code
3}

If you access p->a directly in the body of a function, you are likely to get an exception. Because MIPS thinks a is an int, its address should be a multiple of 4, but P->a's address is probably not a multiple of 4.

If P's address is not on the alignment boundary, it may be problematic, such as p from a data packet across CPUs (data of various data types are placed in order in a packet), or p is calculated by pointer shift. Therefore, special attention should be paid to the processing of interface input data across CPU data, and to the security of pointer shift and then casting to the structure pointer for access.

The solution is as follows:

1 Define a local variable for this structure and copy the data in Memmove mode.

1 void Func (struct b *p) {
2     struct B tdata;
3     memmove (&tdata, p, sizeof (struct B));
4     //Thereafter secure access to TDATA.A because the compiler has assigned tdata to the correct starting address
5}

Note: If you can determine that the starting address of P is OK, you do not need to do so, if you are not sure (such as data across the CPU input, or pointer shift data to be particularly careful), you need to do this.

2) The struct_t is defined as a 1-byte alignment with the #pragma pack (1). 2. Data communication between processors

The problem of byte alignment and byte-order is required when communicating between processors through messages (which is the structure body for C + +).

Most compilers provide the option for the memory to be used by the user. This allows the user to choose a different byte alignment depending on the processor. For example, the #pragma pack (n) n=1,2,4 provided by the C + + compiler allows the compiler to arrange the memory data to be arranged in a specified manner at the memory address that is divisible by the 1,2,4 byte, when generating the target file.

However, on different compilation platforms or processors, byte alignment can cause the message structure to change in length. The compiler may populate the message structure for byte alignment, and different compilation platforms may be populated in different forms, greatly increasing the risk of data communication between processors.

The following is an example of a 32-bit processor, and a memory alignment method is proposed to solve the above problem.

For locally used data structures, four-byte alignment is used to improve memory access efficiency, and in order to reduce the overhead of memory, reasonably arrange the position of the members of the structure, reduce the gap between the members caused by four-byte alignment, and reduce the memory overhead.

For the data structure between processors, it is necessary to ensure that the message length does not change the length of the message structure as a result of different compilation platforms or processors, and that the message structure is tightened using a byte alignment; To ensure the memory access efficiency of the message data structure between processors, Four-byte alignment of the members of the message itself with byte padding.

The member position of the data structure should take account of the relationship between members, the efficiency of access and the utilization of space. The sequential arrangement principle is that four bytes are placed at the top, two bytes immediately following the last four-byte member, one byte immediately followed by the last two-byte member, and the padding byte is placed at the end.

Examples are as follows:

1 typedef struct tag_t_msg{
2     long  Paraa;
3     long  parab;
4 short     Parac;
5     char  Parad;
6     char  Pad;   padding byte
7}t_msg;
3. Troubleshoot alignment issues

If an alignment or assignment problem appears:

1 compiler byte order size end setting;

2 whether the processor architecture itself supports non aligned access;

3 If you support the alignment or not, if you do not see the access need to add some special decorations to flag its special access operations. 4. Change the alignment

The main is to change the default byte alignment of the C compiler.

By default, the C compiler allocates space for each variable or data unit according to its natural boundary condition. In general, you can change the default boundary condition by using the pseudo-directive #pragma pack (n): The C compiler will align to n bytes; Use pseudo-directive #pragma pack (): Cancel custom byte alignment.

In addition, there is a way (gcc-specific syntax): __attribute ((aligned (n)): Aligns the structure members that are acting on the N-byte natural boundary. If the length of a member in a struct is greater than N, it is aligned according to the length of the maximum member. __ATTRIBUTE__ ((Packed)): Cancels the optimized alignment of the structure during compilation and aligns to the actual number of bytes occupied.

The "note" __attribute__ mechanism is a major feature of GCC, which can set function attributes, variable attributes (Variable attribute), and type attributes. bit field Alignment 1. Bit-field definition

Some information, when stored, does not need to occupy a full byte, but only a few or a bits. For example, when storing a switch quantity, only 0 and 12 states, with one binary can be. In order to save storage space and simplify processing, C language provides a data structure called "bit field" or "bit segment".

A bit field is a special structure member or union member (that is, it can only be used in a struct or union) to specify the number of digits that the member occupies in memory storage, thereby more compact representation of data within the machine. Each bit field has a domain name that allows the corresponding bit to be operated by the domain name in the program. This allows you to represent several different objects using a byte of bits field.

A bit field definition is similar to a struct definition in the form of:

struct bit domain structure name

{bit field List};

The list of bit fields is in the form of:

Type descriptor bit domain name: bit field length

The use of a bit field is the same as that of a struct member, and its general form is:

Bit domain variable name. bit domain Name

Bit fields allow output in various formats.

A bit field is essentially a struct type, but its members are allocated by binary. The description of the bit field variable is the same as the description of the structure variable, which can be defined as a description, a description, or a direct description.

The use of bit fields is mainly in the following two scenarios:

1 when the machine has less free memory space and the use of bit fields can save a lot of memory. such as when the structure is a large array of elements.

2 when it is necessary to map a structure or joint into a predetermined organizational structure. If you need to access specific bits within the byte. 2. Alignment Guidelines

Bit-domain members cannot be sizeof values alone. The following is a discussion of the sizeof of a structure containing a bit domain.

The C99 stipulates that int, unsigned int, and bool can be used as bit field types, but almost all of the compilers extend this to allow other types of existence. Bit domain is a very common programming tool in embedded system, the advantage of which is the storage space of compressed program.

Its alignment rules are roughly:

1 if the adjacent bit field field is of the same type and its bit width is less than the sizeof size of the type, the following field is stored next to the previous field until it cannot be accommodated;

2 If the adjacent bit field field is of the same type, but its bit width is greater than the sizeof size of the type, the subsequent field starts with the new storage cell, with an offset of an integer multiple of its type size;

3 if the type of adjacent bit field field is different, then the concrete implementation of each compiler has difference, VC6 adopts uncompressed way, dev-c++ and GCC adopt compression way;

4 If the bit field field is interspersed between fields with a non bit field, no compression is performed;

5 The total size of the entire structure is an integer multiple of the widest base type member size, while the bit field is aligned according to its widest type of byte number.

"Example 5"

1 struct bitfield{
2     char element1  : 1;
3     Char Element2  : 4;
4     Char Element3  : 5;
5};

The bit field type is char, and the 1th byte can only hold the element1 and Element2, so element1 and element2 are compressed into the 1th byte, and Element3 can only start with the next byte. So the result of sizeof (Bitfield) is 2.

"Example 6"

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.