Sizeof, ultimate (on)

Source: Internet
Author: User

0. Forward Declaration

Sizeof, a handsome guy, leads countless cainiao to fold his waist. I didn't make too many mistakes at the beginning, and adhered to the great idea of "I am one, happy ten million people, I decided to summarize it as much as possible.

However, when I sum up, I find that this problem can be both simple and complex. Therefore, some of this article is not suitable for people who are new to learning, or even have no need to write an article. However, if you want to "know the truth, know more about it", this article may be helpful to you.

I have not learned much about C ++, but many of them are incorrect. You are welcome to throw bricks.

1. Definition

Where is sizeof holy? Sizeof is an operator in C/C ++. In short, sizeof is used to return the memory bytes occupied by an object or type.

The description on msdn is as follows:

The sizeof keyword gives the amount of storage, in bytes, associated with a variable or a type (including aggregate types ).

This keyword returns a value of Type size_t.

The return value type is size_t, which is defined in the header file stddef. h. This is a value dependent on the compilation system, which is generally defined:

Typedef unsigned int size_t;

There are many compilers in the world, but as a standard, they will ensure that the sizeof values of char, signed Char, and unsigned char are 1. After all, char is the minimum data type that we can use for programming.

2. Syntax

Sizeof has three syntax forms:

1) sizeof (object); // sizeof (object );
2) sizeof (type_name); // sizeof (type );
3) sizeof object; // sizeof object;

So,

Int I;
Sizeof (I); // OK
Sizeof I; // OK
Sizeof (INT); // OK
Sizeof int; // Error

Since writing 3 can be replaced by writing 1, in order to unify forms and reduce the burden on our brains, there are 3rd writing methods. Forget it!

In fact, the size of the sizeof calculation object is also converted to the calculation of the object type, that is, the sizeof values of different objects of the same type are consistent. Here, the object can be further extended to the expression, that is, sizeof can evaluate an expression. The Compiler determines the size based on the final result type of the expression, and generally does not calculate the expression. For example:

Sizeof (2); // The type of 2 is int, so it is equivalent to sizeof (INT );
Sizeof (2 + 3.14); // The type of 3.14 is double, and 2 is also upgraded to double, so it is equivalent to sizeof (double );

Sizeof can also evaluate a function call. The result is the size of the function return type, and the function is not called. Let's take a complete example:

Char Foo ()
{
Printf ("Foo () has been called./N ");
Return 'a ';
}
Int main ()
{
Size_t SZ = sizeof (FOO (); // the return value type of Foo () is Char, so SZ = sizeof (char), Foo () is not called
Printf ("sizeof (FOO () = % d/N", SZ );
}

According to the c99 standard, functions, expressions of undetermined types, and bit-field members cannot be computed with sizeof values. That is, the following statements are incorrect:

Sizeof (FOO); // Error

Void foo2 (){}
Sizeof (foo2 (); // Error

Struct s
{
Unsigned int F1: 1;
Unsigned int F2: 5;
Unsigned int F3: 12;
};
Sizeof (S. F1); // Error

3. constants of sizeof

Sizeof occurs during compilation, so it can be used as a constant expression, for example:

Char ary [sizeof (INT) * 10]; // OK

The latest c99 standard stipulates that sizeof can also be calculated at the execution time. For example, the following programs can be correctly executed in Dev-C ++:

Int N;
N = 10; // n Dynamic assignment
Char ary [N]; // c99 also supports dynamic definition of Arrays
Printf ("% d/N", sizeof (ary); // OK. Output 10

However, the compiler does not fully implement the c99 standard. The above code is only compiled in vc6. Therefore, we may feel that sizeof is running during the compilation period, which will not cause errors and make the program more portable.

4. sizeof of basic data type

The basic data types here refer to simple built-in data types such as short, Int, long, float, and double. Because they are all related to the system, the values may be different in different systems, this must attract our attention, and try not to cause trouble for porting our programs in this respect.

Generally, in a 32-bit compiling environment, the value of sizeof (INT) is 4.

5. sizeof pointer variable

If you have learned the data structure, you should know that the pointer is a very important concept, which records the address of another object. Since the address is stored, it is equal to the width of the computer's internal address bus. Therefore, in a 32-bit computer, the return value of a pointer variable must be 4 (note that the result is in bytes) and can be estimated, in the future 64-bit system, the sizeof result of the pointer variable is 8.

Char * Pc = "ABC ";
Int * PI;
String * pS;
Char ** PPC = & PC;
Void (* PF) (); // function pointer
Sizeof (PC); // The result is 4
Sizeof (PI); // The result is 4
Sizeof (PS); // The result is 4
Sizeof (PPC); // The result is 4
Sizeof (PF); // The result is 4

The sizeof value of the pointer variable has nothing to do with the object indicated by the pointer, because all the pointer variables occupy the same memory size, therefore, the MFC message processing function can transmit various complex message structures (using pointers to struct) using two segments wparam and lparam ).

6. sizeof of Array

The sizeof value of the array is equal to the number of memory bytes occupied by the array, for example:

Char A1 [] = "ABC ";
Int A2 [3];
Sizeof (A1); // The result is 4. A null Terminator exists at the end of the string.
Sizeof (A2); // The result is 3*4 = 12 (dependent on INT)

Some friends regard sizeof as the number of array elements at the beginning. Now, you should know that this is incorrect. How should we calculate the number of array elements? Easy:

Int C1 = sizeof (A1)/sizeof (char); // total length/length of a single element
Int C2 = sizeof (A1)/sizeof (A1 [0]); // total length/length of the first element

Write it here and ask, what is the value of C3 and C4 below?

Void foo3 (char A3 [3])
{
Int C3 = sizeof (A3); // C3 =?
}
Void foo4 (char A4 [])
{
Int C4 = sizeof (A4); // C4 =?
}

Maybe when you try to answer the C4 value, you realize that C3 is wrong. Yes, C3! = 3. Here, the function A3 is no longer an array type, but a pointer, equivalent to char * A3. Why? When we call the foo1 function, will the program allocate a 3 array on the stack? No! The array is "Address Transfer". The caller only needs to pass the address of the real-time sequence. Therefore, A3 is a pointer type (char *) and C3 is 4.

7. sizeof of struct

This is one of the most frequently asked questions from people who have just started learning, so it is necessary to pay more attention here. Let's first look at a struct:

Struct S1
{
Char C;
Int I;
};

What is sizeof (S1) equal? Clever, you start to think about it. Char occupies 1 byte and INT occupies 4 byte, so the sum should be 5. Is that true? Have you tried it on your machine? Maybe you are right, but most likely you are wrong! In vc6, the default value is 8.

Why? Why is my injury always me?

Don't be frustrated. Let's take a look at the definition of sizeof. The result of sizeof is equal to the number of memory bytes occupied by the object or type. Well, let's take a look at the memory allocation of S1:

S1 S1 = {'A', 0 xffffffff };

After defining the variables above, add a breakpoint, execute the program, and observe the memory where S1 is located. What did you find?

Take my vc6.0 as an example. The S1 address is 0x0012ff78, and its data content is as follows:

0012ff78: 61 CC FF

What have you found? Why is CC mixed with 3 bytes in the middle? Let's take a look at the description on msdn:

When applied to a structure type or variable, sizeof returns the actual size, which may include Padding Bytes inserted for alignment.

Originally, this is the legendary byte alignment! An important topic has emerged.

Why is byte alignment required? The computer composition principle teaches us how to speed up the computer's data acquisition speed, otherwise it will take more time to run the command cycle. To this end, the compiler will process the struct by default (in fact, the data variables in other places are also the same), so that the basic data type with a width of 2 (short, etc) are located on the address that can be divisible by 2, so that the basic data type (INT, etc.) with the width of 4 is located on the address that can be divisible by 4, and so on. In this way, two numbers may need to be added in the middle? Fill byte, so the sizeof value of the entire struct increases.

Let's exchange the positions of char and INT in S1:

Struct S2
{
Int I;
Char C;
};

Let's see what the result of sizeof (S2) is. Why is it 8? Let's look at the memory. The original member C still has three Padding Bytes. Why? Don't worry, the following summary rules.

The details of byte alignment are related to compiler implementation, but generally three criteria are met:

1) The first address of the struct variable can be divisible by the size of its widest basic type member;
2) The offset (offset) of each member of the struct to the first address of the struct is an integer multiple of the member size. If necessary, the compiler will add the internal adding between the members );
3) the total size of the struct is an integer multiple of the size of the widest basic type of the struct. If necessary, the compiler will add the trailing padding after the last member ).

There are several points to note about the above principles:

1) I didn't say that the address of a struct member is an integer multiple of its size. How can I talk about the offset? Because of the existence of 1st points, we can only consider the member offset, which is easy to think about. Think about why.
The offset of a member of the struct to the first address of the struct can be obtained through the macro offsetof (). This macro is also defined in stddef. H, for example:

# Define offsetof (S, m) (size_t) & (S *) 0)-> m)

For example, to get the offset of C in S2, the method is

Size_t Pos = offsetof (S2, c); // POS equals 4

2) The basic type refers to the built-in data types such as char, short, Int, float, and double. The "data width" here refers to the size of its sizeof. Because the structure member can be a composite type, for example, another struct, when looking for the widest basic type member, it should contain the child Member of the composite type member, instead of seeing a composite member as a whole. However, when determining the offset position of a composite member, the composite type is taken as the overall view.

This is a bit difficult to describe here, and it seems a bit confusing to think about it. Let's take a look at the example (the detailed value is still taken as an example of vc6 and will not be explained later ):

Struct S3
{
Char C1;
S1 S;
Char C2
};

The type of the widest basic member of S1 is int. S3 splits S1 when considering the widest basic member, so the widest basic type of S3 is int, for variables defined by S3, the first address of the bucket must be divisible by four, and the entire sizeof (S3) value should also be divisible by four.

The offset of C1 is 0. What is the offset of S? At this time, S is a population. As a struct variable, it also satisfies the first three principles. Therefore, the size is 8, the offset is 4, and three bytes are required between C1 and S, however, there is no need between C2 and S, so the offset of C2 is 12, and the size of C2 is 13, and 13 cannot be fully divided by 4. In this way, three Padding Bytes must be added at the end. Finally, the value of sizeof (S3) is 16.

Through the above description, we can get a formula:

The size of the struct is equal to the offset of the last member plus the size of the struct plus the number of filled bytes at the end, that is:
Sizeof (struct) = offsetof (last item) + sizeof (trailing padding)

Here, friends should have a new understanding of the sizeof struct, but don't be too happy. There is an important amount of attention that affects the sizeof which has not been mentioned yet, that is the pack instruction of the compiler. It is used to adjust the structure alignment mode. Different compiler names and usage methods are slightly different. In vc6, the # pragma pack is used to directly change the/ZP compilation switch. # The basic usage of The Pragma pack is as follows: # pragma pack (N), where N is the number of bytes alignment. The values are 1, 2, 4, 8, and 16. The default value is 8, assume that the value is smaller than the sizeof value of the struct member, the offset of the struct member should take the minimum value of the two. The formula is as follows:

Offsetof (item) = min (n, sizeof (item ))

Let's look at the demo:

# Pragma pack (push) // Save the current pack setting pressure Stack
# Pragma pack (2) // must be used before struct Definition
Struct S1
{
Char C;
Int I;
};
Struct S3
{
Char C1;
S1 S;
Char C2
};
# Pragma pack (POP) // restore previous pack settings

When sizeof (S1) is calculated, the value of min (2, sizeof (I) is 2, so the offset of I is 2, and the value of sizeof (I) is 6, it can be divisible by 2, so the size of the entire S1 is 6.

Same. For sizeof (S3), the offset of S is 2, the offset of C2 is 8, and the addition of sizeof (C2) is 9, which cannot be divisible by 2? One fill byte, so sizeof (S3) is equal to 10.

Now, friends can breathe a sigh of relief ,:)

Note that the size of the "Empty struct" (excluding data members) is not 0, but 1. Imagine how a variable with "no space occupied" can be distinguished by its address and two different "Empty struct" variables? Therefore, the "null struct" variable must be stored, so that the compiler can allocate only one byte of space for it. For example:

Struct S5 {};
Sizeof (S5); // The result is 1.

8. sizeof containing bitfield struct

As we have already said, bitfield members cannot be taken sizeof values independently. Here we will discuss the sizeof of struct containing the bitfield, which is specially listed for consideration of its particularity.

C99 specifies that int, unsigned int, and bool can be used as bit domain types, but the compiler almost extended this and agreed to the existence of other types.

The main purpose of bit domains is to compress storage. The general rules are as follows:

1) assuming that the types of adjacent fields are the same, and the sum of the bit widths is smaller than the sizeof size of the type, the subsequent fields are stored next to the previous field until they cannot be accommodated;
2) assuming that the type of the adjacent bit field is the same, but the sum of its bit width is greater than the sizeof size of the type, the subsequent fields start from the new storage unit, its offset is an integer multiple of its type;
3) if the types of adjacent bit field are different, the detailed implementation of each compiler is different. vc6 merge is not compressed, and Dev-C ++ merge is compressed;
4) if the fields in the bitfield are interspersed with non-bitfield fields, the data is not compressed;
5) the total size of the entire struct is an integer multiple of the size of the widest basic type.

Let's take a look at the example.

Example 1:

Struct bf1
{
Char F1: 3;
Char F2: 4;
Char F3: 5;
};

Its memory layout is:

| F1 | F2 | F3 |
---------------------------------
|
---------------------------------
0 3 7 8 13 16 (byte)

The bit field type is Char, and the 1st bytes can only accommodate F1 and F2, so F2 is compressed to 1st bytes, while F3 can only start from the next byte. Therefore, the result of sizeof (bf1) is 2.

Example 2:
Struct bf2
{
Char F1: 3;
Short F2: 4;
Char F3: 5;
};

Because the adjacent bit fields have different types, in vc6, sizeof is 6, and in Dev-C ++ is 2.

Example 3:
Struct bf3
{
Char F1: 3;
Char F2;
Char F3: 5;
};

Fields in non-bit fields are interspersed with each other and will not be compressed. The size obtained in vc6 and Dev-C ++ is 3.

9. sizeof of the Consortium

The struct structure is sequential in the memory structure, while the consortium is overlapping. Each member shares a piece of memory, so the sizeof of the entire consortium is the maximum value of sizeof each member. Struct members can also be composite. Here, composite Members are considered as a whole.

Therefore, in the following example, the sizeof value of U is equal to sizeof (s ).

Union u
{
Int I;
Char C;
S1 S;
};

(For details about sizeof in C ++, refer to the second part of this article)

 

Sizeof, ultimate (on)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.