C ++ sizeof rules and trap Analysis)

Last Update:2018-12-04 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. What is sizeof

First, let's take a look at the definition of sizeof on msdn:

The sizeof
Keyword gives the amount of storage, in bytes, associated with
Variable or a type (including aggregate types). This keyword returns
Value of Type size_t.

When I saw the word "return", did I think of a function? Wrong. sizeof is not a function. See
PASS Parameters to a function without brackets? Sizeof is acceptable, so sizeof is not a function. Some people on the Internet say that sizeof is a mona1 operator, but I don't think so, because
Sizeof is more like a special macro, Which is evaluated during the compilation phase. For example:

Cout <sizeof (INT) <Endl; // the length of an int on a 32-bit machine is 4.
Cout <sizeof (1 = 2) <Endl;
// = The operator returns the bool type, which is equivalent to cout <sizeof (bool) <Endl;

　
It has been translated:

Cout <4 <Endl;
Cout <1 <Endl;

　
Here is a trap. Let's look at the following program:

Int A = 0;
Cout <sizeof (A = 3) <Endl;
Cout <A <Endl;

Why is the output 4 or 0 instead of the expected 4 or 3 ??? It is because sizeof processes
Features. Because sizeof cannot be compiled into machine code, the content in the scope of sizeof, that is, (), cannot be compiled, but replaced with the type. = Operator returns the value of the left operand
Type, so a = 3 is equivalent to int, and the code is replaced:

Int A = 0;
Cout <4 <Endl;
Cout <A <Endl;

Therefore, sizeof cannot support chained expressions, which is different from the unary operator.

　
Conclusion: Do not treat sizeof as a function, or a mona1 operator, and treat it as a special compilation preprocessing.

　　2. sizeof usage

　
Sizeof has two usage methods:

(1) sizeof (object)

That is, you can use sizeof for an object or write it
Sizeof object format.

(2) sizeof (typename)

That is, sizeof is used for the type,
Note that it is invalid to write sizeof typename in this case. The following are examples:

Int I = 2;
Cout <sizeof (I) <Endl ;//
Sizeof (object) usage, reasonable
Cout <sizeof I <Endl; // sizeof
Object usage, reasonable
Cout <sizeof 2 <Endl; // 2 is parsed to an int-type object,
Sizeof object usage, reasonable
Cout <sizeof (2) <Endl ;//
2. It is reasonably used to parse an object of the int type into a sizeof (object ).
Cout <sizeof (INT) <Endl ;//
Sizeof (typename) usage, reasonable
Cout <sizeof int <Endl ;//
Error! For operators, you must add ()

We can see that adding () is always the right choice.

　
Conclusion: It is best to add () regardless of the value of sizeof ().

　　3. sizeof of Data Type

　　
(1) Inherent Data Types of C ++

The basic data type in 32-bit C ++, that is, Char, short int (short), Int, long
INT (long), float, double, long double

The values are: 1, 2, 4, 4, 8, and 10.

　
Consider the following code:

Cout <sizeof (unsigned INT) = sizeof (INT) <Endl ;//
Equal, output 1

Unsigned only affects the meaning of the highest bit, and the data length will not be changed.
.

Conclusion: Unsigned does not affect the value of sizeof.

(2) Custom Data Types

　　
Typedef can be used to define the C ++ custom type. Consider the following questions:

Typedef short word;
Typedef long DWORD;
Cout <(sizeof (short)
= Sizeof (Word) <Endl; // equal, output 1
Cout <(sizeof (long) =
Sizeof (DWORD) <Endl; // equal, output 1

End
Theory: the sizeof value of a custom type is equivalent to its type prototype.

(3) Function Type

Consider the following questions:

Int F1 () {return 0 ;};
Double F2 () {return 0.0 ;}
Void F3 (){}

Cout <sizeof (F1 () <Endl;
// The Return Value of F1 () is int, so it is considered int
Cout <sizeof (F2 () <Endl ;//
The return value of F2 () is double, so it is considered as double.
Cout <sizeof (F3 () <Endl ;//
Error! Unable to use sizeof for void type
Cout <sizeof (F1) <Endl ;//
Error! Unable to use sizeof for function pointer
Cout <sizeof * F2 <Endl ;//
* F2, equivalent to F2 (), because it can be viewed as an object, parentheses are not necessary. Is considered double

　
Conclusion: When sizeof is used for a function, it will be replaced by the type of the function return value in the compilation phase,

　　4. pointer Problems

Consider
Problem:

Cout <sizeof (string *) <Endl; // 4
Cout <sizeof (int *) <Endl;
// 4
Cout <sizof (char ***) <Endl; // 4

　
We can see that no matter what type of pointer, the size is 4, because the pointer is a 32-bit physical address.

Conclusion: The pointer size is 4. (On a 64-bit Machine
It cannot be changed to 8 ).

By the way, the pointer in C ++ indicates the actual memory address. Unlike C, C ++ removes the pattern division, that is, there is no more
Small, middle, and big are replaced by the Unified flat. The flat mode uses 32-bit real address addressing instead of in C.
Segment: offset mode. For example, if there is a pointer pointing to the address f000: 8888, and if it is C type, it is 8888 (16 bits,
Only store the displacement, omitted segments). The far Type C pointer is f0008888 (32-bit, high-position reserved segment address, position reserved shift ), the pointer of the C ++ type is f8888 (32-bit, equivalent
Segment address x 16 + displacement, but the addressing range must be larger ).

　　5. array Problems

Consider the following questions:

Char A [] = "abcdef ";
Int B [20] = {3, 4 };
Char C [2] [3] = {"AA ",
"BB "};

Cout <sizeof (a) <Endl; // 7
Cout <sizeof (B) <Endl;
// 20*4
Cout <sizeof (c) <Endl; // 6

　
The size of array A is not specified during definition. The space allocated to it during compilation is determined according to the initialization value, that is, 7. C is a multi-dimensional array, and the space occupied is the product of each dimension, that is, 6. You can see
The size of the array is the space allocated during compilation, that is, the product of each dimension * the size of the array element.

Conclusion: The size of the array is the product of each dimension * the size of the array element
Small.

There is a trap:

Int * D = new int [10];
Cout <sizeof (d) <Endl; // 4

　
D is a dynamic array we often call, but it is actually a pointer, so the value of sizeof (d) is 4.

Consider the following questions:

Double * (* A) [3] [6];
Cout <sizeof (a) <Endl; // 4
Cout <sizeof (* A) <Endl;
// 72
Cout <sizeof (** A) <Endl; // 24
Cout <sizeof (*** A) <Endl;
// 4
Cout <sizeof (*** A) <Endl; // 8

　
A is a very strange definition, which indicates a pointer to an array of the double * [3] [6] type. Since it is a pointer, sizeof (a) is 4.

　
Since a is a pointer of the double * [3] [6] type, * a indicates a multi-dimensional array of the double * [3] [6] type. Therefore
Sizeof (* A) = 3*6 * sizeof (double *) = 72. Similarly, ** A indicates an array of the double * [6] type.
Sizeof (** A) = 6 * sizeof (double *) = 24. * ** A indicates an element, that is, double *.
Sizeof (*** A) = 4. As for *** A, it is a double, so sizeof (*** A) = sizeof (double) = 8.

6. array passing to Functions

Consider the following questions:

# Include <iostream>
Using namespace STD;

Int
Sum (int I [])
{
Int sumofi = 0;
For (Int J = 0; j <
Sizeof (I)/sizeof (INT); j ++) // actually, sizeof (I) = 4
{
Sumofi + =
I [J];
}
Return sumofi;
}

Int main ()
{
Int
Allages [6] = {21, 22, 22, 19, 34, 12 };
　
Cout <sum (allages) <Endl;
System ("pause ");
Return 0;
}

Sum is used to get the size of the array with sizeof, and then sum. But in fact, the Input Self-Function
Sum is only a pointer of the int type, so sizeof (I) = 4, instead of 24, it will produce an error. To solve this problem, use a pointer or reference.

　
Pointer usage:

Int sum (INT (* I) [6])
{
Int sumofi = 0;
For (Int J = 0; j
<Sizeof (* I)/sizeof (INT); j ++) // sizeof (* I) = 24
{
Sumofi + =
(* I) [J];
}
Return sumofi;
}

Int main ()
{
Int
Allages [] = {21, 22, 22, 19, 34, 12 };
　
Cout <sum (& allages) <Endl;
System ("pause ");
　
Return 0;
}

In this sum, I is a pointer to the I [6] type.
Meaning, int sum (int
(* I) []) declares a function, but must specify the size of the array to be passed in. Otherwise, sizeof (* I) cannot be calculated. However, in this case, sizeof is used to calculate the array size.
It doesn't make sense, because the size is set to 6.

The reference is similar to the pointer:

Int sum (INT (& I) [6])
{
Int sumofi = 0;
For (Int J =
0; j <sizeof (I)/sizeof (INT); j ++)
{
Sumofi + = I [J];
}
　
Return sumofi;
}

Int main ()
{
Int allages [] = {21, 22,
22, 19, 34, 12 };
Cout <sum (allages) <Endl;
　
System ("pause ");
Return 0;
}

In this case
Sizeof calculation is also meaningless, so the array is used as a parameter. When traversal is required, the function should have a parameter to describe the size of the array, the size of the array is defined in the scope of the array.
Sizeof evaluate. Therefore, the correct form of the above function should be:

# Include <iostream>
Using namespace STD;

Int
Sum (int * I, unsigned int N)
{
Int sumofi = 0;
For (Int J = 0;
J <n; j ++)
{
Sumofi + = I [J];
}
Return sumofi;
}

Int main ()
{
Int allages [] = {21, 22, 22, 19, 34, 12 };
　
Cout <sum (I, sizeof (allages)/sizeof (INT) <Endl;
　
System ("pause ");
Return 0;
}

　　7,
String sizeof and strlen

Consider the following questions:

Char A [] = "abcdef ";
Char B [20] = "abcdef ";
String S =
"Abcdef ";

Cout <strlen (a) <Endl; // 6, String Length
Cout <sizeof (a) <Endl;
// 7, string capacity
Cout <strlen (B) <Endl; // 6, String Length
Cout <strlen (B) <Endl;
// 20, string capacity
Cout <sizeof (s) <Endl; // 12,
It does not represent the length of the string, but the size of the string class.
Cout <strlen (s) <Endl ;//
Error! S is not a character pointer.

A [1] = '/0 ';
Cout <strlen (a) <Endl;
// 1
Cout <sizeof (a) <Endl; // 7, sizeof is constant

　
Strlen is the number of characters starting from the specified address to the first zero. It is executed in the running stage, and sizeof is the data size, here we get the string content
Quantity. Therefore, the sizeof value is constant for the same object. String is a C ++ string, which is a class, so sizeof (s) does not represent the length of a string.
Degrees, but the size of the class string. Strlen (s) is an error at all, because the strlen parameter is a character pointer. If you want to use strlen to get the length of the S string, you should
Sizeof (S. c_str () is used because the string member function c_str () returns the first address of the string. In fact, the string class provides its own members.
Function to obtain the capacity and length of the string, namely capacity () and length (). String encapsulates common string operations, so it is best to use
String is a string of the C type.

Note: For sizeof (string), it seems that different implementations return different results:

Devcpp: 4
Vs2005: 32

8. view the CPU peer bounds from the sizeof problem of union

Consider the following: (default alignment)

Union u
{
Double;
Int B;
};

Union U2
{
　
Char A [13];
Int B;
};

Union U3
{
Char A [13];
　
Char B;
};

Cout <sizeof (u) <Endl; // 8
Cout <sizeof (U2) <Endl;
// 16
Cout <sizeof (U3) <Endl; // 13

　
We all know that the size of Union depends on the size of one member that occupies the largest space among all its members. So for u, the size is the largest double type member A, so
Sizeof (u) = sizeof (double) = 8. However, for U2 and U3, the maximum space is an array of char [13] type. Why is the size of U3 13?
What if U2 is 16? The key lies in the member int in u2.
B. Because of the existence of int type members, the U2 alignment is changed to 4. That is to say, the U2 size must be 4 to the world, therefore, the occupied space is changed to 16 (the nearest 13 peer ).

　
Conclusion: The alignment of composite data types, such as Union, struct, and class, is the alignment of the members with the largest alignment.

By the way
The 32-bit C ++ uses eight-bit pairs to speed up the operation. Therefore, the compiler tries its best to put data in the field to improve the memory hit rate. It can be changed to make
# Pragma
The Pack (x) macro can change the method of the compiler's peer interface. The default value is 8. C ++ is a smaller method than its own size. For example, specify that the compiler is based on two pairs of bounds, int
If the value of the type is 4, the int is 2 and 2 is smaller than 4. In the default method, because almost all data types are not greater than the default method 8 (except for long
Double), so all the inherent type of the method can be considered as the size of the type itself. Change the above program:

# Pragma pack (2)
Union U2
{
Char A [13];
Int B;
};

Union
U3
{
Char A [13];
Char B;
};
# Pragma pack (8)

Cout <sizeof (U2) <Endl;
// 14
Cout <sizeof (U3) <Endl; // 13

　
Because the method of Manually changing to 2 is also changed to 2 for int, and the biggest Member for U2 is 2, so now sizeof (U2) = 14.

　
Conclusion: C ++'s inherent type of bounded access Compiler's bounded access method is smaller than its own size.

　　9. sizeof problem of struct

　
The sizeof structure is complicated due to alignment. See the following example: (the default alignment Mode)

Struct S1
{
Char;
Double B;
Int C;
Char D;
};

Struct S2
{
Char;
Char B;
Int C;
Double D;
};

Cout <sizeof (S1) <Endl; // 24
Cout <sizeof (S2) <Endl;
// 16

It is also two Char Types, one int type and one double type,
Due to peer problems, their sizes are different. The element pendulum method can be used to calculate the struct size. For example, the CPU determines the peer bounds of the struct. According to the conclusion in the previous section, S1 and S2
All vertices take the largest element type, that is, 'double' 8 8 8. Then, each element is placed.

For S1, first place a to the peer field of 8, which is assumed to be 0.
The next idle address is 1, but the next Element D is of the double type. To put it in the world of 8, the closest address to 1 is 8, so d is placed in 8, the next idle address becomes
16. The peer interface of the next element C is 4 or 16. Therefore, C is placed on 16, and the next idle address is 20. The next element D needs to be bound to 1, it is also in the right world, so d is placed in
20, and the struct ends at the address 21. Because the size of S1 must be a multiple of 8, the space from 21-23 is retained, and the size of S1 is changed to 24.

For S2, the first
Put a in the peer interface of 8, assuming that it is 0, the next idle address is 1, and the peer interface of the next element is also 1, so B is placed in 1, the next idle address is changed to 2; the peer interface of the next element C is 4
Place C at address 4 closest to 2, the next idle address is changed to 8, and the peer field of Element D is 8, so d is placed at 8, and all elements are placed, the struct ends at 15 points, occupying the total space
16, which is a multiple of 8.

There is a trap here. For struct members in the struct, do not consider its alignment as its size. See the following example:

Struct S1
{
Char A [8];
};

Struct S2
{
　
Double D;
};

Struct S3
{
S1 S;
Char;
};

Struct
S4
{
S2 S;
Char;
};

Cout <sizeof (S1) <Endl;
// 8
Cout <sizeof (S2) <Endl; // 8
Cout <sizeof (S3) <Endl;
// 9
Cout <sizeof (S4) <Endl; // 16;

　
The size of S1 and S2 is 8, but the alignment of S1 is 1 and S2 is 8 (double). Therefore, this difference exists in S3 and S4.

Therefore
When you define a struct, if the space is tight, you 'd better consider alignment factors to arrange the elements in the struct.

　　10. Do not let double interfere with your bit domain

In struct and classes, you can use a bit field to specify the space occupied by a Member. Therefore, using a bit field can save the space occupied by the struct to a certain extent. However, consider the following generation
Code:

Struct S1
{
Int I: 8;
Int J: 4;
Double B;
Int
A: 3;
};

Struct S2
{
Int I;
Int J;
Double B;
　
Int;
};

Struct S3
{
Int I;
Int J;
Int;
　
Double B;
};

Struct S4
{
Int I: 8;
Int J: 4;
　
Int A: 3;
Double B;
};

Cout <sizeof (S1) <Endl;
// 24
Cout <sizeof (S2) <Endl; // 24
Cout <sizeof (S3) <Endl;
// 24
Cout <sizeof (S4) <Endl; // 16

　
As you can see, the existence of double will interfere with the in-place domain (sizeof algorithm refer to the previous section), so when using the bit field, it is best to put the float type and double type in the program
.

Address: http://dev.yesky.com/143/2563643.shtml

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

C ++ sizeof rules and trap Analysis)

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support