Analysis of usage rules and traps of---sizeof in C + + knowledge points

Last Update:2015-06-05 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Original: http://blog.csdn.net/chenqi514/article/details/7245273

1. What is sizeof
First look at the definition of sizeof on MSDN:
The sizeof keyword gives the amount of storage, in bytes, associated with a variable or a type (including aggregate types) . This keyword returns a value of type size_t.
See the word return, do not think of the function? Wrong, sizeof is not a function, have you ever seen a function that passes arguments without parentheses? sizeof is possible, so sizeof is not a function. Online Some people say that sizeof is a unary operator, but I do not think so, because sizeof is more like a special macro, it is evaluated during the compilation phase. As an example:

cout<<sizeof (int) <<endl; An int length of 4 on a 32-bit machine
Cout<<sizeof (1==2) <<endl; the = = operator returns the bool type, equivalent to cout<<sizeof (BOOL) <<endl;
The compilation phase has been translated to:

cout<<4<<endl;
cout<<1<<endl;
Here is a trap, see the following program:

int a = 0;
Cout<<sizeof (a=3) <<endl;
cout<<a<<endl;
Why is the output 4,0 rather than the expected 4,3??? is the characteristics that sizeof handles during the compilation phase. Since sizeof cannot be compiled into a machine code, the contents of the sizeof scope, that is, (), are not compiled, but are replaced by the type. The = operator returns the type of the left operand, so a=3 is equivalent to int, and the code is replaced by:

int a = 0;
cout<<4<<endl;
cout<<a<<endl;
Therefore, it is impossible for sizeof to support chained expressions, which is not the same as unary operators.

Conclusion: Do not think of sizeof as a function, nor as a unary operator, as a special compilation preprocessing.

2, the use of sizeof
There are two ways to use sizeof:
(1) sizeof (object)
That is, using sizeof for an object can also be written in the form of sizeof object.

(2) sizeof (TypeName)
That is, using sizeof for the type, note that it is illegal to write sizeof TypeName in this case. Here are a few examples to illustrate:

int i = 2;
Cout<<sizeof (i) <<endl; The use of sizeof (object), reasonable
Cout<<sizeof i<<endl; The use of sizeof object, reasonable
Cout<<sizeof 2<<endl; 2 is parsed into an object of type int, the usage of sizeof object, reasonable
Cout<<sizeof (2) <<endl; 2 is parsed into an int type of object, the use of sizeof (object), reasonable
cout<<sizeof (int) <<endl;//sizeof (typename) usage, reasonable
Cout<<sizeof int<<endl; Error! For operators, be sure to add ()
As you can see, plus () is always the right choice.

Conclusion: It is best to add () whatever sizeof is going to value.

3, the data type of sizeof
(1) C + + intrinsic data type
The basic data type in 32-bit C + +, also char,short int (short), Int,long Int (long), float,double, long double
The sizes were: 1,2,4,4,4,8, 10.

Consider the following code:
cout<<sizeof (unsigned int) = = sizeof (int) <<endl; Equal, Output 1
Unsigned affects only the meaning of the highest bit, and the length of the data is not changed.

Conclusion: Unsigned can not affect the value of sizeof.

(2) Custom data types
typedef can be used to define C + + custom types. Consider the following question:
typedef short WORD;
typedef long DWORD;
cout<< (sizeof (short) = = sizeof (WORD)) <<endl; Equal, Output 1
cout<< (sizeof (long) = sizeof (DWORD)) <<endl; Equal, Output 1
Conclusion: The sizeof value of a custom type is equivalent to its type prototype.

(3) Function type
Consider the following question:
int F1 () {return 0;}
Double F2 () {return 0.0;}
void F3 () {}

Cout<<sizeof (F1 ()) <<endl; The F1 () return value is int and is therefore considered int
Cout<<sizeof (F2 ()) <<endl; The F2 () return value is double and is therefore considered double
Cout<<sizeof (F3 ()) <<endl; Error! Unable to use sizeof for void type
Cout<<sizeof (F1) <<endl; Error! Unable to use sizeof with function pointer
cout<<sizeof*f2<<endl; *F2, and F2 () are equivalent because they can be considered object, so parentheses are not necessary. is considered a double
Conclusion: Using sizeof for a function will be replaced by the type of the function return value in the compilation phase.

4, pointer problems
Consider the following questions:
Cout<<sizeof (string*) <<endl; 4
Cout<<sizeof (int*) <<endl; 4
Cout<<sizof (char****) <<endl; 4
As you can see, regardless of the type of pointer, the size is 4, because the pointer is the physical address of the 32-bit.

Conclusion: As long as the pointer, the size is 4. (It doesn't have to be 8 on a 64-bit machine.)

By the way, a pointer in C + + represents the address of the actual memory. Unlike C, in C + +, the mode is eliminated, that is, there is no more small,middle,big, instead of a unified flat. The flat mode is addressed with a 32-bit real address and is no longer the Segment:offset mode in C. For example, if there is a pointer to the address f000:8888, if the type C is 8888 (16 bits, only the displacement is stored, the omitted segment), the C pointer of the far type is F0008888 (32-bit, high-level reserved segment address, status reserved displacement), C + + type pointer is f8888 (32 bits, equivalent to segment address *16 + displacement, but the addressing range is larger).

5, array problems
Consider the following questions:
Char a[] = "abcdef";
int b[20] = {3, 4};
Char c[2][3] = {"AA", "BB"};

Cout<<sizeof (a) <<endl; 7
Cout<<sizeof (b) <<endl; 20*4
Cout<<sizeof (c) <<endl; 6
The size of array A is unspecified at the time of definition, and the space allocated to it at compile time is determined by the initialized value, which is 7. C is a multidimensional array, the space occupied by the number of dimensions of the product, that is, 6. As you can see, the size of the array is the space that he was allocated at compile time, that is, the product of the number of dimensions of the array element.

Conclusion: The size of the array is the product of the number of dimensions * The size of the array element.

Here's a trap:
int *d = new INT[10];
Cout<<sizeof (d) <<endl; 4
D is a dynamic array that we often say, but he is essentially a pointer, so the value of sizeof (d) is 4.

Consider the following question:
double* (*a) [3][6];
Cout<<sizeof (a) <<endl; 4
Cout<<sizeof (*a) <<endl; 72
Cout<<sizeof (**a) <<endl; 24
Cout<<sizeof (***a) <<endl; 4
Cout<<sizeof (****a) <<endl; 8
A is a very strange definition, he represents a pointer to an array of type double*[3][6]. Since it is a pointer, sizeof (a) is 4.

Since a is a pointer to a double*[3][6] type, *a represents a multidimensional array type of double*[3][6], so sizeof (*a) =3*6*sizeof (double*) = 72. Similarly, **a represents an array of type double*[6], so sizeof (**a) =6*sizeof (double*) = 24. A means one of the elements, namely double*, so sizeof (***a) = 4. As for ****a, it is a double, so sizeof (****A) =sizeof (double) = 8.

6. The problem of passing an array to a function
Consider the following question:
#include <iostream>
using namespace Std;
int Sum (int i[])
{
int Sumofi = 0;
for (int j = 0; J < sizeof (i)/sizeof (int); J + +)//Actually, sizeof (i) = 4
{
Sumofi + = I[j];
}
return Sumofi;
}

int main ()
{
int allages[6] = {21, 22, 22, 19, 34, 12};
Cout<<sum (allages) <<endl;
System ("pause");
return 0;
}
Sum is intended to use sizeof to get the size of the array, and then sum. But actually, the input from the sum of the function is just a pointer of type int, so sizeof (i) = 4 instead of 24, so it produces the wrong result. The way to solve this problem is to use pointers or references.

When using pointers:
int Sum (int (*i) [6])
{
int Sumofi = 0;
for (int j = 0; J < sizeof (*i)/sizeof (int); j + +)//sizeof (*i) = 24
{
Sumofi + = (*i) [j];
}
return Sumofi;
}

int main ()
{
int allages[] = {21, 22, 22, 19, 34, 12};
Cout<<sum (&allages) <<endl;
System ("pause");
return 0;
}
In this Sum, I is a pointer to the I[6] type, note that the function cannot be declared here with int Sum (int (*i) []), but must indicate the size of the array to be passed in, otherwise sizeof (*I) cannot be evaluated. In this case, however, it is meaningless to calculate the size of the array by sizeof, because the size is specified at 6.

The use of references is similar to pointers:
int Sum (int (&i) [6])
{
int Sumofi = 0;
for (int j = 0; J < sizeof (i)/sizeof (int); J + +)
{
Sumofi + = I[j];
}
return Sumofi;
}

int main ()
{
int allages[] = {21, 22, 22, 19, 34, 12};
Cout<<sum (allages) <<endl;
System ("pause");
return 0;
}
In this case, the calculation of sizeof is also meaningless, so with an array of parameters, and need to traverse the time, the function should have a parameter to explain the size of the array, and the size of the array in the scope of the definition of the array through sizeof evaluation. So the correct form of the above function should be:

#include <iostream>
using namespace Std;

int Sum (int *i, unsigned int n)
{
int Sumofi = 0;
for (int j = 0; J < N; j + +)
{
Sumofi + = I[j];
}
return Sumofi;
}

int main ()
{
int allages[] = {21, 22, 22, 19, 34, 12};
Cout<<sum (i, sizeof (allages)/sizeof (int)) <<endl;
System ("pause");
return 0;
}

7. SizeOf and strlen of strings
Consider the following question:
Char a[] = "abcdef";
Char b[20] = "abcdef";
string s = "abcdef";

Cout<<strlen (a) <<endl; 6, String length
Cout<<sizeof (a) <<endl; 7, String capacity
Cout<<strlen (b) <<endl; 6, String length
Cout<<sizeof (b) <<endl; 20, String capacity
Cout<<sizeof (s) <<endl; 12, this does not represent the length of the string, but the size of the type string
Cout<<strlen (s) <<endl; Error! S is not a character pointer.

A[1] = ' + ';
Cout<<strlen (a) <<endl; 1
Cout<<sizeof (a) <<endl; 7,sizeof is constant.
Strlen is looking for the number of characters from the specified address to the first 0, which is executed at run time, and sizeof is the size of the data, and here is the capacity to get the string. So for the same object, the value of sizeof is constant. String is a C + + type that is a class, so sizeof (s) represents not the length of the string, but the size of the class string. Strlen (s) is simply wrong, because the strlen argument is a character pointer, and if you want to use strlen to get the length of the s string, you should use sizeof (S.C_STR ()), because the member function of String c_str () returns the first address of the string. In fact, the string class provides its own member function to get the capacity and length of the string, respectively capacity () and lengths (). String encapsulation is used in strings, so it is best to use string instead of the C type string in C + + development.

About sizeof (string), as if the different implementations returned the same result:
Devcpp:4
Vs2005:32

8, from the sizeof problem of union to see the CPU of the interface
Consider the following question: (Default alignment)
Union u
{
Double A;
int b;
};

Union U2
{
Char a[13];
int b;
};

Union U3
{
Char a[13];
Char b;
};

Cout<<sizeof (U) <<endl; 8
Cout<<sizeof (U2) <<endl; 16
Cout<<sizeof (U3) <<endl; 13
All know that the size of the union depends on the size of one of its members, occupying the largest space. So for u, size is the largest double type member A, so sizeof (U) =sizeof (double) = 8. But for U2 and U3, the largest space is an array of char[13] types, why is the size of U3 13, and U2 16? The key is the member int b in U2. Due to the existence of the int type member, the alignment of the U2 becomes 4, that is, the size of the U2 must be in the 4 bounds, so the occupied space becomes 16 (closest to 13 of the bounds).

Conclusion: The alignment of a composite data type, such as Union,struct,class, is the alignment of the member with the highest alignment in the member.

By the way, the CPU-to-boundary problem, 32 C + + uses 8-bit bounds to improve the speed, so the compiler will try to put the data on its bounds to improve memory hit ratio. The bounds can be changed, and using the #pragma pack (x) macro can change the way the compiler is bound, by default it is 8. C + + intrinsic type of the bounds of the compiler to the bounds of the way and size of the smaller one. For example, specify that the compiler press 2 to bounds, and that the int type is 4, then the bounds of int are 2 and 4 in the smaller 2. In the default bounds mode, because almost all data types are not more than the default bounds 8 (except long double), all of the intrinsic types of bounds can be considered to be the size of the type itself. Change the program above:

#pragma pack (2)
Union U2
{
Char a[13];
int b;
};

Union U3
{
Char a[13];
Char b;
};
#pragma pack (8)

Cout<<sizeof (U2) <<endl; 14
Cout<<sizeof (U3) <<endl; 13
Since the manual change of the bounded mode is 2, the bounds of int also becomes the largest pair of bounds of the 2,U2 member, and it is also 2, so at this time sizeof (u2) = 14.

Conclusion: The native type of C + + has the smaller one to the bounds of the compiler and its own size.

9. The sizeof problem of struct
Because the alignment problem makes the structure of sizeof more complicated, see the following example: (Default alignment)
struct S1
{
Char A;
Double b;
int C;
Char D;
};

struct S2
{
Char A;
Char b;
int C;
Double D;
};

Cout<<sizeof (S1) <<endl; 24
cout<<sizeof (S2) <<endl; 16
The same is two char type, an int type, a double type, but because of a problem with the bounds, it causes them to differ in size. The size of the structure can be calculated using the element placement method, I give an example: First, the CPU to determine the structure of the bounds, according to the previous section, the S1 and S2 to the bounds of the largest element type, that is, the double type of the bounds 8. Then start placing each element.

For S1, first put a to 8 of the bounds, the assumption is 0, at this time the next idle address is 1, but the next element D is a double type, to put to 8 of the bounds, the nearest address 1 is 8, so D is placed in 8, at this time the next idle address becomes 16, the next element c of the bounds is 4, 16 can be satisfied, so C is placed in 16, at this time the next idle address becomes 20, the next element D needs to the bounds 1, also just falls on the bounds, so D is placed in 20, the structure ends at address 21. Since the size of the S1 needs to be a multiple of 8, 21-23 of the space is reserved and the S1 size becomes 24.

For S2, first put a to 8 of the bounds, the assumption is 0, at this time the next idle address is 1, the next element of the bounds is also 1, so B is placed in 1, the next idle address becomes 2, the next element c is 4, so take the 2 nearest address 4 is placed C, the next idle address becomes 8, The next element d of the bounds is 8, so D is placed in 8, all elements are placed, the structure ends at 15, occupying a total space of 16, just a multiple of 8.

Here's a trap, for struct members in a struct, don't think its alignment is his size, look at the following example:

struct S1
{
Char A[8];
};

struct S2
{
Double D;
};

struct S3
{
S1 s;
Char A;
};

struct S4
{
S2 s;
Char A;
};

Cout<<sizeof (S1) <<endl; 8
cout<<sizeof (S2) <<endl; 8
Cout<<sizeof (S3) <<endl; 9
Cout<<sizeof (S4) <<endl; 16;
The S1 and S2 sizes are 8, but the S1 alignment is 1,s2 is 8 (double), so there is such a difference in S3 and S4.

Therefore, when you define a structure, if space is tense, it is best to consider the alignment factors to arrange the elements in the structure.

10. Do not let double interfere with your bit field
In structs and classes, you can use bit fields to specify the space that a member can occupy, so using bit fields can save a certain amount of space that the structure occupies. But consider the following code:

struct S1
{
int i:8;
int j:4;
Double b;
int a:3;
};

struct S2
{
int i;
Int J;
Double b;
int A;
};

struct S3
{
int i;
Int J;
int A;
Double b;
};

struct S4
{
int i:8;
int j:4;
int a:3;
Double b;
};

Cout<<sizeof (S1) <<endl; 24
cout<<sizeof (S2) <<endl; 24
Cout<<sizeof (S3) <<endl; 24
Cout<<sizeof (S4) <<endl; 16
As you can see, there is a double presence that interferes with the in-place domain (sizeof's algorithm references the previous section), so when using bit fields, it is best to place the float type and the double type at the beginning or last of the program.

Analysis of usage rules and traps of---sizeof in C + + knowledge points

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Analysis of usage rules and traps of---sizeof in C + + knowledge points

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Analysis of usage rules and traps of---sizeof in C + + knowledge points

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support