The problem of C/C ++ drill-down is detailed in sizeof

Source: Internet
Author: User

Transferred from w57w57w57csdn blog:Http://blog.csdn.net/w57w57w57/article/details/6626840

This articleArticleI wrote a lot too well, for fear I couldn't find it, so I reprinted it as follows:

Abstract:

Sizeof is very simple: Evaluate the size of an object or type. However, sizeof is very complex. It involves many special cases. This article classifies these cases and summarizes the ten features of sizeof:

(0) sizeof is an operator, not a function;

(1) sizeof cannot obtain the void length;

(2) sizeof can obtain the length of void pointer;

(3) sizeof can obtain the length of the array for static memory allocation!

(4) sizeof cannot calculate the size of dynamically allocated memory!

(5) sizeof cannot evaluate the length of an incomplete array;

(6) When the expression is used as the operand of sizeof, it returns the type size of the calculation result of the expression, but it does not evaluate the expression!

(7) sizeof can calculate the size of the function call, and the obtained size is equal to the size of the return type, but no function body is executed!

(8) The size of the struct (and its objects) obtained by sizeof is not equal to the sum of the size of each data member object!

(9) sizeof cannot be used to evaluate the size of the bitfield Member of the struct, but it can be used to obtain the size of the struct containing the bitfield member!

 

 

Overview:

Sizeof is in C/C ++Keywords, It isOperatorTo obtain an object (Data TypeOrData Object) (That is, the size of the memory occupied,In bytes). The types include basic data types (excluding void), user-defined types (struct, class ),Function Type. Data Objects refer to common variables and pointer variables (including void pointers) defined with the aforementioned types ). The size of different types of data varies with different platforms, but the C standard stipulates that all compilation platforms should ensure that sizeof (char) is equal to 1. For more information about sizeof, you can enter sizeof in msdn.

After reading the above, you may not feel much. It doesn't matter. I will detail the many features of sizeof below, which are the reason that sizeof is a keyword that is relatively tricky:

Top 10 features:

Feature 0: sizeof is an operator, not a function

This feature is the most basic feature of sizeof. Many of the following features are affected by this feature. Because sizeof is not a function, we do not call the object with the required length as a parameter, I am used to being called operands (this is not rigorous, but it helps me remember that sizeof is an operator ).

Feature 1: sizeof cannot obtain the void Length

Yes, you cannot use sizeof (void), which will cause a compilation error: illegalsizeofoperand. In fact, you cannot declare void variables at all. If you don't believe it, try void a. the compiler will also report the following error: Illegal use of type 'void '. Maybe you have to ask why, well, you can't just learn things, but you need to know why. We know that an important role of declaring a variable is to tell the compiler how much storage space the variable needs. However, void is an "empty type". What is an empty type? You can understand it as a type that does not know the bucket size. Since the compiler cannot determine the storage size of void variables, it naturally does not allow you to declare such variables. Of course, the void type pointer can be declared! This is the content of Feature 2.

Feature 2: sizeof can obtain the length of void pointer

As mentioned in feature 1, void pointers can be affirmed, that is, the compiler can determine the storage space occupied by void pointers. In fact, at present, almost all versions of the compiler on all platforms regard the pointer size as 4 bytes. If you don't believe it, try sizeof (int *); sizeof (void *); sizeof (double *); sizeof (person *); and so on, they are all 4! Why? I will try my best to explain this question: actuallyPointer is also a variableBut this variable is special. It isVariable that stores the addresses of other variables. Because the current 32-bit computer platformProgramThe addressing range of the segments is 4 GB, and the minimum addressing unit is byte. 4 GB equals 232 bytes. If so many memory addresses are encoded, you only need to use 32 bits, while 32bit
= 32/8 = 4 bytes, that is, only 4 bytes can be used to store these memory addresses. Therefore, the result of sizeof operations on Pointer variables of any type is 4!

Feature 3: sizeof can obtain the length of the array for static memory allocation!

Int A [10]; int n = sizeof (a); If sizeof (INT) is equal to 4, n = 10*4 = 40; note: charch [] = "ABC"; sizeof (CH); Result 4. Note that '\ 0' is at the end of the string array '! Generally, we can use sizeof to calculate the number of elements contained in the array. The method is: int n = sizeof (a)/sizeof (A [0]);

It is important to note that sizeof is used for the form parameter array of the function. For example, assume there are the following functions:

 
Void fun (INT array [10]) {int n = sizeof (array );}

What do you think is the value of N in fun? If you answer 40, I am sorry to tell you that you are wrong again. N is equal to 4. In fact, no matter whether the parameter is an int array, a float array, or any other user-defined array, no matter how many elements the array contains, here N is 4! Why? The reason is that when the function parameter is passed, the array is converted to a pointer. You may want to ask why the array is converted to a pointer. The reason can be found in many books. Let me simply say: if the entire array is directly transmitted, copying the array elements (copying the real parameters to the form parameters) is required. When the array is very large, this will cause very low function execution efficiency! Instead, only the address of the array (that is, the pointer) needs to be copied 4 bytes.

Feature 4: sizeof cannot calculate the size of dynamically allocated memory!

Suppose there are the following statements: int * A = new int [10]; int n = sizeof (a); then what is the value of n? Is it 40? The answer is no! In fact, n is equal to 4, because a is a pointer, as mentioned in Feature 2: On a 32-bit platform, the size of all pointers is 4 bytes! Remember, the here is different from the in feature 3! Many people (or even some teachers) think that the array name is a pointer. Otherwise, there are many differences between the two. For details, please refer to Expert C programming. Through Feature 3 and Feature 4, we can see that arrays and pointers are closely related. These relationships are also a major cause of potential program errors 《How to break the pointer and array secrets of C/C ++ DrillThis article provides a detailed introduction.

Feature 3 points out that sizeof can calculate the size of the statically allocated array, and feature 4 shows the size of the dynamically allocated memory that sizeof cannot calculate. So some people think that sizeof is evaluated during compilation, and give the reason: the statement intarray [sizeof (INT) * 10]; can be compiled through, many books have said that the array size is determined during compilation. Since the previous statement can be compiled, sizeof is considered to be evaluated during compilation. After further tests, I found this conclusion somewhat arbitrary! At least some are not rigorous! Because a dynamic array can be defined in a compiler that implements the c99 standard (such as Dev C ++), that is, the statement: int num; CIN> num;
Int arrary [num]; is correct (note that it is incorrect in vc6.0 ). Therefore, in Dev C ++, I used the statement INTN = sizeof (array); cout <n <Endl to calculate the size of the array. The result is compiled successfully, after entering the num value 10 during the runtime, the output N is equal to 40! Obviously, the num value is input at runtime, so sizeof cannot obtain the array size during compilation! In this way, sizeof is evaluated at runtime.

So whether sizeof is the value during compilation or runtime? At the beginning of the C standard, sizeof can only be evaluated during compilation. Later, c99 added that sizeof can be evaluated during runtime. However, even in Dev C ++ that implements the c99 standard, sizeof cannot be used to calculate the size of the dynamically allocated memory!

Feature 5: sizeof cannot evaluate the length of an incomplete array!

Before proceeding with this feature, we assume that there are two source files: file1.cpp and file2.cpp, which have the following definitions:

 
Int arraya [10] = {1, 2, 4, 5, 6, 7, 8, 9, 10}; int arrayb [10] = {11, 12, 13, 14, 15 }; file2.cpp contains the following statements: externarraya []; externarrayb [10]; cout <sizeof (arraya) <Endl; // compilation error !! Cout <sizeof (arrayb) <Endl;

An error occurred while compiling the third statement in file2.cpp. The third statement is correct and can output 40! Why? The reason is that sizeof (arraya) tries to calculate the size of the incomplete array. The incomplete array here is an array with no definite index group size! The sizeof operator is used to evaluate the size of an object. However, the declaration: extern intarraya [] only tells the compiler that arraya is an integer array, but does not tell the compiler how many elements it contains, therefore, sizeof in file2.cpp cannot calculate the size of arraya, so the compiler won't let you compile it.

So why does sizeof (arrayb) obtain the size of arraryb? The key lies in the use of externint arrayb [10] In file2.cpp to explicitly tell the compiler that arrayb is an integer array containing 10 elements, so the size is determined.

This topic is about to end. In fact, this problem can be extended.Connection and compilationHowever, I have no confidence to give a detailed and thorough explanation of these two knowledge points, in the near future, I will discuss related issues in this series.

Feature 6: when the expression is used as the sizeof operand, it returns the type size of the calculation result of the expression, but it does not evaluate the expression!

To illustrate this problem, let's look at the following program statements:

 
Char CH = 1; intnum = 1; int n1 = sizeof (CH + num); int n2 = sizeof (CH = CH + num );

Assume that char occupies 1 byte and INT occupies 4 byte. What is the value of N1, N2, and CH after the above program is executed? I believe many people will think that N1 is equal to N2, and many people think that CH is equal to 2. All these people are wrong. In fact, N1 is equal to 4, N2 is equal to 1, and CH is equal to 1. Why? See the analysis:

Because of the default type conversion, the calculation result type of the expression CH + num is int, so the N1 value is 4! The expression CH = CH + num; returns the char type. Remember that when calculating CH + num, the result is int, however, when the result is assigned to CH, type conversion is performed again. Therefore, the final type of the expression is Char, so N2 is equal to 1. Values of N1 and N2 are 4 and 1, respectively,The reason is that sizeof returns the type size of the expression calculation result, rather than the type size of the variable that occupies the maximum memory in the expression!

For n2 = sizeof (CH = CH + num); At first glance, this program seems to have implemented the function of adding num to CH and assigning it to Ch. This is not the case! Because sizeof only cares about the type size, it should not evaluate the expression. Otherwise, it is too confusing. This is precisely because of this. I would like to warn you not to evaluate the expression size directly in sizeof to avoid errors. You can convert sizeof (CH = CH + num ); rewrite to CH = CH + num; sizeof (CH); although there is an additional statement that seems redundant, there are actually many advantages: first, it is clearer, second, there will be no errors such as CH equal to 1 (assuming that the logic of the program itself is to execute ch
= CH + num ;).

Feature 7: sizeof can calculate the size of the function call, and the obtained size is equal to the size of the return type, but the function body is not executed!

Suppose there are the following functions (it is a function that is not well written, but it can explain the problems that need to be elaborated well ):

 
Int fun (Int & num, const Int & Inc) {floatdiv = 2.0; doubleret = 0; num = num + Inc; ret = num/div; returnret ;}

Statement:

 
Int A = 3; int B = 5; cout <sizeof (fun (a, B) <Endl; cout <A <Endl;

How much is output? Different people will give different answers. I will discuss the values of sizeof (fun (A, B) and a respectively:

First, sizeof (fun (a, B) value: it is indeed 4, because sizeof is used to calculate the size of the function call, and it gets the FunctionReturn typeAnd the return type of fun (A, B) is int, and sizeof (INT) is 4. Many people put the FunctionReturn typeAndType of Return ValueAfter obfuscation, we think that the value of sizeof (fun (A, B) is 8, because the return value of the function is ret, and RET is defined as double, and sizeof (Doube) is equal to 8. Note: although the function return value belongs to the double type, this value is converted to the type when the function returns (the conversion here is not secure ). Some people mistakenly think that the value of sizeof (fun (A, B) is 12, and their reason is: Fun internally defines two local variables, one is float and the other is double, and sizeof (float) + sizeof (Doube) =
4 + 8 = 12. The answer seems reasonable. In fact, they mistakenly think that the sizeof here is the variable size inside the function. This is of course incorrect.

Next, let's look at the value of A: The correct answer is 3! Do you still remember feature 6? This is similar,When the sizeof operation object is called, it does not execute the function body!Therefore, we recommend that you do not put the function body in the brackets behind sizeof. This makes it easy to mistakenly assume that the function is executed, but it is not actually executed at all.

Since sizeof is used to obtain the size of the function return type, it is natural to come to the conclusion that sizeof cannot be used to calculate the size of a function with the return type void! For the reason, see feature 1. Similarly, for function calls that return any type of pointer, sizeof is used to obtain a value of 4. For the reason, see Feature 2.

Finally, let's take a look at this statement: cout <sizeof (fun); what is the answer? In fact, it cannot get the answer, because the compilation won't work! At first, I thought I could output Answer 4, because I think fun is the function name, and I know that the function name is the function address, and the address is the pointer, So I think sizeof (fun) in fact, it is to calculate the size of a pointer. According to feature 2, the size of any pointer is 4. But when I verify it, the compiler won't let me pass it! Why? I can't think of it at half past one, so please add it to your friends!

Feature 8: the size of the struct (and its objects) obtained by sizeof is not equal to the sum of the size of each data member object!

The size of the struct is closely related to the alignment of the struct members, rather than simply equal to the sum of the sizes of each member! For example, the result of using sizeof for struct a and struct B is 16, 24, respectively. We can see that sizeof (B) is not equal to sizeof (INT) + sizeof (double) + sizeof (INT) = 16.

Struct {

Int num1;

Int num2;

Double num3;

};

Struct B {

Int num1;

Double num3;

Int num2;

};

If you don't know the member alignment of the struct, you will be surprised: the members in struct a and struct B are the same, but the order is different. Why is the size different? To solve this problem, we need to understand the rules for alignment of struct members. Because the alignment of struct members is very complex,I will use the topic-C/C ++ drill questions to break through the domain and align the members-- Here I will briefly introduce the rules:

1. The size of the struct is an integer multiple of the maximum member size in the body.

2. The offset between the first address of a member in the structure and the first address of the structure is an integer multiple of its type, for example, the address offset of a double-type member relative to the first address of the struct should be a multiple of 8.

3. To satisfy rules 1 and 2, the compiler will fill in bytes after the struct members!

 

Based on the above three rules, Let's see why sizeof (B) is equal to 24: first, assume that the first address of the struct is 0, and the first address of the first Member num1 is 0 (rule 2 is met, in fact, the struct will never be filled in byte before the first data member). Its type is int, so it occupies the address space from 0 to 3. The second member num3 is of the double type. It occupies 8 bytes. Since the previous num1 occupies only 4 bytes, in order to meet rule 2, rule 3 needs to be filled with four bytes (4--7) after num1, so that the starting address offset of num3 is 8, so the address space occupied by num3 is 8-15. The third member num2 is of the int type and the size is 4. Because num1 and num3 occupy 16 bytes in total, rule 2 can be satisfied without any padding. Therefore, the address space occupied by num2 is 16--19. Is the total size of the struct 20 bytes in the 0--19 format? Please note that do not forget rule 1! Because the largest member in the structure is double, it occupies 8 bytes. Therefore, four bytes need to be added after num2 to make the total size of the structure 24.

According to the above three rules and analysis process, you can easily know why sizeof (a) is equal to 16.Note thatI have provided three conclusive rules, but I have not explained why. You may have a lot of questions: why should the structure members be aligned and why Rule 1 should be defined. If you have such a question and try to figure it out, I can assert that in the near future, you will certainly have great achievements, at least in learning C ++. As mentioned above, I will write another topic:The location and member alignment of each split in the C/C ++ DrillTo answer these questions in detail, if you are eager to understand, you can refer to other materials, such as the high quality c ++ programming guide.

Finally, let me remind you that during the design, it is best to carefully arrange the order of each member in the struct, because you have seen that the above struct B is the same as the Member contained in struct, however, the order is slightly different, resulting in B consuming 50% more space than a. If the array of the struct needs to be defined in the project, the airborne consumption will be huge. Even if the memory price will be reduced to the cabbage price in the future, you should not ignore this problem. Thrift is a fine tradition of the Chinese people. We should inherit and maintain it!

Feature 9: sizeof cannot be used to calculate the size of the bitfield Member of the struct, but can obtain the size of the struct containing the bitfield member!

First, we will explain what is a bit field: the size of the type is based on bytes, for example, sizeof (char) is 1 byte, and sizeof (INT) is 4 byte. We know that the size of a type determines the range of variables that can be defined by the type. For example, sizeof (char) is 1 byte, and 1 byte is 8 bit, so the char type variable range is-128--127, or 0--255 (unsignedchar). In short, it can only be defined as 28 = 256! However, the bool type can only be set to true or false. Generally, only 1 bit (1/8 bytes) is enough, but in fact sizeof (bool) is equal to 1. Therefore, we can think that bool variables waste 87.5% of storage space! This is not suitable for some devices with limited storage space (such as embedded devices). Therefore, we need to provide a slot mechanism for variable storage space, which is a bit domain. To put it simplyStructThe member variable followed by a colon + an integer represents a bit field. See the following struct:

 
Struct a {boolb: 1; charstrap: 4; charch2: 4;} item;



Among them, B, worker, and CH2 are both bit domain members, while I is a common member. TheTryLet the variable B of the bool type occupy only one bit, and the variable and CH2 occupy only four bit, respectively, so as to achieve the function of careful memory calculation (In fact, bitwise computing of memory is sometimes successful, but sometimes not necessarily, I will 《The location and member alignment of each split in the C/C ++ Drill).Note the following:C LanguageThe specified bit field can only be used for int, signed int, or unsigned int type. c ++ supplements the char and long types!You cannot use the bitfield floatf: 8 in this way; this cannot be compiled.The bitfield variables cannot be defined in the function or global zone.Can only be used in struct, custom class, union!

Based on the structure above, statements such as sizeof (item. B) and sizeof (item. Limit) cannot be compiled by statements that calculate the size of the corresponding domain members.The reason can be found in this introduction: sizeof returns the size of the operand in bytes!

You may ask if sizeof (a) can be compiled? How can this problem be solved? This is a very good question. In fact, I have not seen any discussion about this (maybe I have not read enough information). I am seeing sizeof (item. b) You can't think of these two problems during compilation, and then come to the following conclusion through verification:You can use sizeof to calculate the size of struct containing bits. However, the evaluation rules of struct are complex, which involves not only Member alignment, but also the specific compiling environment!Here, you only need to know that you can use sizeof to calculate the size of struct containing bit fields. For the sizeof rule, I will discuss this issue in the topic :《C/C ++ drill-by-drill questions: Bit domain and member alignmentAre described in detail.

Postscript:

So far, this topic is almost over. It should be noted that this topic does not contain all knowledge points about sizeof, but it also contains almost all the error-prone features. It took me three and a half days to complete the article. As this is the first topic in this series, I am extremely cautious and I am afraid that wrong information will mislead you. Even so, it is inevitable to make mistakes or mistakes. please correct me!

In addition, I have a few words to say to College Students: textbooks are usually just basic knowledge. To study them in depth, I also need to read other materials, such as papers, online materials, and Forum blog posts, the most important thing is to often sum up, record, and sum up during learning, so it will benefit a lot from persistence.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.