Sizeof and sizeof (string) Problems

Source: Internet
Author: User

Today I read the book "Programmer interview book" (to cope with the Microsoft written examination coming soon) and saw the sizeof (string) problem. The test result on dev C ++ is 4, which is hard to understand. Search the Internet and get the following results:

String strarr1 [] = {"trend", "micro", "soft "};

Sizeof (strarr1) = 12


From: http://apps.hi.baidu.com/share/detail/30398570

About sizeof (string), I was a little surprised to see this expression when I read the interview book today. The book says sizeof (string) = 4. I was wondering at the time, is it possible to allocate 4 bytes of memory to the string? After reading the relevant information, we can conclude that the implementation of string may be different in different databases, but the same point in the same database is that no matter how long your string contains, its sizeof () is fixed, and the space occupied by the string is dynamically allocated from the heap, regardless of sizeof.
Sizeof (string) = 4 may be one of the most typical implementations, but there are also library implementations where sizeof () is 12 and 32 bytes. However, after the vc6.0 test, sizeof (string) = 16. It is still related to the compiler.

# Include <iostream>
Using namespace STD;
Void main (void)
{
String a [] = {"AAAAA", "BBBB", "CCC "};
Int x = sizeof ();
Int y = sizeof (string );
Cout <x <Endl;
Cout <Y <Endl;
}

Running result:

For more usage of sizeof, see: http://hi.baidu.com/haijiaoshu/blog/item/a269f527706b910a908f9d5b.html

1. What is sizeof

First, let's take a look at the definition of sizeof on msdn:

The sizeof keyword gives the amount of storage, in bytes, associated with a variable or a type (including aggregate types). This keyword returns a value of Type size_t.

When I saw the word "return", did I think of a function? Wrong. sizeof is not a function. Have you ever passed parameters to a function without brackets? Sizeof is acceptable, so sizeof is not a function. Some people on the Internet say that sizeof is a unary operator, but I don't think so, because sizeof is more like a special macro, Which is evaluated during the compilation stage. For example:

 

Cout <sizeof (INT) <Endl; // the length of an int on a 32-bit machine is 4.

Cout <sizeof (1 = 2) <Endl; // = Operator returns the bool type, equivalent to cout <sizeof (bool) <Endl;

It has been translated:

Cout <4 <Endl;

Cout <1 <Endl;

Here is a trap. Let's look at the following program:

Int A = 0;

Cout <sizeof (A = 3) <Endl;

Cout <A <Endl;

Why is the output 4 or 0 instead of the expected 4 or 3 ??? It lies in the features of sizeof processing in the compilation phase. Because sizeof cannot be compiled into machine code, the content in the scope of sizeof, that is, (), cannot be compiled, but replaced with the type. = Operator returns the type of the left operand, so a = 3 is equivalent to int, and the code is replaced:

Int A = 0;

Cout <4 <Endl;

Cout <A <Endl;

Therefore, sizeof cannot support chained expressions, which is different from the unary operator.

Conclusion: Do not treat sizeof as a function, or a mona1 operator, and treat it as a special compilation preprocessing.

2. sizeof usage

Sizeof has two usage methods:

 

(1) sizeof (object)

That is, the sizeof object can also be written as a sizeof object. For example:

(2) sizeof (typename)

That is to say, sizeof is used for the type. In this case, it is invalid to write sizeof typename. The following are examples:

 

Int I = 2;

Cout <sizeof (I) <Endl; // sizeof (object) usage, reasonable

Cout <sizeof I <Endl; // sizeof object usage, reasonable

Cout <sizeof 2 <Endl; // 2 is parsed to an int-type object. The usage of sizeof object is reasonable.

Cout <sizeof (2) <Endl; // 2 is resolved to an int-type object. The usage of sizeof (object) is reasonable.

Cout <sizeof (INT) <Endl; // sizeof (typename) usage, reasonable

Cout <sizeof int <Endl; // error! For operators, you must add ()

We can see that adding () is always the right choice.

Conclusion: It is best to add () regardless of the value of sizeof ().

3. sizeof of Data Type

(1) Inherent Data Types of C ++

The basic data type in 32-bit C ++, that is, Char, short int (short), Int, long int (long), float, double, long double

The values are: 1, 2, 4, 4, 8, and 10.

Consider the following code:

Cout <sizeof (unsigned INT) = sizeof (INT) <Endl; // equal, output 1

Unsigned only affects the meaning of the highest bit, and the data length will not be changed.

Conclusion: Unsigned does not affect the value of sizeof.

(2) Custom Data Types

Typedef can be used to define the C ++ custom type. Consider the following questions:

Typedef short word;

Typedef long DWORD;

Cout <(sizeof (short) = sizeof (Word) <Endl; // equal, output 1

Cout <(sizeof (long) = sizeof (DWORD) <Endl; // equal, output 1

Conclusion: the sizeof value of the custom type is the same as that of the original type.

(3) Function Type

Consider the following questions:

Int F1 () {return 0 ;};

Double F2 () {return 0.0 ;}

Void F3 (){}

Cout <sizeof (F1 () <Endl; // the return value of F1 () is int, so it is considered as int

Cout <sizeof (F2 () <Endl; // the return value of F2 () is double, so it is considered as double.

Cout <sizeof (F3 () <Endl; // error! Unable to use sizeof for void type

Cout <sizeof (F1) <Endl; // error! Unable to use sizeof for function pointer

Cout <sizeof * F2 <Endl; // * F2, equivalent to F2 (), because it can be viewed as an object, parentheses are not necessary. Is considered double

Conclusion: When sizeof is used for a function, it will be replaced by the type of the function return value in the compilation phase,

4. pointer Problems

Consider the following questions:

 

Cout <sizeof (string *) <Endl; // 4

Cout <sizeof (int *) <Endl; // 4

Cout <sizof (char ***) <Endl; // 4

We can see that no matter what type of pointer, the size is 4, because the pointer is a 32-bit physical address.

Conclusion: The pointer size is 4. (64-bit hosts cannot be 8 ).

By the way, the pointer in C ++ indicates the actual memory address. Unlike C, C ++ removes the pattern, that is, there is no small, middle, or big, and it is replaced by a unified flat. The flat mode uses 32-bit real address addressing, instead of the segment: offset mode in C. For example, if there is a pointer pointing to the address f000: 8888, and if it is C type, it is 8888 (16 bits, only store the displacement, omitted segments ), far Type C pointer is f0008888 (32-bit, high reserved segment address, position reserved displacement), c ++ type pointer is f8888 (32-bit, it is equivalent to segment address * 16 + displacement, but the addressing range must be larger ).

5. array Problems

Consider the following questions:

Char A [] = "abcdef ";

Int B [20] = {3, 4 };

Char C [2] [3] = {"AA", "BB "};

 

Cout <sizeof (a) <Endl; // 7

Cout <sizeof (B) <Endl; // 20*4 = 80

Cout <sizeof (c) <Endl; // 6

 

The size of array A is not specified during definition. The space allocated to it during compilation is determined according to the initialization value, that is, 7. C is a multi-dimensional array, and the space occupied is the product of each dimension, that is, 6. It can be seen that the size of the array is the space allocated during compilation, that is, the product of each dimension * the size of the array element.

Conclusion: The size of an array is the product of all dimensions * the size of an array element.

There is a trap:

Int * D = new int [10];

Cout <sizeof (d) <Endl; // 4

D is a dynamic array we often call, but it is actually a pointer, so the value of sizeof (d) is 4.

Consider the following questions:

Double * (* A) [3] [6];

 

Cout <sizeof (a) <Endl; // 4

Cout <sizeof (* A) <Endl; // 72

Cout <sizeof (** A) <Endl; // 24

Cout <sizeof (*** A) <Endl; // 4

Cout <sizeof (*** A) <Endl; // 8

A is a very strange definition, which indicates a pointer to an array of the double * [3] [6] type. Since it is a pointer, sizeof (a) is 4.

Since a is a pointer of the double * [3] [6] type, * a indicates a multi-dimensional array of the double * [3] [6] type. Therefore, sizeof (*) = 3*6 * sizeof (double *) = 72. Similarly, ** A indicates an array of the double * [6] type, so sizeof (** A) = 6 * sizeof (double *) = 24. * ** A indicates an element, that is, double *. Therefore, sizeof (*** A) = 4. As for *** A, it is a double, so sizeof (*** A) = sizeof (double) = 8.

6. The problem of passing arrays to functions.

Consider the following questions:

# Include <iostream>

Using namespace STD;

Int sum (int I [])

{

Int sumofi = 0;

For (Int J = 0; j <sizeof (I)/sizeof (INT); j ++) // actually, sizeof (I) = 4

{

Sumofi + = I [J];

}

Return sumofi;

}

Int main ()

{

Int allages [6] = {21, 22, 22, 19, 34, 12 };

Cout <sum (allages) <Endl;

System ("pause ");

Return 0;

}

Sum is used to get the size of the array with sizeof, and then sum. But in fact, the Input Self-function Sum is only a pointer of the int type, so sizeof (I) = 4, not 24, so it will produce an error. To solve this problem, use a pointer or reference.

Pointer usage:

Int sum (INT (* I) [6])

{

Int sumofi = 0;

For (Int J = 0; j <sizeof (* I)/sizeof (INT); j ++) // sizeof (* I) = 24

{

Sumofi + = (* I) [J];

}

Return sumofi;

}

Int main ()

{

Int allages [] = {21, 22, 22, 19, 34, 12 };

Cout <sum (& allages) <Endl;

System ("pause ");

Return 0;

}

In this sum, I is a pointer to the I [6] type. Note that int sum (INT (* I) []) cannot be used to declare a function, instead, you must specify the size of the array to be passed in. Otherwise, sizeof (* I) cannot be calculated. However, in this case, it is meaningless to use sizeof to calculate the array size, because the size is set to 6.

The reference is similar to the pointer:

Int sum (INT (& I) [6])

{

Int sumofi = 0;

For (Int J = 0; j <sizeof (I)/sizeof (INT); j ++)

{

Sumofi + = I [J];

}

Return sumofi;

}

Int main ()

{

Int allages [] = {21, 22, 22, 19, 34, 12 };

Cout <sum (allages) <Endl;

System ("pause ");

Return 0;

}

In this case, sizeof calculation is meaningless. Therefore, an array is used as a parameter. When traversal is required, the function should have a parameter to describe the size of the array, the size of the array is evaluated by sizeof within the scope defined by the array. Therefore, the correct form of the above function should be:

# Include <iostream>

Using namespace STD;

Int sum (int * I, unsigned int N)

{

Int sumofi = 0;

For (Int J = 0; j <n; j ++)

{

Sumofi + = I [J];

}

Return sumofi;

}

Int main ()

{

Int allages [] = {21, 22, 22, 19, 34, 12 };

Cout <sum (I, sizeof (allages)/sizeof (INT) <Endl;

System ("pause ");

Return 0;

}

7. String sizeof and strlen

Consider the following questions:

Char A [] = "abcdef ";

Char B [20] = "abcdef ";

String S = "abcdef ";

Cout <strlen (a) <Endl; // 6, String Length

Cout <sizeof (a) <Endl; // 7, string capacity

Cout <strlen (B) <Endl; // 6, String Length

Cout <sizeof (B) <Endl; // 20, string capacity

Cout <sizeof (s) <Endl; // 12, which does not represent the length of the string, but the size of the string class

Cout <strlen (s) <Endl; // error! S is not a character pointer.

A [1] = '\ 0 ';

Cout <strlen (a) <Endl; // 1

Cout <sizeof (a) <Endl; // 7, sizeof is constant

Strlen is the number of characters starting from the specified address to the first zero. It is executed in the running stage, and sizeof is the data size, here is the string capacity. Therefore, the sizeof value is constant for the same object. String is a C ++ string. It is a class. Therefore, sizeof (s) indicates not the length of a string, but the size of a string. Strlen (s) is wrong at all, because the strlen parameter is a character pointer. If you want to use strlen to get the length of the S string, you should use sizeof (S. c_str (), because the string member function c_str () returns the first address of the string. In fact, the string class provides its own member functions to obtain the capacity and length of the string, namely capacity () and length (). String encapsulates string operations, so it is best to use string to replace C-type strings during C ++ development.

8. view the CPU peer bounds from the sizeof problem of union

Consider the following: (default alignment)

Union u

{

Double;

Int B;

};

Union U2

{

Char A [13];

Int B;

};

Union U3

{

Char A [13];

Char B;

};

Cout <sizeof (u) <Endl; // 8

Cout <sizeof (U2) <Endl; // 16

Cout <sizeof (U3) <Endl; // 13

We all know that the size of Union depends on the size of one member that occupies the largest space among all its members. So for u, the size is the largest double type member A, so sizeof (u) = sizeof (double) = 8. However, for U2 and U3, the maximum space is an array of char [13] type. Why is the size of U3 13 and that of U2 16? The key lies in the member int B in u2. Because of the existence of int type members, the U2 alignment is changed to 4. That is to say, the U2 size must be 4 to the world, therefore, the occupied space is changed to 16 (the nearest 13 peer ).

Conclusion: The alignment of composite data types, such as Union, struct, and class, is the alignment of the members with the largest alignment.

By the way, the 32 C ++ uses eight-bit bounds to increase the running speed. Therefore, the compiler tries its best to put the data in the world to improve the memory hit rate. The field can be changed. The # pragma pack (x) macro can be used to change the method of the compiler's field. The default value is 8. C ++ is a smaller method than its own size. For example, if you specify that the compiler is bounded by 2 pairs and the size of the int type is 4, the int pair is bounded by 2 and 2, which is smaller than 4. In the default method, because almost all data types are not greater than the default method 8 (except long double ), therefore, all inherent types of peer methods can be considered as the size of the type itself. Change the above program:

# Pragma pack (2)

Union U2

{

Char A [13];

Int B;

};

Union U3

{

Char A [13];

Char B;

};

# Pragma pack (8)

Cout <sizeof (U2) <Endl; // 14

Cout <sizeof (U3) <Endl; // 13

Because the method of Manually changing to 2 is also changed to 2 for int, and the biggest Member for U2 is 2, so now sizeof (U2) = 14.

Conclusion: C ++'s inherent type of bounded access Compiler's bounded access method is smaller than its own size.

9. sizeof problem of struct

The sizeof structure is complicated due to alignment. See the following example: (the default alignment Mode)

Struct S1

{

Char;

Double B;

Int C;

Char D;

};

Struct S2

{

Char;

Char B;

Int C;

Double D;

};

Cout <sizeof (S1) <Endl; // 24

Cout <sizeof (S2) <Endl; // 16

It is also two Char Types, one int type and one double type, but their sizes are different due to the bounded problem. The element pendulum method can be used to calculate the struct size. For example, the CPU determines the peer bounds of the struct. According to the conclusion in the previous section, both S1 and S2 have the largest element type, that is, double type's ter8. Then, each element is placed.

For S1, first place a to the peer interface of 8, assuming that it is 0. At this time, the next idle address is 1, but the next Element D is of the double type, the closest address to 1 is 8, so d is placed in 8. At this time, the next idle address becomes 16, and the peer interface of the next element C is 4 or 16, therefore, C is placed at 16. At this time, the next idle address is changed to 20. The next element D needs to be bound to 1, which also falls in the opposite world. Therefore, D is placed on 20, the struct ends at address 21. Because the size of S1 must be a multiple of 8, the space from 21-23 is retained, and the size of S1 is changed to 24.

For S2, first place a to the peer interface of 8, assuming that it is 0, then the next idle address is 1, and the peer interface of the next element is also 1, so B is placed in 1, the next idle address is changed to 2; the peer interface of the next element C is 4, so take the address closest to 2 4 and place it in C. The next idle address is changed to 8, the peer interface of the next Element D is 8, so d is placed in 8. After all the elements are placed, the struct structure ends at 15 points, occupying a total space of 16, which is exactly a multiple of 8.

There is a trap here. For struct members in the struct, do not consider its alignment as its size. See the following example:

Struct S1

{

Char A [8];

};

Struct S2

{

Double D;

};

Struct S3

{

S1 S;

Char;

};

Struct S4

{

S2 S;

Char;

};

Cout <sizeof (S1) <Endl; // 8

Cout <sizeof (S2) <Endl; // 8

Cout <sizeof (S3) <Endl; // 9

Cout <sizeof (S4) <Endl; // 16;

The size of S1 and S2 is 8, but the alignment of S1 is 1 and S2 is 8 (double). Therefore, this difference exists in S3 and S4.

Therefore, when you define a struct, if the space is insufficient, consider alignment to arrange the elements in the struct.

10. Do not let double interfere with your bit domain

In struct and classes, you can use a bit field to specify the space occupied by a Member. Therefore, using a bit field can save the space occupied by the struct to a certain extent. But consider the following code:

Struct S1

{

Int I: 8;

Int J: 4;

Double B;

Int A: 3;

};

Struct S2

{

Int I;

Int J;

Double B;

Int;

};

Struct S3

{

Int I;

Int J;

Int;

Double B;

};

Struct S4

{

Int I: 8;

Int J: 4;

Int A: 3;

Double B;

};

Cout <sizeof (S1) <Endl; // 24

Cout <sizeof (S2) <Endl; // 24

Cout <sizeof (S3) <Endl; // 24

Cout <sizeof (S4) <Endl; // 16

As you can see, the existence of double will interfere with the in-place domain (sizeof algorithm refer to the previous section), so when using the bit field, it is best to put the float type and double type at the beginning or end of the program.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.