C ++ Data Type

Source: Internet
Author: User

C ++ Data Type

Code compiling and running environment: VS2012 + Win32 + Debug.

1. Introduction to C ++ Data Types

C ++ is a strong language. Any variable (or function) in the C ++ program must follow the "First description and then use" principle. Defining a data type has two functions: one is to determine how the data type is stored in the memory, and the other is to determine which legal operations can be performed on the data type.

C ++ data types are classified into basic data types and non-basic data types. The non-basic data type is called the composite data type or the constructed data type. To reflect the differences between the C ++ language and the traditional C language in terms of non-basic data types, here we turn non-basic data types that can reflect the object-oriented features into constructors, other non-basic data types are called composite data types. Shows the data types of C ++:

The basic data type is pre-defined in C ++, also known as the built-in (built-in) data type. The non-basic data type is the data type created by the user according to the C ++ syntax rules as needed. Here, the difference between constructing a data type and a composite data type is that an instance that constructs a data type is called an object, which is a collection of attributes and methods. The construction of a data type is introduced by the C ++ language, which reflects the object-oriented programming philosophy. A notable feature of constructing a data type is that when an instance of this data type is generated, the constructor defined by this type is automatically called. That is to say, the initialization of the variable that constructs the data type is completed by the constructor.

Note:When you use a basic data type to define a variable, the type appears first, and the variable directly follows the type. However, when a variable is defined using a composite data type, the variable is not necessarily behind the type. For example, to define an array of int a [8], the Data Type of identifier a is int [8], but it appears in the middle of the data type. In addition, when defining or declaring a variable, parentheses must not be added outside the type. For example, it is wrong to define a pointer in this way: (int *) p ;, it indicates converting p to int * type, which is the syntax form of forced type conversion.

2. Wide and single-stick

The traditional character char is a single-byte character type, which stores the ASCII code of the character and occupies one byte. Char can also be interpreted as a single-byte integer with a value range of-128 ~ 127. A single-byte unsigned integer can be expressed with unsigned char. The value range is 0-255.

In VC ++, if a string contains Chinese characters, each Chinese Character occupies 2 bytes, and the highest bit of each byte is 1, the number of bytes occupied by the wide character is related to the specific implementation of the compiler, to ensure that Unicode characters can be stored. VC ++ implements wchar_t into two bytes. The two bytes obviously cannot represent all Unicode characters, but are encoded and converted using the current system's language environment, two bytes can contain a maximum of 65536 characters, which is sufficient to represent the text of a country.

Single-byte characters cannot contain one Chinese character. For example, if char c = 'haok' is defined, a compilation warning message is generated, and only the low-byte encoding is stored in the character variable c.

The C ++ language supports the wchar_t type to indicate Unicode characters. To support Unicode Character Processing, C ++ defines the corresponding Unicode character processing functions in the library functions, and puts the declarations of these functions in the header file.

In Visual C ++, whar_t and char are two different data types. Their storage structure and usage are different. See the following example.

# Include <iostream> using namespace std; int main (int argc, char * argv []) {char * p; wchar_t s [] = L "ABC "; char name [] = "zhangsan"; wchar_t wname [] = L "zhangsan"; cout <sizeof (wchar_t) <""; // output 2 cout <sizeof (s) <endl; // output 8 p = (char *) s; for (int I = 0; I <sizeof (s); ++ I) cout <(int) p [I] <"; cout <endl; cout <s <""; wcout <s <endl; for (int I = 0; I <sizeof (name); ++ I) cout <(int) name [I] <"; cout <endl; p = (char *) wname; for (int I = 0; I <sizeof (wname); ++ I) cout <(int) p [I] <""; cout <endl; cout <name <endl; // setlocale (LC_ALL, "chs "); // Add the following wname to output wcout <wname <endl; getchar ();}

Program output result:

Read the above procedures and draw the following conclusions:
(1) wchar_t and char are different data types with different data widths. sizeof (char) = 1. The data width of wchar_t is related to the implementation of the compiler, encoding and Conversion Based on the current system language environment is sufficient to ensure the storage of Unicode characters. in Visual C ++, wchar_t occupies two bytes.

(2) when defining a wchar_t string, it must start with L; otherwise, a compilation error occurs. To define A wchar_t type character constant, it must also start with L. For example, wchar_t wc = L 'A'. If L is removed, the compiler will automatically perform the conversion from char to wchar_t.

(3) For Spanish characters (such as 'A', 'B', and 'C'), the high byte value is 0x00 in the wchar_t type variables, the lower byte stores the ASCII value of the western character.

(4) char strings end with single byte '\ 0', and wchar_t strings end with dubyte' \ 0' \ 0.

(5) In Windows 7 Simplified Chinese environment, one Chinese Character occupies two bytes and adopts GBK encoding. Therefore, one Chinese character in a char string occupies two bytes, the maximum bits of the two bytes are 1. Only in this way can they be distinguished from the Western characters, so when their ASCII code is output, two negative numbers are obtained. In a string of the wchar_t type, each Chinese character is expressed in double byte, using the UTF-16 encoding method, so the same Chinese character, the stored code value is different. UTF-16 encoding is not compatible with ASCII encoding, so the above Code uses cout to output L "ABC" cannot be normally output. There is also UTF-16 encoding will commonly used characters using two bytes for storage, not commonly used Chinese characters using four bytes for storage, so the use of wchar_t storage UTF-16 encoding in four bytes of Chinese characters will produce data loss, cannot be properly stored.

(6) In the above program, the statementcout<<name<<endl;The output result of the statement is "Zhang San ".wcout< <wname< <endl;But the output cannot be seen normally. If the string wname contains all Spanish characters, you can still see the output. This is a phenomenon in the console program. It depends on the settings of the default language environment on the console, that is, the encoding method used for output. Setlocale is used to set the language environment for encoding conversion. For details, see the code in the program.

References

[1] C ++ advanced tutorial. Chen Gang, Wuhan University Press
[2 http://www.cnblogs.com/wpcockroach/p/3907324.html]

Copyright Disclaimer: This article is an original article by the blogger and cannot be reproduced without the permission of the blogger.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.