C ++ Chinese and English strings (string, wstring)

Source: Internet
Author: User
Tags traits

In C ++, the template prototype of the string class is basic_string.

 

Template < Class _ ELEM, Class Traits = Char_traits < _ ELEM > , Class _ Ax = Allocator < _ ELEM >
Class Basic_string {};

 

The first parameter _ ELEM indicates the type. The default value of the second parameter traits uses the char_traits type and defines the types and character operation functions, such as comparison, equivalence, and allocation. The default value of the third parameter _ ax is the allocator class, indicating the memory mode. Different memory structures place different actions on the Operation pointer, such as stack, heap, or segment memory mode.

The C ++ standard defines two strings: string and wstring.

Typedef basic_string < Char >   String ;
Typedef basic_string < Wchar_t > Wstring;

 

The former string is a common type and can be viewed as char []. In fact, this is exactly the same as _ ELEM = char in the string definition. Wstring uses the wchar_t type, which is a wide character that meets non-ASCII character requirements, such as Unicode encoding, Chinese, Japanese, and Korean. For the wchar_t type, in fact, both C ++ use the wchar_t function corresponding to the char function, because they are defined from the same template similar to the above method. Therefore, wcout, wcin, werr, and other functions are also available.

In fact, string can also use Chinese, but it writes a Chinese character in two char. If you think of a Chinese character as a unit wchar_t, then only one unit is occupied in the wstring, and so are other non-English characters and encodings. In this way, we can truly meet the requirements of string operations, especially international operations.

Check the followingProgramTo understand the difference between the two.

 

# Include < Iostream >
# Include < String >
Using   Namespace STD;

# DefineTab "\ t"

int main ()
{< br> locale def;
cout def. name () Endl;
locale current = cout. getloc ();
cout current. name () Endl;

float Val = 1234.56 ;< br> cout Val Endl;

// chage to French/France
cout. imbue (locale ( " CHS " ));
current = cout. getloc ();
cout current. name () Endl;
cout Val Endl;

//The above describes the usage of locale. The following is the content of this example, because the imbue function is used.
Cout<"*********************************"<Endl;

//To ensure localized output (text/time/currency, etc.), chs indicates China and wcout must use localized resolution encoding.
Wcout. imbue (STD: locale ("CHS"));

// String English, correct reverse position, display the second character is correct
String Str1 ( " Abcabc " );
String Str11 (str1.rbegin (), str1.rend ());
Cout < " UK \ ts1 \ t: " < Str1 < Tab < Str1 [ 1 ] < Tab < Str11 < Endl;

// Wstring English, correct reverse position, display the second character is correct
Wstring str2 = L " Abcabc " ;
Wstring str22 (str2.rbegin (), str2.rend ());
Wcout < " UK \ tws4 \ t: " < Str2 < Tab < Str2 [ 1 ] < Tab < Str22 < Endl;

// String (Chinese). After being reversed, it turns into garbled characters. The second character is also incorrectly read.
String Str3 ( " How are you? " );
String Str33 (str3.rbegin (), str3.rend ());
Cout < " CHN \ ts3 \ t: " < Str3 < Tab < Str3 [ 1 ] < Tab < Str33 < Endl;

//Correct Method for printing the second character
Cout<"CHN \ ts3 \ t: right \ t"<Str3 [2]<Str3 [3]<Endl;

// Chinese, correct reverse position, display the second character is correct
Wstring str4 = L " How are you? " ;
Wstring str44 (str4.rbegin (), str4.rend ());
Wcout < " CHN \ tws4 \ t: " < Str4 < Tab < Str4 [ 1 ] < Tab < Str44 < Endl;

Wstring str5 (str1.begin (), str1.end ()); // Only string of the char type can be constructed in this way.
Wstring str55 (str5.rbegin (), str5.rend ());
Wcout < " CHN \ tws5 \ t: " < Str5 < Tab < Str5 [ 1 ] < Tab < Str55 < Endl;

Wstring str6 (str3.begin (), str3.end ()); // This construction will fail !!!!
Wstring str66 (str6.rbegin (), str6.rend ());
Wcout < " CHN \ tws6 \ t: " < Str6 < Tab < Str6 [ 1 ] < Tab < Str66 < Endl;

Return 0;
}

 

The result is as follows:

The above shows the role of localization, which is to add a comma in each three digits, in fact, it affects time/text.

The following output illustrates how to correctly use the string and wstring methods. Third, some errors occur because the string is used to represent Chinese characters. The last line is also incorrect, and the output is also affected. There is no space or carriage return. (The last two do not care about Chinese and English. It just shows that the Chinese constructor is wrong)

In Chapter 12th "language support", "Mastering Standard C ++" focuses on the internationalization and localization of C ++. c ++ provides i18n standard processing, software developers can refer.

The C ++ standard library is still very extensive and has complete functions. Continue learning.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.