[C ++] in-depth investigation on the problem that cout and wcout cannot normally output Chinese characters (1): Various compiler tests

Source: Internet
Author: User

Author: zyl910

The C ++ standard provides a complete international text processing mechanism for the c ++ standard Io library. However, in actual use, it is found that there is a large difference in the support of various compilers, and many times it is impossible to correctly output characters. So I conducted an in-depth investigation.

1. Description 1.1 Test Procedure

The following is a simple program that uses cout, wcout, and printf to output strings. The specific code is --

# Include <stdio. h> # include <locale. h> # include <wchar. h >#include <string >#include <iostream> using namespace STD; const char * PSA = "a Chinese character ABC"; const wchar_t * psw = l "W Chinese Character ABC "; int main (INT argc, char * argv []) {// init. // IOs: sync_with_stdio (false); // Linux GCC. locale: Global (locale (""); // setlocale (lc_ctype, ""); // mingw GCC. wcout. imbue (locale (""); // C ++ cout <PSA; cout. clear (); cout <Endl; wcout <psw; wcout. clear (); wcout <Endl; // C printf ("\ NC: \ n"); printf ("\ t % s \ n", PSA ); printf ("\ t % ls \ n", psw); Return 0 ;}

 

Let's guess what the running result of this program is?

1.2 theoretical results

First, according to the C ++ standard, analyze the theoretical results of this program.

In the main function, the two lines of code are executed to initialize the region environment --

    locale::global(locale(""));    wcout.imbue(locale(""));

 

Details --
1. locale (""): Call the constructor to create a local. The Null String has special meanings: use the default locale in the customer environment (C ++ standard library-self-repair Tutorial and reference manual p697 ). For example, in a simplified Chinese system, locale of Simplified Chinese is returned.
2. locale: Global (locale (""): Set "Global locale of C ++ standard Io library" to "Default locale in customer environment ". Note that it will also set the locale environment of the C standard library, resulting in ,"") "similar effect (" C ++ standard library-self-repair Tutorial and Reference Manual "p698 ).
3. wcout. imbue (locale (""): enables wcout to use the "Default locale in the customer environment ".

In this way, the C standard library and C ++ standard Io Library (especially wcout) correctly set the regional environment, which exactly matches the default environment in the customer environment.

Then, use the cout and wcout of the c ++ standard Io library to output the narrow string and wide string respectively --

    // C++    cout << psa;    cout.clear();    cout<<endl;    wcout << psw;    wcout.clear();    wcout<<endl;

 

Details --
1. Call the clear member functions of cout and wcout to clear the error status and enable subsequent output to run normally.
2. When "cout <Endl" or "wcout <Endl" is used, not only will the output Text wrap be executed, but also the flush member function will be executed to submit data in the buffer. So that the output texts of cout and wcout do not conflict.

Finally, use the printf function of the C standard library to output narrow strings and wide strings --

    // C    printf("\nC:\n");    printf("\t%s\n", psa);    printf("\t%ls\n", psw);

 

Therefore, the running result of the test program should be --

A Chinese Character abcw Chinese Character ABCC: a Chinese character abc w Chinese Character ABC

 

Note: To better differentiate the output results of the C ++ standard Io Library and the C standard library, a Tab character is added to printf.

Ii. Test vc2005

Vc2005 is the first compiler in the VC series that has good support for the c ++ 03 standard. We will test it first.

2.1 debug

Compile the test program in debug mode in vc2005. The execution result is --

AWC: a Chinese character abc w Chinese Character ABC

 

It can be seen that both cout and wcout of C ++ cannot output Chinese characters normally.
C's printf can normally output narrow strings and wide strings containing Chinese characters.

2.2 release

Change the compilation configuration to the "release" mode and then compile and run the program. This is a magic thing. The execution result is --

A Chinese Character abcw Chinese Character ABCC: a Chinese character abc w Chinese Character ABC

 

All data passes in the release version, and both cout, wcout, and printf can be output normally.

Iii. Test vc2008 and later versions of VC

Compile the test program in vc2008 and the execution result is --

A Chinese Character abcw Chinese Character ABCC: a Chinese character abc w Chinese Character ABC

 

All pass, cout, wcout, and printf can be output normally. Then we tested the release version, and all of them passed. It seems that the vc2005 bug has been fixed.
Then we tested vc2010 and vc2012, all of which passed the test.

4. Test mingw4.1 in Windows

Use GCC 4.6.2 (mingw (20120426) to compile the test program. The execution result is --

A Chinese Character abcwc: a Chinese character ABC W

 

The narrow string can be output normally, but the wide string cannot be output normally.

4.2 modify the code so that mingw can be properly displayed

Add a line of initialization code --

    // init.    locale::global(locale(""));    setlocale(LC_CTYPE, "");    // MinGW gcc.    wcout.imbue(locale(""));

 

Use mingw to compile and run the command. The execution result is --

A Chinese Character abcw Chinese Character ABCC: a Chinese character abc w Chinese Character ABC

 

All passed, cout, wcout, and printf can be output normally. It seems that "locale: Global (locale (" ")" In mingw does not set "setlocale (lc_all," ")" and must be called manually.
Compiling the modified Code with vc2008 is also successful. One call to "setlocale (lc_all," ")" will not cause damage.

5. Test gcc5.1 in Linux

Use GCC in linxu to compile the test program. The execution result is --

A Chinese Character abcwiwabcc: a Chinese character abc w Chinese Character ABC

 

Both cout and printf can be output normally, but wcout cannot.

5.2 modify the code so that the code can be properly displayed in Linux

Add a line of initialization code --

    // init.    ios::sync_with_stdio(false);    // Linux gcc.    locale::global(locale(""));    wcout.imbue(locale(""));

 

 

Use GCC to compile and run. The execution result is --

A Chinese Character abcw Chinese Character ABCC: a Chinese character abc w Chinese Character ABC

 

All passed, cout, wcout, and printf can be output normally.

5.3 modify the code 2nd times so that mingw can be properly displayed

Switch back to Windows and use mingw to compile the modified Code. The execution result is --

A Chinese Character ABCC: a Chinese character ABC W

 

The wide string cannot be properly displayed.
Based on the previous experience, add "setlocale" to the initialization code "--

    // init.    ios::sync_with_stdio(false);    // Linux gcc.    locale::global(locale(""));    setlocale(LC_CTYPE, "");    // MinGW gcc.    wcout.imbue(locale(""));

 

Use mingw to compile and run the command. The execution result is --

A Chinese Character abcw Chinese Character ABCC: a Chinese character abc w Chinese Character ABC

 

Finally all passed.

5.4 test the code modification 2nd times in Linux

In Linux, the code is successfully modified 2nd times.

The modified code is compiled by vc2008.

It seems that the effective initialization methods in VC, mingw, and Linux are finally found. Unfortunately, manual synchronization is required after "iOS: sync_with_stdio (false)" is disabled, which may cause some old code to work abnormally and this method is not very practical.

6. Test GCC under Mac OSX

Use GCC in linxu to compile the test program. The execution result is --

An error is reported when such a simple program runs. Why?
Use GDB to debug the program. R run, where display call stack, list display source code --

It can be seen that an error is reported when "locale (" ")" is executed.
Isn't "locale (" ")" specified in the C ++ standard? How can I connect to it and report an error?
I searched the internet and found someone had checked the GCC source code under Mac. It explicitly wrote "Currently, the generic model only supports the" c "locale ."--
Http://stackoverflow.com/questions/1745045/stdlocale-breakage-on-macos-10-6-with-lang-en-us-utf-8
STD: locale Breakage on MACOs 10.6 With lang = en_US.UTF-8

VII. Summary

Although the C ++ standard concept is perfect, it is a pity that there are many differences in the degree of implementation of various compilers. Even some platforms do not support "locale.
To ensure cross-platform use, use the C ++ standard Io library with caution. It is best to use the C standard library with excellent compatibility as much as possible.

 

References --
ISO/IEC 9899: 1999 (c99). ISO/IEC, 1999. www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf
C ++ international standard-iso iec 14882 Second Edition 2003 (C ++ 03). ISO/IEC, 2003-10-15.
"C ++ standard library-self-repair Tutorial and reference manual". By niclai M. josutis, translated by Hou Jie and Meng Yan. Huazhong University of Science and Technology Press, 2002-09.
STD: locale Breakage on MACOs 10.6 With lang = en_US.UTF-8. http://stackoverflow.com/questions/1745045/stdlocale-breakage-on-macos-10-6-with-lang-en-us-utf-8
[C] cross-platform use of tchar-so that Linux and other platforms also support tchar. h, to solve the problem of cross-platform format control characters, multi-language display at the same time ". http://www.cnblogs.com/zyl910/archive/2013/01/17/tcharall.html

 

Download source code --
Http://files.cnblogs.com/zyl910/wchar_crtbug.rar

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.