After c++11, the source code has been added support for UTF8 and UCS4 (Unicode is used internally for Windows, because the NT kernel uses UCS2, which is 89, UTF8 was invented by the year 92)

Source: Internet
Author: User

in C + + programming, we often deal with nothing more than the editor and compiler, to the editor said, we often encounter is garbled problem, such as Chinese note display or can not save, the solution is to save your file as Unicode (UTF8). for the compiler, the encoding depends on its support for the C + + standard, such as C + + 11, strings we can only be designated as 2: one is MBCS, such as char* p= "abc haha", there is a UCS2, such as Wchar_t*p = L "abc haha", This way the compiler knows the type of string you want to represent. After c++11, the standard added UTF8 and UCS4 support, such as char* p=u8 "abc haha" means utf8,wchar_t* p=u "abc haha" means UCS2 (actually the same as L "XXXX"),char32_t* p=u "abc haha" means UCS4. This is to distinguish between the compile period and the runtime, although c++11 before the compiler we can not tell the compiler we this constant string is the UTF8 format, but the program runtime we could still use all the coded(MBCS/UTF8/UCS2/UCS4), because these are eventually binary streams in memory.In addition C++11 also added UTF8, UCS2, UCS4 mutual transcoding support:
Std::codecvt_utf8 Encapsulates UTF8-related encoding conversions
Std::codecvt_utf16 Encapsulates UCS2-related encoding conversions
Std::codecvt_utf8_utf16 Encapsulates the encoding conversion of UTF8 and UCS2
 for C + + cross-platform development, we often encounter the default with that encoding, we will find that the Windows UCS2 solution is heterogeneous for other platforms, generally there are 2 ways to solve the problem:one is unified with UTF8, but this is a bit of a hassle for Windows, because the Windows API is UCS2, so this means that any string will go from UTF8 to UCS2 before passing it to the Windows API; Define macro, Windows on the string-related macros are all defined as UCS2, the other platform is all defined as UTF8, this method requires you to write code, the mind should be more sober, because the same code on different platforms encoding format is not the same. always curious, who knows why Windows doesn't have to be UTF8, to make it different from other platforms? because the NT kernel uses UCS2, which was 89, UTF8 was invented 92 years ago. http://www.cppblog.com/weiym/archive/2015/07/25/211370.html

After c++11, the source code has been added support for UTF8 and UCS4 (Unicode is used internally for Windows, because the NT kernel uses UCS2, which is 89, UTF8 was invented by the year 92)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.