Multi-byte characters and wide characters

Source: Internet
Author: User
Multi-byte characters and wide characters

I often encounter character encoding problems when developing multi-language software. I have read a lot of materials and I have finally encountered a good article to share with you! (Time relationship, translated the key part)


Char and wchar

In the Japanese version of Windows, the character encoding we use is shift-JIS. It mainly uses 1 byte to represent english numbers, and 2 byte to represent Japanese characters. The characters in this encoding are called multi-byte characters. (Chinese Version windows character encoding: gb2312)




Char array characters

The mainstream standard character encoding in the world is Unicode. In Windows, english numbers, Japanese characters, Chinese characters, and other language characters are in principle expressed as one character in 2 bytes. The characters in this encoding are called wide characters.






Characters in the wchar_t Array



Character declaration, definition, and size

To indicate different strings, the char and wchar_t strings are defined below. For the memory size, see the comments section.

// Char (character column) character declaration character represents char STRM [] = "test character"; printf ("% 3d: % s \ n ", sizeof (STRM), STRM); // 15: test every possible character rows // wchar_t (every character text column) the declaration of authorization indicates setlocale (lc_all, "Japanese"); wchar_t strw [] = l "Test variables have been used before"; wprintf (L "% 3d: % s \ n ", sizeof (strw), strw); // 20: test execution has been completed successfully



Tchar-Automatic Identification and encoding

In Visual Studio, tchar can be used to define characters. tchar is the definition of typedef. When the program is set to multi-byte characters during compilation, it is changed to char type and set to wide-byte characters, changed to the wchar_t type, which is a convenient type.

// Tchar (self-dynamic character text) Declaration declaration statement representation tchar strt [] = _ T ("test character "); _ tprintf (_ T ("% 3d: % s \ n"), sizeof (strt), strt );

Specifies the character set of Visual Studio.

Whether you want to use Unicode or shift-JIS in Visual Studio, you can choose freely.

Right-click the Visual Studio project solution and select Properties from the pop-up menu. The following figure is displayed.






Text character Character Set-red box

Set sequence (not set): The _ Unicode _ MBCS macro is not set. It is only applicable to English ansixes.
Unicode Character セト using する (using Unicode Character Set): _ Unicode macro is set, tchar type is automatically converted to wchar_t type, the character operation function is also automatically converted to the corresponding wide character function.
マルチバ す (Multi-Byte Character Set): _ MBCS macro is set, and tchar type is automatically converted to char type, the string operation function is automatically converted to the corresponding multi-byte function.


Conversion of Multi-byte characters and wide characters
Function mbstowcs: Conversion Function from multi-byte characters to wide characters

// Mbstowcs serial number of character columns character char STRM [] = "test character"; wchar_t strwfm [32]; setlocale (lc_all, "Japan"); mbstowcs (strwfm, STRM, strlen (STRM) + 1); wprintf (L "% s (number of characters = % d) \ n", strwfm, wcslen (strwfm ));

Function wcstombs: Conversion Function between wide characters and multi-byte characters
// Wcstombs records number of bytes written into the Comment comment character column character wchar_t strw [] = l "test character"; char strmfw [32]; setlocale (lc_all, "Japanese"); wcstombs (strmfw, strw, sizeof (strmfw); printf ("% s (number of characters = % d) \ n", strmfw, strlen (strmfw ));


P.s reference: http://mkubara.com/index.php/%E3%83%9E%E3%83% AB %E3%83%81%E3%83%90%E3%82%A4%E3%83%88%E6%96%87%E5%AD%97%E5%88%97%E3%81%A8%E3%83%AF%E3%82%A4%E3%83%89%E6%96%87%E5%AD%97%E5%88%97

Multi-byte characters and wide characters

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.