Wchar_t wide character set research and com bstr Variant

Last Update:2018-12-07 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Wchar_t is equal to Char, that is, wchar_t is not derived from typedef and is a native variable.

In short, it has two bytes, which are the same as the short space.

For example:

String "We \ n"

ANSI hexadecimal: Ce D2 C3 C7 0a 00

Six bytes, including \ 0 at the end of the string

Unicode hexadecimal: 11 62 EC 4E 0a 00 00 00

8 bytes, all characters are 2 bytes, even if the letter and number are all, of course, the line feed \ n is also 0a 00.

AverageProgramAdding l "" in front of the string indicates that it is a unicode string.

In Windows, a macro _ T ("") is the same as the one above.

1. The first simple question is, how to print out Unicode?

2 bytes, which can be printed by number, but if you want to print by character, it cannot be used as a normal printf.

You can use wprintf to print, that is, add a wide W to the front of a common printf. Similar functions, such as wsprintf.

     Char * Lpsztext = " We \ r \ n  " ; //  ANSI: Ce D2 C3 C7  //  UNICODE: 11 62 EC 4E  //  Press enter \ r 0d \ n 0a Printf ( "  Char * Text: % s 0x % 08x 0x % 08x \ nansi encoding:  " , Lpsztext, lpsztext ,* Lpsztext); print_hex_to_file (stdout ,(  Const Uint8_t *) lpsztext, strlen (lpsztext) + 1 , 16  ); // BSTR bstrtext compiled by this function = _ Com_util: convertstringtobstr (lpsztext); wprintf (L  "  BSTR text: % s 0x % 08x 0x % 08x \ nunicode encoding is:  " , Bstrtext, bstrtext ,* Bstrtext); print_hex_to_file (stdout ,(  Const Uint8_t *) bstrtext, wcslen (bstrtext )* 2 + 2 , 16 );

The result is:

 Char* Text: We
0x013fbd80 0 xffffffceANSI code: 0x ce D2 C3 C7 0d 0a00BSTR text: We
0x007be5b4 Zero X 00006211Unicode encoding: 0x11 62EC 4E 0d000a00 00 00

By the way, at first I used wprintf to print Chinese characters, and then I added the following two sentences.

# Include <locale. h>Setlocale (lc_ctype,"CHS");

The source code encoding is ANSI or Unicode, which has no effect on the result.

By the way, how can I print a single wchar_t? The above are all pointers and they are all strings. That's good. Single...

Setlocale (lc_ctype,"CHS"); Wchar wstr1; wchar_t wstr2; wstr1= L'Me'; Wstr2= L'Are'; Wprintf (L"Every size in the wide character set (% C, % C) is % d bytes \ n", Wstr1, wstr2,Sizeof(Wstr1 ));

Always remember l when assigning values, and the result is normal.

 Each size of our wide character set (US) is:2Bytes

If you assign 'I' to a char type, you can only get the first byte of 'my' ce D2. Is it garbled? .

 CharSS; SS='Me'; Printf ("Ss = % C \ n", SS); the result is: SS=?

And? There is no line feed next to it, because \ n is already integrated with % C? .. It may be strange to print ce.

2. The second simple question is how to convert to the char type

 Int  Convertstringtobstrdemo (){  Char * Lpsztext = "  Test "  ; Printf (  "  Char * Text: % s \ n  "  , Lpsztext); BSTR bstrtext = _ Com_util: convertstringtobstr (lpsztext); wprintf (L  "  BSTR text: % s \ n  "  , Bstrtext );  : Sysfreestring (bstrtext );      Return   0  ;}; Int  Convertbstrtostringdemo () {BSTR bstrtext =: Sysallocstring (L "  Test  "  ); Wprintf (L  "  BSTR text: % s \ n  "  , Bstrtext );  Char * Lpsztext2 = _ Com_util: convertbstrtostring (bstrtext); printf (  "  Char * Text: % s \ n "  , Lpsztext2 );  : Sysfreestring (bstrtext );  Delete [] lpsztext2;  Return   0  ;};

This global function sysfreestring () does not seem to have a memory leak if it is not added? (VLD detection)

I rely on it, I know, it may be that VLD has not been reloaded to release the memory allocation in COM, so the comments before sysfreestring are removed.

A memory leak of about MB may occur after 10 000 cycles in the experiment. However, VLD cannot be detected. So be careful!

In COM programming, BSTR is actually the wchar_t * type. in BSTR, pointers are allocated. You must release the memory yourself!

The conversion of BSTR and string (char *) is actually the conversion of wchar_t * and char. This is the com method.

You can also use the methods in stdlib:

Wcstombs and mbstowcs should be widecstring, but how does MBS indicate ANSI normal character encoding? I don't know the abbreviation.

Wchar_t ws [ 10 ]; //  Sizeof (WS) = 20 bytes Wsprintf (WS, l "  We  "  );  Char CS [ 50  ]; Sprintf (CS,  ""  );  //  Clear data and initialize //  Wchar_t * convert to char *      Int Ret = 0  ; Printf (  "  Before wcstombs: cs = % 4 s Ws = % s \ n  "  , Cs, WS); RET = Wcstombs (CS, WS, Sizeof  (WS); printf (  "  After wcstombs: ret = % d, cs = % 4 s Ws = % s \ n  "  , RET, Cs, WS); wsprintf (WS, l ""  );  //  Clear data and initialize  //  Char * To wchar_t * Wprintf (L "  Before mbstowcs: Ws = % 4 s cs = % s \ n  "  , WS, (CS); RET = Mbstowcs (WS, Cs, Sizeof (WS )* 2  ); Wprintf (L  " After mbstowcs: ret = % d, Ws = % 2 S cs = % s \ n  " , RET, WS, (CS ));

Running result

 Before wcstombs: cs = Ws = % s =After wcstombs: Ret=4, Cs = US Ws = % s =Before mbstowcs: WS= Cs = % s =After mbstowcs: Ret=2, Ws = US cs = % s = us

In Windows, there are APIs of the same meaning.

//Multibytetowidechar

Add my favorite print_hex_to_file function.

 Void Print_hex_to_file (File * FP, Const Uint8_t * array, Int Count/*  Size of Aray  */ , Int Linecount /*  The default value is 16.  */  ){  Int  I; fprintf (FP,  "  0x  "  );  For (I = 0 ; I <Count;) {fprintf (FP,  "  % 02x  "  , Array [I]); I ++ ;  If (! (I % linecount) & I < Count) {fprintf (FP,  "  \ N0x  "  ) ;}} Fprintf (FP,  "  \ N  " ) ;};

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Wchar_t wide character set research and com bstr Variant

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Wchar_t wide character set research and com bstr Variant

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support