Wchar_t wide character set research and com bstr Variant

Source: Internet
Author: User

 

Wchar_t is equal to Char, that is, wchar_t is not derived from typedef and is a native variable.

In short, it has two bytes, which are the same as the short space.

For example:

String "We \ n"

ANSI hexadecimal: Ce D2 C3 C7 0a 00

Six bytes, including \ 0 at the end of the string

Unicode hexadecimal: 11 62 EC 4E 0a 00 00 00

8 bytes, all characters are 2 bytes, even if the letter and number are all, of course, the line feed \ n is also 0a 00.

 

AverageProgramAdding l "" in front of the string indicates that it is a unicode string.

In Windows, a macro _ T ("") is the same as the one above.

 

1. The first simple question is, how to print out Unicode?

2 bytes, which can be printed by number, but if you want to print by character, it cannot be used as a normal printf.

You can use wprintf to print, that is, add a wide W to the front of a common printf. Similar functions, such as wsprintf.

     Char * Lpsztext = " We \ r \ n  " ; //  ANSI: Ce D2 C3 C7  //  UNICODE: 11 62 EC 4E  //  Press enter \ r 0d \ n 0a Printf ( "  Char * Text: % s 0x % 08x 0x % 08x \ nansi encoding:  " , Lpsztext, lpsztext ,* Lpsztext); print_hex_to_file (stdout ,(  Const Uint8_t *) lpsztext, strlen (lpsztext) + 1 , 16  ); // BSTR bstrtext compiled by this function = _ Com_util: convertstringtobstr (lpsztext); wprintf (L  "  BSTR text: % s 0x % 08x 0x % 08x \ nunicode encoding is:  " , Bstrtext, bstrtext ,* Bstrtext); print_hex_to_file (stdout ,(  Const Uint8_t *) bstrtext, wcslen (bstrtext )* 2 + 2 , 16 );

The result is:

 
Char* Text: We
0x013fbd80 0 xffffffceANSI code: 0x ce D2 C3 C7 0d 0a00BSTR text: We
0x007be5b4 Zero X 00006211Unicode encoding: 0x11 62EC 4E 0d000a00 00 00

By the way, at first I used wprintf to print Chinese characters, and then I added the following two sentences.

# Include <locale. h>Setlocale (lc_ctype,"CHS");

The source code encoding is ANSI or Unicode, which has no effect on the result.

 

By the way, how can I print a single wchar_t? The above are all pointers and they are all strings. That's good. Single...

Setlocale (lc_ctype,"CHS"); Wchar wstr1; wchar_t wstr2; wstr1= L'Me'; Wstr2= L'Are'; Wprintf (L"Every size in the wide character set (% C, % C) is % d bytes \ n", Wstr1, wstr2,Sizeof(Wstr1 ));

Always remember l when assigning values, and the result is normal.

 
Each size of our wide character set (US) is:2Bytes

If you assign 'I' to a char type, you can only get the first byte of 'my' ce D2. Is it garbled? .

 
CharSS; SS='Me'; Printf ("Ss = % C \ n", SS); the result is: SS=?

And? There is no line feed next to it, because \ n is already integrated with % C? .. It may be strange to print ce.

2. The second simple question is how to convert to the char type

 Int  Convertstringtobstrdemo (){  Char * Lpsztext = "  Test "  ; Printf (  "  Char * Text: % s \ n  "  , Lpsztext); BSTR bstrtext = _ Com_util: convertstringtobstr (lpsztext); wprintf (L  "  BSTR text: % s \ n  "  , Bstrtext );  : Sysfreestring (bstrtext );      Return   0  ;}; Int  Convertbstrtostringdemo () {BSTR bstrtext =: Sysallocstring (L "  Test  "  ); Wprintf (L  "  BSTR text: % s \ n  "  , Bstrtext );  Char * Lpsztext2 = _ Com_util: convertbstrtostring (bstrtext); printf (  "  Char * Text: % s \ n "  , Lpsztext2 );  : Sysfreestring (bstrtext );  Delete [] lpsztext2;  Return   0  ;}; 

This global function sysfreestring () does not seem to have a memory leak if it is not added? (VLD detection)

I rely on it, I know, it may be that VLD has not been reloaded to release the memory allocation in COM, so the comments before sysfreestring are removed.

A memory leak of about MB may occur after 10 000 cycles in the experiment. However, VLD cannot be detected. So be careful!

 

In COM programming, BSTR is actually the wchar_t * type. in BSTR, pointers are allocated. You must release the memory yourself!

The conversion of BSTR and string (char *) is actually the conversion of wchar_t * and char. This is the com method.

 

You can also use the methods in stdlib:

Wcstombs and mbstowcs should be widecstring, but how does MBS indicate ANSI normal character encoding? I don't know the abbreviation.

Wchar_t ws [ 10 ]; //  Sizeof (WS) = 20 bytes Wsprintf (WS, l "  We  "  );  Char CS [ 50  ]; Sprintf (CS,  ""  );  //  Clear data and initialize //  Wchar_t * convert to char *      Int Ret = 0  ; Printf (  "  Before wcstombs: cs = % 4 s Ws = % s \ n  "  , Cs, WS); RET = Wcstombs (CS, WS, Sizeof  (WS); printf (  "  After wcstombs: ret = % d, cs = % 4 s Ws = % s \ n  "  , RET, Cs, WS); wsprintf (WS, l ""  );  //  Clear data and initialize  //  Char * To wchar_t * Wprintf (L "  Before mbstowcs: Ws = % 4 s cs = % s \ n  "  , WS, (CS); RET = Mbstowcs (WS, Cs, Sizeof (WS )* 2  ); Wprintf (L  " After mbstowcs: ret = % d, Ws = % 2 S cs = % s \ n  " , RET, WS, (CS ));

Running result

 
Before wcstombs: cs = Ws = % s =After wcstombs: Ret=4, Cs = US Ws = % s =Before mbstowcs: WS= Cs = % s =After mbstowcs: Ret=2, Ws = US cs = % s = us

In Windows, there are APIs of the same meaning.

//Multibytetowidechar

 

Add my favorite print_hex_to_file function.

 Void Print_hex_to_file (File * FP, Const Uint8_t * array, Int Count/*  Size of Aray  */ , Int Linecount /*  The default value is 16.  */  ){  Int  I; fprintf (FP,  "  0x  "  );  For (I = 0 ; I <Count;) {fprintf (FP,  "  % 02x  "  , Array [I]); I ++ ;  If (! (I % linecount) & I < Count) {fprintf (FP,  "  \ N0x  "  ) ;}} Fprintf (FP,  "  \ N  " ) ;}; 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.