Wchar_t is equal to Char, that is, wchar_t is not derived from typedef and is a native variable.
In short, it has two bytes, which are the same as the short space.
For example:
String "We \ n"
ANSI hexadecimal: Ce D2 C3 C7 0a 00
Six bytes, including \ 0 at the end of the string
Unicode hexadecimal: 11 62 EC 4E 0a 00 00 00
8 bytes, all characters are 2 bytes, even if the letter and number are all, of course, the line feed \ n is also 0a 00.
AverageProgramAdding l "" in front of the string indicates that it is a unicode string.
In Windows, a macro _ T ("") is the same as the one above.
1. The first simple question is, how to print out Unicode?
2 bytes, which can be printed by number, but if you want to print by character, it cannot be used as a normal printf.
You can use wprintf to print, that is, add a wide W to the front of a common printf. Similar functions, such as wsprintf.
Char * Lpsztext = " We \ r \ n " ; // ANSI: Ce D2 C3 C7 // UNICODE: 11 62 EC 4E // Press enter \ r 0d \ n 0a Printf ( " Char * Text: % s 0x % 08x 0x % 08x \ nansi encoding: " , Lpsztext, lpsztext ,* Lpsztext); print_hex_to_file (stdout ,( Const Uint8_t *) lpsztext, strlen (lpsztext) + 1 , 16 ); // BSTR bstrtext compiled by this function = _ Com_util: convertstringtobstr (lpsztext); wprintf (L " BSTR text: % s 0x % 08x 0x % 08x \ nunicode encoding is: " , Bstrtext, bstrtext ,* Bstrtext); print_hex_to_file (stdout ,( Const Uint8_t *) bstrtext, wcslen (bstrtext )* 2 + 2 , 16 );
The result is:
Char* Text: We
0x013fbd80 0 xffffffceANSI code: 0x ce D2 C3 C7 0d 0a00BSTR text: We
0x007be5b4 Zero X 00006211Unicode encoding: 0x11 62EC 4E 0d000a00 00 00
By the way, at first I used wprintf to print Chinese characters, and then I added the following two sentences.
# Include <locale. h>Setlocale (lc_ctype,"CHS");
The source code encoding is ANSI or Unicode, which has no effect on the result.
By the way, how can I print a single wchar_t? The above are all pointers and they are all strings. That's good. Single...
Setlocale (lc_ctype,"CHS"); Wchar wstr1; wchar_t wstr2; wstr1= L'Me'; Wstr2= L'Are'; Wprintf (L"Every size in the wide character set (% C, % C) is % d bytes \ n", Wstr1, wstr2,Sizeof(Wstr1 ));
Always remember l when assigning values, and the result is normal.
Each size of our wide character set (US) is:2Bytes
If you assign 'I' to a char type, you can only get the first byte of 'my' ce D2. Is it garbled? .
CharSS; SS='Me'; Printf ("Ss = % C \ n", SS); the result is: SS=?
And? There is no line feed next to it, because \ n is already integrated with % C? .. It may be strange to print ce.
2. The second simple question is how to convert to the char type
Int Convertstringtobstrdemo (){ Char * Lpsztext = " Test " ; Printf ( " Char * Text: % s \ n " , Lpsztext); BSTR bstrtext = _ Com_util: convertstringtobstr (lpsztext); wprintf (L " BSTR text: % s \ n " , Bstrtext ); : Sysfreestring (bstrtext ); Return 0 ;}; Int Convertbstrtostringdemo () {BSTR bstrtext =: Sysallocstring (L " Test " ); Wprintf (L " BSTR text: % s \ n " , Bstrtext ); Char * Lpsztext2 = _ Com_util: convertbstrtostring (bstrtext); printf ( " Char * Text: % s \ n " , Lpsztext2 ); : Sysfreestring (bstrtext ); Delete [] lpsztext2; Return 0 ;};
This global function sysfreestring () does not seem to have a memory leak if it is not added? (VLD detection)
I rely on it, I know, it may be that VLD has not been reloaded to release the memory allocation in COM, so the comments before sysfreestring are removed.
A memory leak of about MB may occur after 10 000 cycles in the experiment. However, VLD cannot be detected. So be careful!
In COM programming, BSTR is actually the wchar_t * type. in BSTR, pointers are allocated. You must release the memory yourself!
The conversion of BSTR and string (char *) is actually the conversion of wchar_t * and char. This is the com method.
You can also use the methods in stdlib:
Wcstombs and mbstowcs should be widecstring, but how does MBS indicate ANSI normal character encoding? I don't know the abbreviation.
Wchar_t ws [ 10 ]; // Sizeof (WS) = 20 bytes Wsprintf (WS, l " We " ); Char CS [ 50 ]; Sprintf (CS, "" ); // Clear data and initialize // Wchar_t * convert to char * Int Ret = 0 ; Printf ( " Before wcstombs: cs = % 4 s Ws = % s \ n " , Cs, WS); RET = Wcstombs (CS, WS, Sizeof (WS); printf ( " After wcstombs: ret = % d, cs = % 4 s Ws = % s \ n " , RET, Cs, WS); wsprintf (WS, l "" ); // Clear data and initialize // Char * To wchar_t * Wprintf (L " Before mbstowcs: Ws = % 4 s cs = % s \ n " , WS, (CS); RET = Mbstowcs (WS, Cs, Sizeof (WS )* 2 ); Wprintf (L " After mbstowcs: ret = % d, Ws = % 2 S cs = % s \ n " , RET, WS, (CS ));
Running result
Before wcstombs: cs = Ws = % s =After wcstombs: Ret=4, Cs = US Ws = % s =Before mbstowcs: WS= Cs = % s =After mbstowcs: Ret=2, Ws = US cs = % s = us
In Windows, there are APIs of the same meaning.
//Multibytetowidechar
Add my favorite print_hex_to_file function.
Void Print_hex_to_file (File * FP, Const Uint8_t * array, Int Count/* Size of Aray */ , Int Linecount /* The default value is 16. */ ){ Int I; fprintf (FP, " 0x " ); For (I = 0 ; I <Count;) {fprintf (FP, " % 02x " , Array [I]); I ++ ; If (! (I % linecount) & I < Count) {fprintf (FP, " \ N0x " ) ;}} Fprintf (FP, " \ N " ) ;};