Http://blog.programfan.com/article.asp? Id = 34551 Http://blog.programfan.com/article.asp? Id = 34552 I.ANSIAnd Unicode 2.ANSICharacters andUnicodeCharacter The ANSI character type is Char, pointing to the string pointer pstr (lpstr), pointing to a constant string pointer pcstr (lpcstr ); The corresponding windows-defined UNICODE character type is wchar (typedef wchar wchar_t), pointing to the Unicode string pointer pwstr, pointing to a constant Unicode string pointer pcwstr. ANSI "ANSI" Unicode l "Unicode" ANSI/Unicode T ("string") or _ text ("string ") 3.ANSICharacters andUnicodeString operations In the double byte (DBCS) Character Set, each character of a string can contain one or two bytes. If you only call the strlen () function, you cannot know the number of characters in the string. It can only tell you the number of bytes before reaching the end of 0. In standard C, strcpy, strchr, and strcat can only be used for ANSI strings and cannot correctly process Unicode strings. Therefore, a set of complementary functions are provided, which are equivalent in functionality but used for Unicode codes. Let's take a look at how the string. h string header file processes the char * And wchar_t * string versions: Char * strcat (char *, const char *); Wchar_t * wcschr (wchat_t *, const wchar_t *) Similar to strchr/wcschr, strcmp/wcscmp, strlen/wcslen etc. ANSI operation functions start with str strcpy Unicode operation functions start with the WCS wcscpy The MBCS operation function starts with _ MBS _ mbscpy ANSI/Unicode operation functions start with _ TCS _ tcscpy (C Runtime Library) ANSI/Unicode operation functions start with lstr lstrcpy (Windows function) Ii. ANSI/Unicode common characters/string typesTchar/lptstr/lpctstr Neutral ANSI/Unicode types 1. General character tchar Tchar Ifdef Unicode it is wchar_t (wchar) for Unicode platforms; Else it is Char For ANSI and DBCS platforms. 2. Generic string pointer lptstr Lptstr Ifdef Unicode it is lpwstr (* wchar_t) for Unicode platforms; Else it is lpstr (* char) For ANSI and DBCS platforms. 3. General-purpose constant string pointer lpctstr Lpctstr Ifdef Unicode it is lpcwstr (* const wchar_t) for Unicode platforms; Else it is lpcstr (* const char) For ANSI and DBCS platforms. Typedef lpwstr LP; # DEFINE _ text (quote) L # quote // r_winnt <1> _ Unicode macro is used for the C Runtime header file, while Unicode macro is used for the Windows header file. When compiling code modules, these two macros must be defined at the same time. <2> If _ Unicode is defined, to generate a unicode string, add an L macro before the string to tell the compiler that the string should be compiled as a unicode string. However, another problem is that compilation fails if _ Unicode is not defined. To solve this problem, we must use the _ text macro, which is also defined in tchar. h. After the macro is used, no compilation error occurs no matter whether the source file has a definition _ Unicode. <3> Unicode and ANSI String Conversion: the WINDOWS function multibytetowidechar is used to convert a multi-byte string to a wide string. The function widechartomultibyte converts a wide string to an equivalent multi-byte string. Some people love to use standard ANSI functions such as strcpy, and some love the _ txxxx function. It is necessary to clarify the ins and outs. To understand these functions, you must write several character types. Not to mention Char. Let's talk about wchar_t first. Wchar_t is the data type of Unicode characters. It is actually defined in <string. h>: Typedef unsigned short wchar_t; You cannot use ansi c string functions such as strcpy to process wchar_t strings. You must use functions prefixed with WCS, such as wcscpy. To enable the compiler to recognize Unicode strings, you must add an "L" to the front. For example: Wchar_t * sztest = l "this is a unicode string ." Wchar_t is the data type of Unicode characters. It is actually defined in <string. h>: Typedef unsigned short wchar_t; Next let's take a look at tchar. If you want to compile source code for both ANSI and Unicode, You need to include tchar. h. Tchar is a macro defined in it. It is defined as Char or wchar_t based on whether _ Unicode macro is defined. If you use tchar, you should not use the ANSI strxxx function or the Unicode wcsxxx function. Instead, you must use the _ tcsxxx function defined in tchar. h. In addition, to solve the problem with "L" mentioned earlier, tchar. h defines a macro: "_ text ". Take the strcpy function as an example to summarize: . If you want to use an ANSI string, use this method: Char szstring [100]; Strcpy (szstring, "test "); . If you want to use a unicode string, use this set: Wchar_t szstring [100]; Wcscpyszstring, l "test "); . If you want to compile the ANSI or Unicode string code by defining the _ Unicode macro: Tchar szstring [100]; _ Tcscpy (szstring, _ text ("test ")); 2, ANSI and Unicode Unicode is a string of the wide character type, and all Unicode strings are used in COM. Convert ANSI to Unicode (1) Use the macro L, for example, clsidfromprogid (L "mapi. folder", & CLSID ); (2) Implement conversion through the multibytetowidechar function, for example: Char * szprogid = "mapi. folder "; Wchar szwideprogid [128]; CLSID; Long llen = multibytetowidechar (cp_acp, 0, szprogid, strlen (szprogid), szwideprogid, sizeof (szwideprogid )); Szwideprogid [llen] =' (3) using the a2w macro, for example: Uses_conversion; Clsidfromprogid (a2w (szprogid), & CLSID ); Convert Unicode to ANSI (1) Use widechartomultibyte, for example: // Assume that you already have a unicode string wszsomestring... Char szansistring [max_path]; Widechartomultibyte (cp_acp, wc_compositecheck, wszsomestring,-1, szansistring, sizeof (szansistring), null, null ); (2) Use the w2a macro, for example: Uses_conversion; Ptemp = w2a (wszsomestring ); This article from the csdn blog, reproduced please indicate the source: http://blog.csdn.net/dongyonghui_1017/archive/2009/06/18/4280205.aspx |