I.ANSIAnd Unicode
ANSICharacters andUnicodeCharacter
The ANSI character type is Char, pointing to the string pointer pstr (lpstr), pointing to a constant string pointer pcstr (lpcstr ); the corresponding windows-defined UNICODE character type is wchar (typedef wchar wchar_t), pointing to the Unicode string pointer pwstr, pointing to a constant Unicode string pointer pcwstr.
ANSI "ANSI"
Unicode l "Unicode"
ANSI/Unicode T ("string") or _ text ("string ")
ANSICharacters andUnicodeString operations
In the double byte (DBCS) Character Set, each character of a string can contain one or two bytes. If you only call the strlen () function, you cannot know the number of characters in the string. It can only tell you the number of bytes before reaching the end of 0.Strcpy, strchr, and strcat in Standard C can only be used for ANSI strings.,Unicode string cannot be correctly processedTherefore, a set of complementary functions are provided, which are equivalent in functionality but used for Unicode codes. Let's take a look at how the string. h string header file processes the char * And wchar_t * string versions:
Char * strcat (char *, const char *);
Wchar_t * wcscat (wchat_t *, const wchar_t *)
Similar to strchr/wcschr, strcmp/wcscmp, strlen/wcslen etc.
ANSI operation functions start with str strcpy
Unicode operation functions start with the WCS wcscpy
The MBCS operation function starts with _ MBS _ mbscpy
ANSI/Unicode operation functions start with _ TCS _ tcscpy (C Runtime Library)
ANSI/Unicode operation functions start with lstr lstrcpy (Windows function)
Ii. ANSI/Unicode common characters/string typesTchar/lptstr/lpctstr
Neutral ANSI/Unicode types
1. General character tchar
Tchar
Ifdef Unicode it is wchar_t (wchar) for Unicode platforms;
Else it is Char For ANSI and DBCS platforms.
2. Generic string pointer lptstr
Lptstr
Ifdef Unicode it is lpwstr (* wchar_t) for Unicode platforms;
Else it is lpstr (* char) For ANSI and DBCS platforms.
3. General-purpose constant string pointer lpctstr
Lpctstr
Ifdef Unicode it is lpcwstr (* const wchar_t) for Unicode platforms;
Else it is lpcstr (* const char) For ANSI and DBCS platforms.
Typedef lpwstr LP;
# DEFINE _ text (quote) L # quote// R_winnt
<1> _ Unicode macro is used for C Runtime header files, while Unicode macro is used for Windows header files.CodeThe two macros must be defined at the same time.
<2> If _ Unicode is defined,To generate a unicode string, add an L macro before the stringTo tell the compiler that the string should be compiled and processed as a unicode string. However, another problem is that compilation fails if _ Unicode is not defined. To solve this problem, we must use the _ text macro, which is also defined in tchar. h. After the macro is used, no compilation error occurs no matter whether the source file has a definition _ Unicode.
<3> conversion of Unicode and ANSI strings: Windows functionsThe multibytetowidechar function is used to convert a multi-byte string to a wide string., FunctionWidechartomultibyte converts a wide string to an equivalent multi-byte string..
In addition, some people love to use standard ANSI functions such as strcpy, and some love the _ txxxx function, so it is necessary to clarify the ins and outs. To understand these functions, you must write several character types. Not to mention Char. Let's talk about wchar_t first.Wchar_t is the data type of Unicode characters, Which is actually defined in <string. h>:
Typedef unsigned short wchar_t;
You cannot use ansi c string functions such as strcpy to process wchar_t strings. You must use functions prefixed with WCS, such as wcscpy. To enable the compiler to recognize Unicode strings, you must add an "L" to the front. For example:
Wchar_t * sztest = l "this is a unicode string ."
let's take a look at tchar. If you want to compile Source Code for both ANSI and Unicode, You need to include tchar. h. tchar is a macro defined in it, it is defined as Char or wchar_t according to whether you have defined _ Unicode macro . If you use tchar, you should not use the ANSI strxxx function or the Unicode wcsxxx function. Instead, you must use tchar. the _ tcsxxx function defined in H. In addition, to solve the problem with "L" mentioned earlier, tchar. h defines a macro: "_ text ".
take the strcpy function as an example to summarize:
. if you want to use an ANSI string, use this method:
char szstring [100];
strcpy (szstring, "test");
. If you want to use a unicode string, use this set:
Wchar_t szstring [100];
Wcscpy (szstring, l "test ");
. If you want to compile the ANSI or Unicode string code by defining the _ Unicode macro:
Tchar szstring [100];
_ Tcscpy (szstring, _ text ("test "));
ANSI and Unicode
Unicode is a string of the wide character type, and all Unicode strings are used in COM.
Convert ANSI to Unicode
(1) through L Macro implementation For example, clsidfromprogid (L "mapi. folder", & CLSID );
(2) Implement conversion through the multibytetowidechar function, for example:
Char * szprogid = "mapi. folder ";
Wchar szwideprogid [128];
CLSID;
Long llen = multibytetowidechar (cp_acp, 0, szprogid, strlen (szprogid), szwideprogid, sizeof (szwideprogid ));
Szwideprogid [llen] = '\ 0 ';
(3) using the a2w macro, for example:
Uses_conversion;
Clsidfromprogid (a2w (szprogid), & CLSID );
Convert Unicode to ANSI
(1) Use widechartomultibyte, for example:
// Assume that you already have a unicode string wszsomestring...
Char szansistring [max_path];
Widechartomultibyte (cp_acp, wc_compositecheck, wszsomestring,-1, szansistring, sizeof (szansistring), null, null );
(2) Use the w2a macro, for example:
Uses_conversion;
Ptemp = w2a (wszsomestring );
This article from csdn blog: http://blog.csdn.net/dongyonghui_1017/archive/2009/06/18/4280205.aspx
Char * And tchar * convert cstring
Cstring STR (****)
The following describes other conversions in detail.
**************************************** *******************************
* Function: transcstringtotchar
* Description: Convert cstring to tchar *
* Date:
**************************************** *******************************
Tchar * cpublic: cstring2tchar (cstring & Str)
{
Int ilen = Str. getlength ();
Tchar * szrs = new tchar [ilen];
Lstrcpy (szrs, str. getbuffer (ilen ));
Str. releasebuffer ();
Return szrs;
}
**************************************** *******************************
* Function: thcar2char
* Description: Convert tchar * To char *
**************************************** *******************************
Char * cpublic: thcar2char (tchar * tchstr)
{
Int ilen = 2 * wcslen (tchstr); // cstring. The length of a tchar character is not required.
Char * chrtn = new char [ilen + 1]
Wcstombs (chrtn, tchstr, ilen + 1); // If the conversion is successful, a non-negative value is returned.
Return chrtn;
}
************************************* * ********************************
* function: cstring2char
* description: convert cstring to char *
******************************* ****************************************
char * cpublic:: cstring2char (cstring & Str)
{< br> int Len = Str. getlength ();
char * chrtn = (char *) malloc (LEN * 2 + 1) * sizeof (char )); // calculate the length of a Chinese character in the cstring
memset (chrtn, 0, 2 * Len + 1);
uses_conversion;
strcpy (lpstr) chrtn, ole2a (Str. lockbuffer ();
return chrtn;
}< br> ********************************* ************************************
* function names: getansistring
* Description: converts a cstring (UNICODE) to a char * (ANSI)
* parameter: cstring & S the cstring to be converted
* return value: return the Conversion Result
******************************* ****************************************
char * getansistring (const cstring & S)
{< br> int nsize = 2 * s. getlength ();
char * pansistring = new char [nsize + 1];
wcstombs (pansistring, S, nsize + 1);
return pansistring;
}