Meaning and distinction of quoting LPSTR, LPCSTR, LPTSTR, LPCTSTR, LPWStr and LPCWSTR
1, ANSI (that is, MBCS): is a multibyte character set, it is an indefinite length of the encoding of the world text. ANSI means the alphabet is the same as ASCII, but it requires multiple bytes to represent other text.
2, Unicode: Two bytes to represent the encoding of a character. For example, the character ' A ' is denoted by a byte under ASCII, and is denoted by two bytes under Unicode, where the high byte is filled with "0", the function ' process ' is denoted by two bytes in ASCII, and the following Unicode is represented by two bytes. The use of Unicode is to set the length of the world text, according to statistics, with two bytes can encode all the existing text without ambiguity.
3, the programming under Windows can support the ANSI and Unicode two encoding methods of the string, the specific use of which is to see the definition of MBCS macro or Unicode macros. The string pointers for MBCS macros are LPSTR (that is, char*), and Unicode corresponds to a pointer of LPWSTR (that is, unsigned char*). In order to write the program conveniently, Microsoft defines the type LPTSTR, under MBCS it represents the char*, under the Unicode it represents the unsigned char*, this can redefine a macro to make the conversion of different character sets.
4. Relationship
The Lpstr:32bit pointer points to a string that occupies 1 bytes per character. Equivalent to char *
The Lpcstr:32-bit pointer points to a constant string that occupies 1 bytes per character. Equivalent to const char *
Lptstr:32-bit pointers may account for 1 bytes or 2 bytes per character, depending on whether Unicode is defined
The Lpctstr:32-bit pointer points to a constant string, which may account for 1 bytes or 2 bytes, depending on whether Unicode is defined
The lpwstr:32-bit pointer, which points to a pointer to a Unicode string, occupies 2 bytes per character.
The lpcwstr:32-bit pointer, which points to a pointer to a Unicode string constant, occupies 2 bytes per character.
In the above type, l means long, p is the pointer, C is constant, t means the number of bytes pointed to by the pointer depends on whether Unicode is defined, w means that wide,str is the meaning of string
LPSTR = char *
LPCSTR = Const char *
LPTSTR = _tchar * (or TCHAR *)
LPCTSTR = Const _TCHAR * (or const TCHAR *)
LPWSTR = wchar_t *
LPCWSTR = Const wchar_t *
5. Windows uses two character sets, ANSI and Unicode, that use a single-byte approach when working with English characters, using a double-byte approach when working with Chinese characters. The latter is expressed in two-byte notation, whether in English or Chinese characters. All character-related functions of Windows NT are available in two ways, while Windows 9x supports only ANSI mode. _t is generally associated with a character constant, such as _t ("Hello!" ");。 If the encoding is ansi,_t, and if the encoding is Unicode, the compiler will "Hello!" "To
Unicode mode for saving. The difference between _t and _l is that _l, regardless of how you compile it, is stored in Unicode.
6, L indicates that the string resource is Unicode encoded, with the following examples:
wchar_t str[]=l "Hello world!"; Each character is stored in 2 bytes.
7, _t is an appropriate macro. When _UNICODE is defined, _t and L are the same; otherwise, the _t string is ANSI encoded. Examples such as
Under
LPTSTR Lptstr=new tchar[32];
tchar* szbuf=_t ("Hello");
The above two statements are correct both in ANSI encoding and in Unicode encoding mode.
8, Microsoft recommends the use of matching string functions. For example, when dealing with LPTSTR or LPCTSTR, you should use _tcslen instead of the strlen function. Otherwise, strlen cannot handle wchar_t* strings in Unicode encoding mode.
9. T is a very important symbol (TCHAR, LPTSTR, LPCTSTR, _t () and _text (), etc.), which indicates the use of an intermediate type, neither explicitly using MBCS nor explicitly using Unicode. Which character set to use is not determined at compile time.
10, pay attention to the use of L and _t method.
LPTSTR, LPCTSTR, and _t (constant strings) are affected by _t.
Describes string types: char, wchar_t, TCHAR, Char, WCHAR.
Definition of TCHAR:
#ifdef UNICODE
typedef wchar_t TCHAR;
#else
typedef unsigned char TCHAR;
#endif
typedef unsigned char char;
typedef unsigned wchar_t WCHAR;
As you can see, unsigned char and wchar_t are basic data types, char is implemented on unsigned char, WCHAR is wchar_t, and TCHAR differs depending on whether Unicode is supported.
There are 5 types of names that can be used in the program, unsigned char (char), wchar_t (WCHAR), TCHAR, but TCHAR is recommended for extensibility and compatibility.
Reference: http://weihe6666.iteye.com/blog/1300698
The meanings and differences of LPSTR, LPCSTR, LPTSTR, LPCTSTR, LPWStr and LPCWSTR