1, ANSI (that is, MBCS): is a multibyte character set, it is an indefinite length of the encoding of the world text. ANSI denotes the English alphabet when it is ASCII, but multiple bytes are required to represent other text. 2, Unicode: Two bytes to represent the encoding of a character. For example, the character ' A ' is denoted by a byte under ASCII, and the Unicode is represented by two bytes, where high bytes are populated with "0", and the function ' process ' is denoted by two bytes under ASCII, while the Unicode is also represented by a two-byte following. The use of Unicode is that the fixed length represents the world text, according to statistics, with two bytes can be encoded All the words that exist are not two meanings. 3, the program design under Windows can support the ANSI and Unicode two encoding methods of the string, the specific use of which depends on the definition of MBCS macro or Unicode macro. The string pointer corresponding to the MBCS macro is LPSTR (that is, char*), and the Unicode corresponding pointer is LPWSTR (i.e. unsigned char*). In order to write the program conveniently, Microsoft defines the type LPTSTR, under MBCS it represents char*, Under Unicode it represents unsigned char*, which allows you to redefine a macro for different character set conversions. 4. Relationship LPSTR: A 32-bit pointer to a string that occupies 1 bytes per character. LPCSTR: A 32-bit pointer to a constant string that occupies 1 bytes per character. LPTSTR: A 32-bit pointer to a string that may account for 1 bytes or 2 bytes per character. LPCTSTR: A 32-bit pointer to a constant string, each of which may account for 1 bytes or 2 bytes. 5. Windows uses two character sets, ANSI and Unicode, that use a single-byte approach when working with English characters, while processing characters Use a double-byte approach. The latter is expressed in two-byte notation, whether in English or Chinese characters. All of Windows NT with characters The relevant functions are available in two ways, while Windows 9x only supports ANSI mode. _t is generally associated with a character constant, such as _t ("you Good! ");。 If the encoding is ansi,_t, and if the encoding is Unicode, the compiler will "Hello!" "To Unicode mode for saving. The difference between _t and _l is that _l, regardless of how you compile it, is stored in Unicode. 6, L indicates that the string resource is Unicode encoded, with the following examples: wchar_t str[]=l "Hello world!"; Each character is stored in 2 bytes. 7, _t is an appropriate macro. When _UNICODE is defined, _t and L are the same; otherwise, the _t string is ANSI encoded. Examples such as Under LPTSTR Lptstr=new tchar[32]; tchar* szbuf=_t ("Hello"); The above two statements are correct both in ANSI encoding and in Unicode encoding mode. 8, Microsoft recommends the use of matching string functions. For example, when dealing with LPTSTR or LPCTSTR, you should use _tcslen to Generation strlen function. Otherwise, strlen cannot handle wchar_t* strings in Unicode encoding mode. 9. T is a very important symbol (TCHAR, LPTSTR, LPCTSTR, _t () and _text (), etc.), which represents the use of an intermediate Type, neither explicitly indicates the use of MBCS, nor does it explicitly use Unicode. Which character set to use, at compile time only Availability 10. Conversion of CString type to LPTSTR type CString path1; LPTSTR path2=path1. GetBuffer (path1. Getlenght ()); |