Chapter 2-Unicode Character Set learning notes

Source: Internet
Author: User
Tags uppercase letter

Notes on studying the Unicode Character Set in Windows programming:

1: The C language supports Unicode through support for wide character sets

2: The wide character in C is based on the wchar_t data type. It includes wchar in several header files. H is defined as follows: typedef unsigned short wchar_t; therefore, the wchar_t data type is the same as the unsigned short Integer type, which is 16-bit wide.

3: To define a variable that contains a wide character, use the following statement; wchar_t c = 'a ';
The variable C is a double byte value of 0x0041, which is the Unicode letter. (However, because Intel microprocessor stores multi-byte values starting from the smallest byte, the bytes are actually saved in the memory in the order of 0x41 and 0x00. Note this if you check the computer storage of Unicode text .) Note This!

4: You can define a pointer to a wide string: wchar_t * P = l "Hello! ";
Note the uppercase letter L ("long") next to the first quotation mark 」).
If you forget to include L, the C compiler usually sends a warning or error message.

5: the wide character version of The strlen function is wcslen (wide-Character String Length: width string length), and in the string. H (strlen) and wchar. h. Strlen functions are described as follows:
Size_t _ cdecl strlen (const char *);
The wcslen function is described as follows:
Size_t _ cdecl wcslen (const wchar_t *);
To obtain the length of a wide string, call
Ilength = wcslen (PW );
The function returns 6 Characters in the string. After changing to the wide-character section, the character length of the string is not changed, but the length of the bit group is changed.

6: All the C execution period linked library functions with string parameters have a wide character version.

7: If the identifier named _ Unicode is defined and the program contains the tchar. h header file, _ tcslen is defined as wcslen:
# DEFINE _ tcslen wcslen
If Unicode is not defined, _ tcslen is defined as strlen:
# DEFINE _ tcslen strlen

8: If _ Unicode identifier is defined, a macro called _ t is defined as follows:
# DEFINE _ T (x) L # x
This is rather obscure syntax, but complies with ansi c-standard Preprocessor specifications. The pair of Well fonts is called "token paste", which adds the letter L to the macro parameter. Therefore, if the macro parameter is "Hello! ", Then l # X is l" Hello! ".

9: typedef char;
Typedef wchar_t wchar; // WC
When you need to define 8 or 16 characters, we recommend that you use char and wchar in windows.
The comments behind the wchar definition are recommended by the Hungarian markup method: a variable based on the wchar data type can be appended with a letter WC to describe a wide character.

10: The winnt. h header file defines the six data types that can be used as 8-Bit String pointers and four data types that can be used as const 8-Bit String pointers. Here we have selected some useful statements for describing the data type in the header file:
Typedef char * pchar, * lpch, * PCH, * npstr, * lpstr, * pstr;
Typedef const char * lpcch, * pcch, * lpcstr, * pcstr;
Prefix N and l indicate "near" and "long", which indicate two indicators of different sizes in 16-bit windows. In Win32, the near and long indicators are no different.
Similarly, winnt. h defines six data types that can be used as a 16-Bit String pointer and four data types that can be used as a const 16-Bit String pointer:
Typedef wchar * pwchar, * lpwch, * pwch, * nwpstr, * lpwstr, * pwstr;
Typedef const wchar * lpcwch, * pcwch, * lpcwstr, * pcwstr;

11: in user32.dll, there is no 32-bit entry point for the MessageBox function. In fact, there are two entry points: messageboxa (ASCII version) and messageboxw (wide character version ).

Ilength = lstrlen (pstring );

Pstring = lstrcpy (pstring1, pstring2 );

Pstring = lstrcpyn (pstring1, pstring2, icount );

Pstring = lstrcat (pstring1, pstring2 );

ICOMP = lstrcmp (pstring1, pstring2 );

ICOMP = lstrcmpi (pstring1, pstring2 );

These functions are the same as those in the C-Linked Library. If Unicode identifiers are defined, these functions accept wide strings. Otherwise, only regular strings are accepted. The wide string version of The lstrlenw function can be executed in Windows 98.

12: You can use sprintf and other functions in the sprintf series to display text. Apart from formatting the content and outputting it to the string buffer provided by the first parameter of the function, these functions have the same function as printfi. Then you can operate the string (for example, pass it to MessageBox ).

13: Windows's wsprintf and wvsprintf functions are functionally the same as sprintf and vsprintf, but they cannot process floating point formats.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.