C language: wide character set operation function (unicode encoding), Character Set unicode

Source: Internet
Author: User
Tags uppercase character

C language: wide character set operation function (unicode encoding), Character Set unicode

C language: wide character set operator function (unicode encoding) character classification: wide character function common C function description iswalnum () isalnum () test whether the character is a number or letter iswalpha () isalpha () test whether the character is iswcntrl () iscntrl () test whether the character is the control letter iswdigit () isdigit () test whether the character is a number iswgraph () isgraph () test whether the character is a visible character iswlower () islower () test whether the character is a lowercase character iswprint () isprint () test whether the character is a printable character iswpunct () ispunct () test whether the character is a punctuation mark iswspace () isspace () test whether the character is a blank symbol iswupper () isupper () test whether the character is an uppercase character iswxdigit () isxdigit () test whether the character is a hexadecimal case-insensitive conversion: wide character function common C function description towlower () tolower () converts character to lower case towupper () toupper () converts character to upper case character comparison: wide character function common C function description wcscoll () strcoll () Comparison string Date and Time conversion: wide character function description strftime () set the format Date and Time wcsftime () based on the specified string format and locale, set the format date and time according to the specified string format and locale, and return the wide string strptime () converts a string to a time value based on the specified format. The strftime reverse process prints and scans the string: the wide character function description fprintf ()/fwprintf () uses the vararg parameter to format the output fscanf () /fwscanf () formatted read printf () formatted output using the vararg parameter to the standard output scanf () formatted read sprintf ()/swprintf () formatted into a string sscanf () according to the vararg parameter table and formatted into the vfprintf ()/vfwprintf () using the stdarg parameter table to format the output to the file vprintf () use the stdarg parameter table to format the output to the standard output vsprintf ()/vswprintf () format the stdarg parameter table and write it to the string numeric conversion: wide character function common C function description wcstod () strtodd () convert the initial part of a wide character to a double-precision floating point wcstol () strtol (). Convert the initial part of a wide character to a long integer wcstoul () strtoul () convert the initial part of a wide character to an unsigned long integer multi-byte character and a wide character conversion and operations: the wide character function description mblen () determines the number of bytes of a character based on locale settings mbstowcs () convert a multi-byte string to a wide string mbtowc ()/btowc () convert a multi-byte string to a wide character wcstombs () convert a wide string to a multi-byte string wctomb ()/wctob () convert wide characters into multi-byte characters input and output: wide character function common C function description fgetwc () fgetc () read a character from the stream and convert it to wide character fgetws () fgets () read a string from the stream and convert it to a wide string fputwc () fputc (). Convert the wide character to a multi-Byte Character and output it to the standard output fputws () fputs () converts a wide string to multi-byte characters and outputs it to the standard output string getwc () getc () to read characters from the standard input, and converts it to the wide character getwchar () getchar () reads characters from the standard input and converts them to the wide character None gets (). Use fgetws () putwc () putc () to convert the wide character into multi-byte characters and write it to the standard output putwchar () putchar () converts a wide character to a multi-Byte Character and writes it to the standard output None puts (). Use fputws () ungetwc () ungetc () to return a wide character to the input stream for string operations: wide character function common C function description wcscat () strcat () connects a string to the tail of another string wcsncat () strncat () is similar to wcscat (), and specifies the bonding length of the bonding string. wcschr () strchr () finds the first position of the substring wcsrchr () strrchr () searches for the first position where the substring appears wcspbrk () strpbrk () find the position where any character in the other string appears for the first time, wcswcs ()/wcsstr () strchr () find the position where another string appears for the first time in a string wcscspn () strcspn (), and return the initial number of wcsspn () strspn () that does not contain the second string () returns the initial number containing the second string wcscpy () strcpy () Copy string wcsncpy () strncpy () similar to wcscpy (), and specifies the number of copies wcscmp () strcmp () comparing two wide strings wcsncmp () strncmp () is similar to wcscmp (), you must also specify the number of character strings wcslen () strlen () to obtain the number of wide strings wcstok () strtok () based on the identifier, the wide string is decomposed into a series of strings wcswidth () None to obtain the width of the wide string wcwidth () None to obtain the width of the wide character. In addition, wmemcpy () corresponding to the memory operation (), wmemchr (), wmemcmp (), wmemmove (), wmemset ()

 


For Unicode and multi-character sets

CString occupies 16 bits in the next Unicode byte, and 8 bits in ascii,
The environment in which the char array is changed is the same
It is best to use a specific variable type to transmit data without knowing the environment of the other end.

In mrp, what is the difference between unicode encoding and gb encoding?

UNICODE, GB2312, and ANSI
What is ANSI and what is UNICODE? In fact, these are two different encoding methods. ANSI adopts 8 bits, while UNICODE uses 16 bits. (For characters, ANSI stores English characters in a single byte and Chinese characters in double byte, while for Unicode, English and Chinese characters are both in double byte) unicode code is also an international standard and uses two-byte encoding, which is incompatible with ANSI code. Currently, it has been applied in networks, Windows systems, and many large software. 8-bit ANSI encoding can only represent 256 characters, indicating that 26 English letters are more than enough, but it is not enough to represent non-Western characters with thousands of characters, such as Chinese characters and Korean letters, in this way, the UNICODE standard is introduced.
In software development, especially some functions related to string processing in C language, ANSI and UNICODE are used for distinguishing. How can we define ANSI and UNICODE characters, how to use it? How can we convert ANSI and UNICODE?
I. Definition:
ANSI: char str [1024]; available string processing functions: strcpy (), strcat (), strlen (), and so on.
UNICODE: wchar_t str [1024]; string processing functions available
Ii. Available functions:
ANSI: char. Available string processing functions: strcat (), strcpy (), strlen (), and other functions with str headers.
UNICODE: the available string processing functions of wchar_t: functions such as wcscat (), wcscpy (), and wcslen () that are headers with wcs.
Iii. System Support
Windows 98: only ANSI is supported.
Windows 2 k: supports both ANSI and UNICODE.
Windows CE: Only UNICODE is supported.
Description
1. Only UNICODE is supported in COM.
2. in Windows 2000, the entire OS system is UNICODE-based. Therefore, using ANSI in windows 2000 requires a price. Although no conversion is required for encoding, this conversion is hidden, CPU and memory are occupied by system resources ).
3. If UNICODE must be used in Windows 98, you must manually switch the encoding.
4. How to differentiate:
In our software development, we often need to support ANSI and UNICODE. It is impossible to re-change the string type and use the string operation functions when type conversion is required. Therefore, the standard C Runtime Library and Windows provide macro-defined methods.
_ UNICODE macros (with underscores) are provided in the C language, and UNICODE macros (without underscores) are provided in Windows. If _ UNICODE macros and UNICODE macros are specified, the system automatically switches to the UNICODE version. Otherwise, the system compiles and runs in ANSI mode.
Only macros are defined and cannot be automatically converted. It also requires support for a series of character definitions.
1. TCHAR
If a UNICODE macro is defined, TCHAR is defined as wchar_t.
Typedef wchar_t TCHAR;
Otherwise, TCHAR is defined as char.
Typedef char TCHAR;
2. LPTSTR
If a UNICODE macro is defined, LPTSTR is defined as LPWSTR. (I have never known what LPWSTR is before and finally understood it)
Typedef lptstr lpwstr;
Otherwise, TCHAR is defined as char.
Typedef lptstr lpstr;
Add:
UTF-8 can be used for real stream transmission, while Unicode is an encoding scheme
In my understanding, UTF-8 is a specific implementation of Unicode. Similar to ...... the remaining full text>

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.