Unicode wide-byte character set

Source: Internet
Author: User
Tags character set

1. How do I get the number of characters in a string that contains both Single-byte characters and double-byte characters?

You can call the runtime library of Microsoft Visual C + + to include function _mbslen to manipulate multibyte (both Single-byte and double-byte) strings.

Calling the Strlen function does not really understand how many characters there are in the string, it can only tell you how many bytes before you reach the end of 0.

2. How do I manipulate DBCS (Double-byte character set) strings?

Function description

Ptstr Charnext (LPCTSTR); Returns the address of the next character in a string

Ptstr Charprev (LPCTSTR, LPCTSTR); Returns the address of one character in an upper string

BOOL Isdbcsleadbyte (byte); If the byte is the first byte of a DBCS character, returns a value other than 0

3. Why use Unicode?

(1) Data exchange can be easily made between different languages.

(2) enables you to allocate a single binary. exe file or DLL file that supports all languages.

(3) Improve the operation efficiency of the application.

Windows 2000 is developed from scratch using Unicode, and if you call any of the Windows functions and pass an ANSI string to it, the system first converts the string to Unicode and then passes the Unicode string to the operating system. If you want the function to return an ANSI string, the system first converts the Unicode string to an ANSI string, and then returns the result to your application. The conversion of these strings takes up the time and memory of the system. By developing applications from scratch with Unicode, you can make your application run more efficiently.

Windows CE is itself an operating system that uses Unicode and does not support ANSI Windows functions at all

Windows 98 only supports ANSI and can only develop applications for ANSI.

When Microsoft converts COM from 16-bit Windows to Win32, the company determines that all COM interface methods that require a string can accept only Unicode strings.

4. How do I write Unicode source code?

Microsoft has designed WINDOWSAPI for Unicode, so that you can minimize the impact of your code. In fact, you can write a single source code file to compile it using or not using Unicode. To define only two macros (UNICODE and _UNICODE), you can modify and recompile the source file.

The _UNICODE macro is used for the C run-time header file, while the Unicode macro is used for the Windows header file. When compiling a source code module, you usually have to define both macros.

5. What are the Unicode data types defined by Windows?

Data type description

WCHAR Unicode characters

Pwstr pointer to a Unicode string

Pcwstr pointer to a constant Unicode string

The corresponding ANSI data type is CHAR,LPSTR and LPCSTR.

The Ansi/unicode universal data type is TCHAR,PTSTR,LPCTSTR.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.