Windows core programming-Chapter 2 Unicode

Source: Internet
Author: User
Tags uppercase letter
Next, let us further clarify the "Mi c r o s o f t company's support for u n I c o d e ":
• Windows 2000 supports both u n I c o d E and a n s I, so you can develop any application.
• Windows 98 only supports a n s I and can only develop applications for a n s I.
• Windows CE only supports u n I c o d e and can only develop applications for u n I c o d e.

How to compile u n I c o d e source code
M I c r o s o f t company designed windows APIs for u n I c o d e to minimize the impact on your code. Real
You can compile a single source code file to compile it using or without using u n I c o d e. You only need
You can modify the two macros (u n I c o d e and _ u n I c o d e) and re-compile the source file.

To use the u n I c o d E string, some data types are defined. The standard C header file s t r I N G. H has been modified,
To define a data type named w c h a R _ t, it is a data type of u n I c o d e characters:
Typedef unsigned char wchar_t;
Standard C-runtime string functions, such
S t r c p y, s t r c h r, s t r c a t, etc., can only operate on a n s I string, you cannot properly process the u n I c o d E string. Therefore,
Ansi c also has a set of complementary functions ., All u n I c o d e functions start with W C S. W C S is the abbreviation of a wide string. To call the u n I c o d e function, you only need to use the prefix w c s to replace the prefix s t r of the n s I string function.

For code that includes explicit calls to the s t r function or the w c s function
U n I c o d e to compile the code. As mentioned earlier in this chapter, you can create a ticket to compile both a n s I and u n I c o d e.
Source code files. To create a dual function, the t c h a r. h file must be included, not the s t r I N G. H file.

The only function of t c h a r. h file is to help create a n s I/u n I o d e through source code files. It contains
A group of Macros in the code, instead of directly calling the s t r function or w c s function. If you define
_ U n I c o d e, these macros will reference the W C s functions. If _ u n I c o d e is not defined, these macros reference S T R
This group of macros.
For example, a macro in t c h a r. h is called _ t c s c p y. If _ u n I c o d e is not defined when the header file is contained
Do_t c s c p y will be extended to the s t r c p y function of a n s I. However, if _ Unicode is defined, _ tcscpy will be extended to u n I c o d e
W c s c p y function. All runtime functions with string parameters define a general macro in the t c h a r. h file. For example
If a general macro is used, instead of the specific function name of a n s I/u n I c o d e, you can create a n s I or u n I c o d e
Source code for compilation.

According to the default settings, the C ++ compiler of m I c r o s o f t can compile all strings, just as they are a n s I strings, instead of the u n I c o d E string. Therefore, if _ u n I c o d e is not defined, the compiler will be able to compile this line of code correctly: tchar * szerror = "error ". However, if _ u n I c o d e is defined, an error occurs. To generate a u n I c o d E string instead of a n s I string, you must rewrite the code line as follows: tchar * szerror = l "error "; the upper-case letter l before the literal string is used to tell the compiler that the string should be compiled as a u n I c o d E string. When the compiler places a string in the data section of the program, it inserts zero bytes between each character. The problem with this change is that the program can be compiled successfully only when _ u n I c o d EIS defined. We need another macro to selectively Add the uppercase letter l before the string. This work is done by the _ t e x T macro, And the _ t e x T macro is also defined in the t c h a r. h file. You can use this macro _ text ("error") to rewrite the above line of code. Whether _ u n I c o d e macro is defined or not, it can be compiled correctly.

U n I c o d e data type defined by wi n d o W S
The WI n d o w s header file also defines the data types of a n s I/u n I c o d e, p t r and P C t r. These data types can be a n s I string or u n I c o d E string, this depends on whether the u n I c o d e macro is defined when the program module is compiled. Note that the u n I c o d e macro is not prefixed with an underscore. The _ u n I c o d e macro is used for header files in the C runtime, while the u n I c o d e macro is used for the header files in the WI n d o W S. When compiling the source code module, these two macros must be defined at the same time.

C r e a t e wi n d o w e x W is a function version that accepts the u n I c o d E string. The upper-case letter W at the end of the function name is w I d e
(Width. Each u n I c o d e character is 1 to 6 characters in length. Therefore, they are often called wide characters. C r e a t e wi n d o w e x
The uppercase letter A at the end of the string indicates that the function can accept a n s I string. However, in our code, we usually only include calls to C r e a t e wi n d o w e x, instead of directly Calling C r e a t e wi n d o w e x W or C r e a t e wi n d o w e x. In the WI n u s e r. h file, C r e a t e wi n d o w e x is actually defined as a macro.

We recommend that you use operating system functions instead of running string functions in C. This will help to slightly improve the running performance of your application, because the operating system string function is often used by large applications such as the operating system shell process e x p l o r e r. used by e x e. Because these functions are used a lot, they may have been loaded into r a m when your application is running. In the classic operating system function style, the operating system string function name contains both uppercase and lowercase letters, which look like this: s t r c a t, s t r c h r, s t r c m p and s t r c p y. To use these functions, you must add the s h l wa p I. H header file. In addition, as mentioned above, these string functions include both a n s I version and u n I c o d e version, for example, s t r c a t a and s t r c a t w. Because these functions are operating system functions, when creating an application, if u n I c o d e is defined (without the prefix underline ), then their symbols will be extended to the wide character version.

Become an application that complies with a n s I and u n I c o d e

The following are some basic principles to be followed:
• Treat a text string as a character array rather than a c h a r s array or byte array.
• Use common data types (such as t c h a R and p t s t r) for text characters and strings.
• Apply explicit data types (such as B y t e and P B y t e) to byte, byte pointer, and data cache.
• Use the t e x T macro for the original characters and strings.
• Perform global replacement (for example, replace p s t r with p s t r ).
• Modifying string operations. For example, a function usually requires that you pass a cached size in characters, rather than bytes.
This means that you should not pass s I z e o f (s Z B u FF E R ), instead, it should pass (s I z e o f (s Z B u FF e r)/s I Z e o f (t C H a r ). In addition, if you need to allocate a memory block to the string and have the number of characters in the string, remember to use byte
Allocate memory. This means that you should call malloc (ncharacters * sizeof (tchar) instead of calling m a l o C
(N c h a r a c t e r s ). Among all the principles mentioned above, this is the most difficult principle to remember. If an operation error occurs, the compiler
No warning will be issued.

Resources
When the resource compiler compiles all your resources, the output file is the binary file of the resource. Resource (string
String values in tables, dialog box templates, and menus are always written into the u n I c o d E string. In Windows 98 and WI n d o W S
2 0 0 0, if the application does not define the u n I c o d e macro, the system will perform internal conversion.
For example, if u n I c o d e is not defined when the source code module is compiled, calling l o a d s t r I n g is actually calling
L o a d s t r I n g a function. Then, l o a d s t r I n g A reads the string from your resource and converts it to a n s I
String. A string in the form of a n s I will be returned from this function to your application.

Convert string between u n I c o d E and A N S I
Wi n d o w s function m u l t I B y t e to wi d e c h a r is used to convert a multi-byte string to a wide string. Shown below
M u l t I B y t e to wi d e c h a r function.
The U c o d e p a g e parameter is used to identify a code page number related to a multi-byte string. The d w f l a g s parameter is used to set another
Control, which can affect characters by distinguishing characters such as accents. These flags are generally not used in D W F L A G S
0 is passed in the parameter. The p m u l t I B y t e S t R parameter is used to set the string to be converted. The C h m u l t I B y t e parameter is used to specify this character
The length of a string (in bytes ). If the parameter C h m u l t I B y t e is passed-1, this function is used to determine the length of the source string
Degree.
The converted u n I c o d e version string will be written to the cache in the memory. Its address is specified by the P wi d e c h a R S T R parameter
Yes. The maximum value of the cache must be set in the c h wi d e c h a R parameter (measured in characters ). If you call
M u l t I B y t e to wi d e c h a r, pass 0 to the C H wi d e c h a R parameter, this parameter will not convert strings,
Returns the value of the cache required for successful conversion. In general, you can convert multi-byte strings using the following steps
Equivalent string to u n I c o d E:
1) Call the m u l t I B y t e to wi d e c h a r function, transfer n u L for the p wi d e c h a r s t r parameter, which is the c h wi d e c h a R parameter
Pass 0.
2) allocate enough memory blocks to store the converted u n I c o d E string. The size of the memory block is from the front
Chapter 2 Unicode 2nd
Download
M u l t B y t e to wi d e c h a r call is returned.
3) Call m u l t I B y t e to wi d e c h a r again, this time, the cached address is passed as the P wi d e c h a r s t r parameter, and
The cache size returned when the first call of m u l t I B y t e to wi d e c h a r is passed as the parameter of C H wi d e c h a R.
4. Use the converted string.
5) release the memory block occupied by the u n I c o d E string

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.