Write ANSI and Unicode-compliant applications

Source: Internet
Author: User

The world is really amazing. It takes a long time to get together and a long time to get together.

Today, with the development of computers, communication between multiple countries has become increasingly widespread, and software localization is a major trend. It is worth considering to reduce localization work.

The real problem for software localization is how to handle different character sets. You must know that a single byte is represented by an 8-bit data. Therefore, it can contain up to 256 characters. How can 256 countries in the world be enough. Therefore, DBCS is proposed to solve this problem.

 

Single-byte and double-byte character sets -----> multi-character sets

 

When it represents English or some symbols, it is expressed in one byte. When it represents Japanese, Chinese, and other characters, it is expressed in two bits. As you can imagine, we can no longer traverse every character through pchar ++ like operating a single byte character.

Therefore, Ms provides charnext and charpre as the traversal tool. However, these functions are a headache.

 

Unicode came into being. It uses two bytes to represent a character, whether in Chinese or English. Unified. The two bytes are 16 bits, indicating 65536 bits. The symbols of all countries in the world are about 35000, which is enough.

 

Why use Unicode

When developing applications, consider the advantages of Unicode. Even if you don't want to localize your program, you should also focus on unicde during development, which will certainly simplify your future code conversion work. In addition, unicde provides the following functions.

1. It is easy to exchange data in different languages.

2XX enables you to allocate a single binary file or DLL file that supports all languages.

3. Improve the running efficiency of applications.

 

 

Unicode on Windows 2000

 

Windows 2000 is developed from scratch using Unicode. Unicode is used for all string-related operations. Of course, Windows 2000 APIs all accept parameters of the Multi-Character Set and Unicode Character Set. However, only Unicode functions are implemented. The function of multiple character sets converts the Unicode Character Set to Unicode before being processed by the Unicode function. It can be seen that using Unicode to call APIs will be much faster. Similarly, the API functions that return strings perform the same conversion.

 

Two APIs in the system take createfile as an example. As defined below

# Ifdef Unicode

# Define createfilew createfile

# Else

# Define createfilea createfile

# Endif

 

When we call createfile, the system selects a normal function based on whether you want Unicode.

When you call createfilea

Call createfilea ---> convert multiple character set parameters to Unicode ---> call createfilew

The conversion work is too much in vain. Therefore, Unicode programming can improve efficiency.

 

We will not discuss Windows 98 .. We need to know that win 98 does not support Unicode. Therefore, force the function ending with W to use getlasterror () to get the error message. You will be prompted that this function is not implemented.

 

Windows CE is a full Unicode operating system and does not support ANSI .....

 

 

How to use Unicode

Data Type

To be different from ANSI, the Unicode data type is obviously different.

Char wchar_t

Wchar_t is defined as typedef unsigned shot wchar_t

It can be seen that it is 16 bits.

For Commonly Used string operation functions, the comparison is as follows:

Strcpy wcscpy

Strcat wcscat

...

 

STR has been replaced with the abbreviation of "WCS", that is, "wide character string ".

 

The above is the definition of the C Runtime Library, because the C Runtime Library provided by MS is the same as the ANSI standard. Therefore, the above wide character operation is still effective for WIN 98.

 

For Unicode, we cannot directly use the above function, because in this case, you will cry when converting the ANSI/Unicode source code.

Therefore, we should use an image

 

# Ifdef Unicode

# DEFINE _ strcpy wcscpy

# Else

# DEFINE _ strcpy

# Endif

 

This macro is used for every function, and the tchar. h header file has already helped us. You only need to include it and use the correct macro-controlled function name and type. It can be easily implemented...

 

Assign values to strings.

 

Char * P = "Ook ";

Wchar_t * P = "Ook"; // Error

 

But it should be

Wchart_t * P = l "Ook"; // L indicates the width.

 

Of course, we cannot use this method directly. Instead, use a text macro.

The usage is as follows: tchar * P = text ("Ook ");

 

The definition is similar to the following.

# Ifdef Unicode

Typedef wchar_t tchar

# Define text (x) L # x

# Else

Typedef char tchar

# Define text (X)

# Endif

 

In this way, the corresponding information is correct.

 

To sum up, write the original code rules that support ANSI/Unicode compilation.

 

# Treat a text string as a character array instead of a char array or byte array. (Because the length of tchar is not fixed)

# Use common data types (tchar, ptstr) for text characters and strings

# Use explicit data types (byte, pbyte) for byte, byte pointer, and data cache

# Use the text macro to use the original characters and strings.

# Perform global replacement (for example, replace pstr with ptstr)

# Modifying string operations. For example, if the array size is calculated, sizeof (szbubffer)/szbuffer [0] should be used;

 

# Include <windows. h> <br/> # include <tchar. h> <br/> # include <shlwapi. h> <br/> # include <stdio. h> </P> <p> // width byte <br/> bool stringreversw (pwstr pwchar) <br/>{< br/> pwstr pendstr = pwchar + wcslen (pwchar)-1; <br/> wchar pchar; <br/> while (pwchar <pendstr) <br/>{< br/> pchar = * pwchar; <br/> * pwchar = * pendstr; <br/> * pendstr = pchar; <br/> pwchar ++; <br/> pendstr --; <br/>}< br/> return true; <br/>}</P> <p> // multibyte <br/>/ /Convert the result to multiple bytes, and then convert the result to multiple bytes <br/> bool stringreversa (pstr pchar) <br/>{< br/> pwstr pwchar; <br/> int nlenofwidechar; <br/> bool OK = false; </P> <p> nlenofwidechar = multibytetowidechar (cp_acp, 0, pchar,-1, null, 0); </P> <p> pwchar = (wchar *) heapalloc (getprocessheap (), 0, nlenofwidechar * sizeof (wchar); <br/> If (! Pwchar) return false; </P> <p> multibytetowidechar (cp_acp, 0, pchar,-1, pwchar, nlenofwidechar); <br/> OK = stringreversw (pwchar ); <br/> If (OK) <br/> {<br/> widechartomultibyte (cp_acp, 0, pwchar,-1, pchar, strlen (pchar), null, null ); <br/>}</P> <p> heapfree (getprocessheap (), 0, (lpvoid) pwchar); <br/> Return OK; <br/>}</P> <p> // unconverted Function Conversion .... <Br/> bool stringrevers _ (tchar * pwchar) <br/>{< br/> tchar * pendstr = pwchar + _ tcslen (pwchar)-1; <br/> tchar pchar; <br/> while (pwchar <pendstr) <br/>{< br/> pchar = * pwchar; <br/> * pwchar = * pendstr; <br/> * pendstr = pchar; <br/> pwchar ++; <br/> pendstr --; <br/>}< br/> return true; <br/>}</P> <p> # ifdef Unicode <br/> # define stringrevers stringreversw <br/> # else <br/> # define stringrevers stringreversa <br/> # En DIF </P> <p> int _ tmain () <br/> {<br/> tchar pstr [] = text ("Haha, this is a good thing, OK? "); <Br/> stringrevers _ (pstr); <br/> // stringrevers (pstr); <br/> printf (" % d ", sizeof (text ("Haha, this is good, OK? "); <Br/> MessageBox (null, pstr, null, mb_ OK); </P> <p> return 0; <br/>}

 

This program supports ANSI/Unicode and the output is normal. Stringrevers _ (pstr); can be blocked. Enable // stringrevers (pstr); and compile it in ANSI/Unicode to see the effect. In addition, the output result to the console also shows that the word length is different...

 

For the two conversion functions used, check msdn.

 

Summary completed... Close the job .!!!!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.