This article clarifies the character sets of Windows Mobile and Windows Wince (Windows Embedded CE ).

Source: Internet
Author: User
Document directory
  • Case 1
  •  
  • Case 2
  • Case 3
  • Case 4
Background

Developers who have developed Windows Mobile and Windows Embedded CE, especially Native C ++, have encountered conversion of ANSI and Unicode character sets more or less. This article attempts to understand the character set issues developed by Windows Mobile and Windows Embedded CE. In fact, this question is a bit ambitious and aggressive.

 

Introduction

This article attempts to clarify the character set conversion problems in Windows Mobile and Windows Embedded CE Native C ++ development. Starting from the concept of character set, this article describes all the string types supported by Wince, as well as various conversion methods, and finally provides suggestions for use.

 

What Is Character Set

Character Set is a ing relationship that defines the relationship between characters and encoding. The encoding here is generally the bit of 1 and 0. Any data storage in popular computer systems is expressed as 1 and 0. The numbers 1 and 0 are mapped to different characters in different character sets.

In the early stages of computer development and use, storage devices were very expensive. Scientists tried their best to save costs. Therefore, the most common character set was the single-byte character set (Signle-byte ), the single-byte character set uses a byte to represent a character. The typical character set is ASCII (American Standard Code for Information Interchange, we didn't think about how we felt. We started to develop Chinese characters from Oracle, and the Americans saw ~ Z.

This ASCII table is used by anyone who has learned C language and used it in previous tests.

ANSI (American National Standards Institute) is a standard developed by European letters based on the ASCII 7bit coding standard (ASA X3.4-1963.

However, the biggest disadvantage of a single-byte character set is that a byte only needs eight bits, that is, a maximum of 256 (28) visible and invisible characters. Chinese characters cannot be expressed by 256 characters in English-speaking countries. Therefore, the Unicode Character Set of the international standard is gradually developed.

 

Wince and Unicode

For those who are new to Windows Mobile and Windows Embedded CE Native C ++ development, the idea is that Windows Mobile and Wince only support Unicode and do not support ANSI. TinyXML uses ANSI string, but Wince uses Unicode, so TinyXML cannot be used in Wince and Windows Mobile. Wait ...... In fact, there are some mistakes in these ideas. Wince is a Unicode system, which means that all the string processing code in Wince is based on Unicode encoding, but it does not mean that Wince does not support ANSI. We can also continue to use ANSI in Wince, for example, std: string, char.

But why are there any of these misunderstandings? Let's first look at the compilation errors below.

error C2664: 'wprintf' : cannot convert parameter 1 from 'const char [21]' to 'const wchar_t *'

 

error C2664: 'DeleteFileW' : cannot convert parameter 1 from 'const char [21]' to 'LPCWSTR'

I can assure that 9 or even 10 of the 10 people who have just been familiar with Windows Mobile and Windows Embedded CE Native C ++ development have encountered the above problems. This problem occurs when you use the interfaces of MFC and WTL when calling Win32 APIs, because we are used to using char *, std: string, but exactly the Win32 API, the string in the function entry of MFC and WTL is Unicode, so the above compilation error occurs. I don't know why we came up with a wrong idea: Wince only supports Unicode and does not support ANSI. In fact, Wince still supports ANSI. We define a single character char array, and even print the ANSI string in the Console through C Runtime.

char ansiStr[] = "I am ANSI string";
printf(ansiStr);

 

String supported by Wince

Since Wince supports ANSI and Unicode, when should ANSI be used and Unicode be used? In the following sections, we will talk about String Conversion and usage suggestions in the Development of Wince.

Char *

Char * and char [] have no essential difference. They all point to the memory pointer. All the strings in the ANSI Win32 API use char *. Since Win32 APIs are language-independent, these parameters actually pass a pointer that should store the memory of the string. (It's very easy, but it does, huh, huh ). In the ANSI environment, the use of pure C development, the program is inseparable from char.

 

Wchar_t *

Macro definitions such as LPWSTR and PWSTR are actually wchar_t *, and the most common LPCWSTR is the macro definition of const wchar_t. Wchar_t indicates a 16-bit Unicode character. Wchar_t * And wchar_t [] are used to define Unicode strings. In Unicode, all Win32 API strings are changed from char * To wchar_t.

You can check the pre-Compilation of the header file. Take the DeleteFile API of winbase. h as an example.

WINBASEAPI
BOOL
WINAPI
DeleteFileA(
LPCSTR lpFileName
);
WINBASEAPI
BOOL
WINAPI
DeleteFileW(
LPCWSTR lpFileName
);
#ifdef UNICODE
#define DeleteFile DeleteFileW
#else
#define DeleteFile DeleteFileA
#endif // !UNICODE

In ANSI, DeleteFile is DeleteFileA, and the parameter string is defined as LPCSTR, that is, const char *, while in Unicode, DeleteFile is DeleteFileW, and the parameter string is defined as LPCWSTR, that is, const wchar_t *.

 

Deciding whether DeleteFile is DeleteFileA or DeleteFileW is determined by the Pre-compiled macro UNICODE.

This macro can be configured in the project properties, such:

 

When you select Use Unicode Character Set, the pre-compilation will add macro UNICODE and _ UNICODE.

However, if the target platform is Windows Mobile or Wince, whether or not you choose to Use Unicode Character Set or not, the pre-Compilation of UNICODE and _ UNICODE will be added, that is to say, all Win32 apis under Wince are of the Unicode version.

 

CString

CString was initially encapsulated in MFC. ATL and WTL respectively encapsulate CString. The three CString packages have semantic compatibility, that is, they provide the same interface. The advantage of using CString is that it can be compatible with both ANSI and Unicode, as shown in the following example:

CString str = "Independent String";
m_wndPic.SetWindowText(str);

M_wndPic is a CStatic control. The above code can be used in ANSI or Unicode, and does not need to be changed.

 

The following uses ATL CString as an example to describe how CString supports both ANSI and Unicode.

typedef CAtlString CString;
typedef CStringT< TCHAR, StrTraitATL< TCHAR > > CAtlString;

The type of the string stored in CString is determined by TCHAR, and TCHAR is determined by UNICODE pre-compilation. See the macro definition below.

#ifdef  UNICODE                     // r_winnt
typedef WCHAR TCHAR, *PTCHAR;
#else /* UNICODE */ // r_winnt
typedef char TCHAR, *PTCHAR;
#endif /* UNICODE */

This CString uses the string type and is automatically determined based on the pre-compilation options.

 

Std: string

The string in STL encapsulates single-byte characters. Due to its cross-platform features, std: string is widely used in the code I have written. Actually, STL is widely used. For example, I usually use std: string to operate TinyXML. Aside from the Wince platform, std: string has no disadvantages and can be used across any platform that supports Standard C ++. However, the development in Windows Mobile and Windows is a little different, because std: string encapsulates single-byte characters, so if you need to call Win32 API, use MFC, transformation is required for the functions of ATL and WTL. This is a disadvantage, but there is no problem in using std: string after conversion.

 

Std: wstring

The Unicode version of string in STL, similar to std: string, uses unicode characters for encapsulation. In fact, std: wstring is not used much. It is enough to use std: string.

 

How to convert strings supported by Wince

Since Windows Mobile and Wince (Windows Embedded CE) support the above strings, we will encounter the problem of Direct conversion between these strings during development. The following example demonstrates how to convert them.

 

In the Conversion process, we recommend that you use ATL Macros. For details about ATL Macros, refer to ATL and MFC String Conversion Macros.

These macros are named

CSourceType2[C] DestinationType [EX]

SourceType/DestinationType

Description

A

ANSI character string.

W

Unicode character string.

T

Generic character string (equivalent to W when _ UNICODE is defined, equivalent to A otherwise ).

OLE

OLE character string (equivalent to W ).

 

A Indicates ANSI string, W indicates Unicode string, and T indicates generic string. The type is determined based on pre-compilation. Like W, OLE is never used.

For example, CT2CA converts a common string to an ANSI string.

 

Convert to char *
void ConvertToCharArray()
{
char ansiStr[255] = "ANSI string";
wchar_t unicodeStr[255] = _T("Unicode string"); //use _T() convert const string to wchar_t string

CString cstr("ATL CString");

std::string stlStr("STL string");
std::wstring stlWStr(_T("STL wstring")); //use _T() convert const string to wchar_t string

printf("All string convert to char*\n");
strcpy(ansiStr, CT2CA(unicodeStr));
printf("Convert from wchar_t*, %s\n", ansiStr);

strcpy(ansiStr, CT2CA(cstr));
printf("Convert from CString, %s\n", ansiStr);

strcpy(ansiStr, stlStr.c_str());
printf("Convert from std::string, %s\n", ansiStr);

strcpy(ansiStr, CT2CA(stlWStr.c_str()));
printf("Convert from std::wstring, %s\n", ansiStr);
}

In this example, ATL CString is used. If you want to add ATL support to a new Win32 project, see Add ATL support to the Win32 project in Windows Mobile and Windows Embedded CE.

As mentioned above, ATL CString, WTL, and MFC CString have the same semantics. Therefore, all CString codes in this article are equally valid in MFC.

 

Convert to wchar_t *
void ConvertToWCharArray()
{
char ansiStr[255] = "ANSI string";
wchar_t unicodeStr[255] = _T("Unicode string"); //use _T() convert const string to wchar_t string

CString cstr("ATL CString");

std::string stlStr("STL string");
std::wstring stlWStr(_T("STL wstring")); //use _T() convert const string to wchar_t string

printf("All string convert to wchar_t*\n");
wcscpy(unicodeStr, CComBSTR(ansiStr));
wprintf(_T("Convert from char*, %s\n"), unicodeStr);

wcscpy(unicodeStr, cstr);
wprintf(_T("Convert from CString, %s\n"), unicodeStr);

wcscpy(unicodeStr, CComBSTR(stlStr.c_str()));
wprintf(_T("Convert from std::string, %s\n"), unicodeStr);

wcscpy(unicodeStr, stlWStr.c_str());
wprintf(_T("Convert from std::wstring, %s\n"), unicodeStr);
}

Here we use the CComBSTR () recommended by Microsoft instead of CA2W ().

Convert to CString
void ConvertToCString()
{
char ansiStr[255] = "ANSI string";
wchar_t unicodeStr[255] = _T("Unicode string"); //use _T() convert const string to wchar_t string

CString cstr("ATL CString");

std::string stlStr("STL string");
std::wstring stlWStr(_T("STL wstring")); //use _T() convert const string to wchar_t string

printf("All string convert to CString\n");
cstr = ansiStr;
wprintf(_T("Convert from char*, %s\n"), cstr);

cstr = unicodeStr;
wprintf(_T("Convert from wchar_t*, %s\n"), cstr);

cstr = stlStr.c_str();
wprintf(_T("Convert from std::string, %s\n"), cstr);

cstr = stlWStr.c_str();
wprintf(_T("Convert from std::wstring, %s\n"), cstr);
}

Convert to std: string
void ConvertToStlString()
{
char ansiStr[255] = "ANSI string";
wchar_t unicodeStr[255] = _T("Unicode string"); //use _T() convert const string to wchar_t string

CString cstr("ATL CString");

std::string stlStr("STL string");
std::wstring stlWStr(_T("STL wstring")); //use _T() convert const string to wchar_t string

printf("All string convert to STL string\n");
stlStr = ansiStr;
printf("Convert from char*, %s\n", stlStr.c_str());

stlStr = CT2CA(unicodeStr);
printf("Convert from wchar_t*, %s\n", stlStr.c_str());

stlStr = CT2CA(cstr);
printf("Convert from CString, %s\n", stlStr.c_str());

stlStr = CT2CA(stlWStr.c_str());
printf("Convert from std::wstring, %s\n", stlStr.c_str());
}

 

Convert to std: wstring
void ConvertToStlWstring()
{
char ansiStr[255] = "ANSI string";
wchar_t unicodeStr[255] = _T("Unicode string"); //use _T() convert const string to wchar_t string

CString cstr("ATL CString");

std::string stlStr("STL string");
std::wstring stlWStr(_T("STL wstring")); //use _T() convert const string to wchar_t string

printf("All string convert to STL wstring\n");
stlWStr = CComBSTR(ansiStr);
wprintf(_T("Convert from char*, %s\n"), stlWStr.c_str());

stlWStr = unicodeStr;
wprintf(_T("Convert from wchar_t*, %s\n"), stlWStr.c_str());

stlWStr = cstr;
wprintf(_T("Convert from CString, %s\n"), stlWStr.c_str());

stlWStr = CComBSTR(stlStr.c_str());
wprintf(_T("Convert from std::string, %s\n"), stlWStr.c_str());
}

 

 

Pure C Runtime Library Conversion

Sometimes Win32 is used for pure C development, such as development of today's plug-ins, without the use of ATL, WTL, MFC and STL, there will also be the need to convert char * And wchar_t, however, you cannot use the ATL macro. The following shows how to use the C Runtime Library for conversion.

void ConvertToWCharArrayUsingCRuntime()
{
char ansiStr[255] = "ANSI string";
wchar_t unicodeStr[255] = _T("Unicode string"); //use _T() convert const string to wchar_t string

printf("Convert to char* from wchar_t* using C Runtime library.\n");
sprintf(ansiStr, "%S", unicodeStr);
printf("Convert from wchar_t*, %s\n", ansiStr);
}

void ConvertToCharArrayUsingCRuntime()
{
char ansiStr[255] = "ANSI string";
wchar_t unicodeStr[255] = _T("Unicode string"); //use _T() convert const string to wchar_t string

printf("Convert to wchar_t* from char* using C Runtime library.\n");
swprintf(unicodeStr, _T("%S"), ansiStr);
wprintf(_T("Convert from char*, %s\n"), unicodeStr);
}

 

Suggestions

The above describes how to select the strings that Windows Mobile and Windows Embedded CE support? In fact, there is no rule. Let me discuss my experience. This is not a criterion, so it is only for reference.

I. Avoid using char * And wchar_t * whenever possible *

In addition to the following situations, when char * And wchar_t * have to be used, avoid using char * And wchar_t * in most cases *.

Case 1

For today's component development, only Win32 is used. If it does not depend on ATL, WTL, MFC, and STL, you have no choice but to use char * And wchar_t *.

For more information about the development of today's components, see:

About how to use WTL in Windows Mobile today

Case 2

Package DLL or general static library for third-party use, such as TinyXML and CppUnitLite class libraries. They all implement character string processing classes based on char *, so that the library does not depend on ATL, WTL, MFC and STL.

For more information about TinyXML, see CppUnitLite:

Use TinyXML for Native C ++ development in Windows Mobile and Wince

Unit Testing of native C ++ in Windows Mobile and Windows Mobile

Use CppUnitLite on Windows Mobile to output test results

 

Case 3

Encapsulate the DLL for. NET Compact Framework. The interface functions can only use char * And wchar_t *, but cannot use CString or std: string.

For DLL encapsulation, refer:

How to encapsulate Native DLL provided to. NET Compact Framework in Windows Mobile and Wince (Windows Embedded CE) for calling

Encapsulation of Native DLL in Windows Mobile and Windows Wince (Windows Embedded CE)

 

Case 4

You can use char * And wchar_t * to include some string constants to replace macro definitions.

 

In addition to the preceding situations, avoid using char * And wchar_t * as much as possible. Instead, use a string class encapsulated by CString, std: string, and so on.

 

2. Use CString to support both PC and Window Mobile versions.

If C ++ is used with ATL, WTL, or MFC development, the program must support both Windows desktop edition, Windows Mobile, and Wince. You can use CString. CString is compatible with ANSI and Unicode versions.

For example, a database category class of SQL Server Compact encapsulated by me uses CString, which supports PC and Windows Mobile. Refer:

Encapsulation of access to SqlCe by Native C ++ in Windows Mobile

 

3. Use std: string for cross-platform applications

The program is not only used on windows platforms, but also for Linux, Unix, and BSD platforms. You can consider using std: string. I generally do not use std: wstring. I don't think this is necessary, use std: string to convert it as needed. However, in pursuit of higher cross-platform performance, only char * And wchar_t * can be used, and STL is independent.

 

I personally like to use std: string, because I use STL a lot. The interface and processing logic are separated during the design. The processing logic uses std: string and STL containers in a unified manner. Interface interaction is required, or string conversion is performed when Win32 is called.

 

Articles for further reference

Http://www.tenouk.com/ModuleG.html

Http://www.codeproject.com/KB/string/cppstringguide1.aspx

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.