Learn MFC process by writing Serial port Helper Tool--(c) wide character and multibyte character conversion of Unicode character set

Source: Internet
Author: User
Tags control characters

learn the MFC process by writing a Serial port helper tool

Because it has been done several times MFC programming, each time the project is completed, MFC basic operation is clear, but too long time no longer contact with MFC project, again do MFC project, but also from the beginning familiar. This time by doing a serial assistant once again familiar with MFC, and made a record, in order to facilitate later access. The process of doing more is encountered problems directly Baidu and Google search, so many are superficial understanding, know it does not know why. In addition to do this tool just to get familiar with, many features are not perfect! (development tool VS2008)

(iii) wide-character and multibyte-character conversions for Unicode character sets

In the previous section, "(ii) the" Open serial port "button to learn about the basic operation of the method of wide-character and multibyte-character conversion, this section is a simple explanation:

In Visual C + +. In NET2005, the default character set form is Unicode, but in projects such as VC6.0, the default character set form is multibyte character set (Mbcs:multi-byte Character set). This results in a very simple and useful variety of character operations and functions in VC6.0 that run in the VS2005 environment will report a variety of errors, summarized here in Visual C + +. There are several ways to convert between CString and char * under the Unicode character set in the NET2005 environment, which is actually the Unicode character set and the MBCS character set conversion.

1, Unicode under CString , WCHAR * Convert to char *

Method One: Use Api:widechartomultibyte for conversion

CString str = _t ("d://in-school project//qq.bmp");

Note: The following values of N and Len differ in size, and n is calculated by character, and Len is calculated as a byte
int n = str.     GetLength (); n = +, Len = 18

Gets the size of a wide-byte character, measured in bytes
int len = WideCharToMultiByte (cp_acp,0,str,str. GetLength (), null,0,null,null);

Request space for a multibyte-character array with an array size of wide-byte bytes, measured in bytes
char * pfilename = new char[len+1]; In bytes

Wide-byte encoding converted to multibyte encoding
WideCharToMultiByte (cp_acp,0,str,str. GetLength (), pfilename,len,null,null);

PFILENAME[LEN+1] = '/0 '; Multibyte character ends with '/0 '

Method Two: Use the function: T2A, W2A

CString str = _t ("d://in-school project//qq.bmp");

declaring identifiers
Uses_conversion;

Calling functions, both T2A and W2A support character conversions in ATL and MFC
char * pfilename = T2A (str);
char * pfilename = W2A (str); Conversion can also be achieved

Note: Sometimes you may need to add references to # include <afxpriv.h>

2, Unicode under char * converted to CString , WCHAR *

Method One: Use Api:multibytetowidechar for conversion

char * pfilename = "d://in-school project//qq.bmp";

Computes the char * array size, in bytes, of two bytes per kanji
int charlen = strlen (pfilename);

Calculates the size of a multibyte character, calculated by character.
int len = MultiByteToWideChar (cp_acp,0,pfilename,charlen,null,0);

Request space for a wide-byte character array with an array size of multibyte character size computed in bytes
TCHAR *buf = new Tchar[len + 1];

Multi-byte encoding converted to wide-byte encoding
MultiByteToWideChar (Cp_acp,0,pfilename,charlen,buf,len);

Buf[len] = '/0 '; Add end of string, note not len+1
Convert the TCHAR array to CString
CString Pwidechar;
Pwidechar.append (BUF);

Delete buffer
delete []buf;

Method Two: Use the function: A2T, a2w

char * pfilename = "d://in-school project//qq.bmp";

Uses_conversion;
CString s = a2t (pfilename);

CString s = a2w (pfilename);

////

WCHAR * bt = A2T (Pfilename);

The BT pointer variable here is automatically returned by the A2T function to modify his initial value.

Method Three: Use the _t macro to convert a string to a wide character

Multibyte character sets, which can be compiled in VC6 and VC7 statements, but VS2005 cannot pass, default to Unicode character set
AfxMessageBox ("Failed to load data", 0);

The writing code uses text ("") or _t (""), and the text is common in both Unicode and non-Unicode programs
AfxMessageBox (_t ("Failed to load data"), 0);

Note: Direct conversion is possible in an MBCS-based project, but it is not feasible to convert directly in a Unicode-based project, CString will save the data in Unicode form, forcing the type conversion to return only the first character.

Citation: http://blog.csdn.net/neverup_/article/details/5664733

Attention:

A2T (), a2w () the conversion of a char array to a WCHAR array is an error!!! The reason is unclear.

The MultiByteToWideChar () function effect is the same as the effect of a2t (), a2w (). When a char array is transferred to an array, the WCHAR is used to assign values directly to each element.

The MultiByteToWideChar () and WideCharToMultiByte () functions have the same effect as the a2t () and T2A () function tests, but the former is more complex and simpler to use than the latter.

The following "MultiByteToWideChar and WideCharToMultiByte the correct use of methods and parameters of the detailed " quoted:http://www.cnblogs.com/ziwuge/archive/ 2011/11/05/2236968.html

The above-mentioned two function tests are also used in the method of this article, in order to prevent bloggers to delete the blog, directly copied down, as a reference later.

MultiByteToWideChar and WideCharToMultiByte the correct use method and parameter explanation

the content of this article is excerpted Windows core Programming (Fifth edition) Page26.

The use of these two functions has been elaborated in detail, and I am only here as a memo. For the parameters of the function, please refer to Baidu Encyclopedia MultiByteToWideChar and WideCharToMultiByte.

Function Prototypes:

int MultiByteToWideChar (
UINT CodePage,
DWORD DwFlags,
LPCSTR Lpmultibytestr,
int Cchmultibyte,
LPWStr Lpwidecharstr,
int Cchwidechar
);
int WideCharToMultiByte (
UINT CodePage,
DWORD DwFlags,
LPWStr Lpwidecharstr,
int Cchwidechar,
LPCSTR Lpmultibytestr,
int Cchmultibyte,
LPCSTR Lpdefaultchar,
Pbool Pfuseddefaultchar
);

In the case of safe use, the following general steps are used:
MultiByteToWideChar:
1) call MultiByteToWideChar, pass NULL for the LPWIDECHARSTR parameter, pass in 0 for the Cchwidechar parameter, pass in 1 for the cchmultibyte parameter;
2) Allocate a chunk of memory that is sufficient to accommodate the converted Unicode string, and its size is the return value of the previous MultiByteToWideChar call multiplied by sizeof (wchar_t);
3) Call MultiByteToWideChar again, this time passing the buffer address as the value of the LPWIDECHARSTR parameter, multiplying the return value of the first MultiByteToWideChar call by sizeof (wchar_t) The resulting size is passed in as the value of the Cchwidechar parameter;
4) Use the converted string;
5) releases the block of memory occupied by the Unicode string.

WideCharToMultiByte:
The steps taken are similar to the previous one, except that the return value is directly the number of bytes required to ensure a successful conversion, so there is no need to perform a multiplication operation.

In "Windows core Programming" chapter II (character and string processing) refers to a lot of characters and strings of canonical processing methods, such as the question of string functions, whether the use of C library, or use MS to implement the _s suffix itself.

"Attached" "Windows core Programming" chapter II PDF Download: HTTP://DL.DBANK.COM/C0PARCJXSV

MultiByteToWideChar and WideCharToMultiByte of the parameters
The following section is excerpted from: http://www.cnblogs.com/wanghao111/archive/2009/05/25/1489021.html#2270293

WideCharToMultiByte This function converts a wide string to a specified new string, such as Ansi,utf8, the new string does not have to be a multibyte character set.
Parameters:

CodePage: Specifies the character set code page to convert to, which can be any installed or system-brought character set, or you can use one of the code pages shown below.
CP_ACP current system ANSI code page
CP_MACCP current system Macintosh code page
CP_OEMCP Current system OEM code page, an original device manufacturer hardware scan code
Cp_symbol SYMBOL code page, for Windows 2000 and later versions, I don't understand what
CP_THREAD_ACP Current thread ANSI code page for Windows 2000 and later versions, I don't understand what
Cp_utf7 UTF-7, Lpdefaultchar and Lpuseddefaultchar must be NULL when setting this value
Cp_utf8 UTF-8, Lpdefaultchar and Lpuseddefaultchar must be NULL when setting this value
/* I think the most commonly used should be CP_ACP and Cp_utf8, which convert wide characters to ANSI and the latter to UTF8. */

DwFlags: Specifies how to handle characters that are not converted, but does not set this parameter function to run faster, I set it to 0. The values that can be set are shown in the following table:
Wc_no_best_fit_chars converts a Unicode character that cannot be converted directly to the corresponding multibyte character to the default character specified by Lpdefaultchar. That is, if you convert Unicode to multibyte characters and then back again, you do not necessarily get the same Unicode characters because the default characters may be used during this time. This option can be used alone or in conjunction with other options.
Wc_compositecheck converts synthetic characters into pre-fabricated characters. It can be used with any combination of the last three options, if not with any of them, then the same as option Wc_sepchars.
Wc_err_invalid_chars This option causes the function to fail back when it encounters an invalid character, and GetLastError returns an error code error_no_unicode_translation. Otherwise, the function will automatically discard illegal characters. This option is available only for UTF8.
Wc_discardns cast discards characters that do not occupy space, use with Wc_compositecheck
Wc_sepchars a single character when converting, this is the default conversion option, used with Wc_compositecheck
The Wc_defaultchar conversion uses the default character instead of the exception character (the most common as '? '), which is used with the Wc_compositecheck.
/* When specifying Wc_compositecheck, the function converts the synthesized characters into pre-fabricated characters. A synthetic character consists of a base character and a non-space character (such as a European country and phonetic transcription of Hanyu Pinyin), each of which has a different character value. Pre-fabricated characters have a single character value that represents a base character and a composition that does not occupy space characters. When you specify the Wc_compositecheck option, you can also use the last 3 options listed in the previous table to customize the conversion rules for pre-made characters. These options determine the behavior of the function when it encounters a synthetic character with a wide string that does not have a corresponding prefab character, and they are used with wc_compositecheck, and if none is specified, the function defaults to Wc_sepchars. For the following code page, dwflags must be 0, otherwise the function returns the error code error_invalid_flags. 50220 50221 50222 50225 50227 50229 52936 54936 57002 to 57011 65000 (UTF7) (Symbol)
For utf8,dwflags must be 0 or wc_err_invalid_chars, otherwise the function will fail to return and set the error code error_invalid_flags, which you can call GetLastError get. */

lpwidecharstr : The wide string to convert.
Cchwidechar : The length of the wide string to convert, and 1 to convert to the end of the string.
lpmultibytestr : A buffer that outputs a new string after the transform is received.
cbmultibyte : Output buffer size, if 0,LPMULTIBYTESTR is ignored, the function returns the desired buffer size without using LPMULTIBYTESTR.
Lpdefaultchar : A pointer to a character that is substituted for the default character when the corresponding character is not found in the specified encoding. If NULL, the system default character is used. For dwflags that require this parameter to be null, the function will fail to return and set the error code error_invalid_parameter.
Lpuseddefaultchar : A pointer to a switch variable indicating whether the default character has been used. For dwflags that require this parameter to be null, the function will fail to return and set the error code error_invalid_parameter. Both Lpdefaultchar and Lpuseddefaultchar are set to NULL, and the function is faster.
/*  Note: Improper use of function WideCharToMultiByte will affect the security of the program. Calling this function can easily cause a memory leak because the input buffer size that lpwidecharstr points to is a wide number of characters, and the output buffer size that lpmultibytestr points to is the number of bytes. To avoid memory leaks, be sure to specify the appropriate size for the output buffers. My approach is to make Cbmultibyte call WideCharToMultiByte once for 0 to get the desired buffer size, allocate space for the buffer, and then call the WideCharToMultiByte fill buffer again, as described in the following code. In addition, converting from Unicode UTF16 to a non-Unicode character set may result in data loss because the character set may not be able to find characters that represent specific Unicode data.   */

Return value: If the function succeeds and Cbmultibyte is not 0, returns the number of bytes written to LPMULTIBYTESTR (including null at the end of the string), and Cbmultibyte to 0, which returns the number of bytes required for the conversion. Function failed, returned 0.

MultiByteToWideChar is a multi-byte character to a wide character conversion function.
This function converts multibyte strings to wide strings (Unicode), and the string to be converted is not necessarily multibyte.
The parameters of this function, the return value, and considerations refer to the description of the function WideCharToMultiByte above, which simply explains Dwflags.

DwFlags: Specifies whether to convert to pre-made characters or to synthetic wide characters, whether to use image text for control characters, and how to handle invalid characters.
Mb_precomposed always uses pre-fabricated characters, that is, when there is a single pre-fabricated character, the decomposed base character and the non-occupying space character are not used. This is the default option for the function and cannot be combined with Mb_composite
Mb_composite always uses the decomposition character, that is, always use the base character + the way that does not occupy space characters
Mb_err_invalid_chars Set this option, the function fails with an illegal character and returns an error code of Error_no_unicode_translation, otherwise discards the illegal character
Mb_useglyphchars using image characters instead of control characters
/* dwflags must be 0 for the following code page, otherwise the function returns the error code error_invalid_flags. 50220 50221 50222 50225 50227 50229 52936 54936 57002 to 57011 65000 (UTF7) (Symbol). For utf8,dwflags must be 0 or mb_err_invalid_chars, otherwise the function will fail and return an error code Error_invalid_flags */

Add an example for your reference, operating environment (VC 6.0, 32 pirate Win7 flagship edition)

#include <windows.h>
int Apientry WinMain (hinstance hinstance,
HInstance hPrevInstance,
LPSTR lpCmdLine,
int nCmdShow)
{
Todo:place code here.
wchar_t wsztest[] = L "Ziwuge";
wchar_t wsztestnew[] = L "Ziwuge Blog Park";
int nwsztestlen = LSTRLENW (wsztest); 6
int nwsztestnewlen = LSTRLENW (wsztestnew); 9
int nwsztestsize = sizeof (wsztest); 14
int nwsztestnewsize = sizeof (wsztestnew); 20
int NChar = WideCharToMultiByte (CP_ACP, 0, Wsztestnew,-1, NULL, 0, NULL, NULL); 13, The returned result contains the memory to be occupied by ' \ '
NChar = NChar * sizeof (char); 13, in fact, this step is not required, please see this article explained earlier
char* Szresult = new Char[nchar];
ZeroMemory (Szresult, NChar);
int i = WideCharToMultiByte (CP_ACP, 0, Wsztestnew,-1, Szresult, nChar, NULL, NULL); 13
int nszresultlen = Lstrlena (Szresult); 12
int nszresultsize = sizeof (Szresult); 4
Char sztest[] = "Ziwuge";
Char sztestnew[] = "Ziwuge Blog Park";
int nsztestlen = Lstrlena (sztest); 6
int nsztestnewlen = Lstrlena (sztestnew); 12
int nsztestsize = sizeof (sztest); 7
int nsztestnewsize = sizeof (sztestnew); 13
int Nwchar = MultiByteToWideChar (CP_ACP, 0, Sztestnew,-1, NULL, 0); 10, the returned result contains the memory to be occupied by ' \ '
Nwchar = Nwchar * sizeof (wchar_t); 20
wchar_t* Wszresult = new Wchar_t[nwchar];
ZeroMemory (Wszresult, Nwchar);
Int J = MultiByteToWideChar (CP_ACP, 0, Sztestnew,-1, Wszresult, Nwchar); 10
int nwszresultlen = LSTRLENW (Wszresult); 9
int nwszresultsize = sizeof (Wszresult); 4
return 0;
}

"References thank the Author"

http://www.cnblogs.com/wanghao111/tag/%E5%AE%BD%E5%AD%97%E7%AC%A6%E5%BA%93%E5%87%BD%E6%95%B0/

Learn MFC process by writing Serial port Helper Tool--(c) wide character and multibyte character conversion of Unicode character set

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.