Wchar_t *, wchar_t, wchat_t array, Char, char *, char array, STD: String, STD: wstring, cstring, and system ("comman

Source: Internet
Author: User
Tags control characters

About wchar_t

In the C ++ standard, wchar_t is a wide character type. Each wchar_t type occupies 2 bytes and has a 16-bit width. Wchar_t is required for the representation of Chinese characters. Char, we all know, occupies a byte, 8-bit width.

The wprintf () function in the Standard C ++ and classes and objects in the iostream class library can provide operations related to the wchar_t wide character type.

Locale LOC ("CHS"); // defines the "region Settings" as the Chinese method.
Wcout. imbue (LOC); // Method for loading Chinese Characters
Wchar_t STR [] = l "China"; // defines the wide character array. Note that l is in uppercase.
Wcout <STR <Endl; // display the wide character array, the same below
Wprintf (STR );

System ("pause ");

Convert wchar_t to CharCodeAs follows:

The following wchar_t and char variables are available:

Wchar_t w_cn = '中 ';
Char c_cn [2] = {'0 '};

Char * c2w (wchar_t w_cn, char c_cn [2])


// Following code convert wchar to Char
C_cn [0] = w_cn> 8;
C_cn [1] = w_cn;
C_cn [2] = '\ 0 ';

Return c_cn;


Note that a 16-bit wchar_t needs to be stored with two 8-bit char. We can find another problem: the high byte of wchar_t should be stored in the low byte of the char array.

Convert wchar_t * type to char * type

Cstring strname ("listen ");
Char * pcstr = (char *) New char [2 * strname. getlength () + 1];

Widechartomultibyte (cp_acp,
Strname, // The wchar_t * to be converted *
Pcstr, // receives the buffer pointer of char *
2 * strname. getlength () + 1, // size of the pcstr Buffer
Null );

About System ("command ")

System ("command") is to execute a DOS command. System ("pause") is to execute the doscommand pause, waiting for user input.
Difference between system ("pause") and getchar ()

System ("pause") is used to call the command pause under the Windows console app.
System ("const char *") is a command that calls the Windows console app.
For example, system ("exit ");
System ("Ping") and so on

Getchar () is only a function that waits for one character in the C standard library. The difference between the two is great.

How can I prevent system ("pause") from popping up the word "press any key to continue?
Use System ("Pause> NUL ").

Wchar_t *, wchar_t, wchat_t array, Char, char *, char array, STD: String, STD: wstring, cstring

# Include <string>
// You must use MFC to use cstring and cannot contain <windows. h>
# DEFINE _ afxdll
# Include <afx. h>
Using namespace STD;
// Convert single-byte char * to wide-byte wchar *
Inline wchar_t * ansitounicode (const char * szstr)
Int nlen = multibytetowidechar (cp_acp, mb_precomposed, szstr,-1, null, 0 );
If (nlen = 0)
Return NULL;
Wchar_t * presult = new wchar_t [nlen];
Multibytetowidechar (cp_acp, mb_precomposed, szstr,-1, presult, nlen );
Return presult;
// Convert the wide byte wchar_t * to a single byte char *
Inline char * unicodetoansi (const wchar_t * szstr)
Int nlen = widechartomultibyte (cp_acp, 0, szstr,-1, null, 0, null, null );
If (nlen = 0)
Return NULL;
Char * presult = new char [nlen];
Widechartomultibyte (cp_acp, 0, szstr,-1, presult, nlen, null, null );
Return presult;
// Convert a single-character string to a wide-character wstring
Inline void ascii2widestring (const STD: string & szstr, STD: wstring & wszstr)
Int nlength = multibytetowidechar (cp_acp, 0, szstr. c_str (),-1, null, null );
Wszstr. Resize (nlength );
Lpwstr lpwszstr = new wchar_t [nlength];
Multibytetowidechar (cp_acp, 0, szstr. c_str (),-1, lpwszstr, nlength );
Wszstr = lpwszstr;
Delete [] lpwszstr;
Int _ tmain (INT argc, _ tchar * argv [])
Char * pchar = "I like char ";
Wchar_t * pwidechar = l "I hate wchar_t ";
Wchar_t tagwidecharlist [100];
Char CH = 'a ';
Char tagchar [100] = {null };
Cstring CSTR;
STD: String STR;

// Note: Set the language environment to output widechar
Setlocale (lc_all, "CHS ");

// Note: char * converts wchar_t *
// Note: wchar_t is not overloaded <, so cout cannot be used <output
Pwidechar = ansitounicode (pchar );
// Note: printf ("% ls") and wprintf (L "% s") are consistent
Printf ("% ls \ n", pwidechar );

// Note: wchar_t * converts wchar_t []
Wcscpy (tagwidecharlist, pwidechar );
Wprintf (L "% s \ n", tagwidecharlist );

// Note: wchar_t [] converts wchar_t *
Pwidechar = tagwidecharlist;
Wprintf (L "% s \ n", pwidechar );

// Note: Char converts string
Str. insert (Str. Begin (), ch );
Cout <STR <Endl;

// Note: wchar_t * converts string
Pwidechar = new wchar_t [Str. Length ()];
Swprintf (pwidechar, l "% s", str. c_str ());
Wprintf (L "% s \ n", pwidechar );

// Note: String Conversion char *
Pchar = const_cast <char *> (Str. c_str ());
Cout <pchar <Endl;

// Note: char * converts string
STR = STD: string (pchar );
// Note: cout <reloads the string. If printf is used, printf ("% s", str. c_str () is required ());
// But cannot print ("% s", STR); Because STR is a string class
Cout <STR <Endl;

// Note: String Conversion char []
STR = "Bored ";
Strcpy (tagchar, str. c_str ());
Printf ("% s \ n", tagchar );

// Note: String Conversion cstring;
CSTR = Str. c_str ();

// Note: cstring conversion string
STR = string (CSTR. getbuffer (CSTR. getlength ()));

// Note: char * converts cstring
CSTR = pchar;

// Note: cstring converts char *
Pchar = CSTR. getbuffer (CSTR. getlength ());

// Note: cstring converts char []
Strncpy (tagchar, (lpctstr) cstring, sizeof (tagchar ));

// Note: cstring conversion wchar_t *
Pwidechar = CSTR. allocsysstring ();
Printf ("% ls \ n", pwidechar );

Widechartomultibyte () function

Function: maps a unicode string to a multi-byte string.

Function prototype: int widechartomultibyte (uint codePage, DWORD dwflags, lpwstr lpwidecharstr, int cchwidechar, lpcstr character, int cchmultibyte, lplplpultchar, pbool pfuseddefaultchar );


CodePage: Specifies the code page for conversion. This parameter can be set to any code page that has been installed or is valid. You can also specify it as any of the following values:

Cp_acp: ANSI code page; cp_maccp: Macintosh code page; cp_oemcp: OEM code page;

Cp_symbol: symbol code page (42); cp_thread_acp: Current clue ANSI code page;

Cp_utf7: converts using a UTF-7; cp_utf8: converts using a UTF-8.

Dwflags: a set of BITs that indicate whether they are not converted to premade or wide characters (if in combination), whether to replace control characters with hieroglyphics, and how to handle invalid characters. You can specify the combination of the Mark constants as follows:

Mb_precomposed: generally used as a character -- that is, a character consisting of a basic character and a non-null character has only one single character value. This is the default conversion option. Cannot match

Use the mb_composite value together.

Mb_composite: Generally, a composite character -- that is, a character consisting of a basic character and a non-null character has different character values. This is the default conversion option. It cannot be used with the mb_precomposed value.

Mb_err_invalid_chars: If the function encounters an invalid input character, it fails to run and getlasterro returns the error_no_unicode_translation value.

Mb_useglyphchars: replace control characters with hieroglyphics.

A composite Character consists of a base character and a non-null character. Each character has a different character value. Each premade character is composed of a single character value for basic/non-null characters. E is the basic character, while note-heavy mark is a non-null character.

The default action of a function is to convert it into a premade form. If the pre-made form does not exist, the function will try to convert it into a combination form.

Marking mb_precomposed and mb_composite is mutually exclusive, while marking mb_useglyphchars and mb_err_invalid_chars can be set regardless of other marking.

Lpwidecharstr: point to the Unicode string to be converted.

Cchwidechar: specifies the number of characters in the buffer zone directed by the lpwidecharstr parameter. If the value is-1, the string is set to a string with null as the end character and the length is automatically calculated.

Lpmultibytestr: indicates the buffer zone that receives the converted string.

Cchmultibyte: specify the maximum value of the buffer directed by the lpmultibytestr parameter (measured in bytes ). If the value is zero, the function returns the number of bytes required for the target buffer to which lpmultibytestr points. In this case, the lpmultibytestr parameter is usually null.

Lpdefaultchar and pfuseddefachar CHAR: The widechartomultibyte function uses these two parameters only when the widechartomultibyte function encounters a wide byte character that is not represented in the code page marked by the ucodepage parameter. If the wide byte character cannot be converted, this function uses the character pointed to by the lpdefaultchar parameter. If this parameter is null (this is the parameter value in most cases), the function uses the default character of the system. This default character is usually a question mark. This is dangerous for file names, because question marks are wildcards. The pfuseddefaultchar parameter points to a Boolean variable. if at least one character in a unicode string cannot be converted to an equivalent multi-byte character, the function sets this variable to true. If all characters are successfully converted, this function sets this variable to false. When the function returns to check whether the wide byte string is successfully converted, you can test the variable.

Returned value: If the function runs successfully and cchmultibyte is not zero, the returned value is the number of bytes written in the buffer directed by lpmultibytestr. If the function runs successfully and cchmultibyte is zero, the returned value is the number of bytes required to receive the buffer of the string to be converted. If the function fails to run, the return value is zero. To obtain more error information, call the getlasterror function. It returns the following error codes:

Error_insufficient_bjffer; error_invalid_flags;

Error_invalid_parameter; error_no_unicode_translation.

Note: The pointer lpmultibytestr and lpwidecharstr must be different. If the same, the function will fail, and getlasterror will return the error_invalid_parameter value.

ANSI and Unicode encoding

Both are a form of character code

ANSI code 0x88 ~ 2 bytes in the 0xff range to 1 character.

Unicode encoding is a character encoding scheme specified by international organizations to accommodate all the characters in the world. 0 ~ 0x10ffff to map these characters.

My understanding: To put it bluntly, ANSI encoding is a single byte, and Unicode encoding is a wide character.

This article to learn Step Park, reprint please indicate the source: http://www.xuebuyuan.com/199513.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.