C + + methods to generate UTF-8 encoded files using the WideCharToMultiByte function _c language

Source: Internet
Author: User

The WideCharToMultiByte function maps a Unicode string to a multi-byte string.

Function Prototypes:

int WideCharToMultiByte

    • UINT CodePage,//Specify code page to perform the conversion
    • DWORD dwflags,//allows you to carry out additional control, it will affect the use of pronunciation symbols (such as accent) of the characters
    • LPCWSTR Lpwidecharstr,//Specifies the buffer to convert to a wide-byte string
    • int Cchwidechar,//Specifies the number of characters in the buffer to which the parameter lpwidecharstr points
    • LPSTR lpmultibytestr,//point to buffer receiving the converted String
    • int Cchmultibyte,//Specifies the maximum buffer point that the parameter lpmultibytestr points to
    • LPCSTR Lpdefaultchar,//encounters a wide character that cannot be converted, the function will use the character pointed to by the Pdefaultchar parameter
    • Lpbool Pfuseddefaultchar//At least one character cannot be converted to its multi-byte form, the function will set this variable to True

Parameters:
CodePage: Specifies the code page that executes the transformation, which can be a value given to any code page that is installed or valid by the system. You can also specify any of the following values:

    • Cp_acp:ansi code page; Cp_maccp:macintosh code page; cp_oemcp:oem code page;
    • Cp_symbol: Symbol code page (n); CP_THREAD_ACP: Current thread ANSI code page;
    • CP_UTF7: Using UTF-7 conversion; Cp_utf8: Using UTF-8 conversion.

Related variables

  • LPWIDECHARSTR: Point to the Unicode string that will be converted.
  • Cchwidechar: Specifies the number of characters in the buffer to which the parameter lpwidecharstr points. If this value is-1, the string is set to a null-terminated string, and the length is automatically computed.
  • LPMULTIBYTESTR: Point to the buffer that receives the converted string.
  • Cchmultibyte: Specifies the maximum buffer value (measured in bytes) that the parameter lpmultibytestr points to. If this value is zero, the function returns the number of bytes required by the Lpmultibytestr point to the target buffer, in which case the LPMULTIBYTESTR parameter is usually null.
  • Lpdefaultchar and Pfuseddefaultchar: only if the WideCharToMultiByte function encounters a wide-byte character that is not represented in the code page that the Ucodepage parameter identifies, The WideCharToMultiByte function only uses these two parameters. If a wide-byte character cannot be converted, the function uses the character pointed to by the Lpdefaultchar argument. If the parameter is null (which is the parameter value in most cases), then the function uses the system's default characters. The default character is usually a question mark. This is dangerous for file names because the question mark is a wildcard character. The Pfuseddefaultchar parameter points to a Boolean variable, and if at least one character in the Unicode string cannot be converted to an equivalent multi-byte character, the function will set the variable to true. If all characters are successfully converted, the function will set the variable to False. You can test a wide-byte string when it is returned so that the variable is successfully converted.
  • Return value: If the function succeeds and Cchmultibyte is Non-zero, the return value is the number of bytes written in the buffer to which the lpmultibytestr points, and if the function succeeds and the cchmultibyte is zero, The return value is the number of bytes that are required to receive a buffer for the string to be converted. If the function fails, the return value is zero. To get more error messages, call the GetLastError function. It can return the error codes listed below:
  • Error_insufficient_bjffer;error_invalid_flags;
  • Error_invalid_parameter;error_no_unicode_translation.
  • Note: Pointers Lpmultibytestr and LPWIDECHARSTR must be different. If so, the function will fail and GetLastError will return the Error_invalid_parameter value.
  • Windows CE: Cp_utf7 and Cp_utf8 values in parameter codepage are not supported, as well as wc_no_best_fit_chars values in parameter dwflags.

Generate Utf-8 encoded files
the steps are as follows:
1. First to write a BOM header. UTF-8 file is usually to this head, of course, can not.
2. The character Fuxian to be generated is generated using a wide character format, and then the WideCharToMultiByte is converted to UTF-8 encoding and written to the file.

Examples are as follows:

FILE * pFile = fopen ("D://a.txt", "w"); 
  Char Szbom[4] = {(char) 0xEF, (char) 0xBB, (char) 0xBF, 0}; 
  fprintf (PFile, "%s", Szbom); 
   
  wchar_t chnum[11] = L "0 A three Woolu seven BA Nine"; 
  wchar_t chnum2[10] = L "Hundreds of millions of hundreds of thousands of hundred to pick up a"; 
  Char sz[10] = "112304823"; 
   
  wchar_t result[32] = L ""; 
   
  int offset = 0; 
  for (int i = 0; i < strlen (SZ); + + i) 
  { 
    char c = sz[i]; 
     
    wchar_t W1 = chnum[C-' 0 ']; 
    wchar_t w2 = chnum2[i]; 
    swprintf (Result + offset, L "%c%c", W1, W2); 
    Offset + + 2; 
  } 
  Char szchar[64] = ""; 
  :: WideCharToMultiByte (Cp_utf8, 0, result, wcslen (result), Szchar, 0, 0); 
  fprintf (PFile, "%s", Szchar); 
  Fclose (PFile); 

Note that when using wchar_t instead of char, all string manipulation functions require a W-series, such as Wcslen, swprintf

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.