C ++: interchange between UTF-8 and GB2312

Source: Internet
Author: User

Not to mention, I encountered a problem between characters a few days ago, and garbled characters appeared on the webpage. So I also wrote a Baidu experience.

Now is the time to solve this problem. Of course, you can simply convert it with the "Save as" provided by txt, but now we are discussing how to use functions in c ++ to change it.

The following describes two WinAPI functions: WideCharToMultiByte and MultiByteToWideChar.

Function prototype:

Int WideCharToMultiByte (
UINT CodePage, // code page
DWORD dwFlags, // performance and mapping flags
LPCWSTR lpWideCharStr, // wide-character string
Int cchWideChar, // number of chars in string
LPSTR lpMultiByteStr, // buffer for new string
Int cbMultiByte, // size of buffer
Lpstr lpDefaultChar, // default for unmappable chars
LPBOOL lpUsedDefaultChar // set when default char used
); // Converts a width character to Multiple Narrow characters

Int MultiByteToWideChar (
UINT CodePage, // code page
DWORD dwFlags, // character-type options
LPCSTR lpMultiByteStr, // string to map
Int cbMultiByte, // number of bytes in string
LPWSTR lpWideCharStr, // wide-character buffer
Int cchWideChar // size of buffer
); // Converts Multiple Narrow characters into wide characters
The following functions are required:

CString CTest: HexToBin (CString string) // convert a hexadecimal number to a binary number.
{
If (string = "0") return "0000 ";
If (string = "1") return "0001 ";
If (string = "2") return "0010 ";
If (string = "3") return "0011 ";
If (string = "4") return "0100 ";
If (string = "5") return "0101 ";
If (string = "6") return "0110 ";
If (string = "7") return "0111 ";
If (string = "8") return "1000 ";
If (string = "9") return "1001 ";
If (string = "a") return "1010 ";
If (string = "B") return "1011 ";
If (string = "c") return "1100 ";
If (string = "d") return "1101 ";
If (string = "e") return "1110 ";
If (string = "f") return "1111 ";
Return "";
}

CString CTest: BinToHex (CString BinString) // convert binary to hexadecimal
{
If (BinString = "0000") return "0 ";
If (BinString = "0001") return "1 ";
If (BinString = "0010") return "2 ";
If (BinString = "0011") return "3 ";
If (BinString = "0100") return "4 ";
If (BinString = "0101") return "5 ";
If (BinString = "0110") return "6 ";
If (BinString = "0111") return "7 ";
If (BinString = "1000") return "8 ";
If (BinString = "1001") return "9 ";
If (BinString = "1010") return "";
If (BinString = "1011") return "B ";
If (BinString = "1100") return "c ";
If (BinString = "1101") return "d ";
If (BinString = "1110") return "e ";
If (BinString = "1111") return "f ";
Return "";
}

Int CTest: BinToInt (CString string) // convert binary data to a 10th integer
{
Int len = 0;
Int tempInt = 0;
Int strInt = 0;
For (int I = 0; I <string. GetLength (); I ++)
{
TempInt = 1;
StrInt = (int) string. GetAt (I)-48;
For (int k = 0; k <7-i; k ++)
{
TempInt = 2 * tempInt;
}
Len + = tempInt * strInt;
}
Return len;
}
UTF-8 to GB2312 first convert the UTF-8 to Unicode. Then the Unicode through the function WideCharToMultiByte to GB2312

WCHAR * CTest: UTF_8ToUnicode (char * ustart) // convert the UTF-8 to Unicode
{
Char char_one;
Char char_two;
Char char_three;
Int Hchar;
Int Lchar;
Char uchar [2];
WCHAR * unicode;
CString string_one;
CString string_two;
CString string_three;
CString combiString;
Char_one = * ustart;
Char_two = * (ustart + 1 );
Char_three = * (ustart + 2 );
String_one.Format ("% x", char_one );
String_two.Format ("% x", char_two );
String_three.Format ("% x", char_three );
String_three = string_three.Right (2 );
String_two = string_two.Right (2 );
String_one = string_one.Right (2 );
String_three = HexToBin (string_three.Left (1) + HexToBin (string_three.Right (1 ));
String_two = HexToBin (string_two.Left (1) + HexToBin (string_two.Right (1 ));
String_one = HexToBin (string_one.Left (1) + HexToBin (string_one.Right (1 ));
CombiString = string_one + string_two + string_three;
CombiString = combiString. Right (20 );
CombiString. Delete (4, 2 );
CombiString. Delete (10, 2 );
Hchar = BinToInt (combiString. Left (8 ));
Lchar = BinToInt (combiString. Right (8 ));
Uchar [1] = (char) Hchar;
Uchar [0] = (char) Lchar;
Unicode = (WCHAR *) uchar;
Return unicode;
}

Char * CTest: UnicodeToGB2312 (unsigned short uData) // convert Unicode to GB2312
{
Char * buffer;
Buffer = new char [sizeof (WCHAR)];
WideCharToMultiByte (CP_ACP, NULL, & uData, 1, buffer, sizeof (WCHAR), NULL, NULL );
Return buffer;
}
GB2312 to UTF-8: First GB2312 through the MultiByteToWideChar function to convert Unicode. Then the Unicode By disassembling Unicode assembled into a UTF-8.

Author: Li Mu Space"

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.