Keywords: javascript Chinese character conversion to Unicode unicode encoding conversion to Chinese Character
Conversion of JavaScript Chinese Character unicode encodingCode.
Javascript Library
-Javascript VaR Gb2312unicodeconverter = {
Tounicode: Function (STR ){
Return Escape (STR). tolocalelowercase (). Replace (/% u/GI, '\ U' );
}
, Togb2312: Function (STR ){
Return Unescape (Str. Replace (/\ U/GI, '% U' ));
}
};
Test code
-Html <Html> <Head> <Meta HTTP-equiv= "Content-Type" Content= "Text/html"/> </Head> <Body> <Script>
VaR Gb2312unicodeconverter = {
Tounicode: Function (STR ){
Return Escape (STR). tolocalelowercase (). Replace (/% u/GI, '\ U' );
}
, Togb2312: Function (STR ){
Return Unescape (Str. Replace (/\ U/GI, '% U' ));
}
};
// ================= Test code
VaR STR = 'Shanghai' , Unicode;
Document. Write (STR + '<Br/>' );
Unicode = gb2312unicodeconverter. tounicode (STR );
Document. Write ( 'Code for converting Chinese characters to UNICODE :' + Unicode + '<Br/>' );
Document. Write ( 'Unicode Code Conversion to Chinese characters :' + Gb2312unicodeconverter. togb2312 (UNICODE ));
</Script> </Body> </Html>
Keywords: C # converting Chinese characters to unicode encoding to Chinese Characters
Unicode and Chinese character encoding knowledge
Unicode encoding of Chinese characters, such as "King", which becomes "\ u738b". The Unicode character starts with "\ U" and is followed by four numbers or letters, all characters are hexadecimal numbers. Each two character represents a number less than 256. A Chinese character is composed of two characters, so it is easy to understand that "738b" is two characters, namely "73" and "8b ". However, when converting the Unicode character encoding content into Chinese characters,The characters are processed forward from the back., So,You need to combine characters in the order of "8b" and "73" to obtain Chinese characters..
The following code converts the Unicode codes of C # Chinese characters.
-C #
Using System;
Using System. text;
Using System. Text. regularexpressions;
Using System. Globalization;
Public Class Gb2312unicodeconverter
{
/// <Summary>
///Convert Chinese characters to unicode encoding
/// </Summary>
/// <Param name = "str">Chinese character string to be encoded</Param>
/// <Returns>Unicode encoded string</Returns>
Public Static String Tounicode ( String Str)
{
Byte [] BTS = encoding. Unicode. getbytes (STR );
String R = "" ;
For ( Int I = 0; I <BTS. length; I + = 2) R + = "\ U" + BTS [I + 1]. tostring ( "X" ). Padleft (2, '0') + BTS [I]. tostring ( "X" ). Padleft (2, '0 ');
Return R;
}
/// <Summary>
///Convert Unicode to a Chinese character string
/// </Summary>
/// <Param name = "str">Unicode encoded string</Param>
/// <Returns>Chinese character string</Returns>
Public Static String Togb2312 ( String Str)
{
String R = "" ;
Matchcollection MC = RegEx. Matches (STR ,@ "\ U ([\ W] {2}) ([\ W] {2 })" , Regexoptions. Compiled | regexoptions. ignorecase );
Byte [] BTS = New Byte [2];
Foreach (Match m In MC)
{
BTS [0] = ( Byte ) Int . Parse (M. Groups [2]. Value, numberstyles. hexnumber );
BTS [1] = ( Byte ) Int . Parse (M. Groups [1]. Value, numberstyles. hexnumber );
R + = encoding. Unicode. getstring (BTS );
}
Return R;
}
}
Tip: It seems that Unicode to Chinese Characters in. Net 4.0 has been directly completed by the compiler --!