Code | chinese
/*from:http://blog.joycode.com/hopeq/archive/2005/09/26/64146.aspx*/
There is a Web project where requestencoding and responseencoding are gb2312, and the profile data taken from the database may be mixed in Chinese and Korean or Japanese, and if you are directly outputting to the page, the web.config Its page will appear garbled, where the Korean content can not be displayed correctly. Of course, if the code of the project uses Utf-8 words will not be the problem, but this project is an old project, in order to try not to affect the existing program, so can not change the code to Utf-8, only on this page to use their brains.
After research, it is found that this problem can be solved by the method of HTML entity.
For HTML entities, refer to:
Character entity references in HTML 4
HTML Document Representation
Test code:
byte[] bcomments = Encoding.UTF8.GetBytes ("One ンブル???? Chinese ");
char[] ccomments = Encoding.UTF8.GetChars (bcomments);
StringBuilder Charbuilder = new StringBuilder ();
foreach (char c in ccomments)
{
if (C > ' \u0800 ')
{
Charbuilder.append ("&#");
Charbuilder.append ((int) c);
}
Else
{
Charbuilder.append (c);
}
}
Response.Write (Charbuilder.tostring ());
The purpose of this code is to translate all Chinese, Korean, and Japanese characters into HTML entities by hard coding. HTML entities are not affected by responseencoding and page encoding sets.
Description
\u0800 above for the Chinese, Korean, Japanese characters.
Chinese range: \u4e00-\u9fa5, Japanese in \u0800-\u4e00, Korean for more than \U9FA5.
This method is only to solve the problem of small scope, if you have a better way please advise.