Conversion between gb2312, Unicode, and utf8

Source: Internet
Author: User

Sometimes we have to deal with all kinds of encoding, but we often encounter garbled characters. In this case, we need to save the file as a specific encoding, but if there are many pages, it will be a headache, is there a way to program the batch processing of file encoding and storage? The answer is yes.

First, convertCodePost
Public void convertencoding (string sourcepath, string destpath, encoding destencoding)
{
// Author: Truly 2006-3-31
Encoding srcencoding = encoding. default;
Filestream FS = new filestream (sourcepath, filemode. Open );
Byte [] inbuff = new byte [fs. Length];

If (FS. Canread)
{
// Principle: encoding is determined by the first three digits of the file
Byte [] bt = new byte [3];
FS. Read (BT, 0, 3 );
If (BT [0] = 255 & BT [1] = 254)
Srcencoding = encoding. Unicode;
Else if (BT [0] = 254 & BT [1] = 255)
Srcencoding = encoding. bigendianunicode;
Else if (BT [0] = 239 & BT [1] = 187 & BT [2] = 191)
Srcencoding = encoding. utf8;
Else
Srcencoding = encoding. default;
}

FS. Seek (0, seekorigin. Begin); // return the file header
FS. Read (inbuff, 0, (INT) fs. Length );
FS. Close ();

Byte [] outbuff = encoding. Convert (srcencoding, destencoding, inbuff, 0, inbuff. Length );
Byte [] markunicode ={ 255,254 };
Byte [] markbigendianunicode ={ 254,255 };
Byte [] markutf8= {239,187,191 };

Filestream OUTFILE = new filestream (destpath, filemode. Create );

// Use the file stream mode to write the utf8 encoding mark
If (destencoding = encoding. Unicode)
OUTFILE. Write (markunicode, 0, markunicode. Length );
Else if (destencoding = encoding. utf8)
OUTFILE. Write (markutf8, 0, markutf8.length );

If (destencoding = encoding. Default & srcencoding! = Destencoding)
OUTFILE. Write (outbuff, 1, outbuff. Length-1 );
Else
OUTFILE. Write (outbuff, 0, outbuff. Length );
OUTFILE. Close ();
}

Example:
// No matter what the original encoding is, it can be converted to the required encoding file: B
Convertencoding ("D: \ 2. aspx", "d: \ Unicode. aspx", encoding. Unicode );
Convertencoding ("d :\\ 2. aspx", "d :\\ utf8.aspx", encoding. utf8 );
Convertencoding ("D: \ 2. aspx", "d: \ gb2312.aspx", encoding. Default );

Next, let's take a look at the complete batch operations on all. aspx files in a directory.

// Convert all. aspx files in the test directory to unicode encoding
Directoryinfo di = new directoryinfo ("D: \ test ");
Process (DI );

Private void process (directoryinfo di)
{
Foreach (fileinfo fi in Di. getfiles ("*. aspx "))
{
Response. Write (Fi. fullname + "<br> ");
Convertencoding (Fi. fullname, Fi. fullname, encoding. Unicode );
}

Foreach (directoryinfo DIC in Di. getdirectories ())
{
Process (DIC );
}
}

The above code has been tested repeatedly. Please feel free to use it !! : B complete source code can be downloaded here
Thank you for your painstaking efforts.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.