Read txt to prevent reading garbled-automatically read according to the file encoding

Source: Internet
Author: User

Here is a excerpt

<summary>///Get the encoding format for the file///</summary> public class Encodingtype {//<summary> The path of the given file, reading the binary data of the file, judging the file encoding type///</summary>//<param name= "file_name" > File path </para m>///<returns> file encoding type </returns> public static System.Text.Encoding GetType (string file_name            {FileStream fs = new FileStream (file_name, FileMode.Open, FileAccess.Read);            Encoding r = GetType (FS); Fs.            Close ();        return R; }///<summary>///To determine the encoding type of the file by a given file stream///</summary>//<param name= "FS" > Text The encoding type of the piece flow </param>/////<returns> file </returns> public static System.Text.Encoding GetType (Files            Tream FS) {byte[] Unicode = new byte[] {0xFF, 0xFE, 0x41};            byte[] Unicodebig = new byte[] {0xFE, 0xFF, 0x00}; byte[] UTF8 = new byte[] {0xEF, 0xBB, 0xBF}; TakeBOM Encoding reVal = Encoding.default;            BinaryReader r = new BinaryReader (FS, System.Text.Encoding.Default);            int i; Int. TryParse (fs.            Length.tostring (), out i);            byte[] ss = R.readbytes (i); if (isutf8bytes (ss) | |            (Ss[0] = = 0xEF && ss[1] = = 0xBB && ss[2] = = 0xBF))            {reVal = Encoding.UTF8;  } else if (ss[0] = = 0xFE && ss[1] = = 0xFF && ss[2] = = 0x00) {ReVal =            Encoding.bigendianunicode;  } else if (ss[0] = = 0xFF && ss[1] = = 0xFE && ss[2] = = 0x41) {ReVal =            Encoding.unicode;            } r.close ();        return reVal; }///<summary>//To determine if it is a UTF8 format without BOM///</summary>//<param name= "Data" ></param>//<returns></returns> private static bool Isutf8bytes (byte[] data) {int charbytecounter = 1;//calculates the number of bytes that are currently being parsed Fu Ching byte curbyte;//bytes currently parsed. for (int i = 0; i < data. Length;                i++) {curbyte = Data[i];                        if (Charbytecounter = = 1) {if (Curbyte >= 0x80) {                            Judging the current while (((Curbyte <<= 1) & 0x80)! = 0) {                        charbytecounter++; }//Mark bit first if non 0 is at least 2 1 start as: 110XXXXX ...                            1111110X if (charbytecounter = = 1 | | charbytecounter > 6) {                        return false;                    }}} or else {//if UTF-8 at this time the first bit must be 1      if ((Curbyte & 0xC0)! = 0x80) {return false;              } charbytecounter--;            }} if (Charbytecounter > 1) {throw new Exception ("non-expected byte format");        } return true; }    }

How to use

String text= System.IO.File.ReadAllText (FName, FileEncoding.EncodingType.GetType (fName));

There are other codes to update the GetType method. The code is going to go, and I don't know how to add other encodings.

Read txt to prevent reading garbled-automatically read according to the file encoding

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.