Java read, write file How to solve garbled problem _java

Source: Internet
Author: User
Tags 0xc0 flush getmessage int size stringbuffer

Read the file stream, often encountered garbled phenomenon, the cause of the garbled is certainly not a, here mainly because of file encoding format caused by the garbled problem. First, make it clear that the concepts and differences between text files and binary files.

Text files are character-based files, and common encodings include ASCII encoding, Unicode encoding, ANSI encoding, and so on. A binary file is a value-encoded file that you can use to specify what a value is meant to be (such a process can be considered a custom encoding.) )

Therefore, it can be seen that the text file is basically a fixed-length code (also has a UTF-8 encoding such as a). The binary file can be seen as a variable-length code, because it is a value code, how many bits represent a value, it is entirely up to you to decide.

For binary files, it is never possible to use strings because the system default encoding is used when the string is initialized by default, however, binary files can only be read, manipulated, and written to a binary file because the custom encoding naturally conflicts with the fixed format encoding.

For text files, because the encoding is fixed, so long as you read the file, the file itself in the encoding format to parse the file, then get the bytes, and then, by specifying the format to initialize the string, then the resulting text is not garbled. Although the binary file can also get its text encoding format, but that is not accurate, so not in the same breath.

The specific actions are as follows:

1 Get the format of the text file

public static string Getfileencode (string path) {string charset = "ASci";
    byte[] first3bytes = new Byte[3];
    Bufferedinputstream bis = null;
      try {Boolean checked = false;
      bis = new Bufferedinputstream (new FileInputStream (path));
      Bis.mark (0);
      int read = Bis.read (first3bytes, 0, 3);
      if (read = = 1) return charset;
        if (first3bytes[0] = = (byte) 0xFF && first3bytes[1] = = (byte) 0xFE) {charset = "Unicode";//utf-16le
      Checked = true; else if (first3bytes[0] = = (byte) 0xFE && first3bytes[1] = = (byte) 0xFF) {charset = Unicode;//utf-16b
      E checked = true; else if (first3bytes[0] = = (byte) 0xEF && first3bytes[1] = = (byte) 0xBB && first3bytes[2] = = (byte) 0xBF
        ) {charset = "UTF8";
      Checked = true;
      } bis.reset ();
        if (!checked) {int len = 0;
        int loc = 0; while (read = Bis.read ())!=-1) {loc++;
          if (read >= 0xF0) break;
          if (0x80 <= read && read <= 0xBF)///single appearing under BF, it is also a GBK break;
            if (0xc0 <= read && read <= 0xDF) {read = Bis.read (); if (0x80 <= read && read <= 0xBF)//Double Byte (0XC0-0XDF) (0X80-0XBF), may also be in GB encoding con
            Tinue;
          else break;
            else if (0xe0 <= read && read <= 0xEF) {//may also be wrong, but less likely read = Bis.read ();
              if (0x80 <= read && read <= 0xBF) {read = Bis.read ();
                if (0x80 <= read && read <= 0xBF) {charset = "UTF-8";
              Break
            else break;
          else break;
      }//textlogger.getlogger (). info (loc + "" + integer.tohexstring (read));
  } catch (Exception e) {e.printstacktrace ();  finally {if (bis!= null) {try {bis.close ();
  The catch (IOException ex) {}}} return charset;
    private static string Getencode (int flag1, int flag2, int flag3) {String encode= ""; TXT file starts with a few more bytes, namely FF, Fe (Unicode),//FE, FF (Unicode big endian), EF, BB, BF (UTF-8) if (Flag1 = = 255 && fla
    G2 = = 254) {encode= "Unicode";
    else if (Flag1 = = 254 && Flag2 = = 255) {encode= "UTF-16";
    else if (Flag1 = = 239 && Flag2 = 187 && Flag3 = 191) {encode= "UTF8";
  else {encode= "asci";//ASCII} return encode;
 }

2 reads the file stream through the encoded format of the file

/** * Through the path to get the contents of the file, this method because the use of string as a carrier, in order to correctly read the file (not garbled), can only read text files, security methods!
    */public static string ReadFile (string path) {string data = null;
    To determine if a file exists with the filename = new file (path);
    if (!file.exists ()) {return data;
    //Get file Encoding format String code = fileencode.getfileencode (path);
    InputStreamReader ISR = null; try{////According to the encoding format the file if ("ASci". Equals (code)) {//IS GBK encoded instead of the ambient encoding format because the environment default encoding is not equal to the operating system code//code
        = System.getproperty ("file.encoding");
      Code = "GBK";
      ISR = new InputStreamReader (new FileInputStream (file), code);
      Read the contents of the file int length =-1;
      char[] buffer = new char[1024];
      StringBuffer sb = new StringBuffer ();
      while (length = isr.read (buffer, 0, 1024))!=-1) {sb.append (buffer,0,length);
    data = new String (SB);
      }catch (Exception e) {e.printstacktrace ();
    Log.info ("GetFile IO Exception:" +e.getmessage ()); }finally{try {if(ISR!= NULL)
        {Isr.close ();
        } catch (IOException e) {e.printstacktrace ();
      Log.info ("GetFile IO Exception:" +e.getmessage ());
  } return data;
 }

3) to write to the file in the format specified by the file

/** * Saves the contents of the file according to the specified path and encoding format, which is used for the string as a carrier, in order to correctly write the file (not garbled), can only write text content, security method * * @param data * will be written to the file in bytes
   * @param path * file paths, including filename * @return Boolean * Returns True when writing completes;
    */public static Boolean WriteFile (Byte data[], string path, String code) {Boolean flag = true;
    OutputStreamWriter OSW = null;
      try{File File = new file (path);
        if (!file.exists ()) {file = new file (File.getparent ());
        if (!file.exists ()) {file.mkdirs ();
      } if ("ASci". Equals (code)) {code = "GBK";
      } OSW = new OutputStreamWriter (new FileOutputStream (path), code);
      Osw.write (New String (Data,code));
    Osw.flush ();
      }catch (Exception e) {e.printstacktrace ();
      Log.info ("ToFile IO Exception:" +e.getmessage ());
    Flag = false;
        }finally{try{if (OSW!= null) {osw.close ();
        }}catch (IOException e) {e.printstacktrace (); Log.info("ToFile IO Exception:" +e.getmessage ());
      Flag = false;
  } return flag;
 }

4 for binary files and very little content, such as Word documents, you can read and write files using the following methods

/**
   * Reads a file from a specified path to a byte array, and this method is used for content that is not in text format
   *      457364578634785634534
   * @param path
   *     file paths, Include filename
   * @return byte[]
   *       file byte array
   *
  /public static byte[] GetFile (String path) throws IOException {
    fileinputstream stream=new fileinputstream (path);
    int size=stream.available ();
    byte data[]=new byte[size];
    Stream.read (data);
    Stream.Close ();
    Stream=null;
    return data;
  }
 
 
 
  /**
   * Writes the byte content to the corresponding file, this method can be used for some non text files.
   * @param
   Data *      @param the path to file, which will be written to the file      , containing the filename
   * @return Boolean IsOK Returns True when the write is completed;
   * @throws Exception
   *
  /public static Boolean ToFile (Byte data[], String path) throws Exception {
    FILEOUTPU Tstream out=new FileOutputStream (path);
    Out.write (data);
    Out.flush ();
    Out.close ();
    Out=null;
    return true;
  }

The above is the entire content of this article, I hope to help you learn.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.