Read the file stream, often encounter garbled phenomenon, resulting in garbled reason is certainly not a, here is mainly introduced because of the file encoding format caused by garbled problems. First, be clear about the concepts and differences between text files and binary files.
Text files are character-encoded files, common encodings are ASCII-encoded, Unicode-encoded, ANSI-encoded, and so on. Binary files are value-coded files, and you can specify what a value is (a process that can be considered a custom encoding, depending on your application). )
Therefore, it can be seen that the text file is basically fixed length encoding (also has a non-fixed length encoding such as UTF-8). Binary files can be thought of as variable-length encodings, because it is the value code, and how many bits represent a value that you decide entirely.
For binary files, it is not possible to use a string, because the string default initialization will use the system default encoding, however, the binary file because the custom encoding naturally and fixed format encoding will conflict, so for the binary file can only use the byte stream read, operation, write.
For a text file, because the encoding is fixed, so long as the file before reading, the file itself in the encoding format to parse the file, then get the bytes, and then, by specifying the format to initialize the string, the resulting text is not garbled. Although, the binary file can also get its text encoding format, but that is inaccurate, so can not be in the same.
Here's how:
1) Get the format of the text file
Public static string getfileencode (String path) {string charset = "ASci"; byte[] first3Bytes = new byte[3]; BufferedInputStream bis = null; try { boolean checked = false; bis = new bufferedinputstream (New fileinputstream (path)); bis.mark (0); int read = bis.read (first3bytes, 0, 3); if (read == -1) return charset; if (first3bytes[0] == (byte) 0xff && first3bytes [1] == (byte) 0xfe) { charset = "Unicode";//utf-16le checked = true; } else if (first3bytes[0] == (Byte) 0xFE && first3Bytes[1] == (byte) 0xff) { charset = "Unicode";// utf-16be checked = true; } else if (first3Bytes[0] == (byte) 0xEF && first3Bytes[1] == (byte) 0xBB && first3bytes[2] == (byte) 0xbf) { charset = "UTF8"; checked = true; } bis.reset (); if (! Checked) { int len = 0; int loc = 0; while ((Read = bis.read ()) != -1) { loc++; if (read &NBSP;>=&NBSP;0XF0) break; if (0x80 <= READ&NBSP;&&&NBSP;READ&NBSP;<=&NBSP;0XBF) //alone appeared under the Bf, also considered gbk break; if ( 0XC0&NBSP;<=&NBSP;READ&NBSP;&&&NBSP;READ&NBSP;<=&NBSP;0XDF) { read = bis.read (); if (0x80 <= read &NBSP;&&&NBSP;READ&NBSP;<=&NBSP;0XBF) //Double byte (0xC0 &NBSP;-&NBSP;0XDF) (0X80&NBSP;-&NBSP;0XBF), may also be within GB encoding continue; else break; } else if (0XE0&NBSP;<=&NBSP;READ&NBSP;&&&NBSP;READ&NBSP;<=&NBSP;0XEF) { //may also be wrong, but the odds are small read = bis.read (); if (0X80&NBSP;<=&NBSP;READ&NBSP;&&&NBSP;READ&NBSP;<=&NBSP;0XBF) { read = bis.read (); if (0X80&NBSP;<=&NBSP;READ&NBSP;&&&NBSP;READ&NBSP;<=&NBSP;0XBF) { charset = "UTF-8"; break; } else break; } else break; } } // Textlogger.getlogger (). info (loc + " " + integer.tohexstring (read)); } } catch (Exception e) { e.printstacktrace (); } finally { if (bis != null) { try { bis.close (); } catch (Ioexception ex ) { } } } &Nbsp; return charset;} Private static string getencode (INT&NBSP;FLAG1,&NBSP;INT&NBSP;FLAG2,&NBSP;INT&NBSP;FLAG3) {string encode= ""; the// txt file will have a few more bytes at the beginning, namely FF, FE (Unicode),//&NBSP;FE, FF (Unicode big endian), EF, BB, BF (UTF-8) if (flag1 == 255 && flag2 == 254) {encode= " Unicode ";} else if (flag1 == 254 && flag2 == 255) {encode= "UTF-16";} else if (flag1 == 239 && flag2 == 187 && flag3 == 191) {encode= "UTF8";} Else {encode= "ASci";// ascii yards}return encode;}
2) Read the file stream through the file's encoded format
/** * through the path to get the contents of the file, this method because the use of a string as a vector, in order to properly read the file (not garbled), can only read text files, security methods! */public static string readfile (String path) {string data = null;// Determine if the file exists file file = new file (path); if (!file.exists ()) {return data;} Get file Encoding format string code = fileencode.getfileencode (path); inputstreamreader isr = null;try{// parse files according to the encoding format if ("ASci". Equals (code)) {// here GBK encoding is used instead of the environment encoding format because the environment default encoding is not equal to the operating system encoding // code = system.getproperty ("file.encoding");code = "GBK";} Isr = new inputstreamreader (New fileinputstream (file), code);// reads the contents of the file Int length = -1 ;char[] buffer = new char[1024]; Stringbuffer sb = new stringbuffer (); while (Length = isr.read (buffer, 0, 1024) != -1) {sb.append (buffer,0,length);} Data = new string (SB);} catch (exception e) {E.priNtstacktrace (); Log.info ("Getfile io exception:" +e.getmessage ());} Finally{try {if (isr != null) {isr.close ();}} catch (ioexception e) {e.printstacktrace () Log.info ("Getfile io exception:" + E.getmessage ());}} Return data;}
3) write to file in the format specified by the file
/** * saves the contents of the file according to the specified path and encoding format, because it uses the string as the carrier, in order to write the file correctly (not garbled), only write the text content, security method * * @ param data * The byte data that will be written to the file * @ param path * file path, including file name * @ return boolean * returns True; */public static boolean writefile when writing is complete (byte data[], string path , string code) {Boolean flag = true;o Utputstreamwriter osw = null;try{file file = new file (path); if (!file.exists ()) {File = new file (File.getparent ()), if (!file.exists ()) {file.mkdirs ();}} if ("ASci". Equals (code)) {code = "GBK";} Osw = new outputstreamwriter (New fileoutputstream (path), code); Osw.write (New String ( Data,code)); Osw.flush ();} catch (exception e) {e.printstacktrace (); Log.info ("Tofile io exception:" +e.GetMessage ()); flag = false;} Finally{try{if (osw != null) {osw.close ();}} catch (ioexception e) {e.printstacktrace (); Log.info ("Tofile io exception:" +e.getmessage ()); flag = false;}} Return flag;}
4) for binary files and with very little content, such as Word documents, you can read and write to the file using the following method
/** * reads a file from a specified path into a byte array, this method can be used for some non-text formatted content * 457364578634785634534 * @param path * file path, including file name * @return byte[] * file byte array * */public static byte[] getfile (String path) throws IOException {Fileinputstream stream=new fileinputstream (path); int size=stream.available (); byte data[]= New byte[size];stream.read (data); Stream.Close (); stream=null;return data;} /** * writes the byte contents to the corresponding file, this method can be used for some non-text files. * @param data * Byte data to be written to the file * @param path * file path, including file name * @return boolean isOK true; * @throws returned when writing is complete Exception */public static boolean tofile (Byte data[], string path) Throws exception {fileoutputstream out=new fileoutputstream (path); out.write (data); O Ut.flush (); Out.close (); out=null;return true;}
Java read, write file--solve garbled problem