Java read UTF-8 format file The first line appears garbled--question mark "?" and to solve and Java read with BOM UTF-8 file garbled reason and solution

Source: Internet
Author: User

Test examples:

Java read UTF-8 TXT file The first line is garbled "?" and solve


Test.txt File Contents:
1
00:00:06,000-00:00:06,010
<b>Allerleirauh</b> (2012)
<i>dtv-das Erste-20. Januar 2013</i>

2
00:00:10,280-00:00:12,680
Was Geh?rt zu einer guten Suppe?

3
00:00:14,200-00:00:15,839
Eine Gute Suppe ...

Test.txt files are saved in WordPad as UTF-8 format (here is a UTF-8 file with a BOM)
Save and close after using WordPad to open the UTF-8 document again, Chinese, letter normal display

Test code:

public static string Srt2txt (string filename) {file infile = new File (filename); String realfile = filename.substring (0, Filename.lastindexof (". SRT")) + ". txt"; String tempfile = realfile.replace ('/', ' \ \ '),//windows writes the file path format to filename outfile = new file (tempfile); BufferedReader bufferedreader = null; BufferedWriter BufferedWriter = null;try {BufferedReader = new BufferedReader (new FileReader (infile)); BufferedWriter = New BufferedWriter (New FileWriter (outfile)); String line;//is used to hold the contents of a row every time it reads (line = Bufferedreader.readline ()) = null) {lines = new String (Line.getbytes ("ISO-8859    -1 ")," iso-8859-1 ");     Bufferedwriter.write (line); Bufferedwriter.newline ();//indicates newline Bufferedwriter.flush ();}} catch (IOException e) {e.printstacktrace ();} Finally{if (null! = BufferedReader) {try {bufferedreader.close ();} catch (IOException e) {e.printstacktrace ();}} if (null! = BufferedWriter) {try {bufferedwriter.close ();} catch (IOException e) {e.printstacktrace ();}}} return realfile;}
Test results:

??
00:00:06,000-00:00:06,010
<b>Allerleirauh</b> (2012)
<i>dtv-das Erste-20. Januar 2013</i>

2
00:00:10,280-00:00:12,680
Was Geh?rt zu einer guten Suppe?

3
00:00:14,200-00:00:15,839
Eine Gute Suppe ...

Workaround:

Use UltraEdit to save the above TXT file as UTF-8 no BOM format;

Use notepad++ to open the TXT file above and do the following "format-UTF-8 no BOM format encoding", and then save the TXT text after modification.

Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

Java read UTF-8 format file The first line appears garbled--question mark "?" and to solve and Java read with BOM UTF-8 file garbled reason and solution

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.