I. INTRODUCTION
The BOM (byte order mark)---the byte sequence mark.
Software such as Windows-brought Notepad, when saving a UTF-8 encoded file, inserts three invisible characters (0xEF 0xBB 0xBF, or BOM) where the file begins. It is a string of hidden characters that allows editors such as Notepad to identify whether the file is encoded in UTF-8. Windows uses a BOM to mark the way a text file is encoded.
When you save a text file in UTF-8 format using a program such as Notepad, Notepad adds several invisible characters (the EF BB BF) to the file header, which is the so-called BOM (Byte Order Mark)
Two. Problems encountered
When reading a JSON file in Java, there are several default characters in the front end of the content being read because of the UTF-8 (BOM) encoding used by the other.
Three. Resolve
Using the tool class
Reference address: http://koti.mbnet.fi/akini/java/unicodereader/, download Two of these files: Unicodestream and Unicodereader
Take Unicodereader as an example:
FileInputStream fis = new FileInputStream (file); Unicodereader ur = new Unicodereader (FIS, "utf-8"); BufferedReader br = new BufferedReader (UR);
Problems with BOM encountered in Java streaming