Bom:byte Order Mark
UTF-8 BOM is also called UTF-8 signature, in fact, UTF-8 BOM to UFT-8 has no effect,
In order to support UTF-16,UTF-32, the Bom,bom signature means to tell the editor what encoding to use for the current file,
The editor is easy to recognize, but the BOM does not appear in the editor, but produces output, like a blank line.
Removal of BOM
The BOM is used to mark a text file using Unicode encoding, which is itself a Unicode character ("\ufeff"), located at the head of the text file. Under different Unicode encodings, the binary bytes corresponding to the BOM characters are as follows:
Bytes Encoding---------------------------- FE FF UTF16BE FF FE UTF16LE EF BB BF UTF8
Therefore, we can determine whether a file contains a BOM based on what the first few bytes of the text file equals, and which Unicode encoding to use. However, the BOM character, although the role of the tag file encoding, which itself is not part of the content of the file, if you read a text file without removing the BOM, in some use scenarios will be problematic. For example, when we combine several JS files into a single file, if the middle of the file contains BOM characters, it will cause the browser JS syntax error. Therefore, it is generally necessary to remove the BOM when reading a text file using Nodejs. For example, the following code implements the ability to identify and remove UTF8 BOMs.
function readText(pathname) { var bin = fs.readFileSync(pathname); if (bin[0] === 0xEF && bin[1] === 0xBB && bin[2] === 0xBF) { bin = bin.slice(3); } return bin.toString(‘utf-8‘);}
When node reads a text file, the BOM is removed