Original article: http://my.oschina.net/xianggao/blog/79694
What is the BOM header?
In a UTF-8 encoded file, BOM occupies three bytes in the file header to indicate that the file belongs to UTF-8 encoding. Currently, many software programs have recognized the BOM header, but some cannot recognize the BOM header, for example, PHP cannot recognize the BOM header, which is also the cause of an error after UTF-8 encoding is edited in notepad. In fact, the BOM of the UTF-8 has no effect on the UFT-8, is to support the UTF-16, The UTF-32 to add Bom, bom signature means to tell the editor of the current file using what encoding, convenient editor recognition, however, although Bom is not displayed in the editor, it will generate output, just like an empty row.
Similar to Windows notepad and other software, when saving a file encoded in UTF-8, it inserts three invisible characters (0xef 0xbb 0xbf, BOM) at the beginning of the file ). It is a string of hidden characters, used for the notepad editor to identify whether the file is encoded in UTF-8. For general files, this will not cause any trouble. However, Bom is a big headache for PHP. PHP does not ignore the BOM. Therefore, when reading, including, or referencing these files, the BOM is used as part of the Beginning body of the file. According to the characteristics of the embedded language, this string of characters will be directly executed (displayed. As a result, even if the top padding of the page is set to 0, the whole web page cannot be placed close to the top of the browser, because there are three characters at the beginning of HTML! This is not the biggest problem. Due to restrictions of the cookie sending mechanism, Cookies cannot be sent to files with BOM at the beginning of these files (because PHP has already sent the file header before the cookie is sent ), therefore, the logon and logout functions are invalid. All functions dependent on cookies and sessions are invalid. Therefore, when editing or changing any text files, you must use an editor without adding Bom. The editor in Linux should have no such problem. In Windows, do not use notepad or other editors. The recommended editor is:
Editplus version 2.12 or later; emeditor; ultraedit (related options of 'add BOM 'need to be canceled); Dreamweaver (related options of 'add BOM' need to be canceled. If you want to cancel a file that has been added with Bom, you can use the editor above to save it again. (Editplus needs to be saved as GB first, then as UTF-8 .)
To remove the BOM header, there are two simple methods:
1. How to remove the BOM header from editplus
After the editor is adjusted to the utf8 encoding format, a hidden character (BOM) is added before the saved file, which is used by the editor to identify whether the file is UTF-8 encoded.
Run editplus, Click Tools, select preferences, select files, UTF-8 ID select always Delete Signature,
Then, the edited and saved PHP file does not contain Bom.
2. How to remove the BOM header from ultraedit
After opening the file, select the encoding format of the Save As option (UTF-8 without BOM header) and click OK.