In a UTF-8 encoded file, BOM occupies three bytes in the file header to indicate that the file belongs to UTF-8 encoding. Currently, many software programs have recognized the bom header, but some cannot recognize the bom header, for example, PHP cannot recognize the bom header. This means that after UTF-8 encoding is edited in Notepad, PHP automatically adds the bom header when downloading files. First, you need to understand what is the bom header? When you save a text file in UTF-8 format with a program such as Notepad in Windows, notepad will add a few invisible characters (ef bb bf) before the file header ), is the so-called BOM (Byte order Mark ).
In a UTF-8 encoded file, BOM occupies three bytes in the file header to indicate that the file belongs to UTF-8 encoding. Currently, many software programs have recognized the bom header, but some cannot recognize the bom header, for example, PHP cannot recognize the bom header, which is also the cause of an error after UTF-8 encoding is edited in Notepad.
It is not limited to files stored in Notepad, as long as the file's opening contains several invisible characters of ef bb bf (the hexadecimal format should be xEFxBBxBF, which can be seen by binary editing files ). This is like a convention, and when the system sees it, it will feel that your file is UTF-8-encoded.
If your interface is UTF-8, you need to force download a file, such as csv. excel considers csv as GB encoded by default (in Chinese), so if the rice has a bom header, the file you present to the user may be garbled.
How can I add a bom header?
Add the bom header before the output file:
The code is as follows:
// File name $ filename = "www.bitsCN.com. csv "; header ('expires :'. gmdate ('d, d m y h: I: S', $ _ SERVER ['request _ time'] + 10 ). 'gmt'); header ('cache-Control: max-age = 10'); // header ('content-Type: application/vnd. ms-excel; charset = utf-8 '); header ('content-Type: text/csv; charset = utf-8'); header ("Content-Disposition: attachment; filename = {$ filename} "); // if a message is displayed in the result, change the first line of output to the prompt text $ out =" xEFxBBxBF "; // add On the bom header, the system automatically defaults to the UTF-8 encoding if (! Empty ($ extra ['notice']) {$ out. = "{$ extra ['notice']} rn";} // output foreach ($ table as $ row) {$ out. = implode (",", $ row ). "rn";}/* if (mb_detect_encoding () ($ out) = 'utf-8') {$ out = iconv ("UTF-8 // IGNORE ", "GBK", $ out);} */echo $ out;
The following describes how to remove the BOM header.
To remove the bom header, there are two simple methods:
1. how to remove the BOM header from editplus
After the editor is adjusted to the UTF8 encoding format, a hidden character (BOM) is added before the saved file, which is used by the editor to identify whether the file is UTF-8 encoded.
Run Editplus, click the tool, select preference, select the file, select the UTF-8 ID always delete the signature, and then the php file after editing and saving the php file is without BOM.
2. how to remove the bom header from ultraedit
After opening the file, select the encoding format of the save as option (UTF-8 without bom header). OK.
How about removing the bom header?
Let's talk about the BOM information of utf8.
BOM refers to the PHP file storage method for the UTF-8 with BOM, the common page of Chinese garbled mode is generally not caused by this reason.
header("Content-type: text/html; charset=utf-8");
This statement controls the html output page encoding method, BOM only in WINDOWS using "notepad" storage for UTF-8 will have, this can be used WINHEX to delete the beginning of the 2 bytes.
You can set whether to include BOM in the code settings in dreamweaver. generally, BOM will not cause problems as long as the output of php is not an image (GDI Stream.
If there are additional characters at the beginning of the GDI Stream, it will be displayed as a Red Cross.
The above content introduces how to automatically add and explain the bom header and remove it when downloading PHP files.