The browser handles HTML output by PHP files in UTF-8 format with BOM

Source: Internet
Author: User
Tags php template ultraedit

Write web todayProgramA bug occurs, as shown in the following two figures:

InSource codeMedium,CodeThere is no problem with the structure, and Firefox Marked Lines 1/2/3/25/26 as red, which means there is a problem. When I look at firebug, I found that all the Meta, title, script, and style labels in the head should be under the body! What's the problem? Verify with validator.w3.org and find an error saying:

This character "" not allowed in Prolog error, marked in red <! What does doctype mean?

After research, the root cause of the problem is found in a referenced PHP file. This PHP file is UTF-8 and saved directly in notepad, which generates the legendary BOM (byte order maker ). The BOM header causes HTML parsing in various browsers and an error occurs !!

 

 

Here is the text copied from the http://blog.sina.com.cn/s/blog_669fb0c30100vvf4.html.

 

Recently written in PHP, inexplicable browser header more than a line of empty lines, search on the Internet, roughly all the files are saved as non BOM UTF-8 format

I solved the problem locally, but I still had time to upload it to the server. After a morning, I almost crashed. Finally, I decided to solve it myself, after several hours of exploration, we finally got a perfect solution.

The display principle of the buffer in PHP is used and removed successfully.
Add a line of ob_start () to the header of PHP, and then add ob_end_clean () to the front of the template ();
Add ob_end_flush () after the template is displayed ();
The problem is solved, and the overall structure of the Instance code is now given:

Copy code The Code is as follows: <? PHP
Ob_start (); // PHP logic operation here
Ob_end_clean (); // The PHP template is displayed here.
Ob_end_flush ();
?>

Supplemented by other netizens:
A problem that cannot be solved during development.
The page is UTF-8 encoded, And the template containing files is used in the header and tail. As a result, each header and tail have a blank line of about 10 PX, and there is nothing.
The reason is that all are UTF-8 encoded. When the file is included, the final binary stream contains multiple UTF-8 BOM tags. ie cannot Parse Multiple UTF-8 tags normally.
The page marked by BOM is directly replaced with the actual displayed carriage return, which leads to a blank line, but Firefox does not.
Therefore, if the template contains multiple utf8 files, you must use ultraedit to save the file as a function and select utf8.
Save without BOM format.
In addition, if the Chinese page places the title tag in the <Meta
HTTP-equiv = "Content-Type" content = "text/html; charset = UTF-8 ″
/> The page is blank.
Therefore, standard order should be used for utf8 pages

Copy code The Code is as follows: <meta http-equiv = "Content-Type" content = "text/html;
Charset = UTF-8 "/>
<Meta http-equiv = "content-language" content = "ZH-CN"
/>
<Meta name = "Robots" content = "index, follow"
/>
<Meta name = "keywords" content = ""
/>
<Meta name = "Description" content = ""
/>
<Meta name = "rating" content = "general"
/>
<Meta name = "author" content = ""
/>
<Meta name = "Copyright" content = ""
/>
<Meta name = "generator" content = ""
/>
<Title> </title>

Bom header: \ XeF \ xbb \ xbf, PhP4, and 5 are still ignored. Therefore, the BOM is output directly before resolution.
This w3.org standard FAQ provides a special description of this issue:

Http://www.w3.org/International/questions/qa-utf8-bom

The details are as follows:

There is a Zero Width no-break in the UCS encoding.
Space, which is encoded as feff. Fffe does not exist in the UCs, so it should not appear in actual transmission. We recommend that you transmit the data before transmitting the byte stream.
Character "Zero Width no-break
Space ". In this way, if the receiver receives feff, it indicates that the byte stream is big-Endian; if it receives fffe, it indicates that the byte stream is little-
Endian. Therefore, the character "Zero Width no-break space" is also called Bom.

The UTF-8 does not need BOM to indicate the byte order, but BOM can be used to indicate the encoding method. Character "Zero Width no-break
Space "UTF-8 code is ef bb bf. Therefore, if the recipient receives
The byte stream starting with BF knows this is UTF-8 encoding.

Windows is an operating system that uses BOM to mark the encoding method of text files:Windows XPProfessional,
Default Character Set: Chinese

1) notepad: It can automatically identify UTF-8 encoded files without Bom, but cannot control whether to add BOM when saving files,
If the file is saved, the BOM is added.

2) editplus: Can not automatically identify the UTF-8 encoding format file without Bom, when saving the file, select the UTF-8 format, will not write in the file header
Bom header.

3) ultraedit: the most powerful character encoding function. It can automatically identify UTF-8 files with and without BOM (configurable)
During saving, you can select whether to add BOM through configuration.

(Note that when saving a new file, you must save it as UTF-8 no Bom)

Later, we found that notepad ++ also provides better support for the UTF-8 Bom. We recommend that you use it.

1. How to remove BOM from Dreamweaver: Ctrl + J-> title/Encoding
-> Deselect the Unicode signature (BOM) option.

BOM information is a hidden string starting with a file that is used by some editors to identify it as a UTF-8-encoded file. However, PHP reads these characters when reading a file, which leads to some unrecognized characters at the beginning of the file.

For example, using the UTF-8 format to save the generated image PHP file, because the file header hidden BOM information is also issued, resulting in the generated image data is wrong, the browser cannot recognize.

To detect whether a UTF-8 file contains BOM information, is to detect the file at the beginning of the word three characters, whether it is 0xef, 0xbb,
0xbf. The following applet traverses all files in a directory and checks whether Bom is added.

Save the following code as del_bom.php, modify the directory to be detected, and then run it. This may help detect which file contains BOM information, resulting in a blank section at the beginning of all pages.

<? PHP
// This file is used to quickly test whether the UTF-8 encoded file is added with BOM and can be automatically removed.
// By Bob Shen
$ Basedir = "."; // modify the directory to be checked for this behavior. The vertex indicates the current directory.
$ Auto = 1; // whether to automatically remove the detected Bom. 1 is yes, 0 is no.
// This file is used to quickly test whether the UTF-8 encoded file is added with BOM and can be automatically removed.
// By Bob Shen
$ Basedir = "."; // modify the directory to be checked for this behavior. The vertex indicates the current directory.
$ Auto = 1; // whether to automatically remove the detected Bom. 1 is yes, 0 is no.
// Do not change the following
If ($ DH = opendir ($ basedir )){
While ($ file = readdir ($ DH ))! = False ){
If ($ file! = '.' & $ File! = '..'
&&! Is_dir ($ basedir. "/". $ file ))
Echo "filename: $ File". checkbom ("$ basedir/$ File ")."
<Br> ";
}
Closedir ($ DH );
}
Function checkbom ($ filename ){
Global $ auto;
$ Contents = file_get_contents ($ filename );
$ Charset [1] = substr ($ contents, 0, 1 );
$ Charset [2] = substr ($ contents, 1, 1 );
$ Charset [3] = substr ($ contents, 2, 1 );
If (ord ($ charset [1]) = 239 &&
Ord ($ charset [2]) = 187 &&
Ord ($ charset [3]) = 191 ){
If ($ auto = 1 ){
$ Rest = substr ($ contents, 3 );
Rewrite ($ filename, $ rest );
Return ("<font color = Red> BOM found,
Automatically
Removed. </font> ");

} Else {
Return ("<font color = Red> BOM
Found. </font> ");

}
}
Else return ("Bom not found .");
}
Function rewrite ($ filename, $ data ){
$ Filenum = fopen ($ filename, "W ");
Flock ($ filenum, lock_ex );
Fwrite ($ filenum, $ data );
Fclose ($ filenum );
}
?>

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.