After PHP reads the CSV file, the Uft8 BOM causes the problem resolution to be displayed on the page _php tutorial

Source: Internet
Author: User
Tags file copy ming
Date.csv:
"ID" "NAME" "EMAIL"
"1" "Xiao Ming" "xm@163.com"
"2" "Xiao Dong" "xd@sina.com"
"3" "Little Less" "shaozi@hotmai.com"

Read this CSV file
Copy the Code code as follows:
$handle =fopen (' date.csv ', ' R ');
while ($data =fgetcsv ($handle, 10000, "/T"))
{
echo "$data [0]". " $data [1] "." $data [2] ";
}
?>

When it is displayed on the page after reading, it becomes this:
"ID" NAME EMAIL
1 Xiao Ming xm@163.com
2 Little East Xd@sina.com
3 Little Shaozi@hotmai.com
The field wrapping character of the Fgetcsv function is double-quoted by default.
Why do all the other fields look good when I read them, but the IDs are enclosed in double quotes?

On the Internet to check the next, the original is UTF8 encoded BOM under PHP is not recognized.
Here is the information we have found:
There is a concept of a BOM in the Unicode specification. Bom--byte order mark is a byte-order mark. In
Over here
Find a description of the BOM:
There is a character called "ZERO WIDTH no-break SPACE" in the UCS encoding, and its encoding is Feff. Fffe is not a character in UCS, so it should not appear in the actual transmission. The UCS specification recommends that the character "ZERO WIDTH no-break SPACE" be transmitted before the byte stream is transmitted. This means that if the recipient receives Feff, the byte stream is Big-endian, and if Fffe is received, it indicates that the byte stream is Little-endian. So the character "ZERO WIDTH no-break SPACE" is also called a BOM.

The UTF-8 does not require a BOM to indicate byte order, but it can be used to indicate the encoding using a BOM. The UTF-8 code for the character "ZERO WIDTH no-break SPACE" is the EF BB BF. So if the receiver receives a byte stream beginning with the EF BB BF, it knows that this is UTF-8 encoded.
Windows uses a BOM to mark the way a text file is encoded.

In addition, the Unicode Web site
Faq-bom
Detailed description of the BOM. The official natural authority, which is only English, looks rather laborious.
In a UTF-8 encoded file, the BOM occupies three bytes. If you use Notepad to save a text file as UTF-8 encoding, open the file with your UE, switch to the hexadecimal edit State to see the beginning of the Fffe. This is a good way to identify the UTF-8 encoded file, the software through the BOM to identify whether this file is UTF-8 encoding, many software also requires that the file read must have a BOM. However, there are still many software that do not recognize the BOM. When I was studying Firefox, I knew that in the early versions of Firefox, there was no BOM in the extension, but the version of Firefox after 1.5 has started to support the BOM. It is now also found that PHP does not support the BOM.

PHP did not consider BOM at design time, that is to say, he will not ignore the UTF-8 encoded file at the beginning of the BOM three characters. Because you must convert->utf-8 to ASCII, or select ASCII encoding in Save As. In the case of a line-end character in a DOS format, you can open it in Notepad, save the point as, and select ASCII encoding. If you include Chinese characters, you can use the Save as function of UE, select "UTF-8 no BOM". Please refer to the following image:


According to Bo-blog's wiki, EditPlus needs to be saved as a GB first and then saved as UTF-8. Be careful, however, that all characters that are not included in the GBK encoding are lost. If there are some non-Chinese characters in the file, do not use this method. (from this small point of view, ue--ultraedite-32 really much better than EditPlus, editplus too lightweight)

In addition, I found a way to make use of WordPress provided by the file editor. This method is unrestricted, do not need to download a special editor, after all, everyone is using WordPress. First in the FTP to edit the file to open the Write permission, and then enter the WordPress backstage---management--file editor, enter the path to edit the file, click Edit File. In the display of the editing interface, you can not see the beginning of the three characters, but it doesn't matter, position the cursor in the entire file before the first character, click the Backspace key. OK, click Update file, in the FTP refresh, you can see the file is 3 bytes Small, you are done.

Finally, this is a big problem, all to write their own plug-ins, edit the plug-in for their own use, need to modify the template (this estimate everyone needs it), it is best to understand the above knowledge, so as not to be overwhelmed when the problem occurs.

http://www.bkjia.com/PHPjc/328186.html www.bkjia.com true http://www.bkjia.com/PHPjc/328186.html techarticle date.csv: "ID" "NAME" "EMAIL" "1" "Xiao Ming" "Xm@163.com" "2" "Small East" "Xd@sina.com" "3" "Little Less" "shaozi@hotmai.com" read this CSV file copy code The code is as follows: PHP $handle =fo ...

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.