Has anyone ever met a similar situation?
Because not always appear, and tried the simplest code also problems, did not go to the code to think, has been tangled in the PHP extension is not a problem.
Later colleagues to remove the BOM is good, the surprise is more depressed ... That's embarrassing.
Although the problem is solved, what is the specific reason?
Reply content:
Has anyone ever met a similar situation?
Because not always appear, and tried the simplest code also problems, did not go to the code to think, has been tangled in the PHP extension is not a problem.
Later colleagues to remove the BOM is good, the surprise is more depressed ... That's embarrassing.
Although the problem is solved, what is the specific reason?
Unicode is unique, but Unicode is not encoded in a unique way. The encoding may be unique, but the small end of the big-endian is not necessarily unique in the first. Light says opening a "Unicode" file is actually not very easy to do.
A BOM is a Unicode reserved character U+feff, encoded according to the encoding of the file store, and then plugged to the front of the file content. By using different Unicode encoding to parse the file header, you can know the encoding method and the order of the size and end of the file. The result is that the file header is two or three extra bytes.
With the BOM all the procedures must be modified for the BOM , this is undoubtedly a "big toss" behavior. Therefore, the BOM is generally not considered a good idea. BOM caused by the problem, I can think of two:
- PHP cannot specify header (because BOM is equivalent to open output)
- The Shabang tag () of the UNIX executable script
#!
cannot be read
Using UTF-8 encoded Unicode without BOMat any time is definitely the most practical strategy to cause the least hassle. UTF-8 is a best practice for Unicode and does not have one.
It is important to note that Microsoft often does what it does not want DOM to do, the most typical example of which is the Notepad (save with DOM). So at any time, don't be lazy to edit php with Notepad. Chinese Pride notepad++ is the perfect choice for Windows.
Never forget that Microsoft is technologically backward and will only be cynical about exceeding its own open source community, from the industry's cancer that refuses to really correct its problems . The words under the ruthless point, but even if not to this degree, also not much. Make life easier by getting away from Microsoft.
Bom:byte Order Mark
UTF-8 BOM is also called UTF-8 signature, in fact UTF-8 BOM to UFT-8 no effect, is to support utf-16,utf-32 only add the BOM, BOM signature means to tell the weak editor (Notepad) The current file using what encoding, convenient editor identification.
PHP at the beginning of the design, did not take into account the problem of BOM head, it is easy because the BOM head caused strange problems, such as code conversion failure, style confusion and so on, and this problem is quite covert, it is difficult to determine the problem of the file (imagine the absence of tools from thousands of project files to find which file with BOM header).
BOM headers are hidden characters, non-editable characters, just like regular empty files, when we write
{BOM头}
当 file.php 被其他文件包含时,由于 BOM 头在 php 标签外,会当作输出内容输出到浏览器,然后引发问题。
少年,珍爱生命,远离 BOM 。
一般的编辑器,是侦测不到utf8+bom的, 诸如记事本、写字板等等。 须使用editplus、ultraedit等文本编辑器进行侦测, 然后另存为utf8 no bom格式。
出现问题的原因可能是您用过记事本类的弱文本编辑器编辑过。