Well, because it is pure small white, this problem may be a bit of food, the Great God forgive me ...
is to write PHP, need to echo special characters, such as "This symbol, when writing to PHP with escape \" Good or HTML code
"
Well, what?
Ask you, thank you!
Thank you, everyone! Already understand, thank you very much for the patience of the answer!
Ps.php Great God is not the same, the answer is so detailed ~ ~ here Thanks!
Reply content:
Well, because it is pure small white, this problem may be a bit of food, the Great God forgive me ...
is to write PHP, need to echo special characters, such as "This symbol, when writing to PHP with Escape," good or HTML code? "
Ask you, thank you!
Thank you, everyone! Already understand, thank you very much for the patience of the answer!
Ps.php Great God is not the same, the answer is so detailed ~ ~ here Thanks!
To understand which is good, first we need to figure out the difference between the two:
The use of escape is equivalent to the output of the original character, since the special character Fu Yuan character output, it must and the page encoding method and browser encoding method has a relationship. If the encoding of the page (such as GBK) does not contain the special character, or if the browser encoding does not contain the special character, it will be garbled.
The Orthodox name of the HTML code you are talking about is called the HTML character entity , and the English name is called HTML entities. With the use of character entities, the browser converts the code to the correct character, which requires less coding. About the contents of character entities you can look at these pages: HTML character Entities | HTML Entities
In summary, it is easy to write and read with escape words, but there are requirements for how the page is encoded. The use of HTML character entities, although the encoding is less than the display, but not easy to write and read the source code. So my personal advice is, for example, a "
slightly more common character or an escaped output is better, and special characters that can be independent of the page, such as the characters, are represented by the character entity. Character entity this block, although written with the entity number, will have greater compatibility, but the individual prefers to use the name of the character entity for easy reading.
Escaping and escaping are not the same
In perceptual terms, the delimiter (or any other character that uses special meaning directly) in the string itself must be escaped . But the escape of different languages, although the purpose of the same, but to turn things are often completely different.
"Is it good to escape \"
or to use HTML code "
?" The former is the escape of the PHP string, which is the escape of the HTML language. The use of the occasion is different, so do not confuse--is not the concept of equivalent discussion at all.
In the main problem, the former uses the PHP string escape, which is the double quotation mark wrapped string, allows to escape \"
= "
, \n
=lf (0x0A) and other characters. The latter involves an HTML escape that uses HTML entities (HTML entities) to represent characters, such as &
= &
.
No escaping can be omitted
The escape of PHP can not be omitted, needless to say (most of the time, do not escape the obvious parsing error Ah!) )
But here's the problem: PHP = hypertext preprocessor,php output is an HTML page. The string constants written in PHP are often sent to the browser's HTML parser. The same is true of the questions asked by the main question.
So it's important to be clear: the requirements of the master are likely to involve two escapes : first write the valid PHP string constants, and then ensure that the browser parses the HTML, the output is still the same as the string, but not:
- Special character invalidation
- Destroying the HTML DOM structure
- Be injected with code to create security issues
PHP and HTML, two of the escape can not be omitted. The main problem of the conclusion is: the former only made the escape of PHP, which is placed in the HTML body can be, but placed in the attribute value of the tag does not work. The latter is HTML-escaped, and PHP does not have to be escaped (as if it were escaped), OK. But the important thing is to know "why" can be, not to try and finally "coincidence" can.
I approve of the practice
Although the second can be, I think echo '"';
this is definitely a very bad practice . If you comment on a "php Worst programming habit", this definitely has the strength to list.
Because PHP first constructs a string and then processes it into HTML format output, which is a sequential requirement. If the HTML entity is written directly in the PHP string constant, it is the logic of the two-phase escape mixed together.
My preferred method in PHP is to keep the contents of the string as is, wrapped with or function before the output htmlentities
htmlspecialchars
: (Note that the two functions are slightly different, there are many online data)
header("Content-Type: text/html; charset=utf-8");echo htmlentities("\"M&M\""); # 显示:"M&M" 查看源代码:"M&M"
This way, when writing the source code, just care about the logic of PHP. Finally, the machine guarantees the foreground output, which is consistent with the string itself.
You may always see an inexplicable real "" situation on the Internet &
-this is not only the point of escaping, but the escape process from the thought of being everywhere, and eventually caused the repetition of two times HTML escape error.
The data may be exported everywhere, but the data is only one. It is also important not to mix the original data with your own recognition.
How to deal with special symbols?
To deal with
such special symbols, as well as Chinese and other related coding methods, I and the red son of the view is not very consistent. My idea is:
- Any special symbols that appear are actually written in the string. Chinese is the same.
- Escape is still done using htmlentities.
- PHP files are stored as much as possible using UTF-8 without BOM format.
- If the output encoding is not UTF-8, then the HTML escape is followed by a layer of iconv conversion.
Because I think the escape and encoding is still a matter of two stages, not to mix together to talk. This model is like the Russian set of baby, the installer must be installed small and then set large, and the dismantling must be first removed and then dismantled inside. The logic of taking GBK as an example is:
- First, there is the original data. ——
公司"2014"
- String constants made into PHP. --
"公司\"2014\""
Write it in a PHP script file like this
- Escape HTML entities. --
©公司"2014"
here PHP knows: There are these characters in the string, that's OK, the inside should be Unicode, but we don't care.
- Turn into GBK encoding. --The
©
"2014"
middle part of the +b9abcbbe+ is a 4-byte GBK code primitive value
- Browser unlock GBK code:--
©公司"2014"
- and unlock the HTML entity ————
公司"2014"
- Output as is.
Removing the code is nothing more than removing steps 4 and 5, and escaping a little relationship is not involved.
The HTML Yi must be transformed (spaces
).
Never believe in anything.