PHP htmlspecialchars function
Definitions and usage
The Addslashes () function adds a backslash before the specified predefined character.
Some characters have special meanings for HTML and should be represented in HTML entities if they want to keep their meaning. This function returns a translation of some of the strings associated with these transformations, and is the most useful daily network programming. If you require all HTML character entities to be translated, use the ヶ () instead.
This feature helps prevent users from providing text that includes HTML tags that are applied from a message board or a guest book.
In the translation process for:
' & ' (symbol) into ' & '
' "' (double quotes) becomes '" ' when Ent_noquotes is not set.
' (single quotes) become ' only when ent_quotes set.
' < ' (less than) into ' < '
' > ' (greater than) becomes ' > '
String
The string is converted.
Quote_style
Optionally, the second argument, Quote_style, tells the function how to handle single, double quotes. The default mode, Ent_compat, is backward-compatible, with only 99.64 double quote characters and leaf single quotes translated. If the ent_quotes is set to single and double quote translations, if the ent_noquotes is neither single nor set to translate the double quotes.
Character
The conversion of the character set used in the definition. The default character set is Iso-8859-1.
For the purpose of this function, the standard character set-8859-1 issues, Iso-8859-15,utf-8, cp866,cp1251,cp1252, and koi8-r valid equivalents, such as the htmlspecialchars-affected characters () occupy the same positions All of the character sets.
The following character set is supported in PHP 4.3.0 and later versions.
Supported character Set encoding alias description
Standard-8859-1 iso8859-1 Western Europe, Latin America-1
Standard-8859-15 iso8859-15 Western Europe, Latin America 9. Added the euro symbol, France and Finland in Latin America were missing 1 characters (iso-8859-1).
UTF-8 ASCII is compatible with multibyte 8-bit Unicode.
cp866 ibm866,866 dos specific Cyrillic character set. This character set supports 4.3.2.
cp1251 of the Windows-1251, a total win 1251,1251 Windows specific Cyrillic character set. This character set supports 4.3.2.
cp1252 windows-1252,1252 of Windows Western European-specific character set.
Koi8-r koi8 Ru, koi8r Russia. This character set supports 4.3.2.
Traditional version 950 Traditional Chinese, mainly used in Taiwan.
Simplified version 936 Chinese, national standard character set.
The traditional Chinese character set is extended with Hong Kong, Traditional Chinese.
Shift_JIS 8859,932 Japanese
EUC-JP's EUCJP Japan
Note: Any other character set cannot be recognized and iso-8859-1 will be used instead.
Double_encode
When Double_encode closes PHP to open the HTML entity code that does not exist, the default is to put everything.
<?php
$new = Htmlspecialchars ("<a href= ' test ' >Test</a>", ent_quotes);
Echo $new; <a href=& #039;test& #039;> Test</a>
Reference example
<?php
function Get_page ($url)
{
$curl = Curl_init ();
curl_setopt ($curl, Curlopt_url, $url);
curl_setopt ($curl, curlopt_useragent, ' some bot ');
curl_setopt ($curl, Curlopt_httpheader, $header);
curl_setopt ($curl, Curlopt_referer, '-');
curl_setopt ($curl, curlopt_encoding, ' gzip,deflate ');
curl_setopt ($curl, curlopt_followlocation, 1);
...
curl_setopt ($curl, Curlopt_header, 1);
curl_setopt ($curl, curlopt_nobody, 0);
curl_setopt ($curl, Curlopt_timeout, 10);
$html = curl_exec ($curl);
Curl_close ($curl);
return $html;
}
$text = Get_page ($url);
$new = Htmlspecialchars ($text, ent_quotes); This is the magic:)
Echo ' <pre> '. $new. ' </pre> ';
?>