PHP filters and processes special characters for form submission

Source: Internet
Author: User

The day before yesterday, tianyuan made a batch modification to the content of his blog post. Due to bugs in the source program, many backslashes in the path or code were removed. This problem was discovered only yesterday when bankw3000 posted a message, some corrections have been made, but some paths still have problems. If you find that there is a path loss backslash \ problem on your blog, please leave a message and feedback. tianyuan will fix it again. Tianyuan this article makes a summary of PHP's handling methods for special characters in Form submission, mainly involving htmlspecialchars/addslashes/stripslashes/strip_tags/mysql_real_escape_string and other functions for joint use.
I. Several PHP functions related to special character processing


Function Name




Convert the Ampersand, single double quotation marks, greater than or less than sign into HTML Format

& Convert to & amp;
"To & quot;
'Convert to & #039;
<To & lt;
> Convert to & gt;

Htmlentities ()

Convert all characters to HTML Format

In addition to the above htmlspecialchars characters, the two-byte characters are also displayed as encoding.





Double quotation marks, backslash, and NULL plus backslash escape

The modified characters include single quotation marks ('), double quotation marks ("), backslash (\), and NULL.


Remove backslash characters

Removes the backslash from the string. If there are two backslash lines in a row, remove one and leave one. If there is only one backslash, remove it directly.





Add a reference symbol

Include. \ + * in the string *? [^] ($) And other characters are preceded by the Backslash "\" symbol.

Nl2br ()

Convert a line break to <br>



Remove HTML and PHP tags

Remove any HTML and PHP tags in the string, including the contents between Mark blocking. Note: If the HTML and PHP tags of the string are incorrect, an error is returned.


Escape special characters in SQL strings

Escape \ x00 \ n \ r space \ '"\ x1a, which is very effective for processing multi-byte characters. Mysql_real_escape_string determines the character set, and mysql_escape_string does not need to be considered.



For other string processing functions, see: Regular string replacement in PHP and comparison of split functions.
The following is a summary of Special Character Processing in common forms:
Test string:
1 $ dbstr = 'd: \ test
2 <a href = ""> </a>, tianyuan blog
3 \'! = \ '1 \ 'OR \ '1 \'
4 </DIV>
5 <script language = "javascript" type = "text/javascript"> alert ("Fail"); </script>
8 <? Php echo "<br/> php output";?> ';
Test code:
01 header ("Content-Type: text/html; charset = UTF-8 ");
02 echo "------------------------------------------------------ <br/> \ r \ n ";
03 echo $ dbstr. "<br/> \ r \ n ------------------------------------------------------ <br/> \ r \ n ";
04 $ str = fnAddSlashes ($ _ POST ['dd']);
05 echo $ str. "<br/> \ r \ n ---------------------------------------------------- <br/> \ r \ n ";
07 $ str = preg_replace ("/\ s (? = \ S)/"," \ 1 ", $ str); // retain only one consecutive Space
08 $ str = str_replace ("\ r", "<br/>", $ str );
09 $ str = str_replace ("\ n", "<br/>", $ str );
10 $ str = preg_replace ("/(<br \/?>) +)/I "," <br/> ", $ str); // multiple consecutive tags <br/> retain only one
12 $ str = stripslashes ($ str );
13 echo strip_tags ($ str). "<br/> \ r \ n ---------------------------------------------------- <br/> \ r \ n ";
14 echo htmlspecialchars ($ str). "<br/> \ r \ n ---------------------------------------------------- <br/> \ r \ n ";
15 echo htmlentities ($ str). "<br/> \ r \ n ------------------------------------------------------ <br/> \ r \ n ";
16 echo mysql_escape_string ($ str). "<br/> \ r \ n ---------------------------------------------------- <br/> \ r \ n ";
String contains: backslash path, single double quotation marks, HTML tags, links, unblocked HTML tags, database syntax error tolerance, JS execution judgment, PHP Execution judgment, multiple consecutive carriage return line breaks and spaces. Some of these concepts have an inclusive relationship, the same below.
The source code output is as follows (the JS script will be executed ):

Ii. Data Processing for form submission
1. Force Add a backslash
Because some hosts enable the magic reference get_magic_quotes_gpc by default, and some may disable it, it is best to add a backslash to the program. This can be processed in a unified manner. The characters include single quotes, double quotation marks, and backslash.
1 function fnAddSlashes ($ data)
2 {
3 if (! Get_magic_quotes_gpc () // only escapes POST/GET/cookie data.
4 return is_array ($ data )? Array_map ('addslashes ', $ data): addslashes ($ data );
5 else
6 return $ data;
Use the fnAddSlashes ($ data) function. The result is as follows (JavaScript scripts are not executed, but HTML, JS, and PHP tags still need to be fault-tolerant ):

Use stripslashes, line feed replacement, and space replacement. The result is as follows:

2. Special Character Processing
The following are several common string processing methods, which can be selected based on actual conditions. Because the data in the submitted form has been escaped once, if you need to replace or filter the content, consider the effect of addslashes on the relevant characters. When replacing or searching, consider adding a backslash. Replacement of other characters is not affected, for example, replacement of \ r \ n.
A. retain only one consecutive Space
$ Data = preg_replace ("/\ s (? = \ S)/"," \ 1 ", $ data); // multiple consecutive spaces are reserved for only one
B. Replace line breaks with <br/>
$ Data = str_replace ("\ r", "<br/>", $ data );
$ Data = str_replace ("\ n", "<br/>", $ data );
// Html in the default <br> no blocking, xhtml in the <br/> There is blocking, it is recommended to use <br/>, more differences:
C. Multiple consecutive records <br/> retain only one
$ Data = preg_replace ("/(<br \/?>) +)/I "," <br/> ", $ data); // multiple consecutive <br/> labels retain only one
D. filter all HTML tags
This method filters all potentially dangerous tags, including HTML, Link, unblocked HTML tags, JS, and PHP.
Use the strip_tags ($ data) Function)
After this function is used, all HTML tags (including links), PHP tags, and JS Code are filtered out. The link retains the original link only removes the <a> tag and href content, PHP and JS tags are removed as a whole, including the intermediate content, such:

E. Do not filter tags, just HTML them
This method processes all the original submitted content in plain text.
Using the htmlspecialchars ($ data) function, after the function is executed, all submitted data is displayed in plain text, for example:

Execution result using the htmlentities function (garbled characters are displayed in Chinese ):

3. Write data to the database
Because advanced trusted users can directly write data to the database after using addslashes ($ data), but addslashes cannot intercept single quotes replaced by 0xbf27, it is best to use mysql_real_escape_string or mysql_escape_string for escape, however, you need to remove the backslash before escaping (assuming that addslashes is enabled by default ).
01 function fnEscapeStr ($ data)
03 {
05 if (get_magic_quotes_gpc ())
06 {
07 $ data = stripslashes ($ value );
09 $ data = "'". mysql_escape_string ($ value )."'";
10 return $ data;
13 $ data = fnEscapeStr ($ data );
After execution, for example:

4. Instant display after submission
1. If addslashes is used above, the backslash must be removed before the data is displayed.
Use the stripslashes ($ data) Function)
Note that this function is only intended for data processed by addslashes ($ data). Exercise caution when using this function. Otherwise, the backslash (for example, the folder path and drive path of the content) may be lost ), the error that occurred a few days before tianyuan is because this function was used when the database was read (the code is the old code, and I forgot to modify it), leading to the loss of the backslash in many paths due to the re-writing to the database, or you will not have this article.
2. When the htmlspecialchars ($ data) function is used, all submitted data is displayed in text after the function is executed. Unless links are allowed for special processing, htmlspecialchars output can be used as a rule, especially for unblocked HTML tags, if no tag conversion is used for filtering, the output may cause layout confusion.
Htmlentities is not recommended. On the one hand, it causes a lot of reading obstacles to the output source code. In addition, using the htmlentities function will cause dual-byte characters such as Chinese characters will display a bunch of garbled characters. Other characters are displayed normally.
The second output method can be output directly, as needed, if it is confirmed that there is no illegal tag or potential execution risk.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.