PHP string encoding and escape

Source: Internet
Author: User
PHP string-encoding and escaping. Because PHP programs often interact with HTML pages, Web addresses (URLs), and databases, PHP provides some functions to help you process these types of data. HTML, webpage addresses, and database commands are strings, but each of them requires different characters to be escaped using different methods. For example, a space in the Web address is written as % 20, and the direct amount is less than the symbol (<) which must be written in the HTML document <. PHP has many built-in functions to convert and obtain these encodings.
HTML: special characters in HTML are represented by entities, such as & and <. Here there are two PHP functions to convert special characters in a string into objects. One is used to delete HTML tags and the other is only used to extract meta tags.
Object reference for all special characters: The htmlentities () function converts HTML characters to corresponding entities (except space characters ). Including less than sign (<), greater than sign (>), and number (&) and accent characters. (Entity)
Reference only HTML syntax characters: The htmlspacialchars () function converts the smallest entity set to generate valid HTML. The following entities are converted:
And the symbol (&) is converted &
Double quotation marks (") are converted"
Single quotes (') are converted to' (like the effect of ENT_QUOTES when htmlentities () is called)
The minor sign (<) is converted to <
Greater than (>) is converted to>
If an application is used to display the data filled in by the user, you must use htmlspecialchars () to process the data before displaying and saving the data. If not, once the user submits a string such as "angle <30" or "sturm & drang", the browser uses HTML as the special character, to get a messy page.

Delete HTML tags
The strip_tags () function removes the HTML tag from the string:
$ Input ='

Howdy, "Cowboy"

;
$ Output = strip_tags ($ input); // $ output is 'Howdy, "Cowboy "'
The function can have the second parameter to specify the Tag left in the string. The label ending form listed in the second parameter will also be retained:
$ Input = 'the BoldTags will Stay

';
$ Output = strip_tags ($ input ,''); // $ Output is'BoldTags will stay'
The attributes in the reserved tag will not be changed by strip_tags. Because HTML tag attributes (such as style and onmouseover) can affect the appearance and behavior of web pages, retaining some tags with strip_tags () may lead to the failure to delete all potential redundant content.

Extract meta tags
If you store the HTML on the web page in a string, the get_meta_tags () function returns an array containing the meta tag content in the page. The meta tag names (keywords, author, etries, and so on) become the keys of the array, and the meta tag content is the corresponding value:
$ Meta_tags = get_meta_tags ('HTTP: // www.example.com /');
Echo "Web page made by {$ meta_tags [author]}";
The common function form is:
$ Array = get_meta_tags (filename [, use_include_path]);
You can specify the use_include_path parameter to true, so that PHP can try to open the file with the standard include path.

URL: PHP provides some functions for URL encoding and decoding. There are two methods to encode the URL. The difference is how to process spaces. First (according to RFC1738), use space as another illegal character in the URL and encode it as % 20. The second (execute application/x-www-form-urlencoded system) to encode the space into a plus and use it to create a query string.
Note that you do not need to use these functions for a complete URL, for example, http://www.example.com/hello, because they may turn to false signs and reverse weights:
Http % 3A % 2F % 2Fwww.example.com % 2 Fhello
Only part of the URL (at http://www.example.com/) should be encoded, followed by the Protocol and domain name.

RFC1738 encoding and decoding
To encode the string according to the URL conventions, you can use rawurlencode ():
$ Output = rawurlencode (input );
This function receives a string and returns a copy of the string. In this copy, invalid URL characters are encoded as % dd.
If you want to dynamically generate hyperlink addresses for links on a page, you need to use rawurlencode () to convert them:
$ Name = "Programming PHP ";
$ Output = rawurlencode ($ name );
Echo "http: // localhost/$ output ";
Http: // localhost/Programming % 20PHP
The rawurldecode () function is used to decode the encoded URL string:
$ Encoded = 'Programming % 20php ';
Echo rawurldecode ($ encoded );
Programming PHP

Query string Encoding
The urlencode () and urldecode () functions differ from their original versions (that is, rawurlencode () and rawurldecode () except that they encode spaces as plus signs (+) rather than % 20. This is the format used to create query strings and cookie values, but these values are automatically decoded when transmitted through forms or cookies, therefore, you do not need to use these functions to process query strings or cookies on the current page. These two functions are useful for generating query strings:
$ Base_url = 'HTTP: // www.google.com/qini ';
$ Query = 'php sessions-cookies ';
$ Url = $ base_url.urlencode ($ query );
Echo $ url;
Http://www.google.com/q=PHP+sessions+cookies

SQL: Most database systems require escaping SQL query strings. The SQL escape method is quite simple-Add a backslash (\) before single quotes, double quotation marks, null bytes, and backslash (\). The addslashes () function can add the backslash, And the stripslashes () function will delete them:
$ String = < "It's never going to work," she cried,
As she hit the backslash (\) key.
The_End;
Echo addslashes ($ string );
\ "It \'s never going to work, \" she cried,
As she hit the backslash (\) key.
Echo stripslashes ($ string );
"It's never going to work," she cried,
As she hit the backslash (\) key.
Tip: some databases (such as SYBASE) are escaped with a single quotation mark (single quotation mark) instead of a backslash. For these databases, you can open magic_quotes_sybase in the php. ini file.

Execution of data errors in the form of code is the root cause of the vulnerability, so the solution is to make the data be recognized by the SQL layer and brought into the query in the form of data.
If you do not use backslash escape, the query conditions similar to id = 111 will be directly included in the query statement: select * from table where id = '$ id' is changed to select * from table where id = '000000'; an error is returned when a query statement contains unclosed single quotes.
If we use the addslashes () function to escape the backslash, the SQL query statement will become: select * from table where id = '2017 \'', therefore, the single quotation marks in id = 111 'are passed as \', so they are executed as values, SQL considers the single quotes in id = 111 as part of the id variable value, and the query results may or may not exist (for example, if there is a database value whose id is Wang erxiao's, there will be a query result), but the most important thing is that single quotes are passed in the form of values, which is the most fundamental essence.

1. For PHP magic_quotes_gpc = on,

We can not perform the addslashes () and stripslashes () operations on the string data of the input and output databases, and the data will be displayed normally.

If you have performed addslashes () processing on the input data at this time, you must use stripslashes () to remove unnecessary backslash when outputting the data.

2. For PHP magic_quotes_gpc = off, you must use addslashes () to process the input data, but do not need to use stripslashes () to format the output. Because addslashes () does not write the backslash together into the database, it only helps mysql to complete SQL statement execution.

Some websites can set labels to "programming's" and other formats, but an error is reported when you click "save. This error will be reported no matter which article accesses the corresponding tag. This is caused by incorrect compiling when PHP integrates SQL query statements from user input to PHP code and then to the SQL layer.
In the process of accumulating knowledge, you always don't understand it at first. After learning a little bit, you may think that you have to. When one day you are suddenly confused, I think it seems that it is not so metaphysical, and then you will not think about where else is there a problem? You will be confused for a long time until you finally find the missing corner, and then it is so beautiful, because when you find it, you will almost have the ability to fill it. Everything becomes open.

Sometimes, it may be a little helpless, because people are getting more and more impetuous, while the security environment has just been established, and security issues have just been paid more and more attention. Therefore, when most people are in the peak stage, it has been widely recognized by the industry, so it will be especially blind and confident. Many people, however, have been playing for a long time and have never imagined where else is worth confusion. A humble eye is always a fact that makes people collapse.

Sometimes, the id data value is set to intval ($ id). In this case, if the variable can only be a number, no injection is generated, no matter what you enter, it will be carried into the database for query in the form of numbers -- and values. However, if the value is a balanced id and addslashes () is used, is it safe ?, No, because if the SQL statement format is select * from table where id = "'$ _ POST [id]'", in this case, the id value is enclosed in single quotes, to construct malicious values that can be executed by code, we need to close the single quotation marks, and addslashes will transfer the single quotation marks we enter into the query as normal values. This is the case. Fortunately, some SQL statements do not contain single quotes, for example, select * from table where id = "$ _ POST [id]". Although special characters such as addslashes escape single quotes are enabled, however, we do not need to close the id field in the SQL query statement in single quotes, so we can still inject it.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.