Safe handling of forms in PHP

Last Update:2015-08-26 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Rule 1: Never trust external data or input

The first thing you must realize about WEB application security is that you should not trust external data. External data (outside) includes any data that is not directly entered by the programmer in the PHP code. Any data from any other source (such as GET variables, form POST, database, configuration files, session variables, or cookies) is untrusted until measures are taken to ensure security.

For example, the following data elements can be considered safe because they are set in PHP.

Listing 1. Safe and Flawless code

<?php
$myUsername = ' Tmyer ';
$arrayUsers = Array (' Tmyer ', ' Tom ', ' Tommy ');
Define ("greeting", ' hello there '. $myUsername);
?>

However, the following data elements are defective.

Listing 2. Unsafe, flawed code

<?php
$myUsername = $_post[' username '); tainted!
$arrayUsers = Array ($myUsername, ' Tom ', ' Tommy '); tainted!
Define ("greeting", ' hello there '. $myUsername); tainted!
?>

Why is the first variable $myUsername flawed? Because it comes directly from the form POST. Users can enter any string in this input field, including malicious commands to purge files or run previously uploaded files. You might ask, "Can't you avoid this danger by using the Javascrīpt form verification script that only accepts the letter A-Z?" "Yes, it's always a good step, but as you'll see later, anyone can download any form onto their machine, modify it, and resubmit whatever they want."

The solution is simple: You must run cleanup code on $_post[' username ']. If you do not, you may contaminate these objects at any other time you use the $myUsername, such as in arrays or constants.

A simple way to clean up user input is to use a regular expression to handle it. In this example, you only want to accept letters. It may also be a good idea to limit a string to a specific number of characters, or to require all letters to be lowercase.

Listing 3. Make user input Secure

<?php
$myUsername = cleanInput ($_post[' username '); clean!
$arrayUsers = Array ($myUsername, ' Tom ', ' Tommy '); clean!
Define ("greeting", ' hello there '. $myUsername); clean!

function CleanInput ($input) {
$clean = Strtolower ($input);
$clean = Preg_replace ("/[^a-z]/", "", $clean);
$clean = substr ($clean, 0,12);
return $clean;
}
?>

Rule 2: Disable PHP settings that make security difficult to implement

You know you can't trust user input, and you should know that you shouldn't trust the way you configure PHP on your machine. For example, make sure that register_globals is disabled. If Register_globals is enabled, you might do something careless, such as replacing a GET or POST string with the same name with a $variable. By disabling this setting, PHP forces you to reference the correct variable in the correct namespace. To use a variable from a form POST, you should refer to $_post[' variable '. This will not misinterpret this particular variable as a cookie, session, or GET variable.
Rule 3: If you can't understand it, you can't protect it.

Some developers use strange syntax, or organize statements in a compact form, with short but ambiguous code. This approach can be efficient, but if you don't understand what the code is doing, you can't decide how to protect it.

For example, which of the following two sections of code do you like?

Listing 4. Make code easy to protect

<?php
obfuscated code
$input = (isset ($_post[' username '))? $_post[' username ']: ");

unobfuscated Code
$input = ";

if (isset ($_post[' username ')) {
$input = $_post[' username ');
}else{
$input = ";
}
?>

In the second, clearer code snippet, it is easy to see that $input is flawed and needs to be cleaned before it can be handled safely.
Rule 4: "Defense in Depth" is a new magic weapon

This tutorial will use an example to illustrate how to protect an online form while taking the necessary steps in the PHP code that handles the form. Similarly, even if you use the PHP regex to ensure that the GET variable is fully numeric, you can still take steps to ensure that the SQL query uses escaped user input.

Defense-in-depth is not just a good idea, it ensures that you don't get into serious trouble.

Now that you have discussed the basic rules, consider the first threat: SQL injection attacks.
Preventing SQL injection attacks

In a SQL injection attack, the user adds information to a database query by manipulating the form or GET query string. For example, suppose you have a simple login database. Each record in this database has a user name field and a password fields. Build a login form that allows users to log in.

Listing 5. Simple sign-in form

<title>Login</title>
<body>
<form action= "verify.php" method= "POST" >
<p><label for= ' user ' >Username</label>
<input type= ' text ' name= ' user ' id= ' user '/>
</p>
<p><label for= ' PW ' >Password</label>
<input type= ' password ' name= ' pw ' id= ' pw '/>
</p>
<p><input type= ' submit ' value= ' login '/></p>
</form>
</body>

This form accepts the user name and password entered by the user and submits the user input to a file named verify.php. In this file, PHP processes the data from the login form as follows:

Listing 6. Unsafe PHP Form processing code

<?php
$okay = 0;
$username = $_post[' user '];
$PW = $_post[' pw '];

$sql = "SELECT count (*) as Ctr from users where Username= '". $username. "' and password= '". $PW. "' Limit 1″;

$result = mysql_query ($sql);

while ($data = Mysql_fetch_object ($result)) {
if ($data->ctr = = 1) {
They ' re okay to enter the application!
$okay = 1;
}
}

if ($okay) {
$_session[' Loginokay ') = true;
Header ("index.php");
}else{
Header ("login.php");
}
?>

This piece of code looks fine, doesn't it? This code is used by hundreds (or even thousands) of php/mysql sites around the world. Where is it wrong? OK, remember "cannot trust user input." There is no escaping any information from the user, so the application is vulnerable to attack. Specifically, there may be any type of SQL injection attack.

For example, if the user enters Foo as the user name and enters ' or ' 1′= ' 1 as the password, the following string is actually passed to PHP, and then the query is passed to MySQL:

<?php
$sql = "SELECT count (*) as Ctr from users where Username= ' foo ' and password=" or ' 1′= ' 1′limit 1″;
?>

This query always returns the count value of 1, so PHP will allow access. By injecting some malicious SQL at the end of the password string, the hacker can dress up as a legitimate user.

The solution to this problem is to use PHP's built-in mysql_real_escape_string () function as a wrapper for any user input. This function escapes characters in a string, making it impossible for strings to pass special characters such as apostrophes and letting MySQL operate on special characters. Listing 7 shows the code with escape processing.

Listing 7. Secure PHP Form processing code

<?php
$okay = 0;
$username = $_post[' user '];
$PW = $_post[' pw '];

$sql = "SELECT count (*) as Ctr from users where Username= '". Mysql_real_escape_string ($username). "' and password= '". Mysql_real_escape_string ($PW). "' Limit 1″;

$result = mysql_query ($sql);

while ($data = Mysql_fetch_object ($result)) {
if ($data->ctr = = 1) {
They ' re okay to enter the application!
$okay = 1;
}
}

if ($okay) {
$_session[' Loginokay ') = true;
Header ("index.php");
}else{
Header ("login.php");
}
?>

By using mysql_real_escape_string () as a wrapper for user input, you can avoid any malicious SQL injection in user input. If a user attempts to pass a malformed password through SQL injection, the following query is passed to the database:

Select COUNT (*) as Ctr from users where Username= ' foo ' and password= ' or ' 1 ' = ' 1′limit 1″

Nothing in the database matches such a password. Just taking a simple step is blocking a big hole in the Web application. The experience here is that the user input to the SQL query should always be escaped.

However, there are several security vulnerabilities that need to be blocked. The next item is to manipulate the GET variable.
Prevent users from manipulating GET variables

In the previous section, the user was prevented from logging on with a malformed password. If you are smart, you should apply the method you learned to ensure that all user input to the SQL statement is escaped.

However, the user is now securely logged in. The user has a valid password and does not mean that he will act according to the rules-he has many opportunities to cause damage. For example, an application might allow users to view special content. All links point to locations such as template.php?pid=33 or template.php?pid=321. The part after the question mark in the URL is called the query string. Because the query string is placed directly in the URL, it is also called a GET query string.

In PHP, if register_globals is disabled, the string can be accessed using the $_get[' PID '. In the template.php page, you might perform a similar operation as in Listing 8.

Listing 8. Example template.php

<?php
$pid = $_get[' pid '];

We create an object of a fictional class Page
$obj = new Page;
$content = $obj->fetchpage ($pid);
And now we had a bunch of PHP that displays the page
?>

Is there anything wrong here? First, it is implicitly believed that the GET variable pid from the browser is safe. What's going to happen? Most users are less intelligent and cannot construct semantic attacks. However, if they notice pid=33 in the URL location domain of the browser, they may start messing up. If they enter another number, it may be fine, but what happens if you enter something else, such as typing in a SQL command or a file name (such as/etc/passwd), or doing other pranks, such as entering a value up to 3,000 characters long?

In this case, remember the basic rules and do not trust user input. The application developer knows that the personal identifier (PID) that template.php accepts should be a number, so you can use the PHP is_numeric () function to ensure that non-numeric PID is not accepted as follows:

Listing 9. Use Is_numeric () to restrict GET variables

<?php
$pid = $_get[' pid '];

if (Is_numeric ($pid)) {
We create an object of a fictional class Page
$obj = new Page;
$content = $obj->fetchpage ($pid);
And now we had a bunch of PHP that displays the page
}else{
Didn ' t pass the is_numeric () test, do something else!
}
?>

This method appears to be valid, but the following inputs are easily checked by the is_numeric ():

100 (valid)
100.1 (should not have decimal digits)
+0123.45e6 (Scientific counting Method--bad)
0xff33669f (Hex--Danger!) Dangerous! ）

So what should a security-conscious PHP developer do? Years of experience have shown that the best practice is to use regular expressions to ensure that the entire GET variable is made up of numbers, as follows:

Listing 10. Restricting GET variables with regular expressions

<?php
$pid = $_get[' pid '];

if (strlen ($pid)) {
if (!ereg ("^[0-9]+$", $pid)) {
Do something appropriate, like maybe logging them out or sending them back to home page
}
}else{
Empty $pid, so send them back to the home page
}

We create an object of a fictional class Page, which are now
Moderately protected from evil user input
$obj = new Page;
$content = $obj->fetchpage ($pid);
And now we had a bunch of PHP that displays the page
?>

All you need to do is use strlen () to check if the length of the variable is not 0, and if so, use an all-numeric regular expression to ensure that the data element is valid. If the PID contains letters, slashes, dots, or anything similar to hexadecimal, this routine captures it and masks the page from user activity. If you look behind the Page class, you'll see that a security-conscious PHP developer has escaped the user input $pid, protecting the Fetchpage () method as follows:

Listing 11. Escaping the Fetchpage () method

<?php
Class page{
function Fetchpage ($pid) {
$sql = "Select Pid,title,desc,kw,content,status from page where pid= '". Mysql_real_escape_string ($pid). "'";
}
}
?>

You might ask, "Now that you've made sure the PID is a number, why escape?" "Because you don't know how many different contexts and situations will use the Fetchpage () method. Protection must be done everywhere the method is called, and escaping in the method reflects the meaning of defense in depth.

What happens if a user attempts to enter a very long value, such as up to 1000 characters, to attempt to initiate a buffer overflow attack? The next section discusses this in more detail, but you can now add another check to make sure the PID you entered has the correct length. You know that the maximum length of the PID field for a database is 5 bits, so you can add the following check.

Listing 12. Use regular expressions and length checks to restrict GET variables

<?php
$pid = $_get[' pid '];

if (strlen ($pid)) {
if (!ereg ("^[0-9]+$", $pid) && strlen ($pid) > 5) {
Do something appropriate, like maybe logging them out or sending them back to home page
}
} else {
Empty $pid, so send them back to the home page
}
We create an object of a fictional class Page, which are now
Even more protected from evil user input
$obj = new Page;
$content = $obj->fetchpage ($pid);
And now we had a bunch of PHP that displays the page
?>

Now, no one can cram a 5,000-bit value into a database application-at least not in the case where a GET string is involved. Imagine a hacker's gnashing of teeth when trying to break through your application. And because the bug reports are turned off, hackers are more difficult to scout.
Buffer overflow attack

A buffer overflow attack attempts to overflow a memory allocation buffer in a PHP application (or, more precisely, in Apache or the underlying operating system). Keep in mind that you may be using a high-level language like PHP to write Web applications, but ultimately you will want to call C (in the case of Apache). Like most low-level languages, C has strict rules for memory allocation.

A buffer overflow attack sends a large amount of data to the buffer, causing partial data to overflow into adjacent memory buffers, thereby destroying the buffer or rewriting logic. This can cause denial of service, corrupt data, or execute malicious code on a remote server.

The only way to prevent a buffer overflow attack is to check the length of all user input. For example, if you have a FORM element that requires the user's name, add a MaxLength property with a value of 40 on the domain and check it with substr () on the backend. Listing 13 gives a short example of the form and the PHP code.

Listing 13. Check the length of user input

<?php
if ($_post[' submit '] = = "Go") {
$name = substr ($_post[' name '],0,40);
}
?>

<form action= "<?php echo $_server[' php_self '];? > "method=" POST ">
<p><label for= "Name" >Name</label>
<input type= "text" name= "name" id= "name" size= "20″maxlength=" 40″/></p>
<p><input type= "Submit" name= "submit" value= "Go"/></p>
</form>

Why do you provide both the MaxLength attribute and the substr () check on the backend? Because defense in depth is always good. Browsers prevent users from entering extra-long strings that PHP or MySQL cannot safely handle (imagine someone trying to enter a name up to 1,000 characters), and the backend PHP check will ensure that no one remotely manipulates the form data or in the browser.

As you can see, this approach is similar to the length of the PID using strlen () in the previous section to check the GET variable. In this example, any input value that is longer than 5 bits is ignored, but it is also easy to truncate the value to the appropriate length, as shown here:

Listing 14. Change the length of the input GET variable

<?php
$pid = $_get[' pid '];

if (strlen ($pid)) {
if (!ereg ("^[0-9]+$", $pid)) {
If non numeric $pid, send them back to home page
}
}else{
Empty $pid, so send them back to the home page
}

We have a numeric PID, but it is too long, so let ' s check
if (strlen ($pid) >5) {
$pid = substr ($pid, 0,5);
}

We create an object of a fictional class Page, which are now
Even more protected from evil user input
$obj = new Page;
$content = $obj->fetchpage ($pid);
And now we had a bunch of PHP that displays the page
?>

Note that a buffer overflow attack is not limited to long numeric strings or strings of letters. You may also see long hexadecimal strings (often looking like xA3 or XFF). Remember that any buffer overflow attack is intended to overwhelm a particular buffer and place malicious code or instructions in the next buffer, destroying the data or executing malicious code. The simplest way to counter a hex buffer overflow is to not allow the input to exceed a specific length.

If you are working with a form text area that allows you to enter longer entries in the database, you cannot easily limit the length of the data on the client. After the data arrives in PHP, you can use regular expressions to clear any strings like hexadecimal.

Listing 15. Prevent hexadecimal strings

<?php
if ($_post[' submit '] = = "Go") {
$name = substr ($_post[' name '],0,40);
Potential hexadecimal characters
$name = Cleanhex ($name);
Continue processing ....
}

You may find this series of operations a bit too restrictive. After all, hexadecimal strings have legitimate uses, such as exporting characters from a foreign language. How to deploy a hexadecimal regex is up to you. A good strategy is to delete a hexadecimal string only if it contains too many hexadecimal strings in a row, or if the character of the string exceeds a certain number (such as 128 or 255).
Cross-site scripting attacks

In cross-site scripting (XSS) attacks, there is often a malicious user entering information in table consignments (or other user input) that inserts a malicious client tag into the process or database. For example, suppose you have a simple visitor register program on your site that allows visitors to leave names, e-mail addresses, and short messages. A malicious user can take advantage of this opportunity to insert something other than a short message, such as an inappropriate image for another user or a javascrīpt to redirect the user to another site, or to steal cookie information.

Fortunately, PHP provides the Strip_tags () function, which clears any content that surrounds the HTML tag. The Strip_tags () function also allows you to provide a list of allowed tokens, such as <b> or <i>.

Data manipulation within the browser

There is a class of browser plug-ins that allow users to tamper with head elements and form elements on a page. With Tamper Data (a Mozilla plugin), it's easy to manipulate simple forms that contain many hidden text fields, sending instructions to PHP and MySQL.

The user can launch Tamper Data before clicking Submit on the form. When the form is submitted, he will see a list of form data fields. Tamper data allows the user to tamper with the information and then the browser completes the form submission.

Let's go back to the example we built earlier. The string length has been checked, HTML tags have been cleared, and hexadecimal characters have been removed. However, some hidden text fields are added, as follows:

Listing 17. Hide variables

<?php
if ($_post[' submit '] = = "Go") {
Strip_tags
$name = strip_tags ($_post[' name ');
$name = substr ($name, 0,40);
Potential hexadecimal characters
$name = Cleanhex ($name);
Continue processing ....
}

function Cleanhex ($input) {
$clean = Preg_replace ("![") [XX] ([a-fa-f0-9]{1,3})! "," ", $input);
return $clean;
}
?>
<form action= "<?php echo $_server[' php_self '];? > "method=" POST ">
<p><label for= "Name" >Name</label>
<input type= "text" name= "name" id= "name" size= "20″maxlength=" 40″/></p>
<input type= "hidden" name= "table" value= "Users"/>
<input type= "hidden" name= "action" value= "create"/>
<input type= "hidden" name= "status" value= "Live"/>
<p><input type= "Submit" name= "submit" value= "Go"/></p>
</form>

Notice that one of the hidden variables exposes the table name: Users. You will also see an action field with a value of create. As long as you have basic SQL experience, you can see that these commands may control a SQL engine in the middleware. A man who wants to do a lot of damage. Simply change the table name or provide another option, such as delete.

What questions are left? Remote form submission.

Remote form submission

The benefit of the WEB is the ability to share information and services. The downside is the ability to share information and services, because some people do things without scruple.

Take the form as an example. Anyone can access a Web site and use File > Save as on the browser to create a local copy of the form. He can then modify the action parameter to point to a fully qualified URL (not pointing to formhandler.php, but pointing to http://www.yoursite.com/formHandler.php, Because the form is on this site), making any changes he wants, click Submit and the server will receive the form data as a legitimate communication stream.

You might want to consider checking $_server[' http_referer '] to see if the request is from your own server, which can block most malicious users, but not the most sophisticated hackers. These people are smart enough to tamper with the referrer information in the header, making the remote copy of the form look like it was submitted from your server.

A better way to handle a remote form submission is to generate a token based on a unique string or timestamp and place the token in the session variables and forms. After submitting the form, check that the two tokens match. If it doesn't match, you know someone is trying to send data from a remote copy of the form.

To create a random token, you can use PHP's built-in MD5 (), uniqid (), and Rand () functions as follows:

Listing 18. Defend against remote form submissions

<?php
Session_Start ();

if ($_post[' submit '] = = "Go") {
Check token
if ($_post[' token '] = = $_session[' token ')} {
Strip_tags
$name = strip_tags ($_post[' name ');
$name = substr ($name, 0,40);
Potential hexadecimal characters
$name = Cleanhex ($name);
Continue processing ....
}else{
Stop all processing! Remote form Posting attempt!
}
}
$token = MD5 (Uniqid (rand (), true));
$_session[' token ']= $token;

function Cleanhex ($input) {
$clean = Preg_replace ("![") [XX] ([a-fa-f0-9]{1,3})! "," ", $input);
return $clean;
}
?>
<form action= "<?php echo $_server[' php_self '];? > "method=" POST ">
<p><label for= "Name" >Name</label>
<input type= "text" name= "name" id= "name" size= "20″maxlength=" 40″/></p>
<input type= "hidden" name= "token" value= "<?php echo $token;? > "/>
<p><input type= "Submit" name= "submit" value= "Go"/></p>
</form>

This technique is effective because session data cannot be migrated between servers in PHP. Even if someone gets your PHP source code, transfers it to their server and submits information to your server, your server receives only empty or malformed session tokens and the original provided form tokens. They do not match, and the remote form submission fails.

Safe handling of forms in PHP

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Safe handling of forms in PHP

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support