We all know that security is important, but the trend in the industry is to add security until the last minute. Since it's not possible to completely protect a Web application, why bother? Wrong. There are a few simple steps you can take to make your PHP Web application much more secure.
Before you start
In this tutorial, you will learn how to add security to your PHP Web application. This tutorial assumes that you have at least one year of experience writing PHP WEB applications, so there is no basic knowledge of the PHP language (Conventions or syntax). The goal is to give you an idea of how you should protect your own WEB applications.
Goal
This tutorial explains how to defend against the most common security threats: SQL injection, manipulating get and POST variables, buffer overflow attacks, cross-site scripting attacks, data manipulation within browsers, and remote form submission.
Prerequisite conditions
This tutorial is written for PHP developers who have at least one year of programming experience. You should know the syntax and conventions of PHP, which are not explained here. Developers with experience in other languages, such as Ruby, Python, and Perl, can also benefit from this tutorial, as many of the rules discussed here also apply to other languages and environments.
System Requirements
Requires an environment that is running PHP V4 or V5 and MySQL. You can use Linux, OS X, or Microsoft Windows. If it's on Windows, download the wampserver binaries and install Apache, MySQL, and PHP on the machine.
Quick Introduction to Security
What is the most important part of a WEB application? Depending on the person answering the question, the answer to this question may be varied. Business people need reliability and scalability. The IT support team needs robust, maintainable code. End users need a nice user interface and high performance when performing tasks. However, if you answer "security," everyone will agree that this is important for WEB applications.
But most of the discussion is over. Although security is in the project's checklist, it is often not until the project is delivered that consideration is given to addressing security issues. The number of WEB application projects in this way is staggering. Developers work for months, only to add security features at the end so that Web applications can be opened to the public.
The result is often confusing and even requires rework because the code has been validated, unit tested, and integrated into a larger framework before adding security features. After adding security, the primary component may stop working. The integration of security adds additional burdens or steps to an inherently smooth (but unsafe) process.
This tutorial provides a good way to integrate security into a PHP Web application. It discusses several general security topics, and then delves into major security vulnerabilities and how to block them. After completing this tutorial, you will have a better understanding of security.
Topics include:
SQL Injection Attack
Manipulating Get strings
Buffer overflow attack
Cross-site scripting attacks (XSS)
Data manipulation in the browser
Remote form submission
WEB Security 101
Before discussing the details of implementing security, it is a good idea to discuss WEB application security from a relatively high point of view. This section describes some of the basic tenets of security philosophy that should be kept in mind no matter what WEB application you are creating. Part of these ideas comes from Chris Shiflett (his book on PHP Security is priceless), some from Simson Garfinkel (see Resources), and a few years of accumulated knowledge.
Rule 1: Never trust external data or input
The first thing you must realize about WEB application security is that you should not trust external data. External data (outside) includes any data that is not entered directly by the programmer in the PHP code. Any data from any other source, such as a Get variable, form POST, database, configuration file, session variable, or cookie, is not trusted until the action is taken to ensure security.
For example, the following data elements can be considered safe because they are set in PHP.
Listing 1. Safe and Flawless code
<?php
$myUsername = ' Tmyer ';
$arrayUsers = Array (' Tmyer ', ' Tom ', ' Tommy ');
Define ("greeting", ' hello there '. $myUsername);
?>
However, the following data elements are flawed.
Listing 2. Unsafe, flawed code
<?php
$myUsername = $_post[' username ']; tainted!
$arrayUsers = Array ($myUsername, ' Tom ', ' Tommy '); tainted!
Define ("greeting", ' hello there '. $myUsername); tainted!
?>
Why is the first variable $myUsername flawed? Because it comes directly from the form POST. Users can enter any string in this input field, including malicious commands for purging files or running previously uploaded files. You might ask, "Can't you avoid this danger by using a client (JavaScript) Form validation script that accepts only the letter A-Z?" "Yes, it's always a good step, but as you'll see later, anyone can download any form to their machine, modify it, and resubmit whatever they need."
The solution is simple: You must run cleanup code on $_post[' username '. If you do not, you may contaminate these objects at any other time that you use the $myUsername (for example, in arrays or constants).
An easy way to clean up user input is to use a regular expression to handle it. In this example, you only want to accept letters. It may also be a good idea to limit the string to a specific number of characters, or to require all letters to be lowercase.
$arrayUsers = Array ($myUsername, ' Tom ', ' Tommy '); clean!
Define ("greeting", ' hello there '. $myUsername); clean!
function CleanInput ($input) {
$clean = Strtolower ($input);
$clean = Preg_replace ("/[^a-z]/", "", $clean);
$clean = substr ($clean, 0,12);
return $clean;
}
?>
Rule 2: Disable PHP settings that make security difficult to implement
You know you can't trust user input, and you should know that you shouldn't trust the way you configure PHP on your machine. For example, be sure to disable register_globals. If you enable Register_globals, you may do something careless, such as using $variable to replace a GET or POST string with the same name. By disabling this setting, PHP forces you to reference the correct variable in the correct namespace. To use a variable from the form POST, you should refer to $_post[' variable '. This will not misunderstand this particular variable as a cookie, session, or get variable.
The second setting to check is the error reporting level. During development, you want to get as many error reports as possible, but when you deliver your project, you want to log the errors to the journal file instead of appearing on the screen. Why, then? Because malicious hackers use error reporting information (such as SQL errors) to guess what the application is doing. This reconnaissance can help hackers break through the application. To plug this vulnerability, you need to edit the php.ini file, provide the appropriate destination for the Error_log entry, and set the display_errors to OFF.
Rule 3: If you can't understand it, you can't protect it.
Some developers use strange syntax, or organize statements in a compact, short but ambiguous code. This approach can be efficient, but if you don't understand what the code is doing, you can't decide how to protect it.
For example, which paragraph of the following two sections of code do you like?
In the second clear code snippet, it is easy to see that $input is flawed and needs to be cleaned before it can be safely processed.
Rule 4: "Defense in Depth" is a new weapon
This tutorial will use an example to illustrate how to protect online forms while taking the necessary steps in processing the form's PHP code. Also, even if you use a PHP regex to ensure that a get variable is entirely numeric, you can still take steps to ensure that the SQL query uses escaped user input.
Defense in depth is not just a good idea, it can make sure you don't get into serious trouble.
Now that the basic rules have been discussed, let's look at the first threat: SQL injection attacks.
Preventing SQL injection attacks
In a SQL injection attack, a user adds information to a database query by manipulating the form or get query string. For example, suppose you have a simple login database. Each record in this database has a username and a password field. Build a login form that allows users to log in.
Listing 5. Simple login Form
<html>
<head>
<title>Login</title>
</head>
<body>
<form action= "verify.php" method= "POST" >
<p><label for= ' user ' >Username</label>
<input type= ' text ' name= ' user ' id= ' user '/>
This form accepts the user name and password entered by the user and submits the user input to the file named verify.php. In this file, PHP processes data from the login form, as follows:
Listing 6. Unsafe PHP Form handling code
<?php
$okay = 0;
$username = $_post[' user '];
$PW = $_post[' pw '];
$sql = "SELECT count (*) as Ctr from users where Username= '". $username. "' and password= '. $PW. "' Limit 1";
$result = mysql_query ($sql);
while ($data = Mysql_fetch_object ($result)) {
if ($data->ctr = = 1) {
They ' re okay to enter the application!
$okay = 1;
}
}
if ($okay) {
$_session[' Loginokay '] = true;
Header ("index.php");
}else{
Header ("login.php");
}
?>
This code looks fine, right? This code is used by hundreds (even thousands) of php/mysql sites around the world. Where is it wrong? Well, remember "can't trust user input." There is no escaping any information from the user, so the application is vulnerable to attack. Specifically, any type of SQL injection attack may occur.
For example, if the user enters Foo as the user name and enters ' or ' 1 ' = ' 1 as the password, the following string is actually passed to PHP and the query is passed to MySQL:
<?php
$sql = "SELECT count (*) as Ctr from users where Username= ' foo ' and password= ' or ' 1 ' = ' 1 ' limit 1";
?>
This query always returns the count value of 1, so PHP will allow access. By injecting some malicious SQL into the end of the password string, the hacker can dress up as a legitimate user.
The solution to this problem is to use PHP's built-in mysql_real_escape_string () function as a wrapper for any user input. This function escapes characters in the string, making it impossible for strings to pass special characters such as apostrophes and let MySQL operate on special characters. Listing 7 shows the code with escape processing.
Listing 7. Secure PHP Form handling code
<?php
$okay = 0;
$username = $_post[' user '];
$PW = $_post[' pw '];
$sql = "SELECT count (*) as Ctr from users where Username= '". Mysql_real_escape_string ($username). "' and password= '. Mysql_real_escape_string ($PW). "' Limit 1";
$result = mysql_query ($sql);
while ($data = Mysql_fetch_object ($result)) {
if ($data->ctr = = 1) {
They ' re okay to enter the application!
$okay = 1;
}
}
if ($okay) {
$_session[' Loginokay '] = true;
Header ("index.php");
}else{
Header ("login.php");
}
?>
By using mysql_real_escape_string () as a wrapper for user input, you can avoid any malicious SQL injection in user input. If a user attempts to pass a malformed password through SQL injection, the following query is passed to the database:
Select COUNT (*) as Ctr from users where Username= ' foo ' and password= ' or ' 1\ ' =\ ' 1 ' limit 1 '
There is nothing in the database that matches this password. Using just one simple step, a big flaw in the Web application is blocked. The lesson here is that you should always escape user input for SQL queries.
However, there are several security vulnerabilities that need to be blocked. The next item is manipulating a get variable.
Prevent user from manipulating get variables
In the previous section, you prevented users from using malformed passwords to log on. If you are smart, you should apply the method you have learned to ensure that all user input to the SQL statement is escaped.
However, the user is now securely logged in. A user with a valid password does not mean he will follow the rules-he has many opportunities to do damage. For example, an application might allow users to view special content. All links point to locations such as template.php?pid=33 or template.php?pid=321. The section following the question mark in the URL is called a query string. Because the query string is placed directly in the URL, it is also called a Get query string.
In PHP, if register_globals is disabled, you can access the string using $_get[' pid '. In the template.php page, you might perform an operation similar to listing 8.
Listing 8. Sample template.php
<?php
$pid = $_get[' pid '];
We create an object of a fictional class Page
$obj = new Page;
$content = $obj->fetchpage ($pid);
And now we have a bunch of PHP this displays the page
?>
Is there anything wrong here? First, it is implicitly believed that the get variable pid from the browser is safe. What's going to happen? Most users are less intelligent and cannot construct semantic attacks. However, if they notice the pid=33 in the browser's URL location field, they may start messing around. If they enter another number, it may be fine, but what happens if you enter something else, such as a SQL command or a file name (such as/etc/passwd), or some other prank, such as a value of 3,000 characters long?
In this case, remember the basic rules and do not trust user input. Application developers know that the personal identifiers (PID) that template.php accepts should be numbers, so you can use the Is_numeric () function of PHP to ensure that you do not accept Non-numeric PID, as follows:
Listing 9. Use Is_numeric () to restrict get variables
<?php
$pid = $_get[' pid '];
if (Is_numeric ($pid)) {
We create an object of a fictional class Page
$obj = new Page;
$content = $obj->fetchpage ($pid);
And now we have a bunch of PHP this displays the page
}else{
Didn ' t pass the is_numeric () test, do something else!
}
?>
This method seems to be valid, but the following inputs can be easily checked by Is_numeric ():
100 (valid)
100.1 (there should be no decimal places)
+0123.45e6 (Scientific counting method--not good)
0xff33669f (hex--Dangerous!) Dangerous! )
So what should a security-conscious PHP developer do? Years of experience have shown that the best practice is to use regular expressions to make sure that the entire get variable consists of numbers, as follows:
Listing 10. Using regular expressions to restrict get variables
<?php
$pid = $_get[' pid '];
if (strlen ($pid)) {
if (!ereg ("^[0-9]+$", $pid)) {
Do something appropriate, like maybe logging them out or sending them back to home page
}
}else{
Empty $pid, so send them back to the home page
}
We create an object of a fictional class Page, which are now
Moderately protected from evil user input
$obj = new Page;
$content = $obj->fetchpage ($pid);
And now we have a bunch of PHP this displays the page
?>
All you need to do is use strlen () to check whether the variable is 0 or not, and if so, use a full digit regular expression to ensure that the data element is valid. If the PID contains letters, slashes, dots, or anything similar to hexadecimal, this routine captures it and masks the page from the user activity. If you look behind the Page class, you'll see that security-aware PHP developers have escaped the user input $pid, protecting the Fetchpage () method, as follows:
Listing 11. Escaping the Fetchpage () method
<?php
Class page{
function Fetchpage ($pid) {
$sql = "Select Pid,title,desc,kw,content,status from page where pid= '". Mysql_real_escape_string ($pid). "'";
}
}
?>
You might ask, "Now that you've made sure that the PID is a number, why escape?" "Because you do not know how many different contexts and circumstances will use the Fetchpage () method. Must be protected in all places where this method is invoked, and the escape in the method embodies the meaning of defense-in-depth.
What happens if a user tries to enter a very long value, such as 1000 characters long, trying to initiate a buffer overflow attack? The next section discusses this in more detail, but you can now add another check to make sure the input PID has the correct length. You know that the maximum length of a database's PID field is 5 bits, so you can add the following check.
Listing 12. Use regular expressions and length checks to limit get variables
<?php
$pid = $_get[' pid '];
if (strlen ($pid)) {
if (!ereg ("^[0-9]+$", $pid) && strlen ($pid) > 5) {
Do something appropriate, like maybe logging them out or sending them back to home page
}
} else {
Empty $pid, so send them back to the home page
}
We create an object of a fictional class Page, which are now
Even more protected from evil user input
$obj = new Page;
$content = $obj->fetchpage ($pid);
And now we have a bunch of PHP this displays the page
?>
Now, no one can cram a 5,000-bit value into a database application--at least not in the place where the get string is involved. Imagine a hacker gnashing your teeth when trying to break through your application. And because the error reports are turned off, it's harder for hackers to scout.
Buffer overflow attack
A buffer overflow attack attempts to overflow a memory allocation buffer in a PHP application (or, more precisely, in the Apache or underlying operating system). Keep in mind that you might be writing a WEB application in a high-level language such as PHP, but you'll end up calling C (in Apache case). As with most low-level languages, C has strict rules for memory allocation.
A buffer overflow attack sends a large amount of data to the buffer, causing some data to overflow into an adjacent memory buffer, thereby damaging the buffer or overriding the logic. This can result in denial of service, corrupted data, or execution of malicious code on a remote server.
The only way to prevent buffer overflow attacks is to check the length of all user input. For example, if you have a FORM element that requires the user's name to be entered, add a MaxLength property with a value of 40 on the field and check it using substr () at the back end. Listing 13 gives a brief example of the form and PHP code.
Why do you provide both the MaxLength attribute and the substr () check at the back end? Because defense in depth is always good. Browsers prevent users from entering long strings that PHP or MySQL cannot safely handle (imagine someone trying to enter a 1,000-character name), and the back-end PHP check ensures that no one manipulates the form data remotely or in a browser.
As you can see, this approach is similar to the length of the GET variable PID used in the previous section to check for the strlen (). In this example, ignore any input values that are longer than 5 bits in length, but you can also easily truncate the value to the appropriate length, as follows:
Listing 14. Change the length of the input get variable
<?php
$pid = $_get[' pid '];
if (strlen ($pid)) {
if (!ereg ("^[0-9]+$", $pid)) {
If non numeric $pid, send them back to home page
}
}else{
Empty $pid, so send them back to the home page
}
We have a numeric PID, but it may is too long, so let ' s check
if (strlen ($pid) >5) {
$pid = substr ($pid, 0,5);
}
We create an object of a fictional class Page, which are now
Even more protected from evil user input
$obj = new Page;
$content = $obj->fetchpage ($pid);
And now we have a bunch of PHP this displays the page
?>
Note that buffer overflow attacks are not limited to long numeric strings or alphabetic strings. You may also see long hexadecimal strings (which often look like \xa3 or \xff). Remember, any buffer overflow attack is intended to overwhelm a particular buffer and place malicious code or instructions in the next buffer, destroying data or executing malicious code. The simplest way to deal with a hexadecimal buffer overflow is to not allow the input to exceed a specific length.
If you are dealing with a form text area that allows you to enter a long entry in a database, you cannot easily limit the length of the data on the client. After the data arrives in PHP, you can use regular expressions to clear any string that looks like hexadecimal.
You may find that this series of operations is a bit too strict. After all, the hexadecimal string has a legitimate purpose, such as outputting characters in a foreign language. How to deploy a hexadecimal regex is up to you. A better strategy is to delete a hexadecimal string only if there are too many hexadecimal strings in a row, or if the character of a string exceeds a specific number (such as 128 or 255).
Cross-site scripting attacks
In Cross-site scripting (XSS) attacks, there is often a malicious user entering information in a table consignments (or through other user input) that inserts a malicious client tag into a procedure or database. For example, suppose you have a simple guest register program on your site that allows visitors to leave their names, e-mail addresses, and short messages. A malicious user can use this opportunity to insert something other than a short message, such as a picture that is not appropriate for another user or a JavaScript that redirects a user to another site, or steals cookies.
Fortunately, PHP provides the Strip_tags () function, which clears any content enclosed in HTML tags. The Strip_tags () function also allows you to provide a list of allowed tokens, such as <b> or <i>.
Listing 16 shows an example that was built on the basis of the previous example.
From a security standpoint, it is necessary to use STRIP_TAGS () for public user input. If forms are in protected areas (such as content management systems) and you believe users will perform their tasks correctly (such as creating HTML content for Web sites), then using Strip_tags () may be unnecessary and can affect productivity.
Another problem: If you want to accept user input, such as comments on posts or visitor registrations, and need to display this input to other users, be sure to put the response in PHP's Htmlspecialchars () function. This function converts the symbol, <, and > symbols to HTML entities. For example, the symbol (&) becomes &. In this way, even if the malicious content dodged the front-end Strip_tags (), it would be disposed of Htmlspecialchars () at the back end.
Data manipulation in the browser
A class of browser plug-ins allows users to tamper with the header elements and form elements on the page. With Tamper Data (a Mozilla plugin), you can easily manipulate simple forms that contain many hidden text fields to send instructions to PHP and MySQL.
The user can start Tamper Data before clicking Submit on the form. When you submit a form, he sees a list of form data fields. The Tamper data allows the user to tamper with it, and the browser completes the form submission.
Let's go back to the example we established earlier. The string length has been checked, the HTML markup has been cleared, and the hexadecimal character has been removed. However, some hidden text fields have been added, as follows:
Note that one of the hidden variables exposes the table name: Users. You will also see an action field with a value of create. As long as you have basic SQL experience, you can see that these commands may control a SQL engine in the middleware. People who want to do great damage simply change the table name or provide another option, such as delete.
Figure 1 illustrates the scope of the damage that Tamper Data can provide. Note that Tamper data not only allows users to access form data elements, but also allows access to HTTP headers and cookies.
Figure 1. Tamper Data window
The easiest way to defend against this tool is to assume that any user can use Tamper Data (or similar tools). Provides only the minimum amount of information the system needs to process a form, and submits the form to some proprietary logic. For example, the registration form should only be submitted to the registration logic.
What if you already have a common form handler that has a lot of pages that use this generic logic? What if you use a hidden variable to control the flow? For example, you might specify which database table to write or which file repository to use in hidden form variables. There are 4 kinds of choices:
Without changing anything, secretly praying that there are no malicious users on the system.
Rewrite the feature to avoid using hidden form variables using a more secure, private form-handling function.
Use MD5 () or other encryption mechanisms to encrypt table names or other sensitive information in a hidden form variable. Don't forget to decrypt them on the PHP side.
These values are then converted in the PHP form processing function by using abbreviations or nicknames to blur the values. For example, if you want to refer to the Users table, you can refer to it with either a U or any string, such as U8Y90X0JKL.
The latter two options are not perfect, but they are much better than allowing users to easily guess the middleware logic or data model.
What's left of the question now? Remote form submission.
Remote form submission
The benefit of the WEB is that you can share information and services. The downside is the sharing of information and services, because some people have no scruples about doing things.
Take the form as an example. Anyone can access a Web site and create a local copy of the form using the File > Save as on the browser. He can then modify the action argument to point to a fully qualified URL (not pointing to formhandler.php, but pointing to http://www.yoursite.com/ formhandler.php, because the form is on this site), make any changes he wants, click Submit, and the server will receive this form data as a legitimate traffic stream.
First you might consider checking $_server[' http_referer ' to determine if the request came from your own server, which could block most malicious users, but not the most sophisticated hackers. These people are smart enough to tamper with the referrer information in the head so that the remote copy of the form looks like it was submitted from your server.
A better way to handle remote form submissions is to generate a token based on a unique string or timestamp and place the token in session variables and forms. After submitting the form, check to see if the two tokens match. If it doesn't match, you know someone is trying to send data from a remote copy of the form.
To create a random token, you can use the MD5 (), uniqid (), and Rand () functions built into PHP, as follows:
This technique is effective because session data cannot be migrated between servers in PHP. Even if someone gets your PHP source code, transfers it to your server, and submits information to your server, your server receives only the empty or malformed session token and the form token that was originally provided. They do not match, and remote form submission fails.
Conclusion
This tutorial discusses a number of issues:
Use Mysql_real_escape_string () to prevent SQL injection problems.
Use regular expressions and strlen () to ensure that the get data has not been tampered with.
Use regular expressions and strlen () to ensure that the data submitted by the user does not overflow the memory buffer.
Use Strip_tags () and htmlspecialchars () to prevent users from submitting potentially unwanted HTML tags.
Avoid the system being broken by tools such as Tamper Data.
Use a unique token to prevent users from submitting a form to the server remotely.
This tutorial does not cover more advanced topics, such as file injection, HTTP header spoofing, and other vulnerabilities. However, the knowledge you have learned can help you immediately add enough security to make your current project more secure.
Resources
Learn
Find a useful PHP 101 tutorial on zend.com.
Get a copy of Chris Shiflett's essential PHP security. His introduction is much deeper than the tutorial.
Obtain a copy of the Web Security, Privacy & Commerce of Simson Garfinkel.
Learn more about PHP security Consortium.
Read "Top 7 PHP Security Blunders".
Check out the DeveloperWorks "recommended PHP readings list."
Read DeveloperWorks article "Auditing PHP, part 1th: Understanding Register_globals".
View the PHP Security HOWTO webcast.
Visit the IBM developerWorks PHP project reference to learn more about PHP.
Keep an eye on developerWorks technical activities and webcasts.
Learn about upcoming meetings, internal previews, webcasts, and other events around the world for IBM Open source developers.
Visit DeveloperWorks's Open source zone, which has a wealth of how-to information, tools, and project updates that can help you develop and use open source technology for IBM products.
To listen to interesting interviews and discussions among software developers, be sure to consult developerWorks podcasts.
Access to products and technology
Windows users can download wampserver.
Build your next development project with PHP.
Use the IBM trial software to improve your next open source development project, which can be downloaded or obtained by DVD.
Discuss
Join the DeveloperWorks community by participating in DeveloperWorks blogs.
About the author
Thomas Myer is the founder and principal figure of Triple Dog Dare Media, a Web consulting firm based in Austin, Texas, with expertise in information architecture, WEB application development, and XML consulting. He is the author of No nonsense XML Web Development with PHP (published by SitePoint).
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.