Security needs to be thoroughly understood and mastered, both in development and during interviews or technical discussions.
Goal
The goal of this tutorial is to give you an idea of how you should protect your own built WEB applications. Explains how to defend against the most common security threats: SQL injection, manipulating GET and POST variables, buffer overflow attacks, cross-site scripting attacks, data manipulation within the browser, and remote form submission.
Quick Introduction to Security
What is the most important part of a WEB application? Depending on the person answering the question, the answer to this question may be varied. Business people need reliability and scalability. The IT support team needs robust, maintainable code. End users need a nice user interface and high performance when performing tasks. However, if you answer "security," everyone will agree that this is important for WEB applications.
But most of the discussion is over. Although security is in the checklist for a project, it is often not until the project is delivered that you begin to consider addressing security issues. The number of WEB application projects in this way is surprisingly large. Developers work for months, adding security features only at the end of the year, allowing WEB applications to be open to the public.
The result is often chaos and even rework, because the code has been tested, unit tested, and integrated into a larger framework before adding security features to it. After you add security, the primary component may stop working. Security integration adds an additional burden or step to the inherently smooth (but insecure) process.
This tutorial provides a good way to integrate security into a PHP Web application. It discusses a few general security topics and then delves into key security vulnerabilities and how to block them. After completing this tutorial, you will have a better understanding of security.
Topics include:
SQL injection attacks
Manipulating GET strings
Buffer overflow attack
Cross-site scripting attacks (XSS)
Data manipulation within the browser
Remote form submission
WEB Security 101
Before discussing the details of implementing security, it is a good idea to discuss WEB application security from a higher perspective. This section describes some of the basic tenets of security philosophy that you should keep in mind, no matter what WEB application you are creating. Part of these ideas comes from Chris Shiflett (his book on PHP Security is a priceless treasure trove), some from Simson Garfinkel (see Resources), and some from years of accumulated knowledge.
Rule 1: Never trust external data or input
The first thing you must realize about WEB application security is that you should not trust external data. External data (outside) includes any data that is not directly entered by the programmer in the PHP code. Any data from any other source (such as GET variables, form POST, database, configuration files, session variables, or cookies) is untrusted until measures are taken to ensure security.
For example, the following data elements can be considered safe because they are set in PHP.
Listing 1. Safe and Flawless code
[php] $myUsername = ' tmyer ';
$arrayUsers = Array (' Tmyer ', ' Tom ', ' Tommy ');
Define ("greeting", ' hello there '. $myUsername); [/php]
However, the following data elements are defective.
Listing 2. Unsafe, flawed code
[php] $myUsername = $_post[' username '); tainted!
$arrayUsers = Array ($myUsername, ' Tom ', ' Tommy '); tainted!
Define ("greeting", ' hello there '. $myUsername); tainted! [/php]
Why is the first variable $myUsername flawed? Because it comes directly from the form POST. Users can enter any string in this input field, including malicious commands to purge files or run previously uploaded files. You might ask, "Can't you avoid this danger by using a client (JavaScript) Form that accepts only the letter A-Z script?" "Yes, it's always a good step, but as you'll see later, anyone can download any form onto their machine, modify it, and resubmit whatever they want."
The solution is simple: You must run cleanup code on $_post[' username ']. If you do not, you may contaminate these objects at any other time you use the $myUsername, such as in arrays or constants.
A simple way to clean up user input is to use a regular expression to handle it. In this example, you only want to accept letters. It may also be a good idea to limit a string to a specific number of characters, or to require all letters to be lowercase.
Listing 3. Make user input Secure
[php] $myUsername = cleanInput ($_post[' username '); clean!
$arrayUsers = Array ($myUsername, ' Tom ', ' Tommy '); clean!
Define ("greeting", ' hello there '. $myUsername); clean!
function CleanInput ($input) {
$clean = Strtolower ($input);
$clean = Preg_replace ("/[^a-z]/", "", $clean);
$clean = substr ($clean, 0,12);
return $clean;
}[/php]
Rule 2: Disable PHP settings that make security difficult to implement
You know you can't trust user input, and you should know that you shouldn't trust the way you configure PHP on your machine. For example, make sure that register_globals is disabled. If Register_globals is enabled, you might do something careless, such as replacing a GET or POST string with the same name with a $variable. By disabling this setting, PHP forces you to reference the correct variable in the correct namespace. To use a variable from a form POST, you should refer to $_post[' variable '. This will not misinterpret this particular variable as a cookie, session, or GET variable.
The second setting to check is the error reporting level. During development, you want to get as many error reports as possible, but when you deliver the project, you want to log the error to the journal file instead of on the screen. Why is it? Because malicious hackers use error reporting information, such as SQL errors, to guess what the application is doing. This reconnaissance can help hackers break through the application. To plug this vulnerability, you need to edit the php.ini file, provide the appropriate destination for the Error_log entry, and set the Display_errors to Off.
Rule 3: If you can't understand it, you can't protect it.
Some developers use strange syntax, or organize statements in a compact form, with short but ambiguous code. This approach can be efficient, but if you don't understand what the code is doing, you can't decide how to protect it.
For example, which of the following two sections of code do you like?
Listing 4. Make code easy to protect
[Php]//obfuscated Code
$input = (isset ($_post[' username '))? $_post[' username ']: ");
unobfuscated Code
$input = ";
if (isset ($_post[' username ')) {
$input = $_post[' username ');
}else{
$input = ";
}[/php]
In the second, clearer code snippet, it is easy to see that $input is flawed and needs to be cleaned before it can be handled safely.
Rule 4: "Defense in Depth" is a new magic weapon
This tutorial will use an example to illustrate how to protect an online form while taking the necessary steps in the PHP code that handles the form. Similarly, even if you use the PHP regex to ensure that the GET variable is fully numeric, you can still take steps to ensure that the SQL query uses escaped user input.
Defense-in-depth is not just a good idea, it ensures that you don't get into serious trouble.
Now that you have discussed the basic rules, consider the first threat: SQL injection attacks.
Preventing SQL injection attacks
In a SQL injection attack, the user adds information to a database query by manipulating the form or GET query string. For example, suppose you have a simple login database. Each record in this database has a user name field and a password fields. Build a login form that allows users to log in.
Listing 5. Simple sign-in form
[PHP]
<title>Login</title>
[/php]
This form accepts the user name and password entered by the user and submits the user input to a file named verify.php. In this file, PHP processes the data from the login form as follows:
Listing 6. Unsafe PHP Form processing code
[PHP] $okay = 0;
$username = $_post[' user '];
$PW = $_post[' pw '];
$sql = "SELECT count (*) as Ctr from users where
Username= ' ". $username." ' and password= ' ". $PW. "' Limit 1″;
$result = mysql_query ($sql);
while ($data = Mysql_fetch_object ($result)) {
if ($data->ctr = = 1) {
They ' re okay to enter the application!
$okay = 1;
}
}
if ($okay) {
$_session[' Loginokay ') = true;
Header ("index.php");
}else{
Header ("login.php");
}
?> [/php]
This piece of code looks fine, doesn't it? This code is used by hundreds (or even thousands) of php/mysql sites around the world. Where is it wrong? OK, remember "cannot trust user input." There is no escaping any information from the user, so the application is vulnerable to attack. Specifically, there may be any type of SQL injection attack.
For example, if the user enters Foo as the user name and enters ' or ' 1′= ' 1 as the password, the following string is actually passed to PHP, and then the query is passed to MySQL:
$sql = "SELECT count (*) as Ctr from users where
Username= ' foo ' and password= ' or ' 1′= ' 1′limit 1″;
This query always returns the count value of 1, so PHP will allow access. By injecting some malicious SQL at the end of the password string, the hacker can dress up as a legitimate user.
The solution to this problem is to use PHP's built-in mysql_real_escape_string () function as a wrapper for any user input. This function escapes characters in a string, making it impossible for strings to pass special characters such as apostrophes and letting MySQL operate on special characters. Listing 7 shows the code with escape processing.
Listing 7. Secure PHP Form processing code
[PHP] $okay = 0;
$username = $_post[' user '];
$PW = $_post[' pw '];
$sql = "SELECT count (*) as Ctr from users where
Username= ' ". Mysql_real_escape_string ($username)." '
and password= ' ". Mysql_real_escape_string ($PW). "' Limit 1″;
$result = mysql_query ($sql);
while ($data = Mysql_fetch_object ($result)) {
if ($data->ctr = = 1) {
They ' re okay to enter the application!
$okay = 1;
}
}
if ($okay) {
$_session[' Loginokay ') = true;
Header ("index.php");
}else{
Header ("login.php");
}
? >[/php]
By using mysql_real_escape_string () as a wrapper for user input, you can avoid any malicious SQL injection in user input. If a user attempts to pass a malformed password through SQL injection, the following query is passed to the database:
Select COUNT (*) as Ctr from users where \
Username= ' foo ' and password= ' \ ' or \ ' 1\ ' =\ ' 1′limit 1″
Nothing in the database matches such a password. Just taking a simple step is blocking a big hole in the Web application. The experience here is that the user input to the SQL query should always be escaped.
However, there are several security vulnerabilities that need to be blocked. The next item is to manipulate the GET variable.
Prevent user manipulation of variables
In the previous section, the user was prevented from logging on with a malformed password. If you are smart, you should apply the method you learned to ensure that all user input to the SQL statement is escaped.
However, the user is now securely logged in. The user has a valid password and does not mean that he will act according to the rules-he has many opportunities to cause damage. For example, an application might allow users to view special content. All links point to locations such as template.php?pid=33 or template.php?pid=321. The part after the question mark in the URL is called the query string. Because the query string is placed directly in the URL, it is also called a GET query string.
In PHP, if register_globals is disabled, the string can be accessed using the $_get[' PID '. In the template.php page, you might perform a similar operation as in Listing 8.
Listing 8. Example template.php
[PHP] $pid = $_get[' pid '];
We create an object of a fictional class Page
$obj = new Page;
$content = $obj->fetchpage ($pid);
And now we had a bunch of PHP that displays the page
......
......
?> [/php]
Is there anything wrong here? First, it is implicitly believed that the GET variable pid from the browser is safe. What's going to happen? Most users are less intelligent and cannot construct semantic attacks. However, if they notice pid=33 in the URL location domain of the browser, they may start messing up. If they enter another number, it may be fine, but what happens if you enter something else, such as typing in a SQL command or a file name (such as/etc/passwd), or doing other pranks, such as entering a value up to 3,000 characters long?
In this case, remember the basic rules and do not trust user input. The application developer knows that the personal identifier (PID) that template.php accepts should be a number, so you can use the PHP is_numeric () function to ensure that non-numeric PID is not accepted as follows:
Listing 9. Use Is_numeric () to restrict GET variables
[PHP] $pid = $_get[' pid '];
if (Is_numeric ($pid)) {
We create an object of a fictional class Page
$obj = new Page;
$content = $obj->fetchpage ($pid);
And now we had a bunch of PHP that displays the page
......
......
}else{
Didn ' t pass the is_numeric () test, do something else!
}?> [/php]
This method appears to be valid, but the following inputs are easily checked by the is_numeric ():
100 (valid)
100.1 (should not have decimal digits)
+0123.45e6 (Scientific counting Method--bad)
0xff33669f (Hex--Danger!) Dangerous! )
So what should a security-conscious PHP developer do? Years of experience have shown that the best practice is to use regular expressions to ensure that the entire GET variable is made up of numbers, as follows:
Listing 10. Restricting GET variables with regular expressions
[PHP] $pid = $_get[' pid '];
if (strlen ($pid)) {
if (!ereg ("^[0-9]+$", $pid)) {
Do something appropriate, like maybe logging \
them out or sending them back to home page
}
}else{
Empty $pid, so send them back to the home page
}
We create an object of a fictional class Page, which are now
Moderately protected from evil user input
$obj = new Page;
$content = $obj->fetchpage ($pid);
And now we had a bunch of PHP that displays the page
......
......
? >[/php]
All you need to do is use strlen () to check if the length of the variable is not 0, and if so, use an all-numeric regular expression to ensure that the data element is valid. If the PID contains letters, slashes, dots, or anything similar to hexadecimal, this routine captures it and masks the page from user activity. If you look behind the Page class, you'll see that a security-conscious PHP developer has escaped the user input $pid, protecting the Fetchpage () method as follows:
Listing 11. Escaping the Fetchpage () method
[PHP] Class page{
function Fetchpage ($pid) {
$sql = "Select Pid,title,desc,kw,content,\
Status from page where pid= '
". Mysql_real_escape_string ($pid)." ";
etc, etc .....
}
}
?> [/php]
You might ask, "Now that you've made sure the PID is a number, why escape?" "Because you don't know how many different contexts and situations will use the Fetchpage () method. Protection must be done everywhere the method is called, and escaping in the method reflects the meaning of defense in depth.
What happens if a user attempts to enter a very long value, such as up to 1000 characters, to attempt to initiate a buffer overflow attack? The next section discusses this in more detail, but you can now add another check to make sure the PID you entered has the correct length. You know that the maximum length of the PID field for a database is 5 bits, so you can add the following check.
Listing 12. Use regular expressions and length checks to restrict GET variables
[PHP] $pid = $_get[' pid '];
if (strlen ($pid)) {
if (!ereg ("^[0-9]+$", $pid) && strlen ($pid) > 5) {
Do something appropriate, like maybe logging \
them out or sending them back to home page
}
}else{
Empty $pid, so send them back to the home page
}
We create an object of a fictional class Page, which are now
Even more protected from evil user input
$obj = new Page;
$content = $obj->fetchpage ($pid);
And now we had a bunch of PHP that displays the page
......
......
?> [/php]
Now, no one can cram a 5,000-bit value into a database application-at least not in the case where a GET string is involved. Imagine a hacker's gnashing of teeth when trying to break through your application. And because the bug reports are turned off, hackers are more difficult to scout.
Buffer overflow attack
A buffer overflow attack attempts to overflow a memory allocation buffer in a PHP application (or, more precisely, in Apache or the underlying operating system). Keep in mind that you may be using a high-level language like PHP to write Web applications, but ultimately you will want to call C (in the case of Apache). Like most low-level languages, C has strict rules for memory allocation.
A buffer overflow attack sends a large amount of data to the buffer, causing partial data to overflow into adjacent memory buffers, thereby destroying the buffer or rewriting logic. This can cause denial of service, corrupt data, or execute malicious code on a remote server.
The only way to prevent a buffer overflow attack is to check the length of all user input. For example, if you have a FORM element that requires the user's name, add a MaxLength property with a value of 40 on the domain and check it with substr () on the backend. Listing 13 gives a short example of the form and the PHP code.
Listing 13. Check the length of user input
[PHP] if ($_post[' submit '] = = "Go") {
$name = substr ($_post[' name '],0,40);
Continue processing ....
}
?>
[/php]
Why do you provide both the MaxLength attribute and the substr () check on the backend? Because defense in depth is always good. Browsers prevent users from entering extra-long strings that PHP or MySQL cannot safely handle (imagine someone trying to enter a name up to 1,000 characters), and the backend PHP check will ensure that no one remotely manipulates the form data or in the browser.
As you can see, this approach is similar to the length of the PID using strlen () in the previous section to check the GET variable. In this example, any input value that is longer than 5 bits is ignored, but it is also easy to truncate the value to the appropriate length, as shown here:
Listing 14. Change the length of the input GET variable
[PHP] $pid = $_get[' pid '];
if (strlen ($pid)) {
if (!ereg ("^[0-9]+$", $pid)) {
If non numeric $pid, send them back to home page
}
}else{
Empty $pid, so send them back to the home page
}
We have a numeric PID, but it is too long, so let ' s check
if (strlen ($pid) >5) {
$pid = substr ($pid, 0,5);
}
We create an object of a fictional class Page, which are now
Even more protected from evil user input
$obj = new Page;
$content = $obj->fetchpage ($pid);
And now we had a bunch of PHP that displays the page
......
......
? >[/php]
Note that a buffer overflow attack is not limited to long numeric strings or strings of letters. You may also see long hexadecimal strings (often looking like \xa3 or \xff). Remember that any buffer overflow attack is intended to overwhelm a particular buffer and place malicious code or instructions in the next buffer, destroying the data or executing malicious code. The simplest way to counter a hex buffer overflow is to not allow the input to exceed a specific length.
If you are working with a form text area that allows you to enter longer entries in the database, you cannot easily limit the length of the data on the client. After the data arrives in PHP, you can use regular expressions to clear any strings like hexadecimal.
Listing 15. Prevent hexadecimal strings
[PHP] if ($_post[' submit '] = = "Go") {
$name = substr ($_post[' name '],0,40);
Potential hexadecimal characters
$name = Cleanhex ($name);
Continue processing ....
}
function Cleanhex ($input) {
$clean = Preg_replace ("![ \][XX] ([a-fa-f0-9]{1,3})! "," ", $input);
return $clean;
}
?>
[/php]
You may find this series of operations a bit too restrictive. After all, hexadecimal strings have legitimate uses, such as exporting characters from a foreign language. How to deploy a hexadecimal regex is up to you. A good strategy is to delete a hexadecimal string only if it contains too many hexadecimal strings in a row, or if the character of the string exceeds a certain number (such as 128 or 255).
Cross-site scripting attacks
In cross-site scripting (XSS) attacks, there is often a malicious user entering information in table consignments (or other user input) that inserts a malicious client tag into the process or database. For example, suppose you have a simple visitor register program on your site that allows visitors to leave names, e-mail addresses, and short messages. A malicious user can take advantage of this opportunity to insert something other than a short message, such as an inappropriate image for another user or a JavaScript that redirects the user to another site, or steals cookie information.
Fortunately, PHP provides the Strip_tags () function, which clears any content that surrounds the HTML tag. The Strip_tags () function also allows you to provide a list of allowed tokens, such asOr。
Listing 16 shows an example that was built on the basis of the previous example.
Listing 16. Clear HTML markup from user input
[PHP] if ($_post[' submit '] = = "Go") {
Strip_tags
$name = strip_tags ($_post[' name ');
$name = substr ($name, 0,40);
Potential hexadecimal characters
$name = Cleanhex ($name);
Continue processing ....
}
function Cleanhex ($input) {
$clean = preg_replace\
(”! [\] [XX] ([a-fa-f0-9]{1,3})! "," ", $input);
return $clean;
}
?>
[/php]
From a security point of view, it is necessary to use STRIP_TAGS () for public user input. If your form is in a protected area, such as a content management system, and you believe that users will perform their tasks correctly (such as creating HTML content for a Web site), then using Strip_tags () may be unnecessary and will affect productivity.
Another problem: If you want to accept user input, such as comments on posts or visitor registrations, and need to show this input to other users, be sure to put the response in PHP's Htmlspecialchars () function. This function converts symbols, <, and > symbols into HTML entities. For example, the symbol (&) becomes &. In this case, even if the malicious content is avoiding the processing of the front-end strip_tags (), it will be disposed of by Htmlspecialchars () at the backend.
Data manipulation within the browser
There is a class of browser plug-ins that allow users to tamper with head elements and form elements on a page. With Tamper Data (a Mozilla plugin), it's easy to manipulate simple forms that contain many hidden text fields, sending instructions to PHP and MySQL.
The user can launch Tamper Data before clicking Submit on the form. When the form is submitted, he will see a list of form data fields. Tamper data allows the user to tamper with the information and then the browser completes the form submission.
Let's go back to the example we built earlier. The string length has been checked, HTML tags have been cleared, and hexadecimal characters have been removed. However, some hidden text fields are added, as follows:
Listing 17. Hide variables
[PHP] if ($_post[' submit '] = = "Go") {
Strip_tags
$name = strip_tags ($_post[' name ');
$name = substr ($name, 0,40);
Potential hexadecimal characters
$name = Cleanhex ($name);
Continue processing ....
}
function Cleanhex ($input) {
$clean = \
Preg_replace ("![ \][XX] ([a-fa-f0-9]{1,3})! "," ", $input);
return $clean;
}
?>
[/php]
Notice that one of the hidden variables exposes the table name: Users. You will also see an action field with a value of create. As long as you have basic SQL experience, you can see that these commands may control a SQL engine in the middleware. A man who wants to do a lot of damage. Simply change the table name or provide another option, such as delete.
Figure 1 illustrates the scope of damage that Tamper Data can provide. Note that Tamper data not only allows users to access form data elements, but also allows access to HTTP headers and cookies.
Figure 1. Tamper Data window
The simplest way to defend against such a tool is to assume that any user may use Tamper Data (or similar tools). Provide only the minimal amount of information that the system needs to process the form, and submit the form to some specialized logic. For example, the registration form should only be submitted to the registration logic.
What if a common form handler has been created and many pages use this common logic? What if you use a hidden variable to control the flow direction? For example, you might specify which database table to write in a hidden form variable or which file repository to use. There are 4 types of options:
Don't change anything, secretly pray that there are no malicious users on the system.
Override functionality to avoid using hidden form variables by using more secure, specialized form handlers.
Use MD5 () or other encryption mechanisms to encrypt table names or other sensitive information in a hidden form variable. Do not forget to decrypt them on the PHP side.
By using abbreviations or nicknames to blur the meaning of a value, the values are converted in a PHP form handler function. For example, if you want to refer to the Users table, you can refer to it by using U or any string, such as U8Y90X0JKL.
The latter two options are not perfect, but they are much better than making it easier for users to guess the middleware logic or data model.
What questions are left? Remote form submission.
Remote form submission
The benefit of the WEB is the ability to share information and services. The downside is the ability to share information and services, because some people do things without scruple.
Take the form as an example. Anyone can access a Web site and use File > Save as on the browser to create a local copy of the form. He can then modify the action parameter to point to a fully qualified URL (not pointing to formhandler.php, but pointing to http://www.yoursite.com/ formhandler.php, because the form is on this site), making any changes he wants, click Submit, and the server will receive the form data as a legitimate communication stream.
You might want to consider checking $_server[' http_referer '] to see if the request is from your own server, which can block most malicious users, but not the most sophisticated hackers. These people are smart enough to tamper with the referrer information in the header, making the remote copy of the form look like it was submitted from your server.
A better way to handle a remote form submission is to generate a token based on a unique string or timestamp and place the token in the session variables and forms. After submitting the form, check that the two tokens match. If it doesn't match, you know someone is trying to send data from a remote copy of the form.
To create a random token, you can use PHP's built-in MD5 (), uniqid (), and Rand () functions as follows:
Listing 18. Defend against remote form submissions
[PHP] Session_Start ();
if ($_post[' submit '] = = "Go") {
Check token
if ($_post[' token '] = = $_session[' token ')} {
Strip_tags
$name = strip_tags ($_post[' name ');
$name = substr ($name, 0,40);
Potential hexadecimal characters
$name = Cleanhex ($name);
Continue processing ....
}else{
Stop all processing! Remote form Posting attempt!
}
}
$token = MD5 (Uniqid (rand (), true));
$_session[' token ']= $token;
function Cleanhex ($input) {
$clean = Preg_replace ("![ \][XX] ([a-fa-f0-9]{1,3})! "," ", $input);
return $clean;
}
?>
[/php]
This technique is effective because session data cannot be migrated between servers in PHP. Even if someone gets your PHP source code, transfers it to their server and submits information to your server, your server receives only empty or malformed session tokens and the original provided form tokens. They do not match, and the remote form submission fails.
Conclusion
This tutorial discusses a number of issues:
Use Mysql_real_escape_string () to prevent SQL injection problems.
Use regular expressions and strlen () to ensure that the GET data is not tampered with.
Use regular expressions and strlen () to ensure that the data submitted by the user does not overflow the memory buffer.
Use Strip_tags () and htmlspecialchars () to prevent users from committing potentially harmful HTML tags.
Avoid the system being broken by tools such as Tamper Data.
Use unique tokens to prevent users from submitting forms remotely to the server.
This tutorial does not cover more advanced topics such as file injection, HTTP header spoofing, and other vulnerabilities. However, the knowledge you have learned can help you immediately add enough security to make your current project more secure.
The above describes the security of the PHP application, including the application, security aspects, I hope to be interested in the PHP tutorial friends helpful.