Each record in this database has a username field and a password field. Create a logon form to allow users to log on. Each record in this database has a username field and a password field. Create a logon form to allow users to log on.
Rule 1: never trust external data or input
The first thing that must be realized about Web application security is that external data should not be trusted. External data includes
Any data directly entered in the code. Before you take measures to ensure security, you can obtain data from any other source (such as GET variables, form POST, database, configuration files, session variables, or
Cookie.
For example, the following data elements can be considered safe because they are set in PHP.
Listing 1. safe and flawless code
$ MyUsername = 'tmyer ';
$ ArrayUsers = array ('tmyer ', 'Tom', 'Tommy ');
Define ("GREETING", 'Hello There'. $ myUsername );
?>
However, the following data elements are flawed.
List 2. insecure and defective code
$ MyUsername = $ _ POST ['username']; // tainted!
$ ArrayUsers = array ($ myUsername, 'Tom ', 'Tommy'); // tainted!
Define ("GREETING", 'Hello There'. $ myUsername); // tainted!
?>
Why is the first variable $ myUsername defective? Because it comes directly from the form
POST. You can enter any strings in this input field, including malicious commands used to clear files or run previously uploaded files. You may ask, "cannot I accept only letters in A-Z
Does the client form validation script avoid this risk ?" Yes, this is always a good step, but as you will see later, anyone can download any form.
Go to your machine, modify it, and resubmit any content they need.
The solution is simple: you must run the cleanup code on $ _ POST ['username. Otherwise, $ myUsername is used.
Any other time (such as in an array or constant), these objects may be contaminated.
A simple method for clearing user input is to use a regular expression to process it. In this example, only letters are allowed. It may be a good idea to limit a string to a specific number of characters, or to require that all letters be in lowercase.
Listing 3. making user input secure
$ MyUsername = cleanInput ($ _ POST ['username']); // clean!
$ ArrayUsers = array ($ myUsername, 'Tom ', 'Tommy'); // clean!
Define ("GREETING", 'Hello There'. $ myUsername); // clean!
Function cleanInput ($ input ){
$ Clean = strtolower ($ input );
$ Clean = preg_replace ("/[^ a-z]/", "", $ clean );
$ Clean = substr ($ clean, 0, 12 );
Return $ clean;
}
?>
Rule 2: disable PHP settings that make security difficult
You already know that you cannot trust user input. you should also know that you should not trust the PHP configuration method on the machine. For example, make sure to disable register_globals. If
Register_globals may do some careless things, such as replacing the GET or POST string with the same name with $ variable. By disabling this setting, PHP
Force you to reference the correct variables in the correct namespace. To use a variable from Form POST, you should reference $ _ POST ['variable']. This will not misunderstand this specific variable
Cookie, session, or GET variable.
Rule 3: If you cannot understand it, you cannot protect it.
Some developers use strange syntaxes, or organize statements very compact to form short but ambiguous code. This method may be highly efficient, but if you do not understand what the code is doing, you cannot decide how to protect it.
For example, which of the following two sections of code do you like?
Listing 4. easy code protection
// Obfuscated code
$ Input = (isset ($ _ POST ['username'])? $ _ POST ['username']: ");
// Unobfuscated code
$ Input = ";
If (isset ($ _ POST ['username']) {
$ Input = $ _ POST ['username'];
} Else {
$ Input = ";
}
?>
In the second clear code segment, it is easy to see that $ input is defective and needs to be cleaned up before it can be processed safely.
Rule 4: "defense in depth" is a new magic weapon
This tutorial uses examples to illustrate how to protect online forms and take necessary measures in PHP code that processes forms. Similarly, even if PHP regex is used to ensure GET
The variable is completely numeric, and measures can still be taken to ensure that the user input for SQL query uses escape characters.
Defense in depth is not just a good idea. it ensures that you are not in serious trouble.
Now that we have discussed the basic rules, we will study the first threat: SQL injection attacks.
Prevent SQL injection attacks
In SQL injection attacks, you can manipulate the form or GET
Query string to add the information to the database query. For example, assume there is a simple login database. Each record in this database has a username field and a password field. Create a logon form to allow users to log on.
Listing 5. simple logon form
Username
Password
This form accepts the user name and password entered by the user, and submits the user input to the file verify. php. In this file, PHP
Process data from the logon form as follows:
Listing 6. insecure PHP form processing code
$ Okay = 0;
$ Username = $ _ POST ['user'];
$ Pw = $ _ POST ['pw '];
$ SQL = "select count (*) as ctr from users where username = '". $ username ."'
And password = '". $ pw." 'limit 1 ″;
$ Result = mysql_query ($ SQL );
While ($ data = mysql_fetch_object ($ result )){
If ($ data-> ctr = 1 ){
// They're okay to enter the application!
$ Okay = 1;
}
}
If ($ okay ){
$ _ SESSION ['loginokay'] = true;
Header ("index. php ");
} Else {
Header ("login. php ");
}
?>
This code looks okay, right? Hundreds or even thousands of PHP/MySQL sites around the world are using this code. Where is the error? Good, remember
"User input cannot be trusted ". No information from the user is escaped, so the application is vulnerable to attacks. Specifically, any type of SQL injection attacks may occur.
For example, if you enter foo as the user name and 'or '1' = '1 as the password, the following string is actually passed to PHP and then the query is passed
MySQL:
$ SQL = "select count (*) as ctr from users where username = 'foo' and
Password = "or '1' = '1' limit 1 ″;
?>
This query always returns a count value of 1, so PHP will allow access. By injecting some malicious SQL statements at the end of the password string, hackers can dress up as legitimate users.
The solution to this problem is to build the PHP built-in mysql_real_escape_string ()
The function is used as a wrapper for any user input. This function is used to escape characters in a string, making it impossible for the string to pass special characters such as an apostrophes and allow MySQL to perform operations based on special characters. Listing 7
Demonstrate the code with escape processing.
Listing 7. safe PHP form processing code
$ Okay = 0;
$ Username = $ _ POST ['user'];
$ Pw = $ _ POST ['pw '];
$ SQL = "select count (*) as ctr from users where
Username = '". mysql_real_escape_string ($ username)."' and password = '".
Mysql_real_escape_string ($ pw). "'limit 1 ″;
$ Result = mysql_query ($ SQL );
While ($ data = mysql_fetch_object ($ result )){
If ($ data-> ctr = 1 ){
// They're okay to enter the application!
$ Okay = 1;
}
}
If ($ okay ){
$ _ SESSION ['loginokay'] = true;
Header ("index. php ");
} Else {
Header ("login. php ");
}
?>
Using mysql_real_escape_string () as the package for user input can avoid any malicious SQL injection in user input. If you try
If an SQL injection passes a malformed password, the following query is passed to the database:
Select count (*) as ctr from users where username = 'foo' and password = '\' or
\ '1 \ '= \ '1' limit 1 ″
There is nothing in the database that matches this password. Simply taking a simple step blocks a major vulnerability in a Web application. The experience here is that the SQL
The user input for the query is escaped.
However, several security vulnerabilities need to be blocked. The next item is to manipulate the GET variable.
Prevents users from manipulating GET variables
In the previous section, users are prevented from logging on with malformed passwords. If you are smart, you should apply the method you have learned to ensure that all user input in the SQL statement is escaped.
However, the user has logged on safely. A user having a valid password does not mean that he will follow the rules --
He has many opportunities for damages. For example, an application may allow users to view special content. All links direct to template. php? Pid = 33 or
Template. php? Pid = 321. The part after the question mark in the URL is called a query string. Because the query string is directly placed in the URL, it is also called GET
Query string.
If register_globals is disabled in PHP, you can use $ _ GET ['pid '] to access this string. In template. php
Page, operations similar to listing 8 may be performed.
Listing 8. Sample template. php
$ Pid = $ _ GET ['pid'];
// We create an object of a fictional class Page
$ Obj = new Page;
$ Content = $ obj-> fetchPage ($ pid );
// And now we have a bunch of PHP that displays the page
?>
What's wrong here? First, the GET variable pid from the browser is implicitly believed here.
Is safe. What will happen? Most users are less intelligent and cannot construct semantic attacks. However, if they notice that
Pid = 33. If they enter another number, it may be okay; but if they enter something else, such as an SQL command or a file name (such
/Etc/passwd), or do other pranks. for example, if you enter a value up to 3,000 characters, what will happen?
In this case, remember the basic rules and do not trust user input. Application developers know that the personal identifier (PID) accepted by template. php should be a number, so they can use
PHP's is_numeric () function ensures that non-numeric PIDs are not accepted, as shown below:
Listing 9. use is_numeric () to restrict GET variables
$ Pid = $ _ GET ['pid'];
If (is_numeric ($ pid )){
// We create an object of a fictional class Page
$ Obj = new Page;
$ Content = $ obj-> fetchPage ($ pid );
// And now we have a bunch of PHP that displays the page
} Else {
// Didn't pass the is_numeric () test, do something else!
}
?>
This method seems to be valid, but the following inputs can be easily checked by is_numeric:
100 (valid)
100.1 (decimal places should not exist)
+ 0123.45e6 (scientific notation-not good)
0xff33669f (hexadecimal -- dangerous! Dangerous !)
So what should PHP developers with security awareness do? Years of experience show that the best practice is to use regular expressions to ensure that the entire GET variable is composed of numbers, as shown below:
Listing 10. use regular expressions to restrict GET variables
$ Pid = $ _ GET ['pid'];
If (strlen ($ pid )){
If (! Ereg ("^ [0-9] + $", $ pid )){
// Do something appropriate, like maybe logging them out or sending them
Back to home page
}
} Else {
// Empty $ pid, so send them back to the home page
}
// We create an object of a fictional class Page, which is now
// Moderately protected from edevil user input
$ Obj = new Page;
$ Content = $ obj-> fetchPage ($ pid );
// And now we have a bunch of PHP that displays the page
?>
All you need to do is use strlen () to check whether the variable length is non-zero. If yes, use a full-number regular expression to ensure that the data element is valid. If the PID
Contains letters, slashes, dots, or anything similar to the hexadecimal format, this routine captures it and shields the page from user activity. If you look at the Page behind the scenes, you will see the security-conscious PHP
The developer has escaped the user input $ pid to protect the fetchPage () method, as shown below:
Listing 11. escape the fetchPage () method
Class Page {
Function fetchPage ($ pid ){
$ SQL = "select pid, title, desc, kw, content, status from page where
Pid = '". mysql_real_escape_string ($ pid )."'";
}
}
?>
You may ask, "since you have ensured that the PID is a number, why should we escape it ?" Because we do not know how many different contexts will use fetchPage ()
Method. It must be protected in all the places where this method is called, and escaping in the method reflects the meaning of in-depth defense.
If you try to enter a very long value, for example, up to 1000
Attempts to initiate a buffer overflow attack. what will happen? This issue is discussed in more detail in the next section. However, you can add another check to ensure that the entered PID has the correct length. You know
The maximum length of the pid field is 5 bits, so you can add the following check.
Listing 12. use regular expressions and length checks to restrict GET variables
$ Pid = $ _ GET ['pid'];
If (strlen ($ pid )){
If (! Ereg ("^ [0-9] + $", $ pid) & strlen ($ pid)> 5 ){
// Do something appropriate, like maybe logging them out or sending them
Back to home page
}
} Else {
// Empty $ pid, so send them back to the home page
}
// We create an object of a fictional class Page, which is now
// Even more protected from edevil user input
$ Obj = new Page;
$ Content = $ obj-> fetchPage ($ pid );
// And now we have a bunch of PHP that displays the page
?>
Currently, no one can insert a 5,000-bit value in the database application-at least in the case of GET
This is not the case where the string is located. Imagine a hacker biting his teeth when trying to break through your application and getting frustrated! Moreover, it is more difficult for hackers to conduct reconnaissance because the error report is disabled.
Buffer overflow attacks
The buffer overflow attack attempts to overflow the memory allocation buffer in PHP applications (or, more specifically, in Apache or the underlying operating system. Remember, you may be using PHP
This advanced language is used to compile Web applications, but C is still called (in Apache ). Like most low-level languages, C has strict rules for memory allocation.
The buffer overflow attack sends a large amount of data to the buffer, so that part of the data overflows to the adjacent memory buffer, thus damaging the buffer or rewriting logic. In this way, it can cause denial of service, damage data, or execute malicious code on a remote server.
The only way to prevent buffer overflow attacks is to check the length of all user input. For example, if a form element requires the user's name, add a maxlength value of 40 to this field.
And use substr () at the backend to check the attributes. Listing 13 provides a brief example of the form and PHP code.
Listing 13. check the length of user input
If ($ _ POST ['submit '] = "go "){
$ Name = substr ($ _ POST ['name'], 0, 40 );
}
?>
"Method =" post ">
Name
Why does it provide both the maxlength attribute and substr () check on the backend? Because in-depth defense is always good. The browser prevents users from entering PHP or MySQL
Super-long strings that cannot be processed securely (imagine someone trying to enter a name up to 1,000 characters), and the backend PHP check ensures that no one remotely or in the browser manipulated form data.
As you can see, this method is similar to using strlen () in the previous section to check the length of the GET variable pid. In this example, the ignore length exceeds 5
Any input value, but it can also be easily truncated to an appropriate length, as shown below:
Listing 14. changing the length of the input GET variable
$ Pid = $ _ GET ['pid'];
If (strlen ($ pid )){
If (! Ereg ("^ [0-9] + $", $ pid )){
// If non numeric $ pid, send them back to home page
}
} Else {
// Empty $ pid, so send them back to the home page
}
// We have a numeric pid, but it may be too long, so let's check
If (strlen ($ pid)> 5 ){
$ Pid = substr ($ pid, 0, 5 );
}
// We create an object of a fictional class Page, which is now
// Even more protected from edevil user input
$ Obj = new Page;
$ Content = $ obj-> fetchPage ($ pid );
// And now we have a bunch of PHP that displays the page
?>
Note: buffer overflow attacks are not limited to long numeric or serial strings. You may also see a long hexadecimal string (often looks like \ xA3 or
\ XFF ). Remember, the purpose of any buffer overflow attack is to drown out a specific buffer zone and place malicious code or instructions in the next buffer zone to corrupt data or execute malicious code. The easiest way to deal with Hex buffer overflow is not to allow the input to exceed a specific length.
If you are allowed to enter a long form partition in the database, you cannot easily limit the data length on the client. When data arrives in PHP
Then, you can use a regular expression to clear any string like a hexadecimal string.
Listing 15. preventing hexadecimal strings
If ($ _ POST ['submit '] = "go "){
$ Name = substr ($ _ POST ['name'], 0, 40 );
// Clean out any potential hexadecimal characters
$ Name = cleanHex ($ name );
// Continue processing ....
}
Function cleanHex ($ input ){
$ Clean = preg_replace ("! [\] [XX] ([A-Fa-f0-9 })!", "", $ Input );
Return $ clean;
}
?>
"Method =" post ">
Name
You may find that these operations are a little too strict. After all, the hexadecimal string has a valid purpose, such as outputting characters in a foreign language. How to deploy a hexadecimal regex
It is up to you to decide. A better strategy is to delete a hexadecimal string only when a row contains too many hexadecimal strings or the number of characters in the string exceeds a specified number (such as 128 or 255.
Cross-Site Scripting
In cross-site scripting (XSS) attacks, there is often a malicious user entering information in the form (or through other user input methods), these input will be evil
Mark the insert process or database on the client. For example, assume that there is a simple visitor register program on the site, allowing visitors to leave their names, email addresses, and short messages. Malicious users can use this opportunity to insert things other than short messages, such as images that are inappropriate for other users or redirect users to another site.
Or steal cookie information.
Fortunately, PHP provides the strip_tags () function, which can clear any content surrounded by HTML tags. Strip_tags ()
The function also allows a list of allowed tags, such as OR.
Data manipulation in the browser
A browser plug-in allows users to tamper with header and form elements on a page. Use Tamper Data (a Mozilla
Plug-ins), which can easily manipulate simple forms that contain many hidden text fields to send commands to PHP and MySQL.
Before clicking Submit on the form, the user can start Tamper Data. When submitting a form, he will see a list of data fields in the form. Tamper Data
The user is allowed to tamper with the data and then the browser submits the form.
Let's go back to the example we created earlier. Check the string length, clear the HTML tag, and delete hexadecimal characters. However, some hidden text fields are added as follows:
Listing 17. hiding variables
If ($ _ POST ['submit '] = "go "){
// Strip_tags
$ Name = strip_tags ($ _ POST ['name']);
$ Name = substr ($ name, 0, 40 );
// Clean out any potential hexadecimal characters
$ Name = cleanHex ($ name );
// Continue processing ....
}
Function cleanHex ($ input ){
$ Clean = preg_replace ("! [\] [XX] ([A-Fa-f0-9 })!", "", $ Input );
Return $ clean;
}
?>
"Method =" post ">
Name
Note: One of the hidden variables exposes the table name users. You can also see an action field with the value of create. As long as there is basic SQL
Experience, we can see that these commands may control an SQL engine in the middleware. To make a big damage, you only need to change the table name or provide another option, such as delete.
What are the remaining problems? Remote form submission.
Remote form submission
The advantage of Web is that information and services can be shared. The downside is that you can share information and services, because some people do things without scruples.
Take the form as an example. Anyone can access a Web site and use File> Save As on the browser to create a local copy of the form. Then, he can modify
Action parameter to point to a fully qualified URL (not to formHandler. php, but
Http://www.yoursite.com/formhandler.php, because the table ticket is on this site), make any modifications he wants, click
Submit. the server receives the form data as a valid communication stream.
Check
$ _ SERVER ['http _ referer'] to determine whether the request comes from its own SERVER. this method can block most malicious users, but cannot block the most brilliant hackers. These people are smart enough to tamper with the reference information in the header so that the form's Remote Copy looks like it was submitted from your server.
A better way to process remote form submission is to generate a token based on a unique string or timestamp and place the token in session variables and forms. After submitting the form, check whether the two tokens match. If they do not match, someone tries to send data from the form's remote copy.
To create a random token, you can use the built-in md5 (), uniqid (), and rand () functions of PHP, as shown below:
Listing 18. defense remote form submission
Session_start ();
If ($ _ POST ['submit '] = "go "){
// Check token
If ($ _ POST ['token'] = $ _ SESSION ['token']) {
// Strip_tags
$ Name = strip_tags ($ _ POST ['name']);
$ Name = substr ($ name, 0, 40 );
// Clean out any potential hexadecimal characters
$ Name = cleanHex ($ name );
// Continue processing ....
} Else {
// Stop all processing! Remote form posting attempt!
}
}
$ Token = md5 (uniqid (rand (), true ));
$ _ SESSION ['token'] = $ token;
Function cleanHex ($ input ){
$ Clean = preg_replace ("! [\] [XX] ([A-Fa-f0-9 })!", "", $ Input );
Return $ clean;
}
?>
"Method =" post ">
Name
"/>
This technology is effective because session data in PHP cannot be migrated between servers. Even if someone gets your PHP
Source Code, transfer it to your server, and submit information to your server. what your server receives is only an empty or malformed session token and the original form token. If they do not match, the remote form submission fails.