1.3. Method
Like the principles in the previous section, there are still many methods to use when developing security applications. All the methods mentioned below are also important to me.
Some methods are abstract, but each of them has examples to illustrate how to apply and its purpose.
1.3.1 balance risks and availability
The friendliness of user operations is a conflict with the security measures. While improving security, the availability is usually reduced. When you writeCodeLogical users must be considered. It is really difficult to achieve a proper balance, but you must do it well and no one can replace you because it is your software.
Try to make security measures transparent to users, so that they do not feel its existence. If it is not possible, try to use a common and familiar method. For example, it is better for a user to enter the user name and password before accessing controlled information or services.
When you suspect that there may be illegal operations, you must be aware that you may borrow money. For example, if the system has doubts about the user's identity during user operations, it is usually used to ask the user to enter the password again. This is a little inconvenient for legal users, while it is a copper wall for attackers. Technically, this is basically the same as prompting users to log on again, but there is a world of difference in user experience.
There is no need to force users out of the system and accuse them of being so-called attackers. When you make a mistake, these processes will greatly reduce the availability of the system, and errors are inevitable.
In this book, I focus on transparent and common security measures, and I suggest you be careful and wise to respond to suspected attacks.
1.3.2. Tracking Data
As a security-aware developer, the most important thing is to track data at any time. You do not only need to know what it is and where it is, but also where it comes from and where it is going. Sometimes it is difficult to do this, especially when you do not have a deep understanding of the principles of Web operations. This is why some developers often make mistakes and create security vulnerabilities when they are not very experienced in other development environments.
When reading emails, most people are generally not spoofed by spam such as "re: Hello" because they know that the subject that this reply looks like can be forged. Therefore, this email is not necessarily a reply to the previous email with the subject "hello. In short, people know that they cannot trust this topic too much. But few people realize that the sender address can also be forged, and they mistakenly think that it can reliably display the source of this email.
The Web is also very similar. One thing I want to teach you is how to distinguish between credible and untrusted data. It is often difficult to achieve this. Blind Guesses are not a solution.
PHP uses a super Global Array such as $ _ Get, $ _ post, and $ _ cookie to clearly indicate the source of user data. A strict naming system can ensure that youProgramAny part of the code knows the source of all the data, which I have always demonstrated and emphasized.
It is extremely important to know where the data is going into your program, and to know where the data is going out of your program. For example, when you use the echo command, you are sending data to the client; when you use mysql_query, you are sending data to the MySQL database (though you may want to retrieve data ).
When I audit PHP code for security vulnerabilities, I mainly check the interaction between the code and external systems. This part of the Code may contain security vulnerabilities. Therefore, you must pay special attention to it during development and code check.
1.3.3. filter input
Filtering is the foundation of Web Application Security. It is the process of verifying data legitimacy. By filtering all the data during input, you can avoid being compromised (unfiltered) by mistake or misuse in your program. Most popular PHP application vulnerabilities are caused by the absence of proper filtering of input.
The filter input I refer to refers to three different steps:
L recognition input
L filter input
L differentiate filtered and Contaminated Data
Take the recognition input as the first step because if you do not know what it is, you cannot filter it correctly. Input refers to all external data. For example, all inputs from the client are input, but the client is not the only external data source. Others, such as database and RSS push, are also external data sources.
User input data is very easy to recognize. php uses two super public arrays $ _ Get and $ _ post to store user input data. Other inputs are much more difficult to recognize. For example, many elements in the $ _ server array are manipulated by the client. It is often difficult to determine which elements in the $ _ server array constitute the input. Therefore, the best way is to regard the entire array as the input.
In some cases, what you input depends on your opinion. For example, if session data is stored on the server, you may not consider session data as an external data source. If you hold this point of view, you can store session data in your software. It is wise to realize that the security of the session storage location is associated with the security of the software. The same view can be pushed to the database, or you can regard it as part of your software.
In general, it is safer to regard the session storage location and database as input, and this is also the recommended method in all important PHP application development.
Once the input is identified, you can filter it out. Filtering is a somewhat formal term. It has many synonyms in common expressions, such as verification, cleaning, and purification. Although these terms are slightly different, they all refer to the same process: Preventing illegal data from entering your application.
There are many ways to filter data, some of which are highly secure. The best way is to regard filtering as a check process. Please do not try to correct illegal data with good intentions, so that your users can follow your rules. History shows that attempts to correct illegal data often result in security vulnerabilities. For example, consider the methods under the upper-level directory (access to the upper-level directory) to prevent directory traversal ).
Code:
<? PHP
$ Filename = str_replace ('..', '.', $ _ post ['filename']);
?>
Can you think of how to set the value of $ _ post ['filename'] to make $ filename the path of the User Password File in Linux: http://www.cnblogs.com/etc/passwd?
The answer is simple:
.../Etc/passwd
This specific error can be replaced repeatedly until it cannot be found:
Code:
<? PHP
$ Filename = $ _ post ['filename'];
While (strpos ($ _ post ['filename'], '...')! = False)
{
$ Filename = str_replace ('..', '.', $ filename );
}
?>
Of course, the function basename () can replace all the above logic and achieve the goal more securely. However, it is important that any attempt to correct illegal data may lead to potential errors and allow illegal data to pass. Only checking is a safer choice.
Annotation: I have a deep understanding of this. I have encountered this kind of thing in a real project. It is to make changes to a user registration and login system. The customer hopes that there will be no space before and after the user name to log on, as a result, the user's logon program was changed. The trim () function was used to remove the spaces before and after the user name was entered (a typical bad thing ), however, space is allowed before and after registration! The results can be imagined.
In addition to filtering as a check process, you can also use the whitelist method whenever possible. It means that you need to assume that the data you are checking is illegal unless you can prove that it is legal. In other words, you 'd better be careful when making mistakes. If you use this method, an error will only cause you to regard legal data as illegal. Although you don't want to make any mistakes, it is much safer to treat illegal data as legal data. By reducing the loss caused by mistakes, you can improve the security of your applications. Although this idea is natural in theory, history proves that this is a very valuable method.
If you can correctly and reliably identify and filter input, your work is basically done. The last step is to use a naming convention or other methods that can help you properly and reliably classify filtered and contaminated data. I recommend a simple naming convention because it can be used in process-oriented and object-oriented programming at the same time. The naming convention I used is to put all filtered data into a data named $ clean. You need to take two important steps to prevent the injection of Contaminated Data:
L $ clean is often initialized as an empty array.
L add check and prevent the variables from external data sources from being named clean,
In fact, Initialization is crucial, but it is also good to develop such a habit: Think of all the variables named clean as your filtered data array. This step reasonably ensures that $ clean only includes the data you intend to store. All you need to do is not include contaminated data in $ clean.
To consolidate these concepts, consider the following form, which allows users to select one of the three colors;
Code:
<Form action = "process. php" method = "Post">
Please select a color:
<Select name = "color">
<Option value = "red"> Red </option>
<Option value = "green"> green </option>
<Option value = "blue"> blue </option>
</SELECT>
<Input type = "Submit"/>
</Form>
In the programming logic for processing this form, it is easy to make the mistake that only one of the three choices can be submitted. In Chapter 2, you will learn that the client can submit any data as the value of $ _ post ['color. To filter data correctly, you need to use a switch statement:
Code:
<? PHP
$ Clean = array ();
Switch ($ _ post ['color'])
{
Case 'red ':
Case 'green ':
Case 'blue ':
$ Clean ['color'] =_ _ post ['color'];
Break;
}
?>
In this example, $ clean is initialized as an empty array to prevent data from being contaminated. Once it is proved that $ _ post ['color'] is one of Red, green, or blue, it will be saved to the $ clean ['color'] variable. Therefore, you can be sure that the $ clean ['color'] variable is valid to use it in other parts of the Code. Of course, you can add a default branch in the switch structure to process illegal data. One possibility is to display the form again and prompt an error. Be careful not to try to output Contaminated Data for friendliness.
The above method is very effective for filtering data with a group of known valid values, but it is not helpful for filtering data with a group of known valid characters. For example, you may need a user name that can only consist of letters and numbers:
Code:
<? PHP
$ Clean = array ();
If (ctype_alnum ($ _ post ['username'])
{
$ Clean ['username'] =_ _ post ['username'];
}
?>
Although regular expressions can be used in this case, PHP built-in functions are more perfect. These functions are much less likely to contain errors than your own code, and an error in the filtering logic almost means a security vulnerability.
1.3.4. Output escape
Another Web Application Security Foundation is to escape the output or encode special characters to ensure the original intention remains unchanged. For example, O 'Reilly needs to be converted to O \ 'Reilly before being transferred to the MySQL database. The backslash before a single quotation mark indicates that a single quotation mark is a part of the data, rather than its meaning.
The output escape I refer to is divided into three steps:
L recognition output
L output escape
L differentiate between data that has been converted and not escaped
It is necessary to escape only filtered data. Although escape can prevent many common security vulnerabilities, it cannot replace input filtering. Contaminated Data must first be filtered and then escaped.
When escaping the output, you must first identify the output. Generally, this is much easier than recognizing input because it depends on your actions. For example, when identifying the output from the client, you can find the following statements in the Code:
Echo
Print
Printf
<? =
As a developer of an application, you must know what is output to an external system. They constitute the output.
Like filtering, the escape process varies depending on the situation. Different types of data processing methods are also different for filtering. Escape also uses different methods based on the information you transmit to different systems.
For escaping common output targets (including clients, databases, and URLs), PHP has built-in functions available. If you want to write your ownAlgorithmIt is important to be foolproof. You need to find a reliable and complete list of special characters in the external system and their representation, so that the data is retained rather than translated.
The most common output target is the client. Using htmlentities () to escape data before it is sent is the best method. Like other string functions, the input is a string that is processed and output. However, the best way to use the htmlentities () function is to specify two optional parameters: the Escape mode of quotation marks (second parameter) and character set (third parameter ). The Escape mode of quotation marks should be specified as ent_quotes. The purpose is to escape single quotation marks and double quotation marks at the same time. This is the most thorough operation. character set parameters must be matched with the character set used on the page.
We recommend that you define a naming mechanism to determine whether the data has been converted. For the escape data output to the client, I use the $ HTML array for storage. The data is first initialized into an empty array to save all filtered and converted data.
Code:
<? PHP
$ Html = array ();
$ HTML ['username'] = htmlentities ($ clean ['username'], ent_quotes, 'utf-8 ');
Echo "<p> welcome back, {$ HTML ['username']}. </P> ";
?>
Tips
The htmlspecialchars () and htmlentities () functions are basically the same. Their parameter definitions are exactly the same, but the escaping of htmlentities () is more thorough.
Output username to the client through $ HTML ['username'], and you can ensure that the special characters are not incorrectly interpreted by the browser. If username only contains letters and numbers, it is not necessary to escape them, but this reflects the principle of deep defense. Escaping any output is a good habit. It can dramatically improve the security of your software.
Another common output target is the database. If possible, you need to use PHP built-in functions to escape data in SQL statements. For mysql users, the best escape function is mysql_real_escape_string (). If the database you are using does not have the PHP built-in escape function available, addslashes () is the final choice.
The following example illustrates the correct escape technique for the MySQL database:
Code:
<? PHP
$ Mysql = array ();
$ Mysql ['username'] = mysql_real_escape_string ($ clean ['username']);
$ SQL = "select *
From Profile
Where username = '{$ mysql ['username']}' ";
$ Result = mysql_query ($ SQL );
?>