At the beginning of the holiday, I took the time to see "White hat Speaking web security", Wu Hanqing basically all the Web security can be encountered problems, solutions summed up very clearly, but also i this time the cornerstone of the overall code security.
I would like to share my experience with the following points.
GRASP the structure of the whole station, avoid leaking site sensitive directory
At the beginning of the code, I was like many old source, in the root directory down index.php, register.php, login.php, users click on the registration page, jump to http://localhost/register.php. There is not much structure in mind, like such a code structure, the biggest problem is not security issues, but code extension and migration issues.
In the process of writing code, we often have to modify the code, this time if the code does not have a unified entry point, we may have to change a lot of places. Later I read a bit of emlog code, found that the site's real front-end code is in the template directory, and the root directory is only the entry point file and configuration file. This insight, the structure of the entire site has been modified.
The site root is decentralized to the previous entry point file to manage all pages of the site, and this time the registration page becomes http://localhost/?act=register, and any page is just an argument to act, and after you get this argument, Then use a switch to select the contents of the file to include. In this entry point file, you can also include some definitions of constants, such as the absolute path of the site, the address of the site, and the password of the database user. Later in the script, we try to use absolute paths instead of relative paths (otherwise the script changes position and the code changes), and this absolute path comes from the definition in the entry point file.
Of course, in security, an entry point file can also hide the background address. Address http://localhost/?act=xxx like this will not expose the absolute path in the background, and can even change it frequently without changing too much code. An entry point file can also verify the identity of the visitor, such as a Web site backstage, and not be allowed to view any page without an administrator. You can verify your identity in the entry point file and output 404 pages If you are not logged in.
With the entry point file, I added this sentence to the front of all non-entry point files:
<?php
if (!defined (' Www_root '))
{
header ("http/1.1 404 Not Found");
Exit;
}
? >
Www_root is a constant that I defined at the entry point, and if the user is accessing the absolute path of the page (http://localhost/register.php), I will output 404 errors; Access through the entry point only (http://localhost/ ? act=register) in order to execute the following code.
Use precompiled statements to avoid SQL injection
Injection is a big problem in the past, but in recent years because we pay more attention to this problem, so slowly become a lot better.
Wu Hanqing in the Web white hat is very good, in fact, many vulnerabilities, such as SQL injection or XSS, are "data" and "code" does not separate the area. "Code" is what the programmer writes, and "data" is what the user can change. If we write an SQL statement select * from admin where username= ' admin ' password= ' xxxxx ', admin and xxxxx are data that is user-entered username and password, but if nothing is handled, User input may be "code", such as ' or ' = ', which creates a vulnerability. "Code" is absolutely impossible for the user to contact.
In PHP, for MySQL database has two modules, MySQL and mysqli,mysqli meaning is MySQL improve. MySQL's improved version, this module contains the concept of "precompilation." Like the above SQL statement, change: SELECT * from admin where username= '? ' password= '? ', it's not an SQL statement, but you can compile it into a stmt object with the Mysqli precompiled feature, After the post user enters the account password, bind the user entered "data" to the location of the two question marks with Stmt->bind_param. In this way, the user input can only be "data", and it is not possible to become "code".
These two question marks define the location of the data and the structure of the SQL statement. We can encapsulate all of our database operations into one class, and all SQL statements are precompiled. This completely avoids SQL injection, which is also the most recommended solution for Wu Hanqing.
Here are some of the code sections that use Mysqli (all of the code for the success or failure of the judgment function are omitted, but not unimportant):
<?php
//user input data
$name = ' admin ';
$pass = ' 123456 ';
First, create a new Mysqli object that contains database-related content in the constructor parameters.
$conn = new Mysqli (Db_host, Db_user, Db_pass, db_name, db_port);
Sets the SQL statement default encoding
$this->mysqli->set_charset ("UTF8");
Create an SQL statement that uses wildcard characters
$sql = ' SELECT user_id from admin WHERE username=? and password=?; ';
Compile the statement to get a stmt object.
$stmt = $conn->prepare ($sql);
The content after/******************** can be reused without recompiling *************************//////
///You can see that Because I left two of them? That is to bind two data to it, so the first parameter is the type of data that is bound (S=string,i=integer), and the second parameter is the data to bind
$stmt->bind_param (' SS ', $ Name, $pass);
Call the Bind_param method binding result (if you just check that the user and password exist, or just a DML statement, do not bind the result)
//The result is the field I select, a few will bind a few
$stmt-> Bind_result ($user _id);
Execute the statement
$stmt->execute ();
The result
if ($stmt->fetch ()) {
echo ' landed success ';
Be sure to pay attention to releasing the resulting resource, otherwise there will be an error
$stmt->free_result ();
return $user _id; Return the content you just select to
}else{echo ' login failed ';}
? >
Prevent XSS code and do not use cookies if you do not need them
I don't use cookies on my site, and because I'm pretty much dead on permissions, it's less risky for XSS.
The defense of XSS is also a reason to handle the relationship between "code" and "Data". Of course, the code here refers to JavaScript code or HTML code. What users can control, we must use functions such as htmlspecialchars to process user input data, and in JavaScript to carefully output content to the page.
Restrict user rights and prevent CSRF
Now the script vulnerability is more of an ultra vires behavior, and many important operations are performed using a Get method, or executed in a post manner without verifying that the performer is aware of it.
CSRF Many students may be unfamiliar, in fact, to give a small example on the line:
A, B are a forum users, the forum allows users to "praise" an article, the user point "praise" is actually visited this page: http://localhost/?act=support&articleid=12. This time, b if the URL sent to a,a without the knowledge to open it, equal to say to articleid=12 article praised once.
So the forum changed the way, by post way to praise a piece of article.
<form action= "Http://localhost/?act=support" method= "POST" > <input type= "hidden" value= "Name="
ArticleID ">
<input type=" Submit "value=" Praise ">
</form>
You can see a hidden input box containing the ID of the article, so you can't get a click through a URL. But B can make a "very tempting" page where a button is written as a form to lure a click. A a bit of a hit, still praise this article.
Finally, the Forum had to add a verification code to the form. Only a input verification code can point to praise. In this way, completely dead B's heart.
But, you have seen which forum point "praise" also want to enter the verification code?
So Wu Hanqing in the white hat also recommended the best way, is to add a random string in the form token (generated by PHP, and saved in session), if the user submitted this random string and the session saved in the string consistent, to praise.
When B does not know the random string of a, it is not allowed to operate ultra vires.
I have also used token on the site many times, either get or post, and usually resist the 99% csrf estimate.
Strictly control the upload file type
Upload vulnerability is a very fatal loophole, as long as there is any file upload vulnerability, you can execute arbitrary code, get Webshell.
I upload this section, wrote a PHP class, through the whitelist validation, to control the user upload malicious files. On the client side, I first validated the type of file selected by the user through JavaScript, but it was just a good reminder to the user, the final validation part, or the server.
Whitelist is necessary, if you only allow uploading of images, set to array (' jpg ', ' gif ', ' PNG ', ' BMP '), when the user came to file, take its filename suffix, with in_array verify whether in the whitelist.
In an array of uploaded files, there will be a MIME type that tells the server what type of file is uploaded, but it is unreliable and can be modified. In many sites that have uploaded vulnerabilities, only the MIME type is validated, without the suffix validation of the filename, which results in the uploading of arbitrary files.
So we can completely ignore this MIME type in the class, and only the suffix of the filename, if in the whitelist, allow upload.
Of course, the resolution of the server is also a lot of vulnerabilities to upload the breakthrough, so we try to upload the file renamed to "date + random number + White list in the suffix" of the file to rename the upload, to avoid parsing loopholes caused by arbitrary code execution.
Encryption confuses JavaScript code, raises attack threshold
Many XSS vulnerabilities are discovered by hackers reading JavaScript code, and if we can confuse and encrypt all of the JavaScript code, it's confusing to have the code decrypted (such as replacing all the variable names with their MD5 hash value) to make reading more difficult.
Save important information in a database with a more advanced hash algorithm
In this period of large disk capacity increase, many people have a large rainbow table, coupled with such a site like CMD5, the simple MD5 has been equated with nothing, so we urgently need a more advanced hash algorithm to save the password in our database.
So then there is the salt of the MD5, such as Discuz's password is added salt. In fact, salt is a password "value-added", such as a password is 123456, and we set the salt is ABC, so save to the database may be MD5 (' 123456ABC '), increased the difficulty of cracking.
But hackers can run MD5 as long as they know the user's salt. Because now the computer's calculation speed is very fast, one second can calculate 1 billion times MD5 value, weak point of the password points can run out of the clock.
So later cryptography improved the hash, introducing a concept: key extension. The simple point is to increase the difficulty of calculating the hash (such as the password using the MD5 () function loop calculation 1000 times), deliberately slow the calculation of the hash time spent, the previous second can be calculated 1 billion times, 1 seconds after the improvement can only be calculated 1 million times, the speed is 1000 times times slower, so that The time required is also increased by 1000 times times.
So how do we use a secure hash algorithm for us? You can read Emlog source code, can be found in the Include directory to find a hashpaaword.php file, in fact, this is a class, emlog use it to calculate the hash of the password.
This class has a feature, each calculated hash value is different, so hackers can not through the rainbow table and other ways to crack the password, can only use this class in a Checkpassword method to return the user input password correctness. The function also deliberately increases the time to compute the hash, so hackers can hardly break the hash value they get.
In the latest php5.5, this hash algorithm becomes a formal function that can then be used to hash our password.
Authentication code Security
This is a point I just thought of, to add.
Verification code is usually generated by PHP script random string, through the GD library processing, made into pictures. The real verification code string is saved in session, and then the resulting picture is displayed to the user. When the user fills in the verification code submission, the Verification code on the service end session is compared.
So I thought of a mistake I made before. I didn't clean up the session after the verification code was completed, whether it was right or wrong. This creates a problem, once a user first submitted the verification Code success, the second time no longer access to generate verification code script, at this time the validation code in the session did not update, also did not delete, resulting in the verification code re-use, can not verify the role.
Again said the verification code is recognized the question, the WordPress including Emlog program I often can draw lessons from, but they use the authentication code I did not dare to compliment. Many of the spam comments were generated by machine recognition, so I later used a more complex verification code, which is said to be recommended by the consortium.
If you need to, you can download http://www.jb51.net/codes/191862.html here
Well, I can think of, and also in the practical use of the things that so much. This is just my own writing code to accumulate some of the code security Insights, if you have a better idea, you can communicate with me. Hopefully, you'll be able to write more secure code.