During this time, I have been writing a whole site, and it was basically completed a few days ago. so I took some time to write a summary of php Security. It doesn't matter if the technology content is not high. I hope some friends who are preparing to write a website can guide me a little. At the beginning of the holiday, I took the time to read white hats about web security. Wu Hanqing basically summarized all the problems and solutions that can be encountered in web security, this is also the cornerstone of my overall code security.
I hope to share my experience in the following aspects:
Grasp the entire site structure to avoid leaking sensitive site directories
At the beginning of writing code, I also put the index under the root directory like many old source codes. php, register. php, login. php: click the registration page to jump to http: // localhost/register. php. There are not many structure ideas. The biggest problem with code structures like this is not security, but code expansion and porting.
During code writing, we often need to modify the code. at this time, if the code does not have a unified entry point, we may need to change many places. Later, I read a little emlog code and found that the real front-end code of the website is in the template directory, while the root directory only contains the entry point file and configuration file. In this case, the entire website structure was modified.
Place the website root directory to an entry point file to manage all the pages of the website. at this time, the registration page is changed to http: // localhost /? Act = register. any page is just a parameter of act. after obtaining this parameter, use a switch to select the file content to be included. The entry point file also contains constant definitions, such as the absolute path of the website, the website address, and the database user password. In the future, we will try to use absolute paths instead of relative paths in script writing (otherwise, the code will change if the script changes its position ), the absolute path comes from the definition in the entry point file.
Of course, in terms of security, an entry point file can also hide the background address. An address like this is http: // localhost /? Act = xxx does not expose the absolute path in the background, or can be changed frequently without changing too much code. An entry point file can also verify the identity of a visitor. for example, if a website is in the background, no administrator can view any page. You can verify your identity in the entry point file. if you have not logged on, the 404 page is displayed.
With the entry point file, I added this sentence to the front of all non-entry point files:
<?php if(!defined('WWW_ROOT')) {header("HTTP/1.1 404 Not Found"); exit; } ?>
WWW_ROOT is a constant defined in the entry point. if the user accesses the constant through the absolute path on this page (http: // localhost/register. php), and I will output the 404 error. only access through the entry point (http: // localhost /? Act = register.
Use precompiled statements to avoid SQL injection
Injection is a big problem in the past, but in recent years, it has become much better because we have attached great importance to this issue.
Wu Hanqing said very well in the web white hat. In fact, many vulnerabilities, such as SQL injection or xss, all separate "data" from "code. "Code" is the content written by programmers, and "data" is the content that users can change. If we write an SQL statement select * from admin where username = 'admin' password = 'xxxxx', admin and xxxxx are data, which are user names and passwords entered by the user, however, without any processing, the user input may be "code", such as 'or ''=', which causes a vulnerability. "Code" is definitely not accessible to users.
In php, mysql databases have two modules: mysql and mysqli. mysqli means mysql improve. Mysql Ultimate Edition. This module contains the concept of "pre-Compilation. Like the preceding SQL statement, change to select * from admin where username = '? 'Password = '? ', It is not an SQL statement, but you can use the pre-compilation function of mysqli to compile it into a stmt object first. after the user enters the account password, use stmt-> bind_param to bind the "data" entered by the user to the positions of the two question marks. In this way, the user input can only be "data", rather than "code ".
These two question marks define the location of "data" and the structure of SQL statements. We can encapsulate all our database operations into a class, and pre-compile all SQL statements. This completely avoids SQL injection, which is also the most recommended solution by Wu Hanqing.
Below are some code parts using mysqli (I have omitted all the code that determines whether the function runs successfully or fails, but it does not mean it is not important ):
<? Php // user-input data $ name = 'admin'; $ pass = '000000'; // create a new mysqli object. The constructor parameters contain database-related content. $ Conn = new mysqli (DB_HOST, DB_USER, DB_PASS, DB_NAME, DB_PORT); // you can specify the default encoding of the SQL statement $ this-> mysqli-> set_charset ("utf8 "); // Create an SQL statement using wildcards $ SQL = 'SELECT user_id FROM admin WHERE username =? AND password = ?; '; // Compile the statement to obtain an stmt object. $ stmt = $ conn-> prepare ($ SQL ); /********************** and the content can be reused, you do not need to compile *********************** // bind_param to bind data. // you can see, because I left two ?, That is to say, to bind two data to it, so the first parameter is the type of the bound data (s = string, I = integer ), the second parameter is the data to be bound $ stmt-> bind_param ('SS', $ name, $ pass ); // call the bind_param method to bind the result (if you only check whether the user and password exist or are only a DML statement, you do not need to bind the result) // The result is the field that I select. several fields need to be bound to $ stmt-> bind_result ($ user_id); // execute the statement $ stmt-> execute (); // obtain the result if ($ stmt-> fetch () {echo 'login successful '; // be sure to release the result resource, otherwise, an error will occur later: $ stmt-> free_result (); return $ user_id; // return the content just selected} else {echo 'logon failed';}?>
Prevent XSS code. do not use cookies if you do not need them.
I didn't use cookies on my website, and because I am very dead about permission restrictions, it is less risky for xss.
It is also a truth for xss defense to deal with the relationship between "code" and "data. Of course, the code here refers to javascript code or html code. For user-controllable content, we must use functions such as htmlspecialchars to process user input data and carefully output the content to the page in javascript.
Restrict user permissions to prevent CSRF attacks
Nowadays, the script vulnerability is prone to unauthorized operations. many important operations are executed in GET mode or POST mode without verifying whether the performer knows.
Many CSRF users may be unfamiliar. just give a small example:
A and B are all forum users. This Forum allows users to "like" an article. when users click "like", they actually access this page: http: // localhost /? Act = support & articleid = 12. At this time, if B sends this URL to A, A opens it without knowing it, which is equivalent to A thumbs up for the article titled article id = 12.
Therefore, this forum adopts the POST method to like an article.
You can see that a hidden input box contains the ID of the article, so you cannot click a through a URL. However, B can create a "tempting" page, and a button is written as a form to seduce. A clicked and liked this article.
Finally, the forum had to add a verification code to the form. Only A can enter the verification code to like. In this way, B's heart is completely killed.
However, which of the following forums do you need to enter the verification code for "like?
So Wu Hanqing also recommended the best way in white hats, that is, adding a random string token to the form (generated by php and saved in SESSION ), if the random string submitted by the user is the same as the string saved in the SESSION, the user can like it.
When B does not know the random string of A, the unauthorized operation is not allowed.
I have also used TOKEN for many times on the website. whether it is the GET or POST method, it usually can withstand the CSRF estimation of 99%.
Strictly control the types of uploaded files
The upload vulnerability is a critical vulnerability. any file Upload vulnerability can execute arbitrary code and obtain webshell.
I am uploading this part. I wrote a php class that uses whitelist verification to control the user's upload of malicious files. On the client side, I verified the type of the file selected by the user through javascript, but this is just a good reminder to the user that the final verification part is still on the server side.
The whitelist is necessary. if you only allow uploading images, set it to array ('jpg ', 'GIF', 'PNG', 'bmp '), after a user uploads a file, obtain the suffix of the file name and use in_array to verify whether the file is in the whitelist.
In the array of uploaded files, there will be a MIME type that tells the server what type the uploaded files are, but it is unreliable and can be modified. In many websites with the upload vulnerability, only the MIME type is verified, and the suffix verification of the file name is not obtained, resulting in the upload of arbitrary files.
Therefore, we can ignore this MIME type in the class, and only take the suffix of the file name. if it is in the white list, upload is allowed.
Of course, the server resolution vulnerability is also a breakthrough in many Upload vulnerabilities. Therefore, we try to rename the uploaded files as much as possible, rename the uploaded file by means of "date and time + random number + white list suffix" to avoid arbitrary code execution due to parsing vulnerabilities.
Encrypts and obfuscated javascript code to increase the attack threshold
Many xss vulnerabilities are discovered by hackers by reading javascript code. if we can confuse and encrypt all javascript code, it makes the code messy even after decryption (for example, replacing all variable names with their MD5 hash values), making reading more difficult.
Use a more advanced hash algorithm to save important information in the database
In this period of increased hard disk capacity, many people have a large rainbow table. in addition to the popularity of websites like iis5, md5 alone is equivalent to nothing, therefore, we urgently need more advanced hash algorithms to save passwords in our databases.
Therefore, the md5 value for adding salt appeared later. for example, the password for discuz is to add salt. In fact, salt is the "value-added" of A's password. for example, A's password is 123456, and the salt we set is abc. in this way, md5 ('123456abc') may be saved to the database '), increased the difficulty of cracking.
However, as long as the hacker knows that the user's salt can also run md5. Because the computing speed of the computer is already very fast, 1 billion md5 values can be calculated in one second, and a weaker password can be used to run the clock.
Later, hash was improved in cryptography and a concept was introduced: Key extension. To put it simply, it increases the difficulty of hash calculation (for example, the password is calculated cyclically for 1000 times using the md5 () function), deliberately slowing down the time used for hash calculation, and 1 billion times can be calculated in the previous second, after improvement, only 1 million times can be calculated in one second, and the speed is 1000 times slower. in this way, the required time is increased by 1000 times.
So how can we use a secure hash calculation method? You can read the source code of emlog and find a HashPaaword. php file in the include directory. this is actually a class. emlog uses it to calculate the password hash.
This class has a feature that the hash values calculated each time are different, so hackers cannot crack the password through the rainbow table or other methods, only one checkpassword method in this class can be used to return the correctness of the user's entered password. This function increases the hash computing time, so it is difficult for hackers to crack the hash value they get.
In the latest php5.5, this hash algorithm has become a formal function and will be able to use this function to hash our password.
Verification code security
This is what I just came up.
The verification code is usually a random string generated by a php script. it is processed by the GD library and made into an image. The real verification code string is saved in the SESSION, and the generated image is displayed to the user. After the user completes the verification code submission, the verification code in the SESSION on the server is compared.
This reminds me of a mistake I made before. After the verification code comparison is complete, no matter whether it is correct or incorrect, I have not cleared the SESSION. This creates a problem. Once a user successfully submits the verification code for the first time, the script for generating the verification code will not be accessed after the second time. at this time, the verification code in the SESSION is not updated or deleted, as a result, the verification code is reused and cannot be used for verification.
The verification code is identified. I often learn from wordpress programs including emlog, but I am not flattering the verification code they use. Many spam comments are generated after the verification code is recognized by the machine, so I also used a complicated verification code, which is said to be recommended by w3c.
If you need it, you can download the http://www.jb51.net/codes/191862.html here
Well, what I can think of is that there are so many things used in practical use. This is just an insight into code security accumulated by myself. if you have any better ideas, you can talk to me. I hope you can write more secure code.