1. Statement
Cryptography is a complex topic and I am not an expert in this field. Many universities and research institutions have long-term research in this area. In this article, I would like to try to show you a secure way to store Web application passwords in a straightforward way.
2. What does "Hash" do?
"Hash converts a piece of data (small or large) to a relatively short amount of data, such as a string or integer. ”
This relies on a one-way hash function to complete. The so-called one-way means it is difficult (or actually impossible) to reverse it back. A common example of a hash function is MD5 (), which is popular in a variety of computer languages and systems.
Copy Code code as follows:
$data = "Hello World";
$hash = MD5 ($DATA);
Echo $hash; B10a8db164e0754105b7a99be72e3fe5
The result of using MD5 () is always a 32-character string, but it contains only 16 characters, and technically it can be represented by 128-bit (16-byte) integer numbers. You can use MD5 () to handle very long strings and data, but you always get a fixed length hash value, which may also help you understand why this function is "one-way".
3. Use hash function to store password
A typical user registration process:
The user fills in the registration form, which contains the password field;
The program stores all the information that the user fills in the database;
However, the password is encrypted by hash function before it is stored in the database.
The original password is no longer stored anywhere, or it is discarded.
User logon process:
User input user name and password;
The program encrypts the password by registering the same hash function;
The program checks the user from the database and reads the password after the hash;
The program compares the username and password and authorizes the user if the match is made.
How to choose the appropriate method to encrypt the password, we will discuss this issue later in the article.
4. Problem 1:hash Collision
Hash collision is the same hash value of two different contents. The likelihood that a hash collision occurs depends on the hash algorithm used.
How is it produced?
For example, some older programs use CRC32 () to hash passwords, an algorithm that produces a 32-bit integer as a hash result, which means that only 2^32 (or 4,294,967,296) of possible output results.
Let's hash a password:
Copy Code code as follows:
echo crc32 (' Supersecretpassword ');
outputs:323322056
Now let's assume a person steals a database and gets a hash of the password. He may not be able to restore 323322056 to ' Supersecretpassword ', but he can find another password and be able to hash out the same value. This only requires a very simple program:
Copy Code code as follows:
Set_time_limit (0);
$i = 0;
while (true) {
if (CRC32 (Base64_encode ($i)) = = 323322056) {
echo Base64_encode ($i);
Exit
}
$i + +;
}
The program may need to run for a while, but eventually it can return a string. We can use this string instead of ' Supersecretpassword ' and use it successfully to log in to use the user account for that password.
For example, after running the program on my computer for a few months, I got a string: ' mtixmjy5mtawng== '. Let's test this out:
Copy Code code as follows:
echo crc32 (' Supersecretpassword ');
outputs:323322056
echo crc32 (' mtixmjy5mtawng== ');
outputs:323322056
How to solve?
Now a slightly stronger home PC can run the hash function 1 billion times a second, so we need a hash function that produces a larger range of results. For example, MD5 () is more appropriate, it can produce 128-bit hash value, that is, 340,282,366,920,938,463,463,374,607,431,768,211,456 possible output. So it's not possible for people to do so many loops to find hash collisions. However, there are still people who find ways to do this, and you can see the examples in detail.
SHA1 () is a better alternative because it produces a hash value of up to 160 bits.
5. Question 2: Rainbow table
Even if we solve the collision problem, it is not safe enough.
"The Rainbow Table is established by calculating the commonly used words and the hash values of their combinations." ”
This table may store millions of or even 1 billion of data. Now the store has been very cheap, so you can build a very large rainbow table.
Now let's assume that a person steals a database and gets millions of hashed passwords. Thieves can easily find these hash values in a rainbow table one by one, and get the original password. Although not all of the hash values can be found in the rainbow table, but there will certainly be found.
How to solve?
We can try to add some interference to the password, such as the following example:
Copy Code code as follows:
$password = "Easypassword";
This May is found in a rainbow table
Because the password contains 2 common words
Echo SHA1 ($password); 6c94d3b42518febd4ad747801d50a8972022f956
Use bunch of random characters, and it can is longer than this
$salt = "f#@v) Hu^%hgfds";
This is not being found in any pre-built rainbow table
Echo SHA1 ($salt. $password); Cd56a16759623378628c0d9336af69b74d9d71a5
All we do here is hash the last jamming string before each password, as long as the attached string is complex enough, the value of the hash is definitely not found in the Pre built Rainbow table. But it's still not safe enough.
6. Question 3: Or the Rainbow table
Note that the Rainbow table may start to build again after stealing the string. Jamming strings can also be stolen with the database, and then they can use this jamming string to create a rainbow table from scratch, such as "Easypassword" hash values may exist in ordinary rainbow tables, but in the new rainbow table, "F#@v" hu^% Hgfdseasypassword "Hash value will also exist.
How to solve?
We can use a unique jamming string for each user. One available scenario is to use the ID of the user in the database:
Copy Code code as follows:
$hash = SHA1 ($user _id. $password);
The premise of this approach is that the user ID is a constant value (as is the case for general applications).
We can also randomly generate a string of unique jamming strings for each user, but we also need to store this string:
Copy Code code as follows:
Generates a character long random string
function Unique_salt () {
Return substr (SHA1 (Mt_rand ()), 0,22);
}
$unique _salt = Unique_salt ();
$hash = SHA1 ($unique _salt. $password);
and save the $unique _salt with the user record
// ...
This approach prevents us from being harmed by the rainbow table because each password interferes with a different string. It is impractical for an attacker to create a rainbow table with the same number of passwords.
7. Problem 4:hash Speed
Most hash algorithms consider the speed problem when designing, because it is generally used to compute the hash value of large data or files to verify the correctness and completeness of the data.
How is it produced?
As mentioned earlier, a strong PC can now work billions of times a second, and it's easy to use brute force to try each password. You might think that a password of more than 8 characters would avoid being violently cracked, but let's see if that's true:
If the password can contain lowercase letters, capital letters, and numbers, there are 62 (26+26+10) characters Fuche;
A 8-bit password has a possible combination of 62^8, a number slightly greater than 218 trillion.
Calculated at the rate of 1 billion hash values in one second, this can only take 60 hours to resolve.
For a 6-bit password, is also a very common password, only need 1 minutes to crack. Requiring 9 to 10-bit passwords may be more secure, but some users may find it troublesome.
How to solve?
Use a slow hash function.
"Suppose you use an algorithm that can only run 1 million times a second in the same hardware condition instead of 1 billion times a second, then the attacker may need to spend 1000 times times to do brute force, and 60 small will only turn into 7 years!" ”
You can implement this method yourself:
Copy Code code as follows:
function Myhash ($password, $unique _salt) {
$salt = "f#@v) Hu^%hgfds";
$hash = SHA1 ($unique _salt. $password);
Make it take 1000 times longer
for ($i = 0; $i < 1000; $i + +) {
$hash = SHA1 ($hash);
}
return $hash;
}
You can also use an algorithm that supports "cost parameters," such as BLOWFISH. In PHP you can use the crypt () function to implement:
Copy Code code as follows:
function Myhash ($password, $unique _salt) {
The salt for Blowfish should is characters long
Return crypt ($password, ' $2a$10 $unique _salt ');
}
The second parameter of this function contains several values separated by the "$" symbol. The first value is "$2A", indicating that the Blowfish algorithm should be used. The second argument "$" here is the cost parameter, which is a 2-based logarithm, indicating the number of times the loop iteration is computed (=> 2^10 = 1024), and the value can be from 04 to 31.
As an example:
Copy Code code as follows:
function Myhash ($password, $unique _salt) {
Return crypt ($password, ' $2a$10 $unique _salt ');
}
function Unique_salt () {
Return substr (SHA1 (Mt_rand ()), 0,22);
}
$password = "Verysecret";
Echo Myhash ($password, Unique_salt ());
Result: $2A$10$DFDA807D832B094184FAEU1ELWHTR2XHTUVS3R9J1NFRGBCUDCCZC
The hash value of the result contains the $2A algorithm, cost parameter $, and a 22-bit jamming string that we use. The rest is the computed hash value, and we'll run a test program:
Copy Code code as follows:
Assume this is pulled from the database
$hash = ' $2A$10$DFDA807D832B094184FAEU1ELWHTR2XHTUVS3R9J1NFRGBCUDCCZC ';
Assume this is the password the user entered to log back in
$password = "Verysecret";
if (Check_password ($hash, $password)) {
echo "Access granted!";
} else {
echo "Access denied!";
}
function Check_password ($hash, $password) {
Characters include algorithm, cost and salt
Let ' s call it $full _salt
$full _salt = substr ($hash, 0, 29);
Run the hash function on $password
$new _hash = Crypt ($password, $full _salt);
Returns TRUE or False
return ($hash = = $new _hash);
}
Run it and we'll see "Access granted!"
8. Integrate
Based on some of the above discussions, we have written a tool class:
Copy Code code as follows:
Class Passhash {
Blowfish
private static $algo = ' $2a ';
Cost parameter
private static $cost = ' $ ';
Mainly for internal use
public static function Unique_salt () {
Return substr (SHA1 (Mt_rand ()), 0,22);
}
This is used to generate a hash
public static function hash ($password) {
Return Crypt ($password,
Self:: $algo.
Self:: $cost.
'$'. Self::unique_salt ());
}
This is used to compare a password against a hash
public static function Check_password ($hash, $password) {
$full _salt = substr ($hash, 0, 29);
$new _hash = Crypt ($password, $full _salt);
return ($hash = = $new _hash);
}
}
The following is the use of registration:
Copy Code code as follows:
Include the class
Require ("passhash.php");
Read all form input from $_post
// ...
Do your regular form validation stuff
// ...
Hash the password
$pass _hash = passhash::hash ($_post[' password '));
Store all user info in the DB, excluding $_post[' password '
Store $pass _hash instead
// ...
The following are the uses of the login:
Copy Code code as follows:
Include the class
Require ("passhash.php");
Read all form input from $_post
// ...
Fetch the user record based on $_post[' username '] or similar
// ...
Check the password the user tried to login with
if (Passhash::check_password ($user [' Pass_hash '], $_post[' password ']) {
Grant access
// ...
} else {
Deny Access
// ...
}
9. Encryption is available
Not all systems support the Blowfish encryption algorithm, although it is now very common, you can use the following code to check whether your system supports:
Copy Code code as follows:
if (crypt_blowfish = = 1) {
echo "Yes";
} else {
echo "No";
}
But for php5.3, you don't have to worry about it because it's built into the implementation of the algorithm.
Conclusions
Passwords encrypted in this way are safe enough for most Web applications. But don't forget that you can still let users use more secure passwords, such as requiring a minimum number of digits, using letters, numbers, and special characters to mix passwords.