Discussion on encryption strategy of password field by de-Library attack--one of the secrets of password leaking event

Source: Internet
Author: User
Tags md5 hash

Turn from: http://blog.sina.com.cn/s/blog_61efbd3c01012wmx.html (formerly Jianghai customers well-known network security experts, the chief technical architect of the Tian Laboratory. )

I have to write it down in the first place, this is the era of a security meltdown. In the past year, companies such as Sony and Sega, including financial institutions such as Citigroup, have been confirmed to have been compromised and have led to the theft or disclosure of critical data, including a security vendor such as the RSA.

The most surprising thing in these events is the intrusion of RSA, which has led to a chain of attacks by a number of industrial giants, as well as the use of RSA's tokens by many security companies themselves. DigiNotar, the Dutch electronics certification company, which is much weaker than RSA, has been declared bankrupt after the invasion.

In the first half of the year, we are still standing on the sidelines to discuss these things. But then we encounter the CSDN, multi-play and the end of the database leaked ... One of the most sensitive is user information, and the other is, of course, the user's password. Because of identity real name, password general and other conditions, such as the impact of a moment SAC kings, each site is also trapped in the saliva.

But, according to inference, these intrusions are actually some of the past-these libraries have long been underground. This time the leak, perhaps just a collective psychological effect.

This theft of database records, by some attackers called off the library, so there is a natural and homophonic dubbed "Off pants." But the attackers are becoming more and more unkind, the former only stole their trousers, but now also to be hung in the street, and posted a notice said: "Look, ya pants there are patches on it."

If it is difficult to avoid the library, then it is necessary to use a reasonable encryption strategy, so that the impact of the attacker's access to the library is reduced to a lesser degree.

The era of plaintext storing passwords must be over, but will encryption be secure?

Those wrong encryption policies:

The plaintext password is unacceptable, but the wrong encryption policy is also bad. Let's take a look at the following scenarios.

Simple use of standard hash

I think of a 90-year hacker joke, someone entered a UNIX host, caught a shadow document, but can not be cracked. So, he used his own machine to do a fake scene, deliberately left this shadow, and then see what other people with what password to try, and finally use these passwords and infiltration of the original host. Unfortunately, at that time we all treated this as a joke, at best replying to the phrase "I served the you!" without rethinking the question of using the standard algorithm.

Currently, the most widely used algorithm for password preservation is the standard MD5 HASH. But in fact, for a long time, we ignored the idea that hash design was not meant to be encrypted, but to be used for verification . The system Designer is because the hash algorithm has the irreversible characteristic, therefore "borrows" uses it to save the password. But its irreversible premise assumes that the set of plaintext is infinitely large. But the password is not the same, the length of the password is limited, and its available characters are also limited. We can see the total number of passwords as a de facto finite set (it is hard to imagine someone using 100 characters as a password).

For example, if a person's password is "123456", then any site database that uses standard MD5 encryption is stored in a MD5 value:


Because ciphertext are the same, and the hash algorithm is one-way, so the attackers used earlier method is ciphertext ratio + high frequency statistics generated ciphertext dictionary attack. Because the vast majority of Web sites and system encryption implementation, are the same plaintext password generated the same ciphertext, so those who have high-frequency ciphertext is likely to use high-frequency plaintext password users. On the one hand, the attacker can make the corresponding secret documents of high frequency plaintext for the standard algorithm, and on the other hand, the method of high frequency statistic attack is very common for those non-standard algorithms.

But the search-table attack quickly overwhelmed the high-frequency statistics, it is from 2000 onwards there has been a large web site-size plaintext password leak event began. In the past, every plaintext password leak event, the attacker will use the password MD5, SHA1 and other common hash algorithms processed into the password and hash value corresponding table, used to use the hash value to save the library.

And with the cheap, GPU popularization, and storage capacity growth, a threat that can not be ignored began to jump on the desktop, that is: these huge hash tables have been not only based on the leakage of passwords and common string dictionaries to make, many attackers through long-term division of labor collaboration, By means of the exhaustive way to make a certain number of digits following the combination of the cipher string and a number of algorithms to encrypt the result set of results, these result sets from hundred g to dozens of T, this is the legendary Rainbow table .

The one-way advantage of hash is already only theoretical, because the unidirectional nature of the hash is guaranteed by the algorithm design, using a finite set to represent an infinite set, which is inevitably irreversible. But the attacker is from the table to complete the restore from hash to password plaintext. Therefore, the unidirectional nature of the algorithm loses its meaning.

Using hash together

Some people mistakenly think that the hash is not safe because of the strength of the hash algorithm, so the MD5 or SHA1 joint use, in fact, this is worthless (only the consumption of storage resources). As mentioned in the previous section, the security of hash is that the correspondence between a large number of passwords and their hash values has been made into a rainbow table. As long as you combine the hash algorithm with one of them in the rainbow table, it's natural to find out.

Similarly, the use of the MD5 head +sha1 tail, or the use of other methods of mixing two values, it is meaningless. Because the attacker can easily observe the law of this combination method, after disassembly continue to follow the table method to crack.

Design your own algorithm

I have always thought that since we are not a cipher, but an engineer and a programmer, it is rather foolish to develop an encryption algorithm without using the ready-made stuff. I believe that a lot of programmers have come to think of a "new algorithm", and then found that early in a 80 's mathematical paper, the relevant algorithm has been proposed.

Moreover, in the open source era, many algorithms have not only been implemented and published, but also experienced long-term use of scrutiny. These are their own design, the realization of their own incomparable.

On the self-designed algorithm of the insecurity, there is a thing deep in my mind. Remember I work in the securities system, because just took over the acquisition of the sales department, need to put a clipper compiled counter system to migrate, but the original developer has not contacted, at that time we developed two roads, a master Li teacher responsible for data cracking, see if can restore clear text, And I am responsible for solving the algorithm, if Miss Li there is no way, then I need to solve the algorithm, the 000000~999999 between the number of all encryption, and then use ciphertext to do the collision (at that time the securities are counter operation, no online stock, passwords are counters with the number of keyboard input).

Because the original developer added a little flower live, my side has no prospect, over there watching Li teacher's engineer, has issued the voice of amazement, I ran over, I saw Mr. Li based on the construction of a few password encryption results, on the paper remitted a very like Yang Hui triangle of things. In less than half an hour, Miss Li has even done the decryption program.

Suddenly found off the topic ... just say, your own design algorithm no matter how self-feeling good, look at the U.S. official Selection algorithm PK Process We understand that we can not compete with the wisdom of the global mathematicians.

So it's not a good idea to design your own implementation algorithm. This also includes, in the implementation of will not have similar input super-long string will overflow a kind of bug.

Use symmetric algorithms alone:

After the standard hash security burst, it is not a good idea to see someone calling for AES. These symmetric algorithms of AES are not unidirectional. The site was attacked in a complex situation where only the database was taken off and some of the environment fell. When the latter AES key is taken, the password will be restored, which is worse than the watch.

Of course, we also see a kind of AES when the idea of hashing, is to keep only a portion of the AES encryption results, only verify not restore. But in fact, such AES does not have an advantage over hash. For example, even if the attacker did not get the key and only took off the library, the attacker himself registered enough accounts before taking off the library and used a large number of different short passwords. Then we get a set of short plaintext and corresponding ciphertext. At this point, the key is completely possible to be parsed out.

And the use of DES, AES a class of algorithms, or the use of labeling hash, or their own design algorithm, if not solve different users of the same password ciphertext the same statistical defects, then the attacker can not get the key, but also to some high-frequency password for account registration, off the library to do the ciphertext. You can lock a large number of users with common passwords.

Add "one grain of salt":

In fact, many colleagues have pointed out that the loads hash salt method (Hash+salt), is the solution of the problem, the so-called salt (salt) is actually very simple, is to give a disturbance when generating a hash, so that the hash value and the standard hash result is different, so you can anti-rainbow table.

For example, the user's password is 123456, add a salt, that is, the random string "1cd73466fdc24040b5", the two together, calculate the MD5, the result is 6c9055e7cc9b1bd9b48475aaab59358e. Through this operation, even if the user weak password, but also by adding salt, so that the actual calculation of the hash is a long string, to some extent, to defend against the brute-attack and rainbow table attack.

But from our audited implementations, many people add only "one grain of salt". In other words, for the same site, different users use the same password, the ciphertext is the same. This goes back to the issue of high-frequency statistical attacks, pre-registration attacks, and so on.

Security Policy for passwords:

In the eyes of traditional cryptology, only one kind of encryption is ideal, that is, "once a secret", of course, in fact, it is impossible. But if we apply this morphology, we can also say that the ideal state of password security policy, we can be called one-way, one person, one secret, one station and one secret .

One-way: The value of the standard hash algorithm although in this scenario, has been torn down, but its unidirectional thinking is still correct, as long as the password can be restored, it means that the attacker can also do this, thus losing meaning, so the use of one-way algorithm is necessary.

One person a secret: the same site sets the same password for different users, encryption generated ciphertext content is not the same. This will effectively respond to results collisions and statistical attacks. The method of attacking with a dictionary is basically not convergent.

One Station one secret: only to ensure that one person a secret is not enough, but also to ensure the use of the same information, the same password to register different sites users, the password encryption results at different sites is different. In view of the large number of users with the same information, the same password to register different sites, if you can do this, the Lost library information will be further discounted. An attacker would essentially abandon the attempt to generate a ciphertext dictionary.

To achieve these words is very simple, still is hash+salt, the key is that each site to have a different salt, each user to have different salts.

But if the attacker does not just get the library, but also obtains the relevant encryption parameters and keys, we will see that the attacker can still call the algorithm by the relevant parameters and keys themselves, using common passwords for each user to generate the ciphertext, and then whether there is a match. Of course, we can see that because of the "one salt per person" strategy, the cost of computing the attackers need has changed, if the past only need to generate once, then if the use of 100 common password to do, so long as the password does not collide, for each user to do 100 encryption operations. But it is also a threat that cannot be underestimated. Because too many users like to use those common passwords.

Therefore, setting a password disable table, so that users avoid the use of common passwords, can further allow the cracker to pay a greater price , resulting in the calculation of resources will eventually not converge and give up, or can be considered a strategy. But it is also necessary to remind web developers that this increases the risk that your users will forget their passwords.

In addition, whether the user has the password set to 123456 of the freedom, I think as long as not defense, aerospace, secret-related systems and security requirements of the enterprise environment, if only diving, scold squalling, the site may remind users, but may not need to make a mandatory strategy.

The specific implementation

Said so much, how to achieve a one-stop, a secret, a secret strategy, December 23, we think of its empty sermon algorithm principle and strategy, rather than provide some very direct sample programs and documents.

So colleagues wrote an open-source code called Antiy Password Mixer (security mixer), of course, there is no technical content, nor "own IP-based domestic algorithm", and some just to achieve a better popular open source algorithm package model use only. The current version of Python, with only 300 lines of code, encapsulates the use of RSA and Hash+salt, and gives specific examples of how to use it when initializing, registering, and certifying.

We can find this stuff here.


Of course, just as we regret that many application developers lack the importance of security, and we do not understand application development, so the code and documentation may seem very ugly to many application developers. Despite the possibility of being despised, we have to open the door and prove that the security team is not conservative.

At the same time, we have to go closer to the app, because we're also using apps that we think violate some kind of security principle, but because they're not developers, they can't be modified.

Over the past more than 10 years, China's Web applications have been running away from security, and the developers have laid out the existing pattern with their own diligence and impact, but also because of the rapid run left something, such as security. Maybe it's time to pick up these discards.

The security community in China, because of its conservatism, sensitivity and many of its own reasons, is far more distant from the application, and when we are still dreaming of some perfect security picture, we find that we cannot see the back of the application. Maybe, when the app comes back and wait for us, it's time for us to speed up and pick up the security that the app has left behind.


Three ways to breach a database: Drag library, off-Library, pants

Discussion on encryption strategy of password field by de-Library attack--one of the secrets of password leaking event

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.