Anti-database theft-Web Front-end slow Encryption

Source: Internet
Author: User
Tags emscripten

Anti-database theft-Web Front-end slow Encryption
0x00 Preface

The world's martial arts, only fast. However, different passwords are encrypted. The faster the algorithm, the easier it is to break.

 

0x01 brute force cracking

Password cracking means restoring the encrypted password to a plaintext password. There seems to be many ways, but in the end, we have to go one way: brute force. You may say that you can also look up the table and get the results in an instant. Although you do not need to make a look-up table, the table manufacturing process is still required. The lookup only advances the effort.

Password Encryption, which uses one-way hash calculation. Since one direction is irreversible, it can only be left blank. The principle of exhaustion is very simple. As long as we know what algorithm is used to encrypt the ciphertext, we also use the same algorithm to run common phrases. If the result is the same as the ciphertext, you can guess it.

How fast is the effort-consuming? This is related to the encryption algorithm. How fast is encryption and how fast is it. For example, MD5 encryption is very fast. It takes 1 microsecond to encrypt each time. It takes only 1 microsecond to guess a phrase During the attack (assuming that the machine performance is the same, the phrase length is also similar ). The attacker can guess 1 million in one second, and this is only the speed of a single thread.

Therefore, the faster the encryption algorithm, the easier it will be to crack.

(Question map from: netdna-ssl.com)

 

0x02 slow Encryption

If the encryption time can be increased, it can obviously increase the cracking time. If encryption is increased to 10 ms at a time, the attacker can only guess 100 times per second, and the cracking speed is 10 thousand times slower. How can we make encryption slow? The simplest is to re-encrypt the encrypted results and repeat the results multiple times.

For example, if the original 1 microsecond encryption Repeat 10 thousand times, it would be 10 thousand times slower:

  1. for i =0~10000
  2. x = md5(x)
  3. end

Encryption takes a little more time in exchange for a large amount of time that attackers can crack.

In fact, such a "Slow encryption" algorithm already exists, suchbcrypt,PBKDF2And so on. They all have a difficulty factor, which can control the encryption time. If you want to slow down, it will be slow.

The slower the encryption, the longer the cracking time.

 

0x03 slow encryption application

The most time-consuming encryption is the password in the website database.

In recent years, we have often heard news about "database theft" on websites. User data is stored in plain text, which cannot be recovered if leaked. The password alone can also be used against attackers. However, many websites use Quick encryption algorithms, so they can easily break through a bunch of Weak Password accounts. Of course, sometimes you only want to crack the account of a specific person. As long as it is not a very complex word, it is likely to be broken out after several days.

However, if the website uses slow encryption, The results may be different. If we increase the encryption time by 100 times, it takes several months to crack and becomes unacceptable. Even if data leaks, the last "password" privacy can be guaranteed.

 

0x04 disadvantages of slow Encryption

However, slow encryption also has obvious disadvantages: it consumes a lot of computing resources. If multiple users are added to a website using slow encryption, the server CPU may not be enough. If a malicious user initiates a large number of login requests, the resource may even be exhausted. Performance and security are always difficult. Therefore, it generally does not use too high intensity.

Some large websites have even invested in clusters for processing a large amount of encrypted computing. However, this requires a lot of costs.

Is there any way for us to use powerful computing power and free computing resources?

 

0x05 front-end encryption

In the past, there was a big gap between the speed of personal computers and servers. However, with the development of hardware entering the bottleneck, this gap is narrowing. It is even comparable to single-line task processing. The client has powerful computing power. Can I share some server work? Especially for open-source but heavy computing tasks such as slow encryption algorithms, why not submit them to the client?

In the past, the plaintext password was submitted. Now, the submitted plaintext password is the "slow encryption result 」. Whether it is registration or login. The server does not need any changes. Use the received "Slow encryption result" as the original plaintext password. How to save it in the past, and how to save it now. In this way, even if the database is dragged, the attacker only cracks the "slow encryption result" and needs to crack it again to restore the "plaintext password 」.

In fact, the median value of "Slow encryption result" cannot be cracked! Because it is a hash value-random string, such as a 32-bit hexadecimal string, and the dictionary is a meaningful phrase, it is almost impossible to run to it! Unless the bytes are exhaustive one by one. However, there are 16 ^ 32 combinations, which are astronomical numbers. Therefore, the "slow encryption result" cannot be "Reversed" by the leaked ciphertext in the database.

Maybe you are thinking, even if you do not know the plain text password, you can directly use "Slow encryption result" to log on. In fact, when the backend storage is encrypted again, the hash value cannot be reversed.

Of course, it cannot be reversed, but it can be pushed smoothly. Execute the phrases in the dictionary one by one using the frontend and backend algorithms:

  1. back_fast_hash( front_slow_hash(password))

Then compare the ciphertext to check whether there are any guesses. In this way, you can use the run dictionary to crack the attack. However, with the front _ slow_hash obstacle, the cracking speed is greatly reduced.

 

0x06 against pre-Calculation

However, everything on the front end is public. Therefore, we all know the front_slow_hash algorithm. Attackers can use this algorithm to calculate the slow encryption results of frequently-used phrases in advance and create a new dictionary 」. After the database is dragged in the future, you can directly run the new dictionary.

To combat this method, you must use the classic method: adding salt. The simplest way is to use the user name as the salt value:

  1. front_slow_hash(password + username)

In this way, even with the same password, the "slow encryption result" is different for different users. You may say that the salt value is unreasonable because the user name is public. Attackers can create a dictionary for an important account.

Can a hidden salt value be provided? The answer is: no.

Because this is at the front end. The user hasn't logged on yet. Who's salt value is returned? You can obtain the account salt value before logging on. Isn't it public. Therefore, the salt values encrypted at the front end cannot be hidden and can only be disclosed. Of course, even if it is made public, it is better to provide a single salt value parameter than the user name. Because the user name remains unchanged, and the independent salt value can be changed regularly.

The salt value can be generated by the front end. For example, when registering:

  1. # Generate a salt value at the front end
  2. salt = rand()
  3. password = front_slow_hash(password + salt)
  4. # Adding a salt value to the submission
  5. submit(..., password, salt)

The backend stores the user's salt value. When Logging On, you can enter the user name to query the corresponding salt value of the User:

Of course, it should be noted that this interface can test whether the user exists, so there must be some control.

The replacement of the salt value is also very simple, and can even be completed automatically:

When the front-end encrypts the current password, it also opens a new thread to calculate the new salt value and new password. When submitting a request, all of them are included. If the current password is successfully verified, use the new password and new salt value to overwrite the old one. In this way, only the front-end computing power is used to replace the salt value.

All of this is automatic, which means you can change the password periodically without knowing it!

The ciphertext changes, and the dictionary for "specific salt value" becomes invalid. Attackers have to recreate it once.

 

0x07 intensity Policy

This is the end of cryptography. The implementation issues are discussed below.

In reality, users' computing power is unbalanced. Some use God-level configurations, and some also use antique hosts. In this way, it is difficult to set the encryption strength. If you log on to the antique host for dozens of seconds, it will definitely not work. In this case, only the following options are available:

  • Fixed strength

  • Variable Strength

 

1. Fixed strength

According to the configurations of the masses, a moderate intensity is developed, which is acceptable to most users. However, if it is not completed after the specified time, half of the Hash and the number of steps will be submitted, and the remaining part will be completed by the server.

  1. [Frontend] 70% completed ----> [backend] computing 30%

However, this requires a "serializable" algorithm to restore the progress on the server. If the computation has a large amount of temporary memory, this solution is not feasible. Compared with the previous 100% slow backend encryption, this small number of users can save a lot of server resources.

Users who request assistance must also have certain restrictions to prevent malicious exploitation of server resources.

 

2. Variable Strength

If the backend does not provide any assistance, you can only choose based on your own conditions. Users with poor configuration will be less encrypted. When a user registers, the encryption algorithm is not limited. The number of steps can be reached at a specific time:

  1. # [Registration phase] computing power evaluation (thread terminated after 1 second)
  2. while
  3. x = hash(x)
  4. step = step +1
  5. end

This step is the encryption strength and will be saved to his account information. Like the salt value, the intensity is also public. This is because the front-end encryption needs to know the strength value when logging on.

  1. # [Logon phase] obtain step first
  2. for i =0~ step
  3. x = hash(x)
  4. end

This solution enables high-Configuration Users to enjoy higher security. Low-Configuration Users will not affect basic usage. (Using a good computer can also improve security. It is a great sense of superiority ~)

But this has an important premise: Registration and login must be on devices with similar performance. If you use an account registered on a high-configuration computer and log on to the antique host one day, it will be a tragedy. It may not be possible to come out for half a day...

 

3. Dynamic Adjustment Plan

The above situations are common in reality. For example, if an account registered on the PC is logged on to the mobile terminal, the computing power is insufficient. If there is no backend assistance, you can only wait. If I often log on to a low-end device, do I have to wait? If you wait for one or two times, you can estimate your capabilities. Dynamically lower the encryption strength to better adapt to the current environment. In the future, if low-end devices are not used, they will be automatically adjusted back. Enables the encryption strength to dynamically adapt to the computing power of commonly used devices.

The implementation principle is similar to the automatic replacement of salt values in the previous section.

 

4. whimsical Solutions

The following is a brain hole-opening solution for YY, provided that the website has enough access traffic. If there are many online users, isn't they a bunch of free computing nodes? The problem of large computing volume is thrown to them.

But there are also some doubts about this. What should I do if I push it to the bad guys? Obviously, too much sensitive data cannot be put out. Nodes only perform computation and do not have to know or understand the ultimate goal of this task.

However, what should I do if I intentionally miscalculate the data when encountering a prank node? Therefore, it cannot be pushed to only one node. If you select more than one, the final result is consistent. This reduces the risk probability.

Compared with P2P computing, websites have a central and real-name structure, which makes it easier to manage websites. For prank users, penalty can be imposed; for users who have participated in the help, a certain reward is also given.

As you can imagine, continue to discuss the actual situation.

 

0x08 Performance Optimization

 

1. Why optimization?

Maybe you will ask, isn't "Slow encryption" a hope that computing will be slower? Why should we optimize it?

If this is a self-developed algorithm and cannot be understood by outsiders, it will be fine if it is not optimized. You can even put some empty loops in it to deliberately consume time. But in fact, we must choose public algorithms recommended by cryptology. Each of their operations is mathematical. Originally, only one CPU command is required for an operation. Because Two commands are not optimized enough, the additional time is the internal consumption. As a result, encryption takes longer, but the intensity has not increased.

 

2. Weakness of front-end computing

If it is a local program, you don't have to worry about this issue. Just give it to the compiler. But in the Web environment, we can only use a browser for computing! Compared with local programs, scripts are much slower, so the internal consumption will be large.

Why is the script slow? The main points are as follows:

  • Weak type

  • Interpreted type

  • Sandbox

 

3. Weak type

Scripts are used to process simple logic, not intensive computing, so there is no need for strong types. But now we have a black technology: asm. js. It can provide true strong types for JS through syntactic sugar. In this way, the computing speed is greatly improved, which can be close to the performance of local programs!

But what if the browser does not support asm. js? For example, there are a large number of Internet users in China, and their computing power is very low. Fortunately, there is another post-completion solution-Flash, which has the characteristics of various high-performance languages. Type.

Flash is slower than asm. js, but faster than IE.

 

4. interpreted type

Interpreted language requires not only syntax analysis, but also the performance improvement caused by "in-depth optimization during compilation. Fortunately, Mozilla provides a tool that can be compiled from C/C ++ into asm. js: emscripten. With it, you don't need to write it naked. In addition, the Code generated will be of higher quality after LLVM optimization during compilation.

In fact, this concept is already available in Flash. There was a tool named Alchemy that could cross-compile C/C ++ into Flash virtual machine commands, which was much faster than ActionScript.

Alchemy is now renamed FlasCC, and the open-source version of crossbridge

 

5. Sandbox

Some local languages seem simple operations, not necessarily in the sandbox. For example, array operation:

  1. vector[k]= v

The virtual machine must first check whether the index is out of bounds, otherwise there will be serious problems. If the "front-end slow encryption" algorithm involves a large amount of random memory access, there will be a lot of meaningless internal consumption, so you have to consider it carefully. But in some special cases, the script speed can even exceed the local program! For example, the MD5 mentioned at the beginning is calculated repeatedly.

This is not difficult to explain:

First, the MD5 algorithm is simple. If you do not perform memory operations such as look-up tables, local variables are used. The location of local variables is fixed to avoid overhead of cross-border checks. Second, emscripten's optimization capabilities are not inferior to local compilers. Finally, after the local program is compiled, the machine commands will not be changed. Now, the script engine has the JIT tool. It generates more optimized machine commands in real time based on running conditions.

Therefore, when you select an encryption algorithm, you must take into account the actual operating environment to maximize your strengths and circumvent weaknesses.

 

0x09 vs GPU

As we all know, using GPU to run passwords can be much faster. A gpu can be imagined as a processor with hundreds of thousands of cores, but only some simple commands can be executed. Although the single-core speed is less than the CPU speed, it can win by quantity. In brute force mode, you can extract thousands of words from the dictionary and run them at the same time, improving the cracking Efficiency.

Can we add some features in the algorithm to hit the weak points of the GPU?

 

1. Memory bottleneck

You have heard of litecoin. Unlike bitcoin, litecoin mining uses the scrypt algorithm. This algorithm relies heavily on memory and requires frequent reading and writing of a table. Although each GPU thread can be computed independently, there is only one video memory, which is shared by everyone. This means that at the same time, only one thread can operate the video memory, and others can only wait. In this way, the concurrency advantage is greatly curbed.

 

2. Porting difficulty

A coin named X11Coin appeared when the coin blossomed everywhere, and it is said to be able to defend against ASIC. Its principle is very simple, with 11 different encryption algorithms in it. In this way, the complexity of the corresponding ASIC is greatly increased. Although this is not a long-term confrontation solution, the idea can be used for reference. If one thing is too complex, many attackers are daunting. It is better to do something easier.

 

3. Other Ideas

The reason why GPUs are popular is that the current encryption algorithms are simple formula operations. This does not have much advantage for CPU. Can an algorithm be designed to fully rely on the advantages of CPU? CPU has many hidden strengths, such as pipelines. If the algorithm has a large number of condition branches, the GPU may not be good at it.

Of course, this is just an assumption. It is very difficult to create an encryption algorithm by yourself, and it is not recommended to do so.

 

0x0A extra meaning

In addition to reducing the password cracking speed, front-end slow encryption also has some other meanings:

 

1. reduce the risk of leakage

The plaintext password entered by the user is encrypted in the front-end memory. If you leave the browser, the risk of leakage is over. Even if the communication is eavesdropped or malicious Middleware on the server, the plaintext password cannot be obtained. Unless the webpage contains malicious code or the user system has malware.

 

2. Private text cannot be stored

Although most websites claim that they do not store users' plaintext passwords. But there is no evidence, and it may be stored quietly in private. If the website is encrypted on the front end, the website cannot obtain the user's plaintext password. Many websites are reluctant to use front-end encryption.

In fact, it doesn't matter whether the website is unwilling or not. We can build a single-host version of the slow encryption plug-in. When the webpage Password box is selected, our plug-in is displayed. Enter the password in the plug-in to start slow encryption calculation. Finally, enter the result in the password box on the page. In this way, all websites can be used. Of course, you cannot register an account. You must adjust it manually.

 

3. Increased database hit costs

Slow front-end encryption consumes users' computing power. This disadvantage is sometimes a good thing.

For normal users, the effect of waiting for one second during logon is not great. However, this is an obstacle for users who log on frequently. Who will log on frequently? It may be a database hit attacker. They are unable to drag the database of the website, so they are constantly testing Weak Password accounts through online login. If you use IP addresses to control the frequency, attackers can find a large number of proxies-how fast the network speed is, and how fast the attacker can try.

However, when the front-end slow encryption is used, each time an attacker attempts a password, a large amount of computing will be consumed, so the bottleneck will be stuck on the hardware-How fast can it be to try. So here is a bit similar to the meaning of PoW (Proof-of-Work, Proof of workload. We will introduce PoW in detail later.

 

What 0x0B cannot do

Although "front-end slow encryption" has many advantages, it is not omnipotent. As mentioned in the previous section, it can reduce risks rather than eliminate risks. If there is a problem in the local environment, any password input is risky.

The following is a scenario: A website uses "front-end slow encryption", but does not use HTTPS-this will cause the link to be eavesdropped. Looking back at section 0x05, if you get the "slow encryption result", you can directly log on to your account, even if you do not know your password. Indeed. But please think about it carefully. Does this reduce the loss?

Not only is the account stolen, but the plaintext password will also be leaked. Today, only the account is stolen, and the plaintext password cannot be obtained by the other party. Therefore, the real protection of front-end slow encryption is "password" rather than "account 」. The account is stolen and the password cannot be obtained!

If attackers can not only eavesdrop but also control traffic, they can inject attack scripts into the page to obtain the plaintext password. Of course, this is the same as computer poisoning and keyboard spying. It is not covered in this article. This article discusses database leakage scenarios.

 

0x0C multi-thread slow Encryption

User Configuration is getting better and better, many of which are quad-core and eight-core processors. Can I use the advantages of multithreading to break down slow encrypted computing? If each computing step depends on the previous results, it cannot be disassembled. For example:

  1. for i =0~10000
  2. x = hash(x)
  3. end

This is a serial computing. However, only parallel tasks can be divided into multiple small tasks. However, another method of Multithreading is also acceptable. For example, we use four threads:

  1. # Thread 1
  2. x1 = hash(password +"salt1")
  3. for i =0~2500
  4. x1 = hash(x1)
  5. end
  6. # Thread 2
  7. x2 = hash(password +"salt2")
  8. for i =0~2500
  9. x2 = hash(x2)
  10. end
  11. #...

Finally, the four results are combined and encrypted again as slow encryption results. But will this lead to easier cracking? Keep thinking for everyone.

 

0x0D Summary

Slow front-end encryption enables each user to contribute a small amount of computing resources, making encryption more powerful. Even if data is leaked, the computing power of all website users is also consolidated, which greatly increases the cracking cost.

 

0xFF postscript

When bitcoin became popular in the past few years, it was a sudden whim to use a browser to mine. Although not done, but obtained some cryptographic posture. I recently reorganized my work and added some new ideas, so I wrote a detailed article to share it. Cryptography belongs to the traditional field, so it can be more innovative only when combined with the popular Web technology.

If you have any questions about the algorithm, you can read this section 0x05 first.

If you are patiently reading this article, I hope you will get something :)

This article permanently updates the link address:

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.