Web Front-end slow Encryption
0x00 Preface
The world's martial arts, only fast. However, different passwords are encrypted. The faster the algorithm, the easier it is to break.
0x01 brute force cracking
Password cracking means restoring the encrypted password to a plaintext password. There seems to be many ways, but in the end, we have to go one way: brute force.
You may say that you can also look up the table and get the results in an instant. Although you do not need to make a look-up table, the table manufacturing process is still required. The lookup only advances the effort.
As long as the algorithm is irreversible, it can only be exhaustive.
The principle of exhaustion is very simple. As long as we know what algorithm is used to encrypt the ciphertext, we also use the same algorithm to run common phrases. If the result is the same as the ciphertext, you can guess it.
How fast is the effort-consuming? This is related to the encryption algorithm. How fast is encryption and how fast is it.
For example, MD5 encryption is very fast. It takes 1 microsecond to encrypt each time. It takes only 1 microsecond to guess a phrase During the attack (assuming that the machine performance is the same, the phrase length is also similar ). The attacker can guess 1 million in one second, and this is only the speed of a single thread.
Therefore, the faster the encryption algorithm, the easier it will be to crack.
0x02 slow Encryption
If the encryption time can be increased, it can obviously increase the cracking time.
If encryption is increased to 10 ms at a time, the attacker can only guess 100 times per second, and the cracking speed is 10 thousand times slower.
How can we make encryption slow? The simplest is to re-encrypt the encrypted results and repeat the results multiple times.
For example, if the original 1 microsecond encryption Repeat 10 thousand times, it would be 10 thousand times slower:
for i = 0 ~ 10000 x = md5(x)end
Encryption takes a little more time in exchange for a large amount of time that attackers can crack.
In fact, such a "Slow encryption" algorithm already exists, such as bcrypt and scrypt. They all have a difficulty factor, which can control the encryption time. If you want to slow down, it will be slow.
The slower the encryption, the longer the cracking time.
0x03 slow encryption application
The most time-consuming encryption is the password in the website database.
In recent years, we have often heard news about "database theft" on websites. User data is stored in plain text, which cannot be recovered if leaked. The password alone can also be used against attackers.
However, many websites use Quick encryption algorithms, so they can easily break through a bunch of Weak Password accounts.
Of course, sometimes you only want to crack the account of a specific person. As long as the password is not complex and runs for a few days, it is very likely to be broken out.
However, if the website uses slow encryption, The results may be different. If we increase the encryption time by 100 times, it takes several months to crack and becomes unacceptable.
Even if data leaks, the last "password" privacy can be guaranteed.
Disadvantages of 0x04 slow Encryption
However, slow encryption also has obvious disadvantages: it consumes a lot of computing resources.
If multiple users are added to a website using slow encryption, the server CPU may not be enough. If a malicious user initiates a large number of login requests, the resource may even be exhausted.
Therefore, both performance and security are hard to achieve.
Some large websites have even invested in clusters for processing a large amount of encrypted computing. However, this requires a lot of costs.
Is there any way for us to use powerful computing power and free computing resources?
0x05 front-end encryption
In the past, there was a big gap between the performance of personal computers and servers. However, with the development of hardware entering the bottleneck, this gap is narrowing. It is even comparable to single-line task processing.
The client has powerful computing power. Can I share some server work?
Especially for open-source but heavy computing tasks such as slow encryption algorithms, why not submit them to the client?
In the past, the plaintext password was submitted. Now, the submitted plaintext password is the slow encryption result. Whether it is registration or login.
The server does not need any changes. Will receive the slow encryption result,As the original plaintext PasswordThat's all. How to save it in the past, and how to save it now.
In this way, even if the database is dragged, attackers only crack the encryption results. They still need to crack it again to restore the "plaintext password 」.
In fact, the median of slow encryption results cannot be cracked!
Because it is a hash value-random string, such as a 32-bit hexadecimal stringDictionaries are meaningful phrases., Almost impossible to run to it!
Unless the bytes are exhaustive one by one. However, there are 16 ^ 32 combinations, which are astronomical numbers.
Therefore, the "slow encryption result" cannot be "Reversed" by the leaked ciphertext in the database.
Maybe you are thinking, even if you do not know the plain text password, you can directly use "Slow encryption result" to log on. In fact, when the backend storage is encrypted again, the hash value cannot be reversed.
Of course, it cannot be reversed, but it can be pushed smoothly. Call the phrases in the dictionary one by one using the frontend and backend algorithms:
back_fast_hash( front_slow_hash(password) )
Then compare the ciphertext to check whether there are any guesses. In this way, you can use the run dictionary to crack the attack.
However, with the front _ slow_hash obstacle, the cracking speed is greatly reduced.
0x06 against pre-Calculation
However, everything on the front end is public. Therefore, we all know the front_slow_hash algorithm.
Attackers can use this algorithm to calculate the slow encryption results of frequently-used phrases in advance and create a new dictionary 」. After the database is dragged in the future, you can directly run the new dictionary.
To combat this method, you must use the classic method: adding salt. The simplest way is to use the user name as the salt value:
front_slow_hash(password + username)
In this way, even with the same password, the "slow encryption result" is different for different users.
You may say that the salt value is unreasonable because the user name is public. Attackers can create a dictionary for an important account.
Can a hidden salt value be provided? The answer is: no.
Because this is at the front end. The user hasn't logged on yet. Who's salt value is returned? The account salt value can be obtained before login. Isn't it public.
Therefore, the salt values encrypted at the front end cannot be hidden and can only be disclosed.
Of course, even if it is made public, it is better to provide a single salt value parameter than the user name. Because the user name remains unchanged, and the independent salt value can be changed regularly.
The salt value can be generated by the front end. For example, when registering:
# Generate a salt value salt = rand () password = front_slow_hash (password + salt) # Bring the salt value submit (..., password, salt) to the submission)
The backend stores the user's salt value.
When Logging On, you can enter the user name to query the corresponding salt value of the User:
Of course, it should be noted that this interface can test whether the user exists, so there must be some control.
The replacement of the salt value is also very simple, and can even be completed automatically:
When the front-end encrypts the current password, it also opens a new thread to calculate the new salt value and new password. When submitting a request, all of them are included.
If the current password is successfully verified, use the new password and new salt value to overwrite the old one.
In this way, only the front-end computing power is used to replace the salt value.
All of this is automatic, equivalent:If the user is unaware of it, change the password regularly!
The ciphertext changes, and the dictionary for "specific salt value" becomes invalid. You have to recreate it.
0x07 intensity Policy
This is the end of cryptography. The implementation issues are discussed below.
In reality, users' computing power is unbalanced. Some use God-level configurations, and some also use antique hosts. In this way, it is difficult to set the encryption strength.
If you log on to the antique host for dozens of seconds, it will definitely not work. In this case, only the following options are available:
Variable Strength
1. Fixed strength
According to the configurations of the masses, a moderate intensity is developed, which is acceptable to most users.
However, if it is not completed after the specified time, half of the Hash and the number of steps will be submitted, and the remaining part will be completed by the server.
[Frontend] 70% completed ----> [backend] computing 30%
However, this requires a "serializable" algorithm to restore the progress on the server. If the computation has a large amount of temporary memory, this solution is not feasible.
Compared with the previous 100% slow backend encryption, this small number of users can save a lot of server resources.
Users who request assistance must also have certain restrictions to prevent malicious exploitation of server resources.
2. Variable Strength
If the backend does not provide any assistance, you can only choose based on your own conditions. Users with poor configuration will be less encrypted.
When a user registers, the encryption algorithm is not limited to a few steps to see how many steps can be calculated at a specific time:
# [Registration phase] computing power evaluation (thread terminated after 1 second) while x = hash (x) step = step + 1end
This step is the encryption strength and will be saved to his account information.
Like the salt value, the intensity is also public. This is because the front-end encryption needs to know the strength value when logging on.
# [Logon phase] obtain step for I = 0 ~ Step x = hash (x) end
This solution enables high-Configuration Users to enjoy higher security. Low-Configuration Users will not affect basic usage. (Using a good computer can also improve security. It is a great sense of superiority ~)
But this has an important premise: Registration and login must be on devices with similar performance.
If you use an account registered on a high-configuration computer and log on to the antique host one day, it will be a tragedy. It may not be possible to come out for half a day...
3. Dynamic Adjustment Plan
The above situations are common in reality. For example, if an account registered on the PC is logged on to the mobile terminal, the computing power is insufficient.
If there is no backend assistance, you can only wait. If I often log on to a low-end device, do I have to wait?
If you wait for one or two times, you can estimate your capabilities. Dynamically lower the encryption strength to better adapt to the current environment.
In the future, if low-end devices are not used, they will be automatically adjusted back. Enables the encryption strength to dynamically adapt to the computing power of commonly used devices.
The implementation principle is similar to the automatic replacement of salt values in the previous section.
4. whimsical Solutions
The following is a brain hole-opening solution for YY, provided that the website has enough access traffic.
If there are many online users, isn't they a bunch of free computing nodes? The problem of large computing volume is thrown to them.
But there are also some doubts about this. What should I do if I push it to the bad guys?
Obviously, too much sensitive data cannot be put out. Nodes only perform computation and do not have to know or understand the ultimate goal of this task.
However, what should I do if I intentionally miscalculate the data when encountering a prank node?
Therefore, it cannot be pushed to only one node. If you select more than one, the final result is consistent. This reduces the risk probability.
Compared with P2P computing, websites have a central and real-name structure, which makes it easier to manage websites. For prank users, penalty can be imposed; for users who have participated in the help, a certain reward is also given.
As you can imagine, continue to discuss the actual situation.
0x08 Performance Optimization
1. Why optimization?
Maybe you will ask, isn't "Slow encryption" a hope that computing will be slower? Why should we optimize it?
If this is a self-developed algorithm and cannot be understood by outsiders, it will be fine if it is not optimized. You can even put some empty loops in it to deliberately consume time.
But in fact, we must choose public algorithms recommended by cryptology. Each of their operations is mathematical.
Originally, only one CPU command is required for an operation. Because Two commands are not optimized enough, the additional time is the internal consumption. It takes longer, but the intensity has not increased.
2. Weakness of front-end computing
If it is a local program, you don't have to worry about this issue. Just give it to the compiler.
But in the Web environment, we can only use a browser for computing! Compared with local programs, scripts are much slower, so the internal consumption will be large.
Why is the script slow? The main points are as follows:
Weak interpretation sandbox
3. Weak type
Scripts are used to process simple logic, not intensive computing, so there is no need for strong types.
But now we have a black technology: asm. js. It can provide true strong types for JS through syntactic sugar.
In this way, the computing speed is greatly improved, which can be close to the performance of local programs!
But what if the browser does not support asm. js? For example, there are a large number of Internet users in China, and their computing power is very low.
Fortunately, there is also a post-completion solution-Flash, which has the characteristics of various high-performance languages. Type.
Flash is slower than asm. js, but faster than IE.
4. interpreted type
Interpreted language requires not only syntax analysis, but also the performance improvement caused by "in-depth optimization during compilation.
Fortunately, Mozilla provides a tool that can be compiled from C/C ++ into asm. js: emscripten.
With it, you don't need to write it naked. In addition, the Code generated will be of higher quality after LLVM optimization during compilation.
In fact, this concept is already available in Flash.
There was a tool named Alchemy that could cross-compile C/C ++ into Flash virtual machine commands, which was much faster than ActionScript.
Alchemy is now renamed FlasCC, and the open-source version of crossbridge
5. Sandbox
Some local languages seem simple operations, not necessarily in the sandbox. For example, array operation:
vector[k] = v
The virtual machine must first check whether the index is out of bounds, otherwise there will be serious problems.
If the "front-end slow encryption" algorithm involves a large amount of random memory access, there will be a lot of meaningless internal consumption, so you have to consider it carefully.
But in some special cases, the script speed can even exceed the local program! For example, the MD5 mentioned at the beginning is calculated repeatedly.
This is not difficult to explain:
First, the MD5 algorithm is simple. If you do not perform memory operations such as look-up tables, local variables are used. The location of local variables is fixed to avoid overhead of cross-border checks. Second, emscripten's optimization capabilities are not inferior to local compilers. Finally, after the local program is compiled, the machine commands will not be changed. Now, the script engine has the JIT tool. It generates more optimized machine commands in real time based on running conditions.
Therefore, when you select an encryption algorithm, you must take into account the actual operating environment to maximize your strengths and circumvent weaknesses.
0x09 vs GPU
As we all know, using GPU to run passwords can be much faster.
A gpu can be imagined as a processor with hundreds of thousands of cores, but only some simple commands can be executed. Although the single-core speed is less than the CPU speed, it can win by quantity.
In brute force mode, you can extract thousands of words from the dictionary and run them at the same time, improving the cracking Efficiency.
Can we add some features in the algorithm to hit the weak points of the GPU?
1. Memory bottleneck
You have heard of litecoin. Unlike bitcoin, litecoin uses the scrypt algorithm.
This algorithm relies heavily on memory and requires frequent reading and writing of a table. Although each GPU thread can be computed independently, there is only one video memory, which is shared by everyone.
This means that at the same time, only one thread can operate the video memory, and others can only wait. In this way, the concurrency advantage is greatly curbed.
2. Porting difficulty
A coin named X11Coin appeared when the coin blossomed everywhere, and it is said to be able to defend against ASIC.
Its principle is very simple, with 11 different encryption algorithms in it. In this way, the complexity of the corresponding ASIC is greatly increased.
Although this is not a long-term confrontation solution, the idea can be used for reference. If one thing is too complex, many attackers are daunting. It is better to do something easier.
3. Other Ideas
The reason why GPUs are popular is that the current encryption algorithms are simple formula operations. This does not have much advantage for CPU.
Can an algorithm be designed to fully rely on the advantages of CPU?
CPU has many hidden strengths, such as pipelines. If the algorithm has a large number of condition branches, the GPU may not be good at it.
Of course, this is just an assumption. It is very difficult to create an encryption algorithm by yourself, and it is not recommended to do so.
0x0A additional meaning of "front-end slow encryption"
In addition to reducing the password cracking speed, front-end slow encryption also has some other meanings:
1. reduce the risk of leakage
The plaintext password entered by the user is encrypted in the front-end memory. If you leave the browser, the risk of leakage is over.
Even if the link is eavesdropped or malicious Middleware on the server, the plaintext password cannot be obtained.
Unless the webpage contains malicious code or the user system has malware.
2. Private text cannot be stored
Although most websites claim that they do not store users' plaintext passwords. But there is no evidence, and it may be stored quietly in private.
If the website is encrypted on the front end, the website cannot obtain the user's plaintext password.
Many websites are reluctant to use front-end encryption.
In fact, it doesn't matter whether the website is unwilling or not. We can build a single-host version of the slow encryption plug-in.
When the webpage Password box is selected, our plug-in is displayed. Enter the password in the plug-in to start slow encryption calculation. Finally, enter the result in the password box on the page.
In this way, all websites can be used. Of course, you cannot register an account. You must adjust it manually.
3. Increased database hit costs
Slow front-end encryption consumes users' computing power. This disadvantage is sometimes a good thing.
For normal users, the effect of waiting for one second during logon is not great. However, this is an obstacle for users who log on frequently.
Who will log on frequently? It may be a database hit attacker. They are unable to drag the database of the website, so they are constantly testing Weak Password accounts through online login.
If you use IP addresses to control the frequency, attackers can find a large number of proxies-how fast the network speed is, and how fast the attacker can try.
However, when the front-end slow encryption is used, each time an attacker attempts a password, a large amount of computing will be consumed, so the bottleneck will be stuck on the hardware-How fast can it be to try.
So here is a bit similar to the meaning of PoW (Proof-of-Work, Proof of workload. We will introduce PoW in detail later.
Can 0x0B slow encryption be computed in parallel?
User Configuration is getting better and better, many of which are quad-core and eight-core processors. Can I use the advantages of multithreading to break down slow encrypted computing?
If each computing step depends on the previous results, it cannot be disassembled. For example:
for i = 0 ~ 10000 x = hash(x)end
This is a serial computing. However, only parallel tasks can be divided into multiple small tasks.
However, another method of Multithreading is also acceptable. For example, we use four threads:
# Thread 1x1 = passwordfor I = 0 ~ 2500x1 = hash (x1 + "salt1") end # thread 2x2 = passwordfor I = 0 ~ 2500x2 = hash (x2 + "salt2") end #...
Finally, the four results are combined and encrypted again as slow encryption results.
But will this cause the intensity to decrease? Keep thinking for everyone.
0x0C Summary
Slow front-end encryption enables each user to contribute a small amount of computing resources, making encryption more powerful.
Even if data is leaked, the computing power of all website users is also consolidated, which greatly increases the cracking cost.
0xFF postscript
When bitcoin became popular in the past few years, it was a sudden whim to use a browser to mine. Although not done, but obtained some cryptographic posture.
I recently reorganized my work and added some new ideas, so I wrote a detailed article to share it.
Cryptography belongs to the traditional field, so it can be more innovative only when combined with the popular Web technology.
If you have any questions about the algorithm, you can read this section 0x05 first.
If you are patiently reading this article, I hope you will get something :)