Adversarial ROBOT: Build a WAF that combines front and back ends
We have introduced some man-in-the-middle attack solutions that combine front and back ends. Due to the particularity of Web programs, the participation of front-end scripts can greatly make up for the shortcomings of the backend, so as to achieve the traditional hard-to-achieve effect.
Since the attack can be used for attacks, similar ideas can also be used for defense. If we integrate the front-end technology into traditional WAF, how can we improve it?
Robot threats
Note: In the author's original Article, the term "Dummy" is used, and the mini-editor is changed to robot.
Ease of use is the biggest advantage of Web services. However, this is also a fatal weakness.
This simple format and consistent standards allow attackers to use existing security tools for large-scale and universal detection and intrusion. You don't even need to know how it works.
Imagine that if a website uses a private binary protocol, communication issues must be considered in case of a vulnerability. If it is pinned on ready-made security tools, it will be even harder. However, in reality, it does not exist. Universality and low cost are always the primary factors.
It is easy to imitate this simple protocol. As a result, robots can be seen in various places where repetitive work is required. It is essential for the security field that requires repeated tests.
Traditional WAF
Traditional WAF mostly focuses on information monitoring, recording and intercepting abnormal input and output. It is not the focus of user authenticity identification.
However, in reality, most abnormal requests are not initiated by normal users. Who would be so idle to input all kinds of accounts to test and hit the database again and again? Or repeatedly try Intranet detection in the browser? Without the assistance of tools, security detection will be extremely difficult.
Unfortunately, it is difficult for WAF to identify the authenticity of users on the surface, and it can only be treated equally. The decision can be made only after more detailed rules are comprehensively analyzed. Therefore, such a decision seems to be "Justified 』.
Rational. For example, a normal user has only a few requests per second, but the attacker has opened a vulnerability scanning tool and generated hundreds of requests in a short time. This is obviously not in line with common sense.
No data. Although it is not difficult to determine a user, it is difficult to come up with conclusive evidence.
This fuzzy rule will inevitably have some errors. If someone intrude into the website in the Intranet, normal users may also be blocked. Alternatively, attackers may slow down the scanning speed to avoid monitoring.
Of course, a set of good rules and models can make interception more accurate, but this requires a lot of analysis and accumulation. Can we find another simple and reliable solution for special groups such as Web?
As mentioned in the traffic hijacking series described earlier, backend analysis is very passive. Fortunately, it controls the traffic power, and the special traffic such as Web is also executable. Therefore, WAF enables attacks and opens up a new combat solution.
Therefore, we need to use the front-end technology to achieve the ultimate goal-to build a rational rule system.
However, a convincing piece of evidence will not be born out of nothing. It must be made by people and brought along as appropriate to achieve true "JUSTIFICATION 』.
Existing solutions
Before starting our "front-end WAF" concept, it is necessary to provide the existing solutions.
Security engineers can give a solution without thinking about this situation. For example, a unique random number and timestamp are generated on the page, and the backend is verified after encryption.
If it is just to solve the case, it is understandable. However, in reality, such requirements are not uncommon. If such a solution is to be implemented for each business, the development and maintenance costs of the front and back ends will be greatly increased.
Therefore, it is unreasonable to give some specific solutions to developers. For developers, they should devote all their energy to product business development. Anything unrelated to them should be handed over to the adaptation layer so that developers can automatically implement them without understanding any details.
Therefore, we need a front-end and back-end plane system to solve these trivial problems transparently. In this way, large-scale deployment and subsequent unified updates and maintenance can be performed.
Combine front and back ends with WAF
As mentioned, Web "Middleware" is naturally our starting point. Similar to man-in-the-middle hijacking, we inject a script into the page to enable the frontend function.
It was difficult in the past to provide more detailed information by dispatching a sentry on the front line. With the ever-changing front-end technologies, we have begun to develop a brand new system.
As mentioned above, if you can provide evidence to prove that you are not an outsider when initiating a request, the backend will be much easier to handle-robots that do not understand the rules will naturally immediately expose vulnerabilities.
Therefore, we make a plane for page I/O: Bring a dark sign containing various private information for backend verification at the moment before the request is sent.
With the technology used in the previous "Future of SSLStrip-HTTPS front-end hijacking", requests at the front-end can be intercepted with slight modifications.
Today, our goal is simpler, just to carry an additional parameter. Therefore, for requests on the same site, we can automatically bring the parameters into the request without having to modify the target URL and store them in cookies.
Therefore, developers can obtain more secure defense without any modifications.
Key policy
In the key, we can store the context of various environments, for example:
A random number can be used only once to prevent request replay.
Check value of form data to prevent package changes.
The current timestamp allows the backend to more accurately understand the packet sending interval.
Check whether the browser BOM is consistent with the browser described by UserAgent.
......
The private algorithm is used to encode the code into a secret key.
Of course, this key does not require strict verification every time. In fact, for the first access, there is no secret key; or the network requests outside the hook, instance slices and other resource files cannot guarantee that each access has a unique key.
In the era of rich Web applications, "AJAX" and "JSONP" carry the vast majority of interface requests, so we need to strictly defend against them. The risk of normal static resources is much lower and can be looser.
Later confrontation
However, similar systems have been tried before, but they are all controversial. The reason is very simple. The key is generated at the front end, and the secret can be obtained by viewing the page source code. Once an algorithm is unlocked, robots can impersonate real users, and the entire system will lose its meaning.
This is also why the "front-end middleware" is marked as a black background-we need to confuse the front-end scripts to make it difficult for attackers to crack the algorithms in the short term.
So we can convert the confrontation on the network into a competition of reverse technology. Attackers need more skills to increase the intrusion threshold.
Black box confrontation
Even if it cannot be cracked, attackers can try their best to use it as a black box. Let's take a few predictions for attack and defense simulation.
No.1
Attackers can ignore the encryption details and directly forward robot requests to the page for proxy, so it looks like a normal service is initiated.
In this case, it is difficult to have absolute defense measures. However, a simple policy can be used to limit the request frequency to reduce the attack speed.
We set a relatively loose request threshold. If the threshold is reached within a certain period of time, we will let the front-end hook Pending hold the request and wait for it to be sent. And the more accumulation, the longer it will be.
If normal users perform operations too quickly in a short period of time, resulting in excessive requests, the penalty will be appropriate for a few seconds (if the request is too fast, it will be stuck). However, the number of requests continues to remain high, that's suspicious. Should I have a good rest?
We record the number of requests to the global storage and share them among multiple pages to prevent slow refreshing. When the page is closed, it is retained. We will continue to punish the page next time to avoid repeated page resetting.
No. 2
Even so, attackers can come up with some solutions. For example, open several different browsers or even virtual machines, as a completely isolated environment, and gradually click it.
To deal with such a robot, we must use a reliable identification method: user behavior analysis.
Normal user page browsing is always accompanied by the scroll wheel, movement, click, touch screen and other events. In addition, most network requests are driven by these events. If the page is not moving, but the request is continuously sent, it is likely to be opened.
You can even consider submitting the acquired behavior data to the backend for analysis through the key to create a more detailed behavior model.
No. 3
Of course, secrets will always be discovered. The threshold of behavior collection is hard for attackers. Today, there are not a few robots capable of simulating user behavior. They can imitate various events in a realistic manner, and we can only analyze them first.
However, we can raise the threshold for attackers to this step, and our goal is also achieved-we do not want to block robots by 100%, but to combat and reduce robots.
End
In the black box confrontation, because attackers do not know the truth, they can only find the "Guess" trick, which is very passive. From time to time, we can update scripts, adjust policies, or add some clever ideas to constantly torture attackers.
Attackers are reluctant to fight against the black box for a long time and will try to crack the script.
Reverse confrontation
Fortunately, compared with traditional languages, JavaScript is still popular for a short time, and there are few mature reverse tools. In addition, running on a browser involves various DOM and BOM, so you have to learn a lot about front-end knowledge.
A good obfuscator requires a lot of theoretical knowledge. But we don't have to be that advanced-we only need to go further than attackers think.
In actual confrontation, there is no need to be too entangled in the "technology" layer, but more is the need for "strategy 』. Some things are actually very simple, but I just can't think of them. Here we will not discuss obfuscation technology in detail. We will share several non-technical confrontation cases.
Shelling and confusion
The real script obfuscator should disrupt the original code structure and add redundant statements to increase debugging complexity.
However, there are currently quite a few obfuscators, just adding a shell -- encrypting the original code and decrypting it at runtime before eval execution. To solve this "Confusion", it is easy to replace eval with alert and it will become a raw image. I believe everyone has tried it.
Of course, we can also use some naive ideas to confuse true and false.
We obviously won't "shell" the original code, but we can prepare a set of pseudo code and pretend to decrypt it before eval. This set of fake code looks the same as it does, but the functions in it won't be triggered, just to confuse it, leading attackers to the wrong direction, thus wasting their time.
The real code is included in the decryption area and has started running before shelling.
When decrypting pseudocode, you can also insert a large amount of useless content into it. For example, the dazzling special symbols and thousands of line breaks impede normal reading.
Although this cannot solve the fundamental problem, it can consume the attacker's energy. This is a non-technical confrontation.
Honeypot phishing
If you want to reverse, you have to take a rough look at the script first. In the dazzling code, a piece of readable text is like a red box in the woods. Attackers can use it to attract their eyes and get hooked.
For example, in a string at the beginning of the Code 『... Compressed by xxxtool 』. When attackers see this text, they will be curious about what tool xxxtool is, so they will go online for a search.
We create a simple web page in advance to provide online JavaScript encryption and decryption. The name is xxxtool, a very special name. We never promote this page. Normally, no one will visit it.
When attackers find this website, they will naturally come in. The discovery also provides online decryption, so that you can find the desired medicine and immediately stick the code to it.
If we fall into our Honeypot, we will inevitably be fooled. If it is a normal script, the results of the common tool will be displayed. Once you find that your code is being decrypted, you can quickly come up with the pre-prepared pseudo code, so that the attacker may mistakenly think it was successfully unlocked.
At the same time, as soon as attackers access our web page, they will immediately report it to the cloud to learn who is studying our scripts in a timely manner. You can even record the visitor's IP address and notify WAF to block it for a period of time.
Of course, it is not that complicated. We can request a special page in a branch that never reaches. If one day we find that this page has access traffic, it is clearly a curious person trying to crack it.
End
There are still many similar ideas for confrontation, so we will share them later. Only by combining technology with strategy can we be better at confrontation.
Back to the question, even if the system is cracked, there will be no major loss. We can continue our defense by changing a set of key algorithms and obfuscation schemes. Automated deployment makes update and maintenance easier and provides a strong guarantee for persistent confrontation.
Postscript
Crossing one field can make the thinking space more broad and more optional solutions.
For attackers, they need to master more skills to increase the intrusion threshold.