In the afternoon, a friend asked some php stuff and later thought that QZ was writing a php variable SECURITY Article recently.
So I went to the read-through address as follows:
Talking about PHP variable security: http://www.bkjia.com/Article/201110/108389.html
PHP variable safety continued: http://www.bkjia.com/Article/201110/108536.html
Talking about PHP variable security Plugin: http://www.bkjia.com/Article/201110/108551.html
In this article, what QZ mainly wants to express is:
In what scenarios will variables become code execution? You can understand php security through prototype.
By learning the basics of point compilation principles
Let me extend something interesting :)
Bytes ---------------------------------------------------------------------------------------------------------------
Directory
0x01 Language Features
0x02 current web defense Layers
0x03 language security for the next battlefield
Bytes ---------------------------------------------------------------------------------------------------------------
[*] 0x01 Language Features
Unusual versions of the so-called xx language are circulating on the Internet
Well-known
Six abnormal C Hello World programs
Http://www.bkjia.com/kf/201110/108778.html
Abnormal JavaScript code
Http://utf-8.jp/public/jjencode.html
Non alphanumeric code
Http://www.thespanner.co.uk/2011/09/22/non-alphanumeric-code-in-php/
Its originator seems to be that brainfuck has played wargame before.
Http://zh.wikipedia.org/zh/Brainfuck
More unusual metaprogramming
Http://zh.wikipedia.org/zh-cn/%E5%85%83%E7%BC%96%E7%A8%8B
You can generate your own or other languages from the language.
We may encounter the following situation when using webshell in php:
A page with Code Injection vul. php
However, it may have the malicious code detection function.
Checks whether dangerous function names, such as eval system, are passed.
We can use vul. php to write a new php file.
Use fopen fwrite and other unfiltered Functions
Write a new webshell file shell. php.
For eval detection
We can also use string combinations to bypass signature detection.
$ A = 'ev ';
$ B = 'al ';
$ C = $ a. $ B
You can come up with a variety of combination Techniques
From these examples, we can see that
We do not know much about many features of the language as we think.
However, we generally do not pay attention to it.
The reason is as follows:
These writing methods are usually not used in the development field.
This code writing method is hard to read and maintain.
And software engineering principles are intended to violate
But note that we usually look at these language features from the perspective of programmers/developers.
From another point of view, what will happen from the perspective of security personnel?
1. Special writing format Bypass Detection (javascript abnormal writing format bypass pattern detection shellcode explicit character encoding bypass IDS dangerous character detection, etc)
2. Special writing format hiding bypass signature scanning tools (like Tiny Php Webshell encoding methods can bypass many webshell detection tools)
3. metaprogramming Bypass Detection (use the Code itself to generate new code to bypass signature detection)
4. security problems that may be caused by language features (for example, some security problems with variable variables written by QZ)
Bytes ---------------------------------------------------------------------------------------------------------------
[*] 0x02 current web defense Layer
From fuzzing to I know all of it is the ultimate goal of hackers on knowledge.
I used to have an idea for testing based on the Compilation Principle (it was the first time I had a discussion with Dr. Shi)
Http://www.bkjia.com/ebook/201110/29961.html
The idea at the time was to use the black box test of SQL Injection (whether attack or defense)
The general idea is
Currently, web defense is generally based on the pattern detection written by security experience.
If a gorgeous hacker thinks of a special technique that bypasses this pattern detection
Then the web defense will be ineffective.
Malicious user input to form SQL Injection
Then his input will inevitably be a subset of the SQL language
It is a part of the language and can be recognized as the language.
Therefore, it obtains user input through a script and passes it to the database application for interpretation and execution.
In this process, it must undergo lexical scanning and semantic recognition of database applications.
In the same way, if I have a complete set of lexical scanning path maps and semantic trees
I can fully understand how user input is recognized as a word or even a last sentence.
Note that in all cases, I can know whether it is encoding or format.
I know how a piece of data is recognized as a language
On the contrary, I can identify all the data as a language and use a complete set of signatures to block dead and inject attacks.
For example, write a plug-in for the database software.
Between the script language and the database command interpretation and Execution Component
Ability to fully recognize data and code
Block malicious user input
The key to such security problems lies in the recognition of user input.
Data or code
In fact, the current network interaction and communication are based on a variety of protocols, a variety of environments, a variety of languages combined into a system
Language Recognition is everywhere.
Both compiled and interpreted languages
The underlying recognition process of this language is the process of lexical scanning and semantic recognition.
The following figure shows a simple lexical recognition path.
The mainstream web Security Detection we see on the market is pattern detection.
(It is usually used for recognition of language keywords or malicious code, such as detecting whether user input contains the SQL language keyword "select)
The current detection level has not yet risen to the level of data and code Dynamic Identification Detection
Therefore, Code Injection attacks may occur.
Typical Code Injection
SQL Injection (for example, MySQL Injection)
Command Injection (for example, OS Command Injection)
Script Injection (for example, php Code Injection)
Causes of such security issues
1. This type of language/environment allows dynamic transmission and interpretation of execution (For ease of use, this is understandable and unavoidable)
2. There is no data and Code recognition detection mechanism to defend against user Code Injection (not yet available on the market)
If you reclassify a security level
We can have a new way to classify
Protocol Security (security between protocols: ddos caused by icmp)
Environment Security (Operating System Security: overflow Elevation of Privilege System File Format features other special features of the system)
(Application Software's own security: a connection point between system security and language security. Two types of security problems may occur on it)
Language Security (user input security: malicious code injection)
Bytes ---------------------------------------------------------------------------------------------------------------
[*] 0x03 language security for the next battlefield
At present, there seems to be no dedicated security research in the field of language recognition.
Of course, the cost of studying this is indeed very high.
But once someone studies it
It will lead to a new form of security storm.
Let's talk about the cost first.
Or the idea of my testing based on compilation principles
At first, I discussed with Dr. Shi about the feasibility of implementation.
However, as a defense component, it is still unknown whether the load is appropriate in the case of large traffic volumes.
And the cost is unacceptable for a non-research company or department.
What should I do to complete it?
Collect complete lexical scan path charts and semantic trees
We use MySQL injection defense as an example.
MySQL is open-source and relatively feasible.
You can find its lexical files and so on.
However, it takes time to rebuild it into a very large and complete lexical scanning path table and semantic tree.
The cost is not at least 1-2 months (at least for me)
People with compilation principles may be more efficient in doing this.
What about closed-source applications like MSSQL?
To restore the complete lexical scan path table and semantic tree
I'm afraid it's not that simple.
Of course, we can also use the analogy to solve the problem of MySQL first.
MSSQL is at least not too far away
Open-source software such as php MySQL is willing to spend the cost or have an efficient team to do this.
The irony is that the creators of these open-source software are actually the best candidates to restore these resources ......
Naturally, they designed and implemented the most primitive lexical scan path tables and semantic tree resources.
Once restored, as I mentioned earlier, both attack and defense will be fully upgraded.
Because we have understood all the data-> Code Recognition Methods
According to this rule, various variants can be derived to bypass the pattern detection written based on security experience on the market.
The security attack and defense of the entire Internet will enter a brand new field.
Finally, as a person, I am very optimistic about language security.
Welcome to all kinds of shoes and Corrections :)
From: hi.baidu.com/hackercasper