Huffy: shellcode (1)
When I first saw "shellcode", I felt very tall. In fact, after a long time of contact, you will find that it is actually just a piece of code (or data filling ), it is a targeted code sent to the server to exploit a specific vulnerability. Generally, it can be used to obtain certain permissions. Today, we will jointly learn a new shellcode encoding method, Huffy, that is, the shellcode Based on the Harman encoding. This method uses the features of the Harman tree to compress shellcode, in order to achieve the goal of "short and refined.
Harman tree
This method is called Huffy, and I recently solved a problem related to the user tree, so the first thing I think of is the user tree.
If you do not know what a user is, I will give you a brief description here. The table store is a simple data structure that can be used to compress data. Built by reading the input content and then creating a tree, the highest-frequency characters appear near the top of the tree, and the lowest-frequency characters are near the bottom of the tree.
To compress the data, it traverses the entire tree to generate the encoding bit (the encoding on the left is 0, and the encoding on the right is 1 ). The closer a character is to the top of the tree, the less digits it uses after encoding. This is also called the "prefix", which is a very concise attribute, this attribute means that a bit string without encoding will be used as the prefix of another bit string (in other words, when you read a binary stream, you can immediately know when the character is decoded ).
Example:
With the Harman tree, we can know that it comes from a 9-character text, five of which are the letter "o" and three are the letter "d ", 1 character is the letter "g ".
Therefore, when you use this tree to compress data, you can process the word "dog" as follows:
D = 00 (left) o = 1 (right) g = 01 (left)
Therefore, "dog" will be encoded as a bitstream "00101 ".
If you see the string represented by the bitstream "01100", you can decode it according to the above Harman tree: Left and Right (g), right (o), left (d ), therefore, the decoded string contains "god ".
If the number of all characters in a string is the same, and the number of types of different characters is an integer power of 2 (for example, in "aabbccdd", the number of types of different characters is 4, that is, the square of 2. For example, the expression of the string "aaabbbcccddd" will be in the following form:
The user tree in the search shows that the string "abcd" will be encoded as "00011011 ". This feature is very important.