Binary vulnerability Mining

Last Update:2015-02-15 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Binary vulnerability Mining
0X00 preface: Binary vulnerability research can be divided into vulnerability analysis and exploitation and vulnerability mining. A large number of articles can be found on the Internet used for vulnerability analysis, but few articles on vulnerability mining. Therefore, I will focus on how to mine binary software vulnerabilities.
0X01 vulnerability Mining Method: vulnerability mining, a project that only pursues results rather than processes. There are many methods for vulnerability mining. Here we mainly discuss the following methods:
1. Manual vulnerability Mining
2. General fuzz vulnerability Mining
3. Smart fuzz vulnerability Mining
0x02 manual vulnerability Mining: What is manual vulnerability mining:

Manual mining means that if you do not use automatic mining tools, manual analysis software may cause problems. The mining points are manually searched, and malformed data is also manually constructed. This method of discovering software vulnerabilities is generally called the manual method.

Advantages of manual vulnerability mining:

Manual testing does not require professional fuzz tools, and the major vulnerability to be tested is the stack overflow vulnerability. The principle is simple. Therefore, manual mining is fast and efficient. The buffer overflow vulnerability can be detected within dozens of minutes.

Disadvantages of manual vulnerability mining:

It is difficult to mine file format vulnerabilities. Because the processing logic of file formats is generally complicated, the manual mining method is not very effective.

How to perform manual mining:

Vulnerability mining methods are generally:

1. Determine a mining point: a mining point is a point used for mining. What kind of point can be used as a mining point? All user-controllable data points can be used as mining points. Including program path, input message, configuration information in the file, and so on. Because it is manual mining, the mining points are not suitable for too complicated selection. Obviously, complicated file formats such as office series and various image sounds are not suitable for manual mining.

2. Fill in malformed data for the mining point: after finding the mining point, you can fill in various malformed data for the mining point. These include ultra-long strings, malformed characters, and boundary value data. Based on long-term mining experience, ultra-long strings have better effects. And the super-long string is generally stack overflow, which can be exploited in general.

3. program exceptions: program crashes, program exits, and so on.

4. Analysis: if an exception occurs, use the Disassembly tool and the Assembly-level debugger for in-depth analysis (such as windbg ida ollydbg) to find the cause of the exception and determine the vulnerability type and hazards.

Examples of manual mining methods:

The following two examples are used for manual mining.

Example 1: sdemo2.0 Buffer Overflow Vulnerability

Vulnerability Number: WooYun-2013-43556

Link: http://www.wooyun.org/bugs/wooyun-2010-043556

Software Introduction: S-Demo is a software used by most hackers to crack the animation process. It can record any action on your screen and move the mouse, at the same time, a high compression rate is used. Of course, you can choose the compression ratio. Normally, the size of the file generated every minute is about KB.

Vulnerability Description: The recorded source file format belongs to the custom file format. During password verification, the program puts the encrypted password (hereinafter referred to as ciphertext) into the memory stack. When the strcpy function is executed, no length detection is performed on the ciphertext, leading to stack overflow. If you use the sdemo player to open a hacker to construct a specific smv file, this can cause arbitrary code execution.

Mining ideas:

1. Select a mining point: here we select the ciphertext processing area.

2. Locate the ciphertext location

Reverse Analysis: reverse analysis is used to view the location of the file where the function processes the ciphertext.

Comparison Method: (1) two samples are generated, with different passwords. Compare the information, and find that the data at the 1A1H offset is different. After testing, we find that the ciphertext information is here.

Figure (1)

3. construct malformed data and test: according to experience, the super-long string is the first malformed data for manual fuzz. The ciphertext string ends with 0. We use winhex to manually modify this string for a long time, to test whether the program crashes. (1A0 is actually a verification. If it is 0, it indicates that the sample has no password, and the subsequent data will not be parsed. If it is 1, it indicates that the sample has a password) data graph before modification (2)

(Figure 2)

When we modified the data for a long time, we found that the program crashed. Figure (3) shows the modified data.

(Figure 3)

4. Analyze the crash information: the exception has been successfully triggered. Next we will use an internship to analyze the exception principle in detail. (4), use ollydbgto load sdemo s-s-player.exe, and then open the malformed file we constructed. View the OLLYDBG information. At this time, we found that the EIP value is 43434343 H and 43H is the c ascii code. That is to say, the EIP has been controlled by us. Generally, the EIP vulnerabilities can be exploited.

(Figure 4)

(5) analysis. We found that the vulnerability was caused by the strcpy function. This function does not control the size of data pointed to by esi. If the data pointed to by esi is too large, it can cause a buffer overflow and directly overwhelm the buffer address.

(Figure 5)

(6 ). Debug. Observe the SEH linked list after passing through the strcpy function. After the flood, we found that the SE processing program is 43434343 H. This data is the malformed data we input, that is, we can control the SEH.

(Figure 6)

5. Vulnerability exploitation: The vulnerability can be exploited Based on the above analysis principles. Since we can control SEH, we can use pop ret to exploit this vulnerability. First, we need to find the address that controls SEH. After debugging, we found that the ciphertext offset is 4 CH, that is, the address of the SEH processing program. Now we modify the poc data (7)

(Figure 7)

We verify that the data is modified to: eeeeeeeeeeh (the end of the chain table is changed to FFFFFFFFH after 48 hours). At this time, the SEH chain table is drowned. (8)

(Figure 8)

The stack address is

1 2 3 xxxxxxxx FFFFFFFF

FFFFFFFF points to the next SEH record pointer, which is the data in our file structure. This data can be controlled. The goal is to run the command (Data FFFFFFFFH ). Therefore, we only need to select a command similar to pop ret in the memory, We can pop the first two addresses in the stack, and then return the address where the command is located (the address where the data FFFFFFFFH is located) run the command here. My system is windows SP XP3. In my system, I found such a piece of code. The address 7FFA1571H, code (9 ):

(Figure 9)

That is to say, we set the SEH address to 7FFA1571H. After the program is abnormal, the code is executed, and then the eip returns to the command FFFFFFFFH. Then we can modify the data FFFFFFFFH to the code we want to execute. In this case, we change ffffffh to 9090909090h. Test (10)

(Figure 10)

At this time, we found that the eip has been successfully executed here. That is, it can be used. Just put our shellcode here.

6. EXP development: here we have developed a small exploit generation tool that can write custom shellcode and generate the demo button to bring up the calculator code. Program (11)

(Figure 11)

When the program loads the exp. smv file, a calculator is displayed. (12) What if this code is not a code that executes the calculator, but a virus?

(Figure 12)

Example 2: Easy language 5.11 Buffer Overflow Vulnerability

Vulnerability Number: WooYun-2013-44371

Link: http://www.wooyun.org/bugs/wooyun-2010-044371
Vulnerability Analysis:
When processing a function, the length is not effectively checked, which can cause a buffer overflow. Attackers can execute a specific source program and open the program to break the vulnerability, which can lead to arbitrary code execution.
Mining ideas:
1. After easy language installation, analyze its mining points. Various editing boxes, input information, and so on can be used as mining points, which have the best effect and the greatest impact, but also belong to easy language engineering files (. E files ).
2. Super-long string can be used as the first choice for mining points. Here, I will perform a manual mining presentation on the "Information Box" control. Create a new window and drag and drop a button. Double-click the button to bring up the code editing area. Enter the information box ("xxxxxxxxxxxxxxxxxxx", 0,) to view the effect. (13)

(Figure 13)

In the next step, we manually try to extend the variable string. Obviously, xxxxxxxx is the variable string. we enter a long string (about h) to save it. Then open the easy language, load the. E source file (or directly open the. E source file), and find that the program unexpectedly exits. We use the debugger OllyDbg to open the easy language program, and then load the. E source file we constructed. (14)

(Figure 14)

At this time, we found that the EIP is 78787878 and has been controlled by us. When we look at the SEH linked list, we will find that the SEH linked list has been controlled by us. (15)

(Figure 15)

Next we can use the pop retn method. The principle is the same as above. The program has some self-verification, solution: the first is to encode shellcode, directly add the encoded shellcode in. E source code. Second, the self-verification algorithm is easy to reverse and try to break through. directly use tools such as winhex to modify the shellcode of the. E file and modify the self-verification code. In most cases, the first one is easier.

0X03 General fuzz Method for vulnerability Mining: What is General FUZZ:

Without studying the file format, use automated testing tools to perform fuzz testing on the target program.

General FUZZ process:

The general fuzz procedure is as follows:

1. Select the target file.

2. Use our tools to mutate the format of the target file and generate a large number of malformed samples.

3. Let the Program Load and parse these malformed samples separately, and check whether the program will trigger exceptions.

4. analyze these abnormal samples in reverse order to check whether the vulnerability is detected and determine the hazard level.

General advantages of FUZZ:

Easy to use. You do not need to know the file format to mine vulnerabilities. Full automation and high efficiency.

General FUZZ disadvantages:

The test depth is not enough. It is only suitable for some file formats with relatively simple file structures. There is nothing to do with complicated file formats.

General FUZZ focus:

Evaluate the performance of a general fuzz tool. Generally, consider the following two aspects:

1. Whether the generated sample is malformed: only when the sample is malformed can the coverage be large enough to discover more security risks.

2. Whether the monitoring function is powerful and accurate enough: the general fuzz monitoring module can accurately monitor the exceptions that occur when the program is running. No false alarm, no false alarm.

EASYFUZZER introduction:

EasyFuzzer is a streamlined and efficient fuzzy testing tool. Currently, only file format fuzzy testing is supported. Supports General fuzz format and smart fuzz format. The current version is up to 1.5. The running environment must be windows xp sp3 32-bit.

Easyfuzzer features:

Easy: No configuration is required. It is very easy to use and easy to use.

Simplified: for capacity and speed, the software is written in 100% assembler language. Eliminate useless fuzzer functions in the past. Green software, non-toxic and harmless.

High Efficiency: because it is compiled in assembly language and supports multi-thread fuzz and distributed fuzz, the speed is extremely fast and the efficiency is extremely high.

Flexibility: when using other tools, fuzz often reports some normal behaviors as abnormal behaviors because some non-abnormal "exceptions" are captured by fuzzer. EasyFuzzer has the ignore exception debugging function, which greatly reduces the false positive rate of fuzzing.

Advanced: supports smart fuzz and user-defined file format samples. Attackers can exploit vulnerabilities in complex file formats.

Program features:

(Figure 16)

Main Window (16 ):

Sample file: various malformed samples are generated based on this sample during General fuzz testing. Select a normal sample. To improve the testing speed, the sample should contain as many data structures as possible, and the sample size should be as small as possible.

Target path: file path for generating malformed samples.

Suffix: The suffix of the sample.

Host Program path: the path of the program to be mined. If the program requires parameters, enter the parameters in the edit box.

For general-purpose fuzz, two sample generation engines are provided.

Engine 1: It is mainly developed for integer overflow fuzz. Suitable for samples with small file formats. If the file format is large, select the mining scope. Otherwise, too many test samples will be generated, affecting the test results.

Engine 2: An engine developed mainly for Stack Buffer Overflow. If you have enough time, you can select the x2 x4 x16 option later. In this way, the sample size increases and the test depth increases. This option can be checked. For example, if you select all, the sample size is * 128 times. In version 1.5, about 0.25 million samples are generated.

Generate a file: (for general purpose fuzz) generate a file based on the selected sample.
Fuzzing: Start the fuzzing test.
Stop: end the fuzz test ahead of schedule.
Always at the beginning: If you hate fuzz and the new startup window blocks our program, you can select this option.

(Figure 17)

Fuzz option window (17 ):

Run time: the time when the sample is monitored. If no exception program is monitored after this time, the system exits. Select based on your hardware configuration and target program. If the time is too small, the test cannot be executed normally, and if the time is too large, the resource is wasted.

Enabling rate: The sample startup rate, that is, how long the program starts. This time is also selected based on your machine configuration and target program. For example, if the running time is 2000 ms and the enabling speed is 500 ms, it is equivalent to four threads in the program.

Valid exception Log Path: the path for storing exception logs. When an exception occurs in the program, the exception logs are stored in the selected path.

Ignore the following exceptions: sometimes some exceptions are thrown due to the design issues of the mining program. This function ignores these exceptions and does not intercept them. Reduce false positives and improve efficiency.

Enable the custom sample function: accept malformed samples generated by third-party programs. It can also be used to continue fuzz samples that were not completed last time. You only need to fill in the total number of samples and the start number of samples.

Enable distributed mining: used to mine a program on multiple machines (or virtual machines. For example, if a program needs to test 10000 samples, If you enable five virtual machines, install the software on these five virtual machines, write 5 for the total number of machines, and write 1, 2, 3, 4 for the machine numbers respectively, 5. Then, each virtual machine can complete the task by testing 2000 samples.

Smart fuzz window: (18) This window is relatively simple, just a code editing area. Enter the code and enter the suffix. Click OK to generate the sample.

(Figure 18)

Examples of General FUZZ vulnerability mining:

The following example shows how to use easyfuzzer to mine software vulnerabilities. The target we selected is sdemo 2.0.

Mining Process: sample generation: First, we generate a very small normal sample file. Name it sample. smv. Then, enable easyfuzzer and fill in the parameters for generating the command. The preceding details are provided. Details (19)

(Figure 19)

Here we select Engine 2 to generate a sample, as shown in. At this time, the sample has been generated. Figure (20) is the generated sample

(Figure 20)

Sample Test: the default configuration information is enough. Click fuzzing to perform fuzz on the sample. Fuzz process (21)

(Figure 21)

Time problem. The test is completed ahead of schedule. At this time, some monitoring results are displayed in the main window. (22) open an exception log with error sample ID and error information. Then we found the ID sample to verify that the program crashed.

(Figure 22)

0X04 smart fuzz Method for vulnerability Mining: Introduction to smart FUZZ

What is smart fuzz: intelligent fuzz is relative to General fuzz. General fuzz has the advantages of fast, malicious, and accurate in fuzz with simple file format. However, complicated file formats are useless. Because the complex file format has a lot of structure validation. For this verification, most samples generated by General fuzz are invalid. Therefore, intelligent fuzz is referenced here. Smart fuzz analyzes the structure of a file and compiles code that expresses the file structure. Then the fuzzer tool generates malformed samples according to the constraints of the Code, and then it is the same as general fuzz: Execute malformed samples and monitor exceptions.

Advantages of smart fuzz: High execution efficiency and good results. You can exploit software vulnerabilities that cannot be mined by other methods.

Disadvantages of smart fuzz: It is necessary to thoroughly study the corresponding file format and write the corresponding file format script. This process takes a long time.

Steps for smart fuzz:

1. Study the file formats to be processed by the program, including the various data structures and constraints of the format.

2. According to the Code rules specified by fuzzer, write the code of the corresponding rules to parse the structure of the current file format.

3. Use fuzzer to generate a large number of malformed samples through the code we write.

4. Let fuzzer execute and monitor the running status of malformed samples processed by the mined program, and check whether the program is abnormal. If the program is abnormal, the exception samples and related information are retained.

5. Use ollydbg windbg ida and other tools to analyze the crash information and check whether the vulnerability is vulnerable and determine the hazard level.

EASYFUZZER:

The following is an example of Intelligence used to mine software vulnerabilities. First, let's talk about the Code rules. In easyfuzzer 1.5, a maximum of codes are supported, and each code can be up to bytes in length. Each piece of code must end with ";" and separate parameters. By 1.5, easyfuzzer supports four commands.

First: _ num, parameter 1, parameter 2, parameter 3, parameter 4; for example: _ num, 1,; _ num has four parameters:

Parameter 1: it is a numerical value. It can be a 10-digit type (such as 100,) or a 16-digit type (such as H or deaddeadh). It can be case-insensitive to hexadecimal letters, h and 10 hexadecimal data must be added later.

Parameter 2: whether the data can be changed. 0 indicates that the data can be changed, and 1 indicates that the data cannot be changed. Variation means that this value will change in different samples generated later. If this value is set to 1 (mutable), there is no difference between parameter 1 and parameter 1 (parameter 1 is not parsed)

Parameter 3: Large Tail. 0 indicates a small tail (a small tail indicates that the low-level data is stored in the low-byte address space), and 1 indicates a large tail (a large tail indicates that the low-level data is stored in the high-byte address space ). For example, if 12345678 H is a large tail, it is 12345678 in the memory, if it is a small tail, it is 78563412 in the memory,
Parameter 4: The value size. Currently, 8-bit, 16-bit, and 32-bit data types are supported. That is, 1 byte, 2 byte, and 4 byte. 8 bits, such as AAH. 16 bits, such as AABBH. 32 bits, such as AABBCCDDH.
Which of the following statements is true?

_num,11111111h,1,0,32; _num,2222h,1,0,16;_num,ffh,1,0,8;_num,254,1,0,8;_num,12345678h,1,0,32;_num,AABBh,1,1,16;

Type 2: _ str, parameter 1, parameter 2, parameter 3, parameter 4, parameter 5, parameter 6; for example: _ str, helloworld, 1; this function has six parameters.

Parameter 1: string value. Enter a string, such as helloworld.

Parameter 2: whether the value changes. 0 indicates change, and 1 indicates not change.

Parameter 3: the length of the string, in bytes.

Parameter 4: string type. 0 indicates STR type, and 1 indicates HEX type. For example, 123456. If it is 0, the data in the output memory is 313233343536. If it is 1, the data in the output memory is 123456.

Parameter 5: size of the prefix. The Unit is the number of bytes. Valid values: 0, 8, 16, and 32. The prefix is used to indicate the length of the string. If you do not need this value, set it to 0.

Parameter 6: Format of the prefix, big tail or small tail. 0 indicates a small tail, and 1 indicates a large tail.

Third: _ cal, addr, parameter 1, parameter 2, parameter 3; for example, _ cal, addr, 32, 0, 3; the function has three parameters:

Parameter 1: The number of digits in the calculation result. Optional values: 8, 16, and 32, which indicate 8-bit, 16-bit, and 32-bit respectively. (1 byte, 2 byte, 4 byte)

Parameter 2: The result is a large tail. 0 indicates a small tail, and 1 indicates a large tail.

Parameter 3: Calculate the target (function serial number). In the preceding example, 3 is used to calculate the offset address of the third function.

Type 4: _ cal, size, parameter 1, parameter 2, parameter 3, parameter 4, for example, _ cal, size, 32, 0, 4, 6. The function has four parameters:

Parameter 1: The number of digits in the calculation result. Optional values: 8, 16, and 32, which indicate 8-bit, 16-bit, and 32-bit respectively. (1 byte, 2 byte, 4 byte)

Parameter 2: The result is a large tail. 0 indicates a small tail, and 1 indicates a large tail.

Parameter 3: Start function (function serial number)

Parameter 4: end function (function serial number ). In the above example, the start value is 4, and the end value is 6, which is to calculate the size of a total of three functions from 4 to 6. Note: The start function value should not be greater than the end function value.

Vulnerability mining example: mining the memory damage vulnerability of the WINXP SP3 system player

Our goal is to use the windows XP SP3 Media Player. Here we select a simple File Format: Mid file format. The following describes the mid file format.

1. mid file format Introduction: a MIDI file consists of two parts: the header block and the track block. For more information, go to the Internet to search.

2. header block: the header block appears at the beginning of the file. The header block always looks like this: 4D5468640000 0006 ffff nnnn dddd. 4D5468640000 indicates the value of the header block, ffff indicates the file format, nnnn indicates the number of tracks in the MIDI file, and dddd indicates the rhythm of each 4-note.

3. Track block: 4D54726B xxxxxxxx aaaaaaaaaaaaaa. 4D54726B indicates the value of the orbital block; aaaaaaaa indicates the orbital block; xxxxxxxx indicates the size of the orbital block. Except for the label values of the header block and orbital block, all structures should be used as the fuzz structure, that is, the content can be changed. Below is a piece of code I have written based on my understanding of the mid file structure:

_str,MThd,1,6,0,0,0; _cal,size,16,1,1,1;_num,ffffh,0,0,16;  _num,ffffh,0,0,16;_num,ffffh,0,0,16;_str,MTrk,1,4,0,0,0; _cal,size,16,1,8,8;_str,fffffffffffffffffff,0,0,0,0,0;

Code Description:

The first line of code: the string type with the header information of MThd and the length of 6;

The second line of code: the first line of code parses the length of the string, which must be in the form of a large tail. The size is 16 bits;

The third line of code: A 16-bit data, the data needs to be changed; the value is a variant value, so the first parameter (FFFFH) is not parsed, and everything is the same;

The fourth and fifth rows are similar to the third rows;

The sixth line is similar to the first line;

Row 7: Obtain the size of data generated by the eighth line of code;

Row 8: string type, variable required. Indicates the track block.

After the code is written, click OK to generate the code. Soon the code was generated. (23) figure (24) generated a total of 11137 malformed samples.

(Figure 23)

(Figure 24)

Next, we can start fuzz. If you have never run a system player, run the configuration first. Otherwise, the fuzz operation fails. In addition, the WinXP system player does not support multithreading, so we need to set the enabling speed to a time later than the running time. Here, we set the enabling rate to 2200 milliseconds and the running time to 2000 milliseconds. After setting, click OK. (25)

(Figure 25)

Because we rely on code to generate samples, we do not need template files, as shown in. Click the FUZZING button. (26)

(Figure 26)

To facilitate observation, you can always select the front button. The following is the fuzz process, which takes a long time. According to my options, it takes six hours to complete the test. to shorten the time, you can open several more virtual machines and select distributed mining. In an i7 pc-level cpu, it is enough to open six virtual machines (large memory is required) for an hour. For time reasons, I ended fuzz ahead of schedule. The good news is that I have found a malformed sample. (27 ):

(Figure 27)

Now let's look at the log. (28 ).

(Figure 28)

Log display, because divided by 0 Causes a crash. We will perform a manual test below. Find the samples 2.16.mid and 00000043. mid. Let's verify if it is caused by dividing by 0. (29) figure (30 ).

(Figure 29)

(Figure 30)

As the log information says, dividing by 0 Causes a crash.

0X05 summary:

This article only describes the three most popular methods for Binary vulnerability mining. There are many methods for vulnerability mining. No one can deny this. As long as you can discover vulnerabilities, it is a good method. Because of this, vulnerability mining is not so much an art as it is. More vulnerability mining methods are to be explored and discovered.

Finally, I would like to give some suggestions for new users: understanding development makes it easy for us to understand the program from the programmer's thinking. Understand reverse engineering, be good at debugging programs, and understand the underlying operating mechanism of code. Multiple vulnerabilities are analyzed to trigger inspiration. More attempts to exploit vulnerabilities are not on paper. Experience is the most important.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Binary vulnerability Mining

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Binary vulnerability Mining

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support