Libc heap management mechanism and vulnerability Exploitation Technology (I)
Some time ago, I was bored to participate in a competition named RCTF. As a result, I was topped by the stakeholders and did not enter the finals. Although I didn't take this as an issue too much, I still felt a little uncomfortable. The so-called shame and courage, as a virus analyst in Windows, resolutely decided to use the Linux libc heap management mechanism to outshine. In general, there are still many problems with the heap Management of libc. there are far fewer security considerations than Windows. The research process references some materials and original methods. Of course, it is certainly not the first. Because I do not like to read the source code, most of the research is based on the debugger. If there are any errors, I hope you can point it out ~
0 × 01 Libc heap Analysis
1.1 heap management structure
struct malloc_state { mutex_t mutex; /* Serialize access. */ int flags; /* Flags (formerly in max_fast). */ #if THREAD_STATS /* Statistics for locking. Only used if THREAD_STATS is defined. */ long stat_lock_direct, stat_lock_loop, stat_lock_wait; #endif mfastbinptr fastbins[NFASTBINS]; /* Fastbins */ mchunkptr top; mchunkptr last_remainder; mchunkptr bins[NBINS * 2]; unsigned int binmap[BINMAPSIZE]; /* Bitmap of bins */ struct malloc_state *next; /* Linked list */ INTERNAL_SIZE_T system_mem; INTERNAL_SIZE_T max_system_mem; };
The malloc_state structure is our most common structure. The important fields are as follows:
Fastbins: stores multiple linked lists. Each linked list is composed of idle fastbin and is fastbin freelist.
Top: top chunk, pointing to the remaining space in arena. If all freelist values are empty, heap blocks are allocated from top chunk.
Bins: stores multiple two-way linked lists. It is the same as the two-way linked list of the heap block header and forms a two-way loop free list (freelist) with it ). Here, bins are located in the header of the freelist structure, and the backward pointer (bk) points to the first node in the freelist logic. When chunk is allocated, the first logical node is allocated to find a suitable heap block.
The entire Heap Structure is roughly as follows:
1.2 heap Block Structure
struct malloc_chunk {INTERNAL_SIZE_T prev_size;INTERNAL_SIZE_T size;struct malloc_chunk * fd;struct malloc_chunk * bk;}
Prev_size: the size of the adjacent first heap block. This field is meaningful only when the previous heap block (and the heap block is normal chunk) is released. The most important (or even unique) function of this field is to quickly integrate with the adjacent idle heap block when the heap block is released. This field is not included in the current heap block size calculation. When the previous heap block is not idle, the data is the data written by the user in the previous heap block. The main reason for libc to do so is to save 4 bytes of memory space, but this space efficiency has caused many security problems.
Size: the length of the heap block. Length Calculation Method: size Field Length + Length applied by the user + alignment. Libc is aligned with the length of size_T x 2. For example, 32 bits are aligned with 4x2 = 8 byte, and 64 bits are aligned with 8x2 = 0x10. Because the size must be at least 8 bytes aligned, the size must be a multiple of 8, so the last three digits of the size field are always 0, and libc uses these three bits as the flag. The key is the last bit (pre_inuse), which indicates whether the adjacent previous heap block is alloc or free. If it is in use, bit = 1. The libc method to determine whether the current heap block is in the free State is to determine whether the pre_inuse of the next heap block is 1. This is also the key to exploiting vulnerabilities such as double free and null byte offset.
Fd & bk: bidirectional pointer, used to form a bidirectional idle linked list. Therefore, these two fields are meaningful only after the heap block is free. When the heap block is in the alloc status, the two fields are user-filled data. Two fields can cause memory leakage (the bss address of libc) and Dw shoot.
It is worth mentioning that, based on the size of the heap block, libc uses logical structures such as fastbin and chunk, but its storage structure is malloc_chunk, but each field is slightly different, for example, fastbin does not use the bk pointer relative to chunk, because fastbin freelist is a one-way linked list.
Heap allocation process of 1.3
I have not analyzed all the source code of malloc, so only a few key points that can be used are listed here.
Malloc makes different processing based on the size of the heap block applied by the user. Fastbin and chunk are commonly used. The overall order of malloc allocation is that if the heap block is small and belongs to fastbin, an appropriate heap block is found in the fastbin list. If the size of the heap block is normal chunk, find an appropriate heap block in the normal bins (unsort, small, large. If these bins are empty or fail to be allocated successfully, heap blocks are allocated from the area pointed to by the top chunk.
Bin: the heap Management Mechanism of libc is the same as other heap management mechanisms. For free heap blocks, the heap manager does not immediately return the released memory to the system, but saves it by itself, for the next allocation. This reduces the number of interactions with the system kernel and improves efficiency. The location where the released memory is stored in Libc is bin. Bin is a pointer pointing to a linked list (bidirectional & unidirectional). These linked lists are composed of released memory.
The bin in Libc has the following types:
Fast bin Unsorted bin Small bin Large bin
Fast bin:
Use fastbin for small heap blocks. The list of stored and released fastbin is a one-way linked list, And fastbin will not be integrated with other heap blocks, so the speed is faster. There are 10 members in the fastbin array in the malloc_state structure, that is, 10 fastbin unidirectional linked lists. The size of idle heap blocks stored on the same linked list is the same. Taking a 32-bit system as an example, the size of the heap blocks stored in the 10 linked lists is 16 bytes to 88 bytes, increasing by 8 bytes (alignment granularity ). However, according to my own test, when the heap block size is greater than 64 (0 × 40) bytes, the free heap block will no longer exist in fastbin. That is to say, only the first seven fastbin lists are used. There is no detailed reason here.
Shows the Fastbin list structure. For any list, when a fastbin is released, the fd pointer corresponding to the fastbin list in malloc_state is used, insert the released fastbin to the end of the queue. Because fastbin list is a one-way linked list and no bk pointer is used, it will be allocated from the heap block at the end of the fastbin linked list according to the fd pointer for efficiency during malloc. The tail heap block is finally free. Therefore, the fastbin list is allocated in the LIFO order, that is, the heap blocks released later are allocated first.
Normal bin:
The heap block except fastbin is normal chunk. After these chunks are released, the chunks will be placed in the bins array of the malloc_state structure. The bins array contains 126 elements. The specific allocation is as follows:
Bin 1 – Unsorted binBin 2 to Bin 63 – Small binBin 64 to Bin 126 – Large bin
When a normal chunk is released for the first time, it is first inserted into the Unsort bin in the order of release. When malloc is used, the first member in the bins list is located based on the bk pointer, and the corresponding heap block allocation starts from this Member. That is, FIFO. The first released heap blocks are allocated first. When an appropriate heap block is found in the unsort bin, the linked list members in front of the block are removed from the unsort bin and placed in other corresponding bins. After that, all these heap blocks will be placed in the unsort bin as long as they are released.
Security issues During Heap allocation:
A) When malloc finds suitable heap blocks in the list corresponding to various bins, it is determined that the idle heap block is based on the size field of the idle heap block and only applies this field. When malloc is used, it is not based on whether the pre_size field of the next heap block is consistent with the size of the current heap block. This is a typical concept of laziness. In order to ensure efficiency, the typical embodiment of security is totally ignored. Based on the characteristics of libc, single-byte overflow can cause memory overflow and memory overlap.
B) In addition, when the unsort bin list removes the node suitable for the heap block, due to the coincidence of the heap management structure, when the overflow constructs the DW shoot, this will cause the top chunk field address to be written to any memory address. If this memory address can be edited, the content of the top chunk field may be tampered with, resulting in further exploitation of the vulnerability. In this process, malloc does not call the safe unlink mechanism for security checks. Relevant examples will be provided in subsequent chapters.
C) Finally, the appropriate heap block found by malloc in the list is larger than the size of the actually applied heap block for an hour, which will involve the issue of "cutting. That is to say, a part is cut out from a relatively large idle heap block and allocated to the applicant. When performing this operation, you need to update the pre_size field of the next heap block of the current idle heap block. When malloc updates the pre_size field, it finds the pre_size field of the next adjacent heap Block Based on the size Field of the current idle heap block. However, when malloc updates and modifies this field, the original value of this field is not verified (that is, it is consistent with the size field ). In this way, if the size field of the idle heap block is damaged when a single byte overflows, The pre_size field of the last heap block is incorrectly updated.
Release Process of 1.4 heap blocks
The free () process can be roughly divided into the following processes:
1) check some basic heap block length fields (for example, size> = min_size and size <= max_size). 2) locate the header of the next adjacent heap Block Based on the length field of the current heap block. The next adjacent heap block must be a valid heap block, and the pre_inuse bit of the header must be 1 (that is, the current heap block is in use to prevent double free ): next_chunk-> size & 0x1 = 13) check whether the current heap block is in the freelist header, mainly to detect double free. However, this detection is very imperfect, because libc does not traverse the entire freelist for efficiency, so as long as the current heap block is in other locations of freelist, free () the function will still release the heap block. 4) check whether the adjacent heap blocks before and after the current heap block are released. If yes, merge the idle heap blocks. The operations here involve many opportunities for exploits of vulnerabilities.
First, the free () function checks whether the previous heap block is released, mainly based on the pre_inuse bit and pre_size field. If the pre_inuse bit is 0, the previous adjacent heap block is merged. Specifically, find the header of the previous heap Block Based on the pre_size field, and then remove the heap block from the free list (unlink) according to the fd and bk pointers of the header ), and add the newly merged heap block to the free list.
The vulnerability exploitation opportunities involved in this process are as follows:
A) free () After judging from the pre_inuse bit that the previous heap block is in the release status, search for the header of the previous heap Block Based on the pre_size. After finding the header, instead of comparing the size field of the heap block header, the system directly starts the chain table operations such as merging. In this case, if the pre_size field is mistakenly tampered with (overflow, single-byte overflow, double free), or the error update when the heap block is released (null byte overflow ), can create a lot of space to exploit vulnerabilities, such as manufacturing memory overlaps.
B) unlink operations may cause DW shoot. This is a classic vulnerability exploitation technology. The Unlink operation logic is:
fd->bk = bkbk->fd = fd
If overflow or other vulnerabilities are used to tamper with and release the fd and bk pointers of the heap block, any memory write effect can be achieved. To prevent DW shoot, Libc uses a mechanism called safe unlinking. This mechanism is simply used to determine the validity of fd and bk pointers based on the characteristics of the two-way linked list before unlink related write operations, check whether the bk pointer of the heap block pointed by the fd Pointer Points to itself (bk is the same ). The Code is as follows:
if (__builtin_expect (FD->bk != P || BK->fd != P, 0)) malloc_printerr (check_action, "corrupted double-linked list", P, AV);
Because of the safe unlink mechanism, the use of DW shoot is restricted. It is difficult to specify any memory to write arbitrary data, at this time, we usually need to use some data management structures of programs containing vulnerabilities. To put it simply, a pointer pointing to the current heap block must be near the memory address to be written. Although this results in limited utilization, it is not impossible. According to the "Security Island" principle, we can use some data management structures to break through this restriction. We can see the actual application methods in the subsequent Double free vulnerability utilization instances.
0 × 02 vulnerability exploitation instance
2.1 Buffer Overflow
Overflow vulnerabilities can be exploited. Some of them can overflow many bytes and overflow to fd and bk pointers. Some of them can only overflow to the size field. Different overflow methods are used based on actual conditions.
A) Null byte offset: plaidDB (550 point, plaidCTF)
Method 1:
Using the shrink free chunk size method, the pre_size feature is incorrectly updated by applying for another cut to achieve memory overlap.
This method is for others. I have read it very well and learned a lot about libc. Thank you very much. Transport: http://blog.frizn.fr/pctf-2015/pwn-550-plaiddb
However, I personally feel that this method is difficult to construct the heap layout. To achieve this, you can use other relatively simple methods.
Method 2:
Modify pre_inuse bit, release, construct fusion, and overlap memory.
The above method is relatively complicated. For plaidDB, It is troublesome to construct the heap layout. A few days ago, I squatted on the toilet and thought of a relatively simple method with better applicability. The pre_size of the heap block header is mainly used within the user space of the previous adjacent heap block, and the pre_inuse position is close to the end of the user space of the previous heap block. For null byte offset, we can not only modify pre_size to make it larger, but also modify the pre_inuse bit. In this way, when the heap block is released, the fusion with the previous heap block will be triggered. Because the pre_size field is increased, other unreleased heap blocks will be incorrectly merged forward, in this way, when we apply for a heap block, the memory will overlap with some unreleased heap blocks. The details are as follows:
Is the initial memory status.
Then, A null byte overflow occurs in heap block A. By modifying the pre_size field in heap block A, the length is the sum of the length of x + fast +. Then, with the help of null byte overflow, the first byte in the header of heap B is modified, and the pre_inuse field is changed to 0. This makes libc incorrectly think that the previous heap block of heap Block B is idle.
At this time, if heap Block B is released, according to the above heap block release process section, libc will first determine whether the adjacent previous heap block is in the release state based on pre_inuse. If it is in the release state, find the header of the previous heap block according to the pre_size field. Through the operation in the previous step, we will find the header of heap block x, then, the heap block x is removed from freelist through the safe unlink operation. To ensure safe unlink will not cause errors, it is easier to leave x idle. Therefore, we should release x before releasing heap Block B, so that the whole space from heap block x to heap Block B will be released.
It is worth noting that the heap block x and heap Block B must be separated by two heap blocks. If no heap block is fast, when releasing heap block x, libc needs to know whether the next heap Block A adjacent to x is idle, this information is obtained through the size field of A to locate heap Block B, and then determined based on the pre_inUse bit of heap Block B. The previous operation has set the pre_inuse field of heap Block B to 0. In this way, libc considers heap Block A to be idle, and unlink A to cause an error. (Of course, if heap block x is released before the null byte overflow occurs, this problem will not occur)
Finally, if we apply for heap block y again, it will overlap with heap block fast and heap block A, so we can cause memory leakage and memory tampering.
The verification code is as follows:
# Include
Void main () {char * x, * fast, * A, * B, * C; x = malloc (0x100-8); memset (x, 'x ', 0x100-8); fast = malloc (1); memset (fast, 'F', 3); A = malloc (0x100-8); memset (, 'A', 0x100-8); B = malloc (0x100-8); memset (B, 'B', 0x100-8 ); C = malloc (0x80-8); memset (C, 'C', 0x100-8 ); // x | fast | A | B | C // why fast is needed? If the heap block is not fast, x is released. to check whether the adjacent next heap block (A) is released, the pre_size and pre_inuse of header B are verified, b's header has been tampered with, so an error may occur. /// * A has a null byte offset vul. * A overflow to fast * change the pre_inuse bit */A [0x100-8] = 0x00; // change the pre_size of B (in A's own memory) A [0xF0] = 0x20; A [0xF1] = 0x02; A [0xF2] = 0x00; A [0xF3] = 0x00; A [0xF4] = 0x00; A [0xF5] = 0x00; A [0xF6] = 0x00; A [0xF7] = 0x00; printf ("before trigger vul, A: % s \ n", A); printf ("before trigger vul, fast: % s \ n", fast ); free (x); // aovid the safe unlinking when merge from x-> Bfree (B); // merge from x to B. then overlap fast and Achar * new = malloc (0x150-8); memset (new, 'w', 0x150-8); printf ("after trigger vul, a: % s \ n ", A); printf (" after trigger vul, fast: % s \ n ", fast );}
Method 3:
The above methods all cause memory overlap, but the null byte overflow can also construct DW shoot. Next we will introduce how to construct DW shoot by using the forward fusion and backward fusion methods.
Forward convergence:
It is basically the same as the above idea. By modifying pre_size and pre_inuse, libc considers the previous adjacent heap block to be released. Then, the previous heap block will be removed from freelist during the release operation. In this way, an unlink operation will be performed to construct the DW shoot.
As shown in, by modifying the pre_size in heap block A, libc can mistakenly locate our forged "previous" heap block header when releasing heap Block B. Then we can construct the DW shoot through the fd and bk pointers. However, a safe unlink issue needs to be considered. We have introduced the safe unlink mechanism in the "Release Process of heap blocks" section in the previous chapter. Therefore, we cannot specify pre_size at will, because the wrong heap block address found based on pre_size must be stored in a variable location of the program. Generally, a program has a management structure for managing some data.
As shown in, when a user applies for a heap block, libc returns the alloc ptr position instead of the starting position (free ptr) of the entire heap block to the user, you can define the data on the stack from the address of this pointer. After the heap block is released, the fd or bk pointer on the idle linked list stores the starting position of the entire heap block, that is, the free ptr position (so safe unlink determines the free ptr position ).
Generally, the program stores user data, so it stores the alloc ptr location. Therefore, our pre_size should be the length of heap Block A minus size * 2, that is, it is located at alloc ptr.
The verification code is as follows:
#include
long gl[0x40];void main(){//set global varmemset(gl,'i',0x3F);char * A, * B, * C;A = malloc(0x100 - 8); //memset(A,'a',0x100 - 8);B = malloc(0x100 - 8); //memset(B,'b',0x100 - 8);C = malloc(0x200 - 8); // for stablememset(C,'c',0x200 - 8);//pre_size,pre_inuse bit must be 1A[0x8]=0x11,A[0x9]=0x01,A[0xA]=0x00,A[0xB]=0x00,A[0xC]=0x00,A[0xD]=0x00,A[0xE]=0x00,A[0xF]=0x00; //fd, A->fd->bk == AA[0x10]=0xE8,A[0x11]=0x10,A[0x12]=0x60,A[0x13]=0x00,A[0x14]=0x00,A[0x15]=0x00,A[0x16]=0x00,A[0x17]=0x00; //bk, A->bk->fd == AA[0x18]=0xF0,A[0x19]=0x10,A[0x1A]=0x60,A[0x1B]=0x00,A[0x1C]=0x00,A[0x1D]=0x00,A[0x1E]=0x00,A[0x1F]=0x00; //change the pre_size of B (in A's own memory) , point to A's Fake HeadA[0xF0]=0xF0,A[0xF1]=0x00,A[0xF2]=0x00,A[0xF3]=0x00,A[0xF4]=0x00,A[0xF5]=0x00,A[0xF6] = 0x00,A[0xF7] = 0x00; //null byte offset , VUL!!!!!!! , change B's pre_inuse to 0 , then free B cause forward mergeA[0x100 - 8] = 0x00;gl[0x10] = A;//avoid safe unlinkingprintf("Before DW , global[0x10] is : %p\n", gl[0x10]);free(B);//triger the merge , Then cause DW shootprintf("After DW , global[0x10] is : %p\n", gl[0x10]);printf("Done\n");}
As shown in the code above, the global variable originally stores the address of heap block A. After DW shoot, it becomes the position 0 × 18 before the global variable, if the program can Edit the memory pointed to by the global variable, it can do a variety of things, for example, change the content of the global variable to the address of the GOT table (because the global variable is now directed to the position 0 × 18 in front of itself.
End of 0 × 03
Recently I wrote a PWN question for a game, using the above null byte offset vulnerability to convert it into DW shoot. The specific question and exploit should be released when appropriate.
In fact, as long as the DW shoot can be constructed through forward convergence, the memory overlap can be constructed. The only difference between the two is that to construct a DW shoot, we need to carefully select the appropriate fd and bk pointers. To construct a memory overlap, We need to release a heap block so that fd and bk are "original, then merge the data normally, and then request to tamper with the memory.
Limit on the number of characters in length. The exploitation of traditional heap overflow and Double free vulnerabilities will be put in the next article ~