1. the outline
A buffer overflow, also called a stack overflow (and a lot of salutation), is an unavoidable vulnerability for computer programs, unless there is a new design to replace the stack design that runs the program.
The purpose of the overflow is to rewrite the program's run stack so that the call returns to the stack containing a program (code) that jumps to a preset program, which is often called a shellcode, which allows the shell to be obtained as expected, and is more likely to get root. 2. Principle of buffer overflow
Each running program in the computer has the same memory layout (logical layout), and the Linux/unix program layout is generally as follows:
A buffer overflow is an article that uses the stack segment in this layout.
Heap: Typically used as a dynamic storage allocation, such as the C standard library function malloc is the application of the memory space in the heap.
Stacks: Automatic variables and where information is stored each time the function is called. Stack is from the top of the growth, the stack also has a special place, that is, advanced out of the stack to put data is like to plug things into the hole, when taking things can only take the most outside to take away.
One of the most important: the return address of the function call is saved in the stack.
The purpose of the buffer overflow is to tamper with the returned address stored in the stack into overflow data, thus indirectly modifying the return address of the function, when the function returns, it can jump to the preset address, execute the implanted code. 3. Buffer overflow needs to know the knowledge
The stack layout of the program at run time, see figure.
C Language Basics, this can see the C language related to the introductory books, such as "The C programming language".
The compilation foundation, may see the entry book, like "Assembly.language.step-by-step", the following brief introduction: registers: General registers has AX, BX, CX,DX, DI, SI, BP, SP altogether 8, x86_ 64bit CPU added eight general-purpose registers, namely R8,R9...R15, 8-15 total 8. A non-universal register, a dedicated register, and most importantly, IP, always points to the address of the next instruction to be executed by the CPU. This is important, and the purpose of the buffer overflow is to modify the IP value. Usually register names are preceded by modifiers, mainly used to distinguish the number of digits represented by registers, 32-bit registers preceded by E, such as EAX, EBX, 64-bit registers before there are r, such as RAX,RBX. Although BP and SP are universal registers, it is dedicated to the base of the stack (BP) to the bottom of the stack, and the SP points to the top of the stack. 64-bit CPU because the general register is more than 32 bits, so the parameters of the function are DI,SI,DX,CX,R8,R9 to save. AX-transmitter is typically used to hold function return values.
4. Witness buffer overflow 4.1 test Code
Let's take a look at the buffer overflow with a small program below.
. Foo ()
printf (" Exploit\n ");
int main (int argc, char *argv)
strcpy (BUF, argv);
printf ("Buf:%s\n", Buf);
In contrast to the memory layout diagram above, the argv string array is the command-line argument, and the environment variable (environ) does not need to be explicitly written out by default.
Next, we compile the code into a binary executable with the following command.
Gcc-g-O stack1 stack1.c
GCC with-g parameter facilitates GDB debugging. 4.2 Run Test
We have defined the size of the BUF to be 10, and the following figure is a test that copies 10, 20, 24, and 23 bytes to BUF:
In the figure above, I tested the CentOS 64bit system, and when I passed in 24 bytes to BUF, the program generated a segment error (segmentation fault).
So here's the problem: the definition of BUF is 10 bytes, why you can pass in more than 10 bytes of data without error. Why the program is having a segment error when saving 24 bytes. Does this have anything to do with buffer overflows?
To explain the 3rd question first, the error in the diagram above is the buffer overflow, and we successfully created a buffer overflow case. space estimation of 4.3 buffers
The first problem is that all memory storage data follows the convention: the data stored must be multiples of 4, 8, 16, 32, and 64, which is called memory alignment.
Why do you want to be aligned?
This is related to the efficiency of CPU access data, data alignment for easy access, and the layout of things neatly and conveniently looking for a reason.
So, although the size of the BUF is defined to be 10, the data is filled with more than 10, as long as there is a certain range of tolerance.
So why is its tolerance not 30, not 20, but 24? Because 24 is the boundary of the data alignment, can be tolerated 24 bytes, but the end of the string has a null character ' "/0", in the example of 24 a in the actual deposit of 25 characters, more than 24 of the boundary, crossed the line, so there is a problem.
The text is less persuasive than the picture, please look at the figure below:
The diagram above is the disassembly code for the main function, and we intercept a small portion of it.
The Red box section is the current stack space and the main parameter pass.
mov%rsp,%RBP set current stack pointer address as base
Sub%0x20,%RSP new stack pointer
The above two statements function is to set a new memory space to the new stack segment, the stack space size is 0x20 (32 bytes)
mov%edi, -0x14 (%RBP) parameter 1, distance stack base address only 0x14 (20 bytes)
mov%rsi, -0x20 (%RBP) Parameter 2
Parameter 1 is the argc of main, and parameter 2 is the argv string array (where argv is/root/stack/stack1,argv[1) is the data we are going to deposit into the buf, so let's verify that it is correct:
Above we started the STACK1 program, with Perl print 20 A as parameter pass, can see%rdi = 2, is argc,%rsi is a double pointer, is consistent with the definition of *argv[, we look at the data of this pointer.
The figure above shows exactly what we expected. Here, you can also review what is a double pointer, such as **PTR, arry categories of definitions, the data they refer to through two layers of indirect access. Similarly, if you are ***ptr these definitions, you have to go through three layers to find the final data indirectly.
Speak to the right turn, say buf why can only hold 24 bytes
1. In contrast, the BUF address in the current stack space, the distance from the stack base RBP only 0x10 (16 bytes), that is to say BUF can accommodate at least 16 bytes, which is the memory data alignment to the 8,BUF definition is 10byte, in order to align, need to allocate 16byte.
2. Below look at the contents of the RBP above.
You can see that there are eight bytes superfluous in the high address on the RBP, which is also allocated for alignment.
Add up to just 24 bytes, and the data in the 0x7fffffffe328 is the return address of the main function. 4.4 Main function return address
Let's look at the diagram below, which is the stack data when the main function executes
The Red box section above is the return address of the main function saved on the stack, and when the main function is finished, the CPU jumps to the address to execute the instruction.
So where is this address (0X7FFFF7A3AB15) saved?
According to the program memory layout, you can be sure that it is stored in the stack section.
Now let's take a look at the current stack space:
The red box indicates the space position of the stack, and we then typed the instructions to see what data the stack space contains.
In contrast to the data in the red box, there is no 0x7ffff7a3ab15, that is, the return address of main is not saved in the current stack space, then is not our affirmation too firm.
Not too. Let's see again:
0X7FFFF7A3AB15 originally hid in a higher address, this is also within the range of the stack, so that the above affirmation is not wrong.
BP and SP represent the current stack space, the program's operating cycle will be in the agreed stack segment using different stack space. 4.5 Buffer Overflow
The above series of instructions, it is not difficult to see, our ultimate goal is to overflow the data in the BUF to overwrite main return address.
As the above analysis, we can only write to buf more than 24 bytes of data to achieve the save main return address space, the test also proves this point.
Why does the program appear segmentation fault (core dumped) when the write is greater than 24 bytes?
This is because the data that we overwrite the main address is not a valid return address for main.
To achieve the real purpose of the overflow (running the shell, getting root privileges), we need to construct the overflow data carefully.
First of all, we must learn to build Shellcode, then how Shellcode is constructed. Please see let's.