3D model data has been being parsed over the past two days. I wrote some test code in my spare time today. Analyzes the buffer overflow attack principles and shell code principles used by hackers. OK. Go to the topic. I hope you can correct anything wrong. Hey!
First come to such a small test code:
Void test (void)
{
Cout <"success! "<Endl;
}
Int main (void)
{
Int A [1];
A [3] = (INT) test;
Return 0;
}
The above code briefly explains the principle of buffer overflow. First, it defines an integer array A, and some red code has been written out of bounds. The result is: Success!
Here is a question. Why does the code in the test function be executed without calling the test function in the program?
This is the result of buffer overflow. Here we will explain this phenomenon from the function calling principle. When a function is called, The stack frame of the function will be saved. The EBP and EIP are stored in the stack. The order is:
High address and low address
[EIP] [EBP] [A [0]
At the Assembly level, the call function calls [address] And calls two more steps. One is to push the EIP to the stack and save it as the RET return address of the function, the EIP value is the address of the next instruction of the current call command. When the function ends, executing the RET command will jump to the address stored in the EIP, that is, the stack frame of the main function, to complete the call.
After that, the [3] address here overwrites the address of test to the address where the EIP is stored. The RET command of the main function jumps to the code space of test. So success is output! Of course, the program will crash because the EIP value is unknown and the stack is unbalanced when the test RET command is executed. Therefore, the jump will go to an unknown place and the system will crash.
From the above example, we can use buffer overflow to change the return address of the function stored in the stack, thus changing the process of the entire program, turn it to any place we want it to go. This provides hackers with an opportunity.
The most common method is to embed a piece of code into a long string (that is, overwrite the return address of the function by overflow out of bounds ), and overwrite the return address of the function to the address of this Code. When the function returns, the program starts to execute this self-compiled code. In general, this code runs a shell program (such as/bin/sh), because in this case, when we intrude into a program with the buffer overflow defect and the SUID-root attribute. We will get a shell with root permissions, in which we can do anything. Therefore, this code is generally called shell code.
The following is an example to illustrate the principle of shell code:
Int code (int A, int B)
{
Return A + B;
}
Void testshell (void)
{
Int result = 0;
Byte funcbyte [512];
Byte * jmpaddr = (byte *) code;
DWORD ofsfuncaddr = * (DWORD *) (jmpaddr + 1) + 5;
Byte * funcaddr = (byte *) (DWORD) jmpaddr) + ofsfuncaddr );
Byte * pfuncbuff = funcaddr;
Byte * pinput = funcbyte;
While (true)
{
If (* pinput ++ = * pfuncbuff ++) = 0xc3)
Break;
}
_ ASM
{
Lea eax, funcbyte
Push 100
Push 200
MoV ECx, 1
Call _ label
_ Label:
CMP ECx, 0
Je _ RET
Sub ECx, 1
JMP eax
_ RET:
MoV result, eax
Add ESP, 8
}
Cout <result <Endl;
System ("pause ");
}
Int main (void)
{
Testshell ();
Return 0;
}
The above funcbyte is used to save the bytecode of the Code function. jmpaddr points to the JMP Instruction address from JMP to the code function. The call [target function] jumps to the address of the JMP [function address] command before JMP is directed to the first address of the target function. Ofsfuncaddr is used to save the offset of the function address saved in the last 4 bytes of the current JMP command (no matter the distance of the jump, this is unconditional transfer, it is roughly considered that 4 bytes are stored as offsets ).
Instruction address bytecode instruction target function address
0041954b E9 10 1D 00 00 JMP testshell (41b260h)
From the preceding JMP command, we can see that E9 is the bytecode of the JMP command, and the four blue bytes are: the address of the target function-the JMP command address-the five bytes of the JMP command. That is, 0x41b260-0x41954b-5 = 0x001d10. Funcaddr saves the first address of the target function. The later while is to copy the bytecode of the Code function to funcbyte. The RET command is executed when the function ends. The bytecode of the RET command is 0xc3. So we use it to terminate the loop and stop copying.
The subsequent Assembly code is used to execute the copied bytecode, maintain the stack balance, and let the jump address jump to the mov result and eax statement after _ RET correctly. But how can we get the correct jump to the desired location after executing the copied bytecode? Here we use the call command to complete this work. The red call _ label will press the address of the next Assembly statement into the stack as the return address of the function. Because funcbyte stores the code function bytecode, the execution of the bytecode in funcbyte has the same effect as the code function. Run funcbyte to directly use JMP eax to redirect. After the code is run to 0xc3 (RET), it will jump to the CMP ECx, 0 statement. Here I made a limit to use the ECX count to get RET back and then execute je _ RET because ECx is zero (sub ECx, 1. The red code segment can also be replaced by a push _ RET command, which is equivalent to pushing the return address to the stack. After the copied bytecode is executed, it is returned to _ RET :. This is just to illustrate the principle of the Call Command. Then, the return value is assigned to result. Then pop drops two parameters, 100,200. Maintain the stack balance. The value of the result is 300. Implements the shellcode prototype.
Okay. It's basically done! The code function here is just a simple statement. If there are complicated operations, you need to further process the bytecode in funcbyte. For example, if a code function contains a function call, JMP jumps. The JMP jump uses the offset from the instruction address of the current statement. Funcbyte is a temporary byte array. The instruction address of the executed bytecode will also be in the temporary address space. When the bytecode remains unchanged, the address of the JMP command changes. Naturally, the same offset of JMP will not jump to the correct target function address. My initial idea is to copy the bytecode and perform special computation on the instruction using the offset. This allows you to jump to the temporary address space correctly. Leave your thoughts for now! Please kindly advise! --
If the code function contains a function call:
Int code (int A, int B)
{
Cout <a + B <Endl;
Return A + B;
}
The following is a comparison between the copied bytecode and the code function bytecode:
Copy bytecode of funcbyte:
0013fbf8 55 push EBP
0013fbf9 8B EC mov EBP, ESP
0013 fbfb 81 EC C0 00 00 00 sub ESP, 0c0h
0013fc01 53 push EBX
0013FC02 56 push esi
0013FC03 57 push edi
0013FC04 8D BD 40 FF lea edi, [ebp-0C0h]
0013FC0A B9 30 00 00 00 mov ecx, 30 h
0013FC0F B8 CC mov eax, 0 CCCCCCCCh
0013fc14 F3 AB rep STOs DWORD PTR [EDI]
0013fc16 68 D8 94 41 00 push 4194d8h
0013fc1b 8B 45 08 mov eax, dword ptr [EBP + 8]
0013fc1e 03 45 0C add eax, dword ptr [EBP + 0ch]
0013fc21 50 push eax
0013fc22 B9 88 86 45 00 mov ECx, 458688 H
0013fc27 E8 2B E2 FF call 0013de57
0013fc2c 8B C8 mov ECx, eax
0013fc2e E8 10 E7 FF call 0013e343
0013fc33 8B 45 08 mov eax, dword ptr [EBP + 8]
0013fc36 03 45 0C add eax, dword ptr [EBP + 0ch]
0013FC39 5F pop edi
0013FC3A 5E pop esi
0013FC3B 5B pop ebx
0013FC3C 81 C4 C0 00 00 00 add esp, 0C0h
0013fc42 3B ec cmp ebp, ESP
0013fc44 E8 D0 E8 FF call 0013e519
0013fc49 8B E5 mov ESP, EBP
0013fc4b 5d pop EBP
0013fc4c C3 RET
Bytecode of the Code function itself:
0041b770 55 push EBP
0041b771 8B EC mov EBP, ESP
0041b773 81 EC C0 00 00 00 sub ESP, 0c0h
0041b779 53 push EBX
0041b77a 56 push ESI
0041b77b 57 push EDI
0041b77c 8d BD 40 FF Lea EDI, [ebp-0C0h]
0041b782 B9 30 00 00 00 mov ECx, 30 h
0041b787 B8 CC mov eax, 0 cccccccch
0041b78c F3 AB rep STOs DWORD PTR [EDI]
0041b78e 68 D8 94 41 00 push offset STD: Endl (4194d8h)
0041b793 8B 45 08 mov eax, dword ptr [A]
0041b796 03 45 0C add eax, dword ptr [B]
0041b799 50 push eax
0041b79a B9 88 86 45 00 mov ECx, offset STD: cout (458688 H)
0041b79f E8 56 de FF call operator <(4195fah)
0041b7a4 8B C8 mov ECx, eax
0041b7a6 E8 40 E3 FF call operator <(419 aebh)
0041B7AB 8B 45 08 mov eax, dword ptr [a]
0041B7AE 03 45 0C add eax, dword ptr [B]
0041B7B1 5F pop edi
0041B7B2 5E pop esi
0041b7b3 5B pop EBX
0041b7b4 81 C4 C0 00 00 00 add ESP, 0c0h
0041b7ba 3B ec cmp ebp, ESP
0041b7bc E8 00 E5 FF call (_ rtc_checkesp) (419cc1h)
0041b7c1 8B E5 mov ESP, EBP
0041b7c3 5d pop EBP
0041b7c4 C3 RET
We can see from the three calls in red that our bytecode has not changed, that is, the same offset value. The calculated call addresses are different. The copied file is in the space of 0x0013. The correct one should be 0x0041... in the space.