Call compiled functions (1): Call compiled Functions

Source: Internet
Author: User

Call compiled functions (1): Call compiled Functions

In the end, it is really a good thing to do and a bad thing to do,

However, if you want to write something that will not only help memory but also help others to reference it, you still have to take some time to write this article.

I mentioned this issue a little in my previous blog posts about cracking. Now let's take a closer look at it.


In a narrow sense, compilation generally refers to converting program language code into machine code that can be executed by the CPU, such as C ++ (VC ++)

The main program of VB6 is actually compiled, but most of them are similar to java and generate intermediate code, which is interpreted by the virtual machine as the machine code at runtime.

This is similar to the script, but the intermediate code is binary and not easy to understand. The script is more intuitive.

For. NET (VB, C #, etc.), it is purely an intermediate code generation (Microsoft intermediate language). Therefore, programs generated by these languages can easily "decompile" and convert any language.

Generate intermediate code, which is also compiled in a broad sense.


What we want to talk about today is narrow compilation, and we mainly use VC6 as an example to illustrate the details of function calls. In fact, I am still paying more attention to the details.

Common function calls in VC include the following:

1. _ stdcall
2. _ cdec (default)
3. _ fastcall
4. thiscall (implicit)
5. naked (bare function)

In fact, naked is not a call convention, but a function modifier and compilation-oriented. It allows programmers to freely control the function stack.

After compilation, it can be the same as all calls except thiscall. Let's write a small demo to see how these functions are called.

// call.h ...#ifndef __CALL_H_#define __CALL_H_#if _MSC_VER > 1000#pragma once#endif // _MSC_VER > 1000//#ifdef __cplusplus//extern "C" {//#endifclass CCall{public:CCall();~CCall();int Call(int arg1, short arg2, char arg3, void *arg4);protected:int m_Var1;};//#ifdef __cplusplus//}//#endif#endif
For various purposes, I still write the function body out of the class:

// call.cpp ...#include "call.h"CCall::CCall(){m_Var1 = 18;}CCall::~CCall(){}int CCall::Call(int arg1, short arg2, char arg3, void *arg4){int var1;short var2;char var3;int *p;var1 = arg1;var2 = arg2;var3 = arg3;p = (int *)arg4;*p = m_Var1;return 0;}
There are also portals and global functions:

// Main. cpp... # include <windows. h> # include "call. h "int g_var1; void fnVoid (int arg1, short arg2, char arg3) {int var1; short var2; char var3; var1 = arg1; var2 = arg2; var3 = arg3; arg1 =-1; g_var1 = 111; return;} int fnDefaultCall (int arg1, short arg2, char arg3, void * arg4) {int var1; short var2; char var3; int * p; var1 = arg1; var2 = arg2; var3 = arg3; p = (int *) arg4; * p = 7; return 0 ;} int _ stdcall fnStandardCall (int arg1, short arg2, char arg3, void * arg4) {int var1; short var2; char var3; int * p; var1 = arg1; var2 = arg2; var3 = arg3; p = (int *) arg4; * p = 11; return 0;} int _ fastcall fnFastCall (int arg1, short arg2, char arg3, void * arg4) {int var1; short var2; char var3; int * p; var1 = arg1; var2 = arg2; var3 = arg3; p = (int *) arg4; * p = 14; return 0 ;}__ declspec (naked) int _ cdecl fnNakedCall (int arg1, short arg2, char arg3, void * arg4) {// 1. here, the values of all registers are the same as those before the call. // 2. referencing any local variable with the variable name is equivalent to referencing the main function variable or parameter // 3. register maintenance is required. Here the function is used as _ cdecl _ asm {pushebp; prolog beginmovebp, espsubesp, 50 hpushebxpushesipushedileaedi, [ebp-50h] movecx, 14 hmoveax, 0 CCCCCCCChrep stosdword ptr [edi]; prolog end // var1 = arg1; moveax, dword ptr [ebp + 8]; [esp + 8] movdword ptr [ebp-4], eax; [esp-4] // var2 = arg2; movcx, word ptr [ebp + 0Ch] mov word ptr [ebp-8], cx // var3 = arg3; movdl, byte ptr [ebp + 10 h] movbyte ptr [ebp-0Ch], dl // p = (int *) arg4; moveax, dword ptr [ebp + 14 h] movdword ptr [ebp-10 h], eax // * p =-1; mov ecx, dword ptr [ebp-10 h] movdword ptr [ecx], 0 FFFFFFFFh // return 22; moveax, 16 h; 0x16 = 22 popedi; epilog beginpopesipopebxmovesp, ebppopebp; epilog end // return to caller function (do not use ret 10 h) ret} int main (int argc, char ** argv) {CCall * pCall; int var1; int ret; fnVoid (1, 2, 3); ret = fnDefaultCall (4, 5, 6, & var1); ret = fnStandardCall (8, 9, 10, & var1 ); ret = fnFastCall (11, 12, 13, & var1); pCall = new CCall (); ret = pCall-> Call (15, 16, 17, & var1 ); delete pCall; // pCall = NULL; ret = fnNakedCall (19, 20, 21, & var1); return 0 ;}

Next we will look at the call process in DEBUG. Note that if it is VS. NET, a DWORD will be added before and after each variable during VC compilation to detect Buffer Overflow.

First, call the void function without return values. The default value is _ cdecl:

120:      fnVoid(1, 2, 3);0040135D   push        30040135F   push        200401361   push        100401363   call        @ILT+5(fnVoid) (0040100a)00401368   add         esp,0Ch121:
It can be seen that the parameter is pushed from right to left into the stack, then the call function address, and then add esp to clear the stack.

Note:

The stack is extended from the high address to the low address. For example, if esp (stack top pointer) = 0x0012FF04 before the first push, esp = 0x0012FF00 after push 3

Similarly, push 2, esp = 0x0012FEFC; push 1, esp = 0x0012FEF8

Next is the call command. This command will return the address, that is, the next command location (eip, command pointer) is pushed into the stack, such

Eip = 0x00401363 before call (next eip = 0x00401368)

Eip = 0x0040100A, esp = 0x0012FEF4 after call

Then the call ends. __cdecl indicates that the last ret command of the function will pop the stack top to the eip pointer.

Eip = 0x00401368 ESP = 0x0012FEF8

Then add esp, 0xc, Here 0xC = 12, that is, the three Dwords are the number of pushes pushed in front (pop should pop up to a register, add directly modify the stack top position, reduce the stack size)

At this point, the stack and eip are restored before the call.


Next, let's go into the function to see what the function has done:

7:    void fnVoid(int arg1, short arg2, char arg3)8:    {00401140   push        ebp00401141   mov         ebp,esp00401143   sub         esp,4Ch00401146   push        ebx00401147   push        esi00401148   push        edi00401149   lea         edi,[ebp-4Ch]0040114C   mov         ecx,13h00401151   mov         eax,0CCCCCCCCh00401156   rep stos    dword ptr [edi]9:        int var1;10:       short var2;11:       char var3;12:       var1 = arg1;00401158   mov         eax,dword ptr [ebp+8]0040115B   mov         dword ptr [ebp-4],eax13:       var2 = arg2;0040115E   mov         cx,word ptr [ebp+0Ch]00401162   mov         word ptr [ebp-8],cx14:       var3 = arg3;00401166   mov         dl,byte ptr [ebp+10h]00401169   mov         byte ptr [ebp-0Ch],dl15:16:       arg1 = -1;0040116C   mov         dword ptr [ebp+8],0FFFFFFFFh17:       g_var1 = 111;00401173   mov         dword ptr [g_var1 (0042ae74)],6Fh18:       return;19:   }0040117D   pop         edi0040117E   pop         esi0040117F   pop         ebx00401180   mov         esp,ebp00401182   pop         ebp00401183   ret--- No source file  --------------------------------------------------------------00401184   int         3
First, ebp is the bottom pointer of the stack, which is a high address (higher than esp). The function stack should be between esp and ebp and should not read or write the stack memory higher than ebp.

Note: it should not be impossible. This is the buffer overflow attack used by hackers. When your program accidentally writes data to these places, they can execute arbitrary code.

Including adding an administrator account. This type of function is usually strcpy, such as char szText [256], but the source string exceeds 256 bytes.

Push ebp is to save the value at the bottom of the stack, which is before the call, and then

Mov ebp and esp assign the stack top value to the stack bottom, which is equivalent to the stack top before the call as the current stack bottom, and then

Sub esp, 4 ch stack top reduction 4C = 76 (19 DWORD), equivalent to the stack size is 76 bytes, thus creating a stack used by the current function

Next

Push ebx to import base address registers into the stack. the compiler is very mechanical. In fact, up to now, it does not need base address registers. Of course, it does not need to store its values temporarily, but the compiler is not a human, it does not care about this

Next, push esi and edi are the original pointers and target pointers for string operations. If you know the assembly language, you will know. This kid started batch processing.

Lea edi, [ebp-4Ch] In fact, ebp-4Ch is esp is the top of the stack, stack top address as the purpose (low memory address)

Mov ecx, 13 H number 0x13 = 19. Do you still remember the 19 Dwords you just mentioned?

Mov eax, 0 CCCCCCCCh, string operation value, 0 xcccccccccc

Rep stos dword ptr [edi]: writes the eax value to the dword pointed to by edi, that is, 0 xcccccccc. If ecx is not zero, edi increments one dword to continue writing.

I know why the default value of the VC variable is always 0xCC. The local variables are saved on the stack. Now the whole stack has this value.

In fact, there is another use. Let's talk about it when the function returns.

Now, "The spring flower and Flower Association" officially started,


// Var1 = arg1;

Mov eax, dword ptr [ebp + 8]

Mov dword ptr [ebp-4], eax

Ebp is the new bottom pointer of the stack, that is, the original top of the stack. As mentioned in the previous call, the call will push the return address (the instruction address is not the return address ),

That is to say, what is ebp pointing? Error! Note that the initial push ebp is pushed into a DWORD. Therefore, ebp points to the original ebp.

When the stack expands to a lower address, ebp + 4 is the return address of the function. in reverse order, ebp + 8 is the last push-in parameter, that is, the first parameter!

Stack to low address extension, then the ebp-4 is the first local variable, someone asked why mov to eax, then put from eax to the first local variable? "Isn't this a multiple-Click Attack?" dichun said.

Yuan Fang said: the two parameters of the mov command cannot both be memory, that is, memory. This is why registers are called registers.


After thinking about this, we can understand it later, but we can transfer it with low characters and low bytes.

Then we can modify the parameter value. In fact, we can understand it well, because after the call, we add esp directly, and the xx parameter is discarded directly, so it does not change anything except the temporarily discarded stack.

The next step is to assign a global variable and send it to the memory address of the global variable immediately.

There is no return value for a single function, and return at the end of the function does not make any sense. If return is above, a jmp command is generated and jumped here.


Finally, clean up the site, First push the last pop to restore their previous values, restore the original stack top value, pop to restore the original stack bottom

The last ret Command, we have already said it during function calling. Here we will talk about the returned address in the stack (the address pointed by the restored stack top esp) what happens if it is modified?

For example, if you point to the address of the ShellExecute API, the parameter is cmd/c net user admin1 123456/add

This is for everyone to think about. Do you still remember to say another use of 0xCC? If there is no ret at this time, the machine code 0xCC corresponds to the int 3 interrupt.

In debug, such as OllyDebug, 0xCC will be inserted at the breakpoint, And the debugger will continue to run to restore the original value of this byte and then continue to execute

Therefore, the accidental buffer zone often causes access to the memory to be disabled or interrupted, but some people are very sensitive to this, just as a pretty girl's skirt is blown up,

! Sin!


The article seems to be very long. I'll start with int3 and continue later.


Which programming prawn explains what is "interpreted language" and "compiled language"

An interpreter reads a command, performs syntax analysis, runs the command, reads the next line, and then runs the command.
The compilation language is to compile the entire code into a lower-level code for execution.

If you are familiar with the language, you will find that the boundaries between the two are very vague, so to tell the truth, it is difficult to make it clear. The distinction between them is not clearly definable, but more reflected in actual use.
For example, c language is a compilation language. c language compiler compiles c code into assembly instructions and then runs
Javascript, as an explanatory language, is executed by the browser's sentence. There is no process in the middle where a browser compiles javascript code into a lower-level code.

Compilation languages often perform a lot of static syntax checks, such as ensuring that the variables you use are defined. The explanatory language is flexible and can be written at will, but errors often occur at runtime and are not easy to find.
 
How does the compiler translate functions and call functions? From the perspective of assembly language?

There are a lot of things involved in this issue. Simply put, the function call process is as follows:
1. Return address of push-in Function
2. push-in parameters vary with the call methods in sequence.
3. call the corresponding function
4. After the function is completed, the ret command is used to jump back to the return address of the medium voltage 1 to continue running.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.