Call Stack
The concept of stack is explained in detail in the data structure.
List some key points:
1. First in first out.
2. Data can always be stored or retrieved from the top of the stack.
In the x86 processor, push the stack command. Pushing an item to the top of the stack will reduce the top pointer of the stack by four bytes. The stack top pointer is stored in register esp. Correspondingly, the register name is the abbreviation of stack pointer.
Stack pressure
When the stack is pressed, the following events occur in sequence:
1. The top stack pointer ESP is reduced by 4 bytes.
2. Data to be pushed into the stack is copied to the address pointed to by ESP.
We can see that the stack is growing to a lower address, that is, the more content in the stack, the smaller the stack top pointer.
Outbound Stack
The following operation is performed once when the stack is output:
1. Data in the stack pointed to by the stack top pointer (ESP) is retrieved.
2. Four bytes are added to the top pointer of the stack.
In this way, the top pointer ESP always points to the available data in the next stack.
In sequence, we load data A, B, and C to the stack in sequence.
Thread Stack
In Win32, every time a thread is created, the virtual memory of 1 MB (1 million, one thousand, one thousand) is retained as the stack space, used by this thread. The ESP register points to the top of the reserved memory (the end with the largest address), so that the stack is initialized. To check the value of the current stack pointer, you can use the 'R' command in the debugger. The ESP register value is the stack pointer value.
Nested function call
ProgramIt is rarely a very long single function. Generally, different functions are responsible for different functions and will be called by the main function as needed. In the first five examples of assembly language basics, this situation is very good. So how does one call a function and then return to execute the program?ArticleThe detailed analysis andCodeAnnotations.
Call conventions
If the code you write uses the same language and the same compiler, it is useless to call the convention. If you use a variety of languages and different compilers, the problem of processing standards in different languages will arise. To enable programmers to handle multi-language mixed programming, Niu people have established rules to specify how to call functions and pass parameters. These rules are called call conventions.
For example, a programmer writes a function named SQRT in Pascal. This function carries a floating point parameter and returns the square root of the parameter.
Function SQRT (numargument: Real): real;
C programmers need to know how Pascal programmers expect to pass the numargument parameter to the SQRT function. To handle such mixed language programming, Pascal defines the Pascal call convention. This Convention is the stdcall we often see, and Pascal is outdated and no longer used.
The start and end codes are automatically added by the compiler to save registers, set the stack bottom, and restore the stack State at the end of the call. These two pieces of code are related to the CPU architecture and compiler. Understanding this part of the code is very important for understanding how to debug. The following example shows the start and end code automatically generated when stdcall is called.
Typical start code of the called function (Prolog)
Push |
EBP |
Frame pointer) |
MoV |
EBP, ESP |
Change the stack base to the current stack top position |
Sub |
ESP, 8 |
Move the top of the stack up to eight bytes to save local variables. |
Push Push Push |
EBX ECX EdX |
Save the value in the register |
Typical caller end code (epilog)
Pop PopPop |
EdX ECXEBX |
Recovery register value |
MoV |
ESP, EBP |
Restore the top pointer of the stack to point it to the bottom of the stack. In this way, the space occupied by local variables is released. |
Pop |
EBP |
Restore EBP (frame pointer) |
RET |
0x8 |
Move 8 bytes down at the top of the stack to release the function parameters in the stack. (Let's take a look at the description of the RET command below) |
Some functions need to save some register values when calling another function, so that they can be used after the call is resumed. Stack is a good place to temporarily save registers, so the code will do some additional stack operations to save some register values that will be used later.
The figure above is very clear. Then, let's take a closer look at the description of the RET command, so that we can better understand each meaning of the RET command.
RET: Command enable -- return.
1. return within a segment. First, send the words at the top of the stack to the IP address, and then increase SP by 2. If there is an immediate number, SP plus the immediate number (discard some parameters that enter the stack before the call is executed ).
2. Return between segments. After the words at the top of the stack are sent to the IP address (SP increases by 2), then the words at the top of the stack are sent to CS, and SP increases by 2. If there is an immediate number, SP plus the immediate number.
In the first article of this series, it shows that Win32 uses planar addressing without segment registers. However, for the sake of integrity, the outdated segment register table is listed below.
1. Code segment register Cs: stores the base values of the segments where the currently running program code is located. This indicates that the currently used instruction code can be obtained from the memory segments specified by this register, the corresponding offset value is provided by the IP address.
2. Data Segment register DS: Specifies the lowest address of the Data Segment used by the current program, that is, the Base Value of the Data Segment.
3. Stack segment register SS: Specifies the bottom address of the current stack, that is, the Base Value of the stack segment.
4. Additional segment register ES: indicates that the current program uses the Segment Base Address of the additional data segment, which is the segment of the destination string in the string operation command.
Stdcall call conventions
This call Convention has the following features:
* The parameter is input from right to left.
* The caller is responsible for clearing the stack.
* The function name starts with an underscore.
* Add a @ symbol after the function name, followed by the parameter size.
* The Name Process of a function is case-insensitive.
Functions with variable parameters cannot be processed.
Caller's Responsibilities
Top right parameters of the stack
Press next
The leftmost Parameter
Call function functionx
Responsibilities of the called Function
Push EBP
MoV EBP, ESP
Sub ESP, local_size
...
MoV ESP, EBP
Pop EBP
The number of bytes of the RET parameter.
Cdecl call conventions
This call Convention has the following features:
* Pass the parameter from right to left
* The caller is responsible for clearing the stack.
* The function name starts with an underscore.
* Case-insensitive conversion is not performed.
Therefore, the function named functioncall is recorded as _ functioncall in the symbol table.
* A variable number of parameters can be processed.
* Is the default call Convention for C and C ++ programs.
Is this statement correct?
Because the parameters are pressed from the right to the left, the initial parameter is at the position closest to the top of the stack. Therefore, when an indefinite number of parameters are used, the position of the first parameter in the stack must be known. As long as the number of parameters can be determined based on the explicit parameters of the first and later, the parameter can be used.
The analysis is as follows:
If stdcall is applied from right to left, why cannot stdcall process variable parameters? The most important difference between stdcall and cdecl is who cleans up stack parameters. Stdcall is cleared by the called function, while cdecl is cleared by the caller. The call convention is a responsibility assignment for the function call stack agreed between the caller and the called.
Msdn explains that cdecl can handle Variable Parameter Function calls in one sentence: Because the stack is cleaned up by the caller, it can doVarargFunctions. It seems that cdecl can implement variable parameters not because of the parameter pressure stack sequence.
Let's first find out a problem, where is the difficulty of variable parameters? How does one obtain the type and number of parameters?
The caller must know how many parameters are to be transferred. Specifically, the caller knows the size of the parameters to be passed in. The caller pushes them to the stack in turn. How do I know the size and number of parameters when the caller takes over the control? As mentioned above, the first parameter can be obtained, and the type can be accessed. Like the printf function, the first parameter contains a successor parameter indicator such as % d and % F. By using these parameters, you can obtain the type and number of the following parameters. The type and number of parameters are all known, and the size of all parameters is also known. The type and number of parameters depend on the function implementation. The callers of stdcall and cdecl can always obtain the size and number of parameters. This has nothing to do with who is going to clean up the stack. Note that the compiled function does not have code to calculate the size of the parameter passed to it.
I still didn't answer that question. Where is the difficulty of variable parameters?
Assuming that in the case of stdcall, The called function containing variable parameters successfully obtains all the parameters and completes the operation, but the return result is not calculated, there is no place to record the size of the stack space it accepts. He does not know how to clean up stacks. The caller is not responsible for clearing the stack. So it is troublesome.
The cdecl caller is responsible for clearing the stack. The caller knows how many parameters it has passed and can clean the stack smoothly so that the program can continue to run.
This section references: http://blog.csdn.net/ZhouHM/archive/2004/04/07/14721.aspx
Http://hi.baidu.com/dtzw/blog/item/cc17ba119eb39374cb80c4eb.html
Caller's Responsibilities
Top right parameters of the stack
Press next
The leftmost Parameter
Call function functionx
Increase ESP to the parameter size.
Function call responsibility (functionx)
Push EBP
MoV EBP, ESP
Sub ESP, local_size
...
MoV ESP, EBP
Pop EBP
RET
Fastcall call Convention
This call Convention means that when possible, the parameter will be placed in the register. Because not all parameters have a stack operation, it is faster than stdcall and cdecl, so it is called fastcall. Haha.
This call Convention has the following features:
* Pass the parameter from right to left
* Parameters in the called function clearing Stack
* Add the @ prefix before the function name, followed by a @, and add the parameter's number of bytes in the format of @ name @ number.
* The Case sensitivity of the function name is not converted.
* Functions with variable parameters cannot be processed.
Caller's Responsibilities
Press the rightmost parameter on the stack
Press the next parameter into edX
Press the first parameter into ECx
Call functionx
Accused by the caller (functionx)
Push EBP
MoV EBP, ESP
Sub ESP, local_size
Do function processing...
MoV ESP, EBP
Pop EBP
RET <number_of_pushed_arguments>; note that if there are two parameters, the RET command here will not contain the immediate number parameter, because no parameter is pushed to the stack.
Thiscall call conventions
This is a member function with a fixed number of parameters in C ++, the default call convention. This call convention cannot be specified because thiscall is not a keyword like stdcall or cdecl. In the thiscall convention, this pointer is passed to the called function through the ECX register.
Naked Function
A function that declares the naked attribute. No Prolog or epilog code is inserted. In this way, you can use the inline validator to write your own Prolog or epilog command sequence. The naked function is provided with advanced features. They allow you to declare such a function-the function is in a non-C or C ++ context (not a C or C ++ function), and there is no agreement on where the parameter is placed, it is unclear which registers are retained. In short, it is all done by programmers themselves.
It is suitable for writing C language programs that communicate with existing systems written in assembly languages.
What is inline aggreger? From Wikipedia
In computer programming, the inline runner er is a feature of Some compilers that allows very low level code written in assembly to be embedded in a high level language like C or Ada.
Compare several call conventions
Project/call Method |
_ Stdcall (win32) |
_ Cdecl |
_ Fastcall |
Thiscall (native C ++) |
Com |
_ Declspec (naked)(_ Declspec is a keyword of Microsoft, which may not exist on other systems) |
Parameter pressure stack Sequence |
From right to left |
From right to left |
From right to left, Arg1 in ECx,Arg2 in edX |
From right to left, This pointer in ECx |
From right to left, Finally press this pointer |
Programmer-defined |
Parameter location |
Stack |
Stack |
Stack + register |
Stack, register ECx |
|
Programmer-defined |
Function used to clear parameters in the stack |
Called |
Caller |
Called |
Called |
Called |
Programmer-defined |
Variable parameters are supported. |
No |
Yes |
No |
No |
No |
Programmer-defined |
Function Name format |
_ Name @ number |
_ Name |
@ Name @ number |
|
|
Programmer-defined |
Function Name example |
_ Ntwaitforsingleobject @ 12 |
_ Printf |
@ Afpfsddispclosevol @ 4 |
|
|
Programmer-defined |
Case-sensitive Conversion |
No |
No |
No |
No |
|
Programmer-defined |
Differences in function name modification conventions during C ++ Compilation |
The function name starts with "@ YG" to identify the parameter table, followed by the parameter table; |
The start ID of the parameter table is "@ ya "; |
The start ID of the parameter table is "@ Yi ". |
|
|
Programmer-defined |
Exercise:
1. Q: What do prologue and epilogue code do?
A: The opening and ending codes are automatically generated by the compiler. They are used to help programs save registers during execution, establish a stack framework, and restore stacks after function calls.
2. Q: What types of calling convention are there?
A: stdcall, cdecl, fastcall, thiscall, and nakedfunction.
3. Which call conventions depend on the called function to clean up stack parameters?
A: stdcall, fastcall, thiscall.
4. Which call conventions are generally used for applications written in C ++?
A: cdecl and thiscall
5. Which register is generally used to transmit this pointer on intel?
A: ECx.