[reprint]c/c++ Stack Guide
Reprint: http://www.cnblogs.com/Binhua-Liu/archive/2010/08/24/1803095.html
Preface
We often discuss the question of when the data is stored in the stack and when the data is stored in the heap. We know that local variables are stored in the stack; When you debug, the view stack can know the order in which the functions are called, and the arguments are passed when the function is called, in fact, the arguments are pressed into the stack, which sounds like a hodgepodge of stacks. So how does the stack work? This article will explain the working mechanism of the C + + stack. Please note the following points when reading:
1) The compilation environment discussed in this article is Visual C + +, because the stack work mechanism of the high-level language is roughly the same, making sense for other compilation environments or high-level languages such as C #.
2) The stack discussed in this article refers to the default stack that the program allocates for each thread to support the running of the program, rather than the stack that the programmer has defined itself to implement the algorithm.
3) The platform discussed in this article is Intel x86.
4) The main part of this article will try to avoid involving the knowledge of the Assembly, in the final optional section of this article, give the previous chapters of the anti-compilation code and comments.
5) Structured exception handling is also done through stacks (when you use the Try...catch statement, C + + expands on Windows structured exception handling), but the topic of structured exception handling is too complex to be addressed in this article.
start with some basic knowledge and concepts
1) The stack of the program is supported directly by the processor. In an Intel x86 system, the stack in memory is extended from a high address to a low address (this differs from a custom stack from a low address to a high address extension), as shown in:
Therefore, the top address of the stack is constantly decreasing, and the more the data that is in the stack, the lower the address.
2) in a 32-bit system, the size of each data unit on the stack is 4 bytes. Data that is less than or equal to 4 bytes, such as bytes, words, double words, and Booleans, is 4 bytes in the stack, and data that is larger than 4 bytes occupies a 4-byte integer multiple in the stack.
3) The two registers associated with the operation of the stack are the EBP register and the ESP register, and in this article you only need to interpret EBP and ESP as 2 pointers. The ESP register always points to the top of the stack, and when the push command presses the data into the stack, the ESP is reduced by 4, then the data is copied to the address the ESP points to, and when the pop command is executed, the data that the ESP points to is copied to the memory address/register, and then the ESP is added 4. The EBP register is used to access the data in the stack, which points to a position in the middle of the stack (which is explained in detail later), the function's parameter address is higher than the EBP value, and the function's local variable address is lower than the value of EBP, so the parameter or local variable is always accessed by the EBP plus minus the offset address, such , the first parameter to access a function is ebp+8.
4) What data is stored in the stack? Includes: function arguments, function local variables, register values (to restore registers), function return addresses, and data for structured exception handling (when there are try...catch statements in the function, this is not discussed in this article). The data is organized in a certain order, and we call it a stack frame. A stack frame corresponds to a call to a function once. At the beginning of the function, the corresponding stack frame has been completely established (all local variables have been allocated space when the function frame was established, not as the function was executed); When the function exits, the entire function frame is destroyed.
5) In the text, we refer to the caller of the function as caller (the caller), and the called function is called the callee (callee). This concept was introduced because of the creation and cleanup of a function frame, some work done by caller, and some by callee.
start to discuss how the stack works
Let's discuss the working mechanism of the stack. Stacks are used to support the invocation and execution of functions, so let's take a look at the following code through an example of a set of function calls:
123456789101112131415161718 |
int
foo1(
int
m,
int n)
{
int
p=m*n;
return
p;
}
int
foo(
int
a,
int
b)
{
int
c=a+1;
int
d=b+1;
int
e=foo1(c,d);
return
e;
}
int
main()
{
int
result=foo(3,4);
return
0;
}
|
The code itself doesn't really make sense, we just use it to track the stack. The following chapters let us track the build of stacks, the use of stacks, and the destruction of stacks.
the creation of stacks
The first line of code that we execute from the main function, the int result=foo (3,4), starts the trace. The stack frame that corresponds to main and previous functions already exists on the stack, as shown in:
Figure 1
parameters into the stack
When the Foo function is called, first, caller (now caller is the main function) puts the two parameters of the Foo function: A=3,b=4 into the stack. The order in which the arguments are in the stack is determined by the calling convention of the function (calling convention), and we will explain the calling convention in a specific section later. In general, the parameters are in the stack from right to left, so the b=4 is pressed into the stack first, a=3,
Figure 2return address into stack
We know that when the function ends and the code goes back to the previous function to continue execution, how does the function know where to go to perform the function? When the function is called, the address of the next instruction is automatically pressed into the stack, and when the function ends, the address is read from the stack and it can be jumped to the execution of the instruction. If the address of the current "call foo" instruction is 0x00171482, since the call instruction occupies 5 bytes, then the address of the next instruction is 0x00171487,0x00171487 will be pressed into the stack:
Figure 3code jumps to the called function execution
After the return address is in the stack, the code jumps to the called function Foo to execute. So far, the previous part of the stack frame was built by caller, and after that, the rest of the stack frame was built by callee.
EBP pointer into the stack
In the Foo function, the value of the EBP register is first pressed onto the stack. Because the value of the EBP register at this time is still used for the main function, to access the parameters and local variables of the main function, it needs to be persisted in the stack, and resumed when the Foo function exits. At the same time, the new value is assigned to EBP.
1) Press EBP into the stack
2) Assign the value of ESP to EBP
Figure 4
In this way, it is easy to find that the current EBP register point to the stack address is the address of the previous value of EBP, you will also find that the address of ebp+4 is the address of the function return value, EBP+8 is the address of the first parameter of the function (the first parameter address is not necessarily ebp+8, which will be discussed later). Therefore, it is easy to find the arguments (or local variables) for which the function is called by the EBP or the Access function.
assigning addresses to local variables
Next, the Foo function assigns the address to the local variable. Instead of pressing the local variable into the stack, the program will subtract a value from the ESP, allocating space directly for all local variables, such as esp=esp-0x00e4 in the Foo function, (depending on the candle fall's test on other compiled environments, you may also use the Push command to assign an address, There is no difference in nature, it is hereby stated):
Figure 5
Oddly, in debug mode, the compiler allocates more space for local variables than is actually needed, and the address between local variables is not contiguous (as I observe, always 8 bytes apart) as shown:
Figure 6
I don't know why the compiler designed this, perhaps to insert debug data into the stack, but that's not the way we're going to talk today.
Universal registers into the stack
Finally, the general register used in the function is put into the stack, which is staged so that the function can be resumed at the end. The common registers used in the Foo function are Ebx,esi,edi, which are pressed into the stack:
Figure 7
At this point, a complete stack frame is set up.
Stack feature Analysis
In the previous section, a complete stack frame has been set up, and now the function can begin to formally execute the code. In this section, we analyze the characteristics of the stack to help you understand the dependencies of the function with the stack frame.
1) When a complete stack frame is established, the structure and size of the function will remain the same throughout the life cycle of the execution, regardless of when the function is called by whom, the structure of the corresponding stack frame is also certain.
2) Call the B function in a function, corresponding to the stack frame corresponding to the A function "below" to establish the stack frame of the B function. For example, the FOO1 function is called in the Foo function, and the stack frame of the FOO1 function is established below the stack frame of the Foo function. As shown in the following:
Figure 8
3) The function uses the EBP register to access the parameters and local variables. We know that the address of the parameter is always higher than the value of EBP, and the address of the local variable is always lower than the value of EBP. In a particular stack frame, the address offset of each parameter or local variable relative to EBP is always fixed. Therefore, the function accesses the parameters and local variables through the EBP plus an offset. For example, in the Foo function, ebp+8 is the address of the first parameter, and EBP-8 is the address of the first local variable.
4) If you think about it, it's easy to see that the EBP registers have a very important feature, please fancy:
Figure 9
We find that the EBP register always points to the previous EBP, and the previous EBP points to the previous EBP, which forms a list in the stack! What's the use of this feature, we know that the Ebp+4 address stores the return address of the function, through which we can know the function's upper-level function (by finding the function address closest to the function's return address in the symbol file, which is the upper-level function of the current function), and so on, We can know the entire sequence of function calls for the current thread. In fact, that's exactly what the debugger does, which is why we always say "view stack" when we look at the sequence of function calls when we're debugging.
how the return value is passed
After the stack frame is established, the code of the function really begins to execute, it will manipulate the parameters in the stack, manipulate the local variables in the stack, and even create the object on the heap (heap), Balabala ...., finally the function has finished its work, some functions need to return the result to its previous function, how is this done?
First, caller and callee have a "pact" on this issue, since caller is unaware of how callee is executed internally, so caller needs a function declaration from callee to know where to get the return value. Similarly, callee can not casually put the return value in a register or in memory and expect caller to be able to obtain correctly, it should according to the function declaration, according to the "Convention" to put the return value in the correct "place". Let's explain the "Convention" below:
1) First, if the return value equals 4 bytes, the function assigns the return value to the EAX register, which is returned by the EAX register. For example, the return value is the type of byte, Word, double Word, Boolean, pointer, etc., all returned through the EAX register.
2) If the return value equals 8 bytes, the function assigns the return value to the EAX and edx registers, which is returned through the EAX and edx registers, the edx stores the high 4 bytes, and the EAX stores the low 4 bytes. For example, a struct with a return value of type __int64 or 8 bytes is returned through EAX and edx.
3) If the return value is double or float, the function assigns the return value to the floating-point register, which is returned through the floating-point register.
4) If the return value is a data greater than 8 bytes, how will the return value be passed? This is a troublesome problem, we will explain in detail:
We modify the Foo function by defining the following and making the appropriate changes to its code:
1234 |
MyStruct foo( int a, int b) { ... } |
MyStruct is defined as:
123456 |
struct MyStruct { int value1; __int64 value2; bool value3; }; |
At this point, the process of loading the arguments is different when calling the Foo function, as shown in:
Figure 10
Caller will press into the leftmost parameter and press a pointer again, let's call it returnvaluepointer,returnvaluepointer an unnamed address that points to the caller local variable area. This address will be used to store the return value of the callee. When the function returns, callee copies the return value to the address pointed to by Returnvaluepointer and assigns the Returnvaluepointer address to the EAX register. After the function returns, caller finds the Returnvaluepointer through the EAX register, and then Returnvaluepointer finds the return value, and finally, caller copies the return value to the local variable that is responsible for the receipt (if the return value is received).
You might have the doubt that when the function returns, the corresponding stack frame has been destroyed, and Returnvaluepointer is in the stack frame, and should not be destroyed? Yes, the stack frame is destroyed, but the program does not automatically clean up its values, so the value in Returnvaluepointer is still valid.
destruction of Stack frames
When the function assigns the return value to some register or to a place on the stack, the function begins to clean up the stack frame and prepares to exit. The cleanup order of stack frames is the opposite of the order in which the stacks are built: (The destruction process of stack frames is not illustrated by drawing)
1) If an object is stored in a stack frame, the object's destructor is called by the function.
2) POPs the value of the previous universal register from the stack and restores the general register.
3) ESP adds a value that reclaims the address space of the local variable (plus the same size as the address assigned to the local variable when the stack frame is established).
4) The value of the previous EBP register is popped from the stack and the EBP register is restored.
5) POPs the return address of the function from the stack and prepares to jump to the return address of the function to continue execution.
6) ESP adds a value that reclaims all the parameter addresses.
All 1-5 of the preceding articles were completed by callee. The 6th, the recovery of the parameter address, is determined by caller or callee completion is the calling convention (calling convention) used by the function. The following subsections let us explain the calling convention of the function.
calling convention for functions (calling convention)
The calling convention for a function (calling convention) refers to the order in which the function's arguments are pressed into the stack, and who (caller or callee) cleans up the parameters in the stack when the function exits. There are 2 ways to specify the calling convention used by a function:
1) When the function is defined, add modifiers to specify, as
1234 |
void __thiscall mymethod(); { ... } |
2) Specify the default calling convention for all functions defined in the project in the VS Project setup: Open project| in the main menu of the project Project property| Configuration properties| c/c++| advanced| Calling convention, select the calling convention (note: This practice is not valid for class member functions).
Common calling conventions have the following 3 types:
1)__cdecl. This is the default calling convention of the VC compiler. The rule is that parameters are pressed from right to left into the stack, and the parameters in the stack are cleaned up by caller when the function exits. This invocation convention is characterized by the support of a variable number of parameters, such as the printf method. Since callee does not know how many arguments caller will put on the stack, callee has no way to clean up the stack itself, so only after the function exits, the stack is cleaned up by caller, because caller always knows how many parameters it has passed in.
2)__stdcall. All Windows APIs use __stdcall. The rule is that the parameter is pressed from right to left onto the stack, and the function exits with the callee itself cleaning up the parameters in the stack. Because the parameter is cleaned by callee itself, the __stdcall does not support a variable number of parameters.
3) __thiscall. The calling convention is used by default for class member functions. The rule is that the parameter is pressed from right to left into the stack, and the x86 frame the this pointer is passed through the ECX register, and when the function exits, the parameters in the stack are cleared by callee, and the this pointer is passed through the ECX register x86 the frame. A variable number of parameters is also not supported. If you explicitly declare a class member function to use __cdecl or __stdcall, the stack and the stack will be pressed with the rules of __cdecl or __stdcall, and the this pointer will be pressed into the stack as the first argument of the function instead of being passed using the ECX register.
tracing of anti-compilation code (not familiar with assembly can be skipped)
The following code establishes the anti-compilation code for the associated code for the stack frame corresponding to the Foo function, and I will give the comment line by row, comparing the description of the stack in the previous article:
The main function is int result=foo (3,4); Disassembly of:
12345 |
008A147E push 4
//b=4 压入堆栈
008A1480 push 3
//a=3 压入堆栈,到达图2的状态
008A1482 call foo (8A10F5h)
//函数返回值入栈,转入foo中执行,到达图3的状态
008A1487 add esp,8
//foo返回,由于采用__cdecl,由Caller清理参数
008A148A mov dword ptr [result],eax
//返回值保存在EAX中,把EAX赋予result变量
|
The following is the assembly code of the Foo function code before and after the formal execution
123456789101112131415161718192021 |
008A13F0 push ebp
//把ebp压入堆栈
008A13F1 mov ebp,esp
//ebp指向先前的ebp,到达图4的状态
008A13F3 sub esp,0E4h
//为局部变量分配0E4字节的空间,到达图5的状态
008A13F9 push ebx
//压入EBX
008A13FA push esi
//压入ESI
008A13FB push edi
//压入EDI,到达图7的状态
008A13FC lea edi,[ebp-0E4h]
//以下4行把局部变量区初始化为每个字节都等于cch
008A1402 mov ecx,39h
008A1407 mov eax,0CCCCCCCCh
008A140C rep stos dword ptr es:[edi]
......
//省略代码执行N行
......
008A1436 pop edi
//恢复EDI
008A1437 pop esi
//恢复ESI
008A1438 pop ebx
//恢复EBX
008A1439 add esp,0E4h
//回收局部变量地址空间
008A143F cmp ebp,esp
//以下3行为Runtime Checking,检查ESP和EBP是否一致
008A1441 call @ILT+330(__RTC_CheckEsp) (8A114Fh)
008A1446 mov esp,ebp
008A1448 pop ebp
//恢复EBP
008A1449 ret
//弹出函数返回地址,跳转到函数返回地址执行 //(__cdecl调用约定,Callee未清理参数)
|
C + + Stack Guide