In-depth analysis of the C ++ function call Process

Source: Internet
Author: User

 

In-depth analysis of the C ++ function call Process

Liu Bing QQ: 44452114

E-mail: liubing2000@foxmail.com

0. Introduction

  The function call process is actually an interrupted process. How does C ++ implement a function call? How does a parameter stack, function jump, protection site, and response site be implemented? This article provides an in-depth analysis and explanation of the function call process, and demonstrates it in the VC 6.0 environment. If the analysis is not in place or there are errors, please criticize and correct them. Contact the author.

The EIP is the instruction pointer, that is, the address of the next command to be executed. EBP is the base address pointer, which is often used to point to the bottom of the stack; ESP is a stack pointer, which is often used to point to the top of the stack.

Check the following simple program and view and analyze the assembly code in VC 6.0.

Figure 1

1. function call

G_func function call assembly code 2:

Figure 2

The first is three push commands, which are pushed to the stack respectively. We can find that the parameter pressure stack order is from right to left. Now we can check the data in the stack for verification. 3. From the real-time register table on the right, we can see that the ESP (stack top pointer) value is 0x0012fef0, and then find the memory address 0x0012fef0 from the memory table in the middle, we can see that the memory stores 0x00000001 (that is, parameter 1), 0x00000002 (that is, parameter 2), 0x00000003 (that is, parameter 3 ), that is, three parameter values are stored at the top of the stack, indicating that the stack is successfully pressed.

Figure 3

 

Then we can see that the call command redirects to the address 0x00401005. What is the address? We will continue to follow up. In Figure 4, we can see that this is another jump command, jump to 0x00401030. Let's take a look at the address 0x00401030. In Figure 5, we can see that this is the real g_func function. 0x00401030 is the starting address of the function, so that the jump to the g_func function is realized.

Figure 4

 

Figure 5

 

2. Save the site

Now let's look at the data in the stack, as shown in figure 6. The ESP (top stack) value is 0x0012feec. In the memory table, we can see that the top stack stores 0x00401093, the following are the parameters 1, 2, 3 in the previous stack. After the call command is executed, a data (0x00401093) is pushed to the stack by default ), so what is it? As shown in figure 3, the address of the next command in the call command is 0x00401093, which is actually the address of the command that needs to be executed after the function call is completed. After the function is returned, it will jump to this address. This is what we often call "Protecting the site" before function interruption ". This process is implicitly completed by the compiler. In fact, the EIP (Instruction Pointer) is pushed to the stack, that is, a push is implicitly executed.
The EIP command pops up from the stack to the EIP when the interrupt function returns, and the program continues to run.

Figure 6

Continue to look down. The first command after entering the g_func function is push EBP, which is about to import EBP into the stack. Because each function has its own stack region, the stack base address is also different. Now we have an interrupt function, and the EBP register is also required during function execution. What should we do if we enter the EBP value of the main function before entering the function? To avoid overwriting, press it into the stack and save it.

Next mov EBP, esp uses the stack top address as the stack base address of the function, and determines the stack region of the g_func function (EBP is the stack bottom, ESP is the stack top ).

The next command is sub ESP, 48 h. the literal meaning of the command is to move the top pointer of the stack up for 48 h bytes. Why do we need to move it? What is the memory area in the middle used? This area is an interval, and the stack areas of the two functions are separated by a distance, as shown in figure 7. The size of the interval is fixed to 40 h, that is, 64 bytes, and the memory area for storing local variables must be reserved. The g_func function has two local variables X and Y. Therefore, the length of ESP must be 40 h + 8 = 48 h.

Figure 7

The following commands assign a value of 0 cccccccch to the memory area of the last 48 hours.

00401039 Lea EDI, [ebp-48h]

0040103c mov ECx, 12 h

00401041 mov eax, 0 cccccccch

00401046 rep STOs dword ptr [EDI].

In the next three stack commands, EBX, ESI, and EDI are pushed into the stack, which is also part of "protecting the site". These are some data of the main function execution. EBX, ESI, and EDI are base address registers, source address change registers, and target address change registers respectively.

 

3. Execute sub-functions

Next, let's continue to look at the assignment of the local variables X and Y. Let's see how the Assembly command calculates the memory addresses of X and Y? As shown in 8, it is calculated based on EBP, which are [ebp-4] and [ebp-8] respectively. We can see that the corresponding memory area has been stored in 0x11111111 and 0x22222222.

Figure 8

At this time, we should be very clear about the content stored in the entire memory area (9 ).

Figure 9

 

4. Restore the site

At this time, the code of the sub-function has been executed, and the compiler will do some post-processing work (10 ). The first step is to read the values of EDI, ESI, and EBX from the top of the stack. From the memory data distribution in Figure 9, we can know that the data at the top of the stack is indeed EDI, ESI and EBX, so that the values of EDI, ESI and EBX before the call are restored, this is part of "restoring the site.

Figure 10

The fourth instruction is mov esp. EBP assigns the value of EBP to ESP. So what does this mean? Looking at the memory data distribution in Figure 9, we can understand that this statement points ESP to the memory unit referred to by EBP, that is, it skips a region, obviously, the interval and local data areas are skipped. Because the function has exited, both areas are useless. In fact, this statement is the sub ESP statement that is used to create the interval area when entering the function. The opposite operation is performed within 48 hours.

Next is pop EBP. We can see from the memory data distribution in Figure 9 that the top of the stack is indeed the EBP value before the storage, so that the EBP value before the call is restored, this is also part of "restoring the site. After the command is executed, the memory data distribution is shown in 11.

Figure 11

Next is a RET command, that is, the return command. What will it do? Note that the ESP value and EIP value before executing the RET command (as shown in 12) Point ESP to 0x00401093 at the top of the stack, and the EIP value is 0x0040105c (that is, the RET command address ).

Figure 12

After executing the RET command, we can view the ESP and EIP values (as shown in 13). At this time, ESP is 0012fef0, that is, 4 bytes are moved down. Obviously, the compiler implicitly executes a pop command. Let's take a look at the EIP value to 0x00401093. How can this value be so familiar! It is actually 4 bytes of data at the top of the stack, so the implicit execution command here should be pop EIP. This value is the address of the next instruction of the call before calling the call command. As shown in Figure 13, the EIP value is changed to 0x00401093, so the program jumps to a command after the call command and returns to the place before the interruption, this is the so-called restoration breakpoint.

Figure 13

It has not completely ended yet. There is also the last command Add ESP, 0ch. This is simple. From figure 13, we can see that the data on the top of the stack is 1, 2, 3, that is, the three real parameters pushed before the function call. This is because the function has been executed. Obviously, these three parameters are useless. So add ESP, 0ch is to let the top pointer of the stack move down the location of 12 bytes. Why is it 12 byte? It is very simple, because three int data are written into the stack. In this way, because all the data added to the function call Stack has been cleared, the stack top pointer (ESP) is actually back to the position before the function call, the values of all registers are also restored before the function call.

 

End!

 

 

 

 

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.