National embedded Talent Training Base
1. function call
Previous Page
Chapter 2 Relationship Between assembly and C
Next Page
--------------------------------------------------------------------------------
1. Please comment on function calls
We use the following code to study the function call process.
Example 19.1: Study the function call Process
Int bar (int c, int D)
{
Int e = C + D;
Return E;
}
Int Foo (int A, int B)
{
Return bar (A, B );
}
Int main (void)
{
Foo (2, 3 );
Return 0;
}
If the-G option is added during compilation (the-G option is described in GDB in Chapter 10th), you can use the objdump Disassembly tool to insert C code and assembly code to display it, in this way, the correspondence between C code and assembly code is clearer. The result of disassembly is very long. The following lists only the parts that we care about.
$ GCC main. C-G
$ Objdump-Ds A. Out
...
08048394 <bar>:
Int bar (int c, int D)
{
8048394: 55 push % EBP
8048395: 89 E5 mov % ESP, % EBP
8048397: 83 EC 10 sub $0x10, % ESP
Int e = C + D;
804839a: 8B 55 0C mov 0xc (% EBP), % edX
804839d: 8B 45 08 mov 0x8 (% EBP), % eax
80483a0: 01 D0 add % edX, % eax
80483a2: 89 45 FC mov % eax,-0x4 (% EBP)
Return E;
80483a5: 8B 45 FC mov-0x4 (% EBP), % eax
}
80483a8: C9 leave
80483a9: C3 RET
080483aa <Foo>:
Int Foo (int A, int B)
{
80483aa: 55 push % EBP
80483ab: 89 E5 mov % ESP, % EBP
80483ad: 83 EC 08 Sub $0x8, % ESP
Return bar (A, B );
80483b0: 8B 45 0C mov 0xc (% EBP), % eax
80483b3: 89 44 24 04 mov % eax, 0x4 (% ESP)
80483b7: 8B 45 08 mov 0x8 (% EBP), % eax
80483ba: 89 04 24 mov % eax, (% ESP)
80483bd: E8 D2 FF call 8048394 <bar>
}
80483c2: C9 leave
80483c3: C3 RET
080483c4 <main>:
Int main (void)
{
80483c4: 8d 4C 24 04 Lea 0x4 (% ESP), % ECx
80483c8: 83 E4 F0 and $0xfffffff0, % ESP
80483cb: FF 71 FC pushl-0x4 (% ECx)
80483ce: 55 push % EBP
80483cf: 89 E5 mov % ESP, % EBP
80483d1: 51 push % ECx
80483d2: 83 EC 08 Sub $0x8, % ESP
Foo (2, 3 );
80483d5: C7 44 24 04 03 00 00 movl $0x3, 0x4 (% ESP)
80483dc: 00
80483dd: C7 04 24 02 00 00 00 movl $0x2, (% ESP)
80483e4: E8 C1 FF call 80483aa <Foo>
Return 0;
80483e9: B8 00 00 00 mov $0x0, % eax
}
80483ee: 83 C4 08 add $0x8, % ESP
80483f1: 59 pop % ECx
80483f2: 5D pop % EBP
80483f3: 8d 61 FC lea-0x4 (% ECx), % ESP
80483f6: C3 RET
...
To view the compiled assembly code, another method is gcc-s main. C. In this way, only the compilation code main. S is generated, instead of the binary target file.
The entire execution process of the program is to call Foo and foo to call bar. We use GDB to trace the execution of the program until int e = C + D in the bar function; when the statement is ready to return after execution, at this time, the function stack frame is printed in GDB.
(GDB) Start
...
Main () at main. C: 14
14 Foo (2, 3 );
(GDB) S
Foo (A = 2, B = 3) at main. C: 9
9 return bar (A, B );
(GDB) S
Bar (C = 2, D = 3) at main. C: 3
3 int e = C + D;
(GDB) disassemble
Dump of worker er code for function bar:
0x08048394 <BAR + 0>: Push % EBP
0x08048395 <BAR + 1>: mov % ESP, % EBP
0x08048397 <BAR + 3>: Sub $0x10, % ESP
0x0804839a <BAR + 6>: mov 0xc (% EBP), % edX
0x0804839d <BAR + 9>: mov 0x8 (% EBP), % eax
0x080483a0 <BAR + 12>: Add % edX, % eax
0x080483a2 <BAR + 14>: mov % eax,-0x4 (% EBP)
0x080483a5 <BAR + 17>: mov-0x4 (% EBP), % eax
0x080483a8 <BAR + 20>: Leave
0x080483a9 <BAR + 21>: Ret
End of worker er dump.
(GDB) Si
0x0804839d 3 int e = C + D;
(GDB) Si
0x080483a0 3 int e = C + D;
(GDB) Si
0x080483a2 3 int e = C + D;
(GDB) Si
4 return E;
(GDB) Si
5}
(GDB) BT
#0 bar (C = 2, D = 3) at main. C: 5
#1 0x080483c2 in Foo (A = 2, B = 3) at main. C: 9
#2 0x080483e9 in main () at main. C: 14
(GDB) info registers
Eax 0x5 5
ECX 0xbff1c440-1074674624
EdX 0x3 3
EBX 0xb7fe6ff4-1208061964
ESP 0xbff1c3f4 0xbff1c3f4
EBP 0xbff1c404 0xbff1c404
ESI 0x8048410 134513680
EDI 0x80482e0 134513376
EIP 0x80483a8 0x80483a8 <BAR + 20>
Eflags 0x200206 [pf if Id]
CS 0x73 115
SS 0x7b 123
DS 0x7b 123
Es 0x7b 123
FS 0x0 0
GS 0x33 51
(GDB) X/20 $ ESP
0xbff1c3f4: 0x00000000 0xbff1c6f7 0xb7efbdae 0x00000005
0xbff1c404: 0xbff1c414 0x080483c2 0x00000002 0x00000003
0xbff1c414: 0xbff1c428 0x080483e9 0x00000002 0x00000003
0xbff1c424: 0xbff1c440 0xbff1c498 0xb7ea3685 0x08048410
0xbff1c434: 0x080482e0 0xbff1c498 0xb7ea3685 0x00000001
(GDB)
Several new gdb commands are used here. Disassemble can disassemble the current function or specified function. The disassemble command is used to disassemble the current function. If the disassemble command is followed by the function name or address, the specified function is decompiled. Previously we mentioned that the step command can be used for single-step debugging with one line of code and one line of code. The Si command used here can be used for single-step debugging with one command and one command. Info registers can display the current values of all registers. Add $ before the register name in GDB. For example, p $ ESP can print the value of the ESP register. In the preceding example, the value of the ESP register is 0xbff1c3f4, therefore, the X/20 $ ESP command is used to view the 20 32-digit number starting from the 0xbff1c3f4 address in the memory. When executing a program, the operating system allocates a stack space for the process to save the function stack frame. The ESP register always points to the top of the stack, on the X86 platform, this stack increases from a high address to a low address. We know that every time a function is called, a stack frame is allocated to save parameters and local variables, now we will analyze the layout of the data in the stack space in detail. The output result of GDB is as follows [29]:
Figure 19.1 function stack frame
In the figure, each small square represents a memory unit of 4 bytes. For example, the memory address occupied by the small square B: 3 is 0xbff1c420 ~ 0xbff1c423: I wrote the address on the bottom boundary line of each small square to emphasize that the address is the starting address of the memory unit. Starting from the main function:
Foo (2, 3 );
80483d5: C7 44 24 04 03 00 00 movl $0x3, 0x4 (% ESP)
80483dc: 00
80483dd: C7 04 24 02 00 00 00 movl $0x2, (% ESP)
80483e4: E8 C1 FF call 80483aa <Foo>
Return 0;
80483e9: B8 00 00 00 mov $0x0, % eax
To call the function Foo, you must first prepare the parameter. The second parameter is saved in the memory location pointed to by ESP + 4, and the first parameter is saved in the memory location pointed to by ESP, it can be seen that the parameters are stacked from right to left. Then execute the Call Command, which has two functions:
1. after the foo function is called, The next instruction of the call must be returned for further execution. Therefore, the address of the next instruction of the call is 0x80483e9, And the ESP value is reduced by 4, the value of ESP is 0xbff1c418.
2. Modify the EIP of the program counter and jump to the beginning of the foo function for execution.
Now let's look at the compilation code of the foo function:
Int Foo (int A, int B)
{
80483aa: 55 push % EBP
80483ab: 89 E5 mov % ESP, % EBP
80483ad: 83 EC 08 Sub $0x8, % ESP
The push % EBP command pushes the value of the EBP register to the stack and reduces the value of ESP by 4. The value of ESP is 0xbff1c414, And the next instruction transfers this value to the EBP register. These two commands are combined to save the original EBP value on the stack and then assign a new value to the EBP. In the stack frame of each function, EBP points to the bottom of the stack, while ESP points to the top of the stack. During function execution, esp changes with the operation of the Pressure stack and the exit stack at any time, while EBP does not move, function parameters and local variables are accessed by adding an offset to the EBP value, for example, parameters a and B of the foo function are accessed through EBP + 8 and EBP + 12 respectively. Therefore, the following commands re-Press the stack parameters A and B to prepare for calling the bar function, and then press the returned address to the stack and call the bar function:
Return bar (A, B );
80483b0: 8B 45 0C mov 0xc (% EBP), % eax
80483b3: 89 44 24 04 mov % eax, 0x4 (% ESP)
80483b7: 8B 45 08 mov 0x8 (% EBP), % eax
80483ba: 89 04 24 mov % eax, (% ESP)
80483bd: E8 D2 FF call 8048394 <bar>
Now let's look at the bar FUNCTION command:
Int bar (int c, int D)
{
8048394: 55 push % EBP
8048395: 89 E5 mov % ESP, % EBP
8048397: 83 EC 10 sub $0x10, % ESP
Int e = C + D;
804839a: 8B 55 0C mov 0xc (% EBP), % edX
804839d: 8B 45 08 mov 0x8 (% EBP), % eax
80483a0: 01 D0 add % edX, % eax
80483a2: 89 45 FC mov % eax,-0x4 (% EBP)
This time, the EBP pressure stack of the foo function is saved, and a new value is assigned to the EBP, pointing to the stack bottom of the bar function stack frame, parameters C and D can be accessed through EBP + 8 and EBP + 12 respectively. The bar function also has a local variable e that can be accessed via a ebp-4. Therefore, the following commands are used to extract the C and D parameters and add them to the register. The calculation results are stored in the eax register, store the eax register back to the memory unit of the local variable E.
In GDB, you can use the Bt and frame commands to view the parameters and local variables on the frames of each stack. Now you can explain how it works: If I am currently in the bar function, I can find the bar function parameters and local variables through EBP, or find the EBP value of the foo function saved on the stack, with the EBP of the foo function, you can also find its parameters and local variables, or find the value of the EBP stored on the stack of the main function, therefore, the function stack frames at each layer are serialized by the values of the EBP stored on the stack.
Now let's look at the bar function's return command:
Return E;
80483a5: 8B 45 FC mov-0x4 (% EBP), % eax
}
80483a8: C9 leave
80483a9: C3 RET
The bar function has an int type return value, which is passed through the eax register. Therefore, the e value is first read to the eax register. Then execute the Leave command, which is the inverse operation of push % EBP and mov % ESP, % EBP at the beginning of the function:
1. Assign the EBP value to esp. The current ESP value is 0xbff1c404.
2. Now, the EBP of the foo function stack frame is stored at the top of the stack pointed to by ESP, and this value is restored to EBP. At the same time, ESP is increased by 4 and ESP is changed to 0xbff1c408.
The last is the RET command, which is the inverse operation of the Call command:
1. Now, the return address is saved on the top of the stack pointed to by ESP, and this value is restored to EIP. At the same time, ESP is increased by 4, and the ESP value is changed to 0xbff1c40c.
2. modified the EIP of the program counter, so the jump to the return address 0x80483c2 to continue the execution.
The address 0x80483c2 is the return command of the foo function:
80483c2: C9 leave
80483c3: C3 RET
Repeat the same process and return to the main function. Note the following rules during function call and return:
1. Parameter pressure stack transfer, and the stack is pressed from right to left.
2. EBP always points to the stack bottom of the current stack frame.
3. The returned value is passed through the eax register.
These rules are not imposed by the architecture, the EBP register is not required to be used, and the parameters and return values of the function are not required to be passed, only the operating system and compiler have chosen to implement function calls in C code in this way. This is called calling convention. calling convention is the operating system binary interface specification (Abi, application binary interface).
Exercise Review
1. As described in section 2nd "user-defined functions", function declaration in the old style C style does not specify the number and type of parameters, so that the compiler does not check function calls, what if the parameter type is incorrect or the number of parameters is incorrect during the call? For example, change the example in this section to the following:
Int Foo ();
Int bar ();
Int main (void)
{
Foo (2, 3, 4 );
Return 0;
}
Int Foo (int A, int B)
{
Return bar ();
}
Int bar (int c, int D)
{
Int e = C + D;
Return E;
}
When the main function calls Foo, another parameter is passed. What values are used for parameters A and B? What if there are multiple parameters? When Foo calls bar, if one parameter is missing, where can the value of parameter d be obtained? Please use disassembly and GDB for your own analysis. Let's look at another example with different parameter types:
# Include <stdio. h>
Int main (void)
{
Void Foo ();
Char c = 60;
Foo (C );
Return 0;
}
Void Foo (double D)
{
Printf ("% F/N", d );
}
What is the printed result? If you change the Declaration void Foo (); to void Foo (double);, what is the printed result?
--------------------------------------------------------------------------------
[29] The starting address of the stack space specified by the Linux kernel for each new process is somewhat different. Therefore, the address obtained by running this program is different, but it is usually 0xbf ?????? Such an address.
--------------------------------------------------------------------------------
Previous Page
Level 1
Next Page
Chapter 2 Relationship Between assembly and C
Start page
2. Main Function and startup routine
National embedded Talent Training Base
This article from the csdn blog, reproduced please indicate the source: http://blog.csdn.net/unbutun/archive/2010/12/02/6051184.aspx