Http://learn.akae.cn/media/ch19s02.html2.
main
Functions and startup routines
Why is the assembler entry?_start
And the entry to the C program ismain
What about functions? This section explains the problem. In the example 18.1 "simplest assembler", our compilation and link steps are:
$ as hello.s -o hello.o
$ ld hello.o -o hello
We usedgcc main.c -o main
You can compile a program by using commands in three steps. The first step is to generate the assembly code, the second step is to generate the target file, and the third step is to generate the executable file:
$ gcc -S main.c
$ gcc -c main.s
$ gcc main.o
-S
Option to generate assembly code,-c
Option to generate the target file. Also described in section 2nd "Array application instance: Count random numbers"-E
The options are only pre-processed but not compiled. If these options are not addedgcc
Complete the compilation steps until the final link generates the executable file. As shown in.
Figure 19.2. GCC Command Options
All of these options can be-o
In combination, rename the output file instead of usinggcc
Default file name (xxx.c
,xxx.s
,xxx.o
Anda.out
), Suchgcc main.o -o main
Setmain.o
Link to an executable filemain
. Target file previously generated by assembly code example 18.1 "simplest assembler"hello.o
We useld
Can I usegcc
Link? Try it.
$ gcc hello.o -o hello
hello.o: In function `_start':
(.text+0x0): multiple definition of `_start'
/usr/lib/gcc/i486-linux-gnu/4.3.2/http://www.cnblogs.com/http://www.cnblogs.com/lib/crt1.o:(.text+0x0): first defined here
/usr/lib/gcc/i486-linux-gnu/4.3.2/http://www.cnblogs.com/http://www.cnblogs.com/lib/crt1.o: In function `_start':
(.text+0x18): undefined reference to `main'
collect2: ld returned 1 exit status
Two errors are prompted: 1. Yes_start
There are multiple definitions. One definition is provided by our Assembly code, and the other definition comes from/usr/lib/crt1.o
2.crt1.o
Of_start
Function to callmain
Function, while our Assembly Code does not providemain
Function Definition. From the last line, we can see that these error messages are causedld
Given. We can see that if we usegcc
Link,gcc
Actually calledld
Change the target filecrt1.o
And ourhello.o
Link together.crt1.o
It already provides_start
At the entry point, we will implement another_start
Multiple definitions are used. The linker does not know which one to use, so an error is reported. In addition,crt1.o
Provided_start
Need to callmain
Function, which is not implemented in our assembler.main
Function, so an error is reported.
If the target file is compiled and generated by C code, usegcc
The link is correct. The entry point of the entire program iscrt1.o
Provided in_start
First, it performs some initialization work (hereinafter referred to as the startup routine, startup routine), and then callsmain
Function. So, as we said beforemain
The function is actually an inaccurate entry point of the program,_start
Is the real entry point, andmain
The function is_start
Called.
We will continue to study example 19.1 in the previous section "study the function call process ". If compilation is performed in two steps, step 2gcc main.o -o main
Actually calledld
The link is equivalent to the following command:
$ ld /usr/lib/crt1.o /usr/lib/crti.o main.o -o main -lc -dynamic-linker /lib/ld-linux.so.2
That is to say,crt1.o
In additioncrti.o
, These two target files and ourmain.o
Link together to generate executable filesmain
.-lc
Link requiredlibc
Library, as mentioned in section 1st "mathematical functions"-lc
Option isgcc
The default value is not to be written.ld
This is not the default option.-dynamic-linker /lib/ld-linux.so.2
Specifies that the dynamic linker is/lib/ld-linux.so.2
, Which will be explained later.
Socrt1.o
Andcrti.o
What is in it? We can usereadelf
Command to view. Here we only care about the symbol table. If you only want to view the symbol table, you can usereadelf
Command-s
Option, you can also usenm
Command.
$ nm /usr/lib/crt1.o
00000000 R _IO_stdin_used
00000000 D __data_start
U __libc_csu_fini
U __libc_csu_init
U __libc_start_main
00000000 R _fp_hw
00000000 T _start
00000000 W data_start
U main
$ nm /usr/lib/crti.o
U _GLOBAL_OFFSET_TABLE_
w __gmon_start__
00000000 T _fini
00000000 T _init
U main
This line indicatesmain
This symbol is incrt1.o
It is used, but not defined (a table shows undefined), so another target file is required to provide a definition andcrt1.o
Link together. Specificallycrt1.o
To usemain
The address represented by this symbol. For example, a command isThe address represented by the push $ Symbol Main
But I don't know what the address is.crt1.o
This command is currently writtenpush $0x0
, Wait until andmain.o
When you link to an executable file, you will know the address, for example, 0x80483c4.main
This command in is changed by the linkerpush $0x80483c4
. Here, the linker serves as a symbolic resolution. In section 5.2 "executable files", we can see that the linker serves as a relocation, both functions are implemented by modifying the address in the instruction, and the linker is also an editor,vi
Andemacs
The source file is edited by the linker, and the target file is edited by the linker. Therefore, the linker is also called the link editor.T _start
This line indicates_start
This symbol is incrt1.o
The definition is provided. The type of this symbol is code (T indicates text ). We select several symbols from the above output to illustrate the relationship between them with diagrams:
Figure 19.3 link process of c program
In fact, what we wrote aboveld
The command is simplified a lot,gcc
Several other target files are also used in the Link. Therefore, an extra box is drawn to form an executable file.main
Exceptmain.o
,crt1.o
Andcrti.o
There are other target documents. This book will not be discussed in depth.gcc
Of-v
For more information about the compilation process:
$ gcc -v main.c -o main
Using built-in specs.
Target: i486-linux-gnu
...
/usr/lib/gcc/i486-linux-gnu/4.3.2/cc1 -quiet -v main.c -D_FORTIFY_SOURCE=2 -quiet -dumpbase main.c -mtune=generic -auxbase main -version -fstack-protector -o /tmp/ccRGDpua.s
...
as -V -Qy -o /tmp/ccidnZ1d.o /tmp/ccRGDpua.s
...
/usr/lib/gcc/i486-linux-gnu/4.3.2/collect2 --eh-frame-hdr -m elf_i386 --hash-style=both -dynamic-linker /lib/ld-linux.so.2 -o main -z relro /usr/lib/gcc/i486-linux-gnu/4.3.2/http://www.cnblogs.com/http://www.cnblogs.com/lib/crt1.o /usr/lib/gcc/i486-linux-gnu/4.3.2/http://www.cnblogs.com/http://www.cnblogs.com/lib/crti.o /usr/lib/gcc/i486-linux-gnu/4.3.2/crtbegin.o -L/usr/lib/gcc/i486-linux-gnu/4.3.2 -L/usr/lib/gcc/i486-linux-gnu/4.3.2 -L/usr/lib/gcc/i486-linux-gnu/4.3.2/http://www.cnblogs.com/http://www.cnblogs.com/lib -L/lib/../lib -L/usr/lib/../lib -L/usr/lib/gcc/i486-linux-gnu/4.3.2/http://www.cnblogs.com/.. /tmp/ccidnZ1d.o -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc/i486-linux-gnu/4.3.2/crtend.o /usr/lib/gcc/i486-linux-gnu/4.3.2/http://www.cnblogs.com/http://www.cnblogs.com/lib/crtn.o
Executable file generated by linkmain
Contains the symbols defined in each target file. The definition of these symbols can be seen through disassembly:
$ objdump -d main
main: file format elf32-i386
Disassembly of section .init:
08048274 <_init>:
8048274:55 push %ebp
8048275:89 e5 mov %esp,%ebp
8048277:53 push %ebx
...
Disassembly of section .text:
080482e0 <_start>:
80482e0:31 ed xor %ebp,%ebp
80482e2:5e pop %esi
80482e3:89 e1 mov %esp,%ecx
...
08048394 <bar>:
8048394:55 push %ebp
8048395:89 e5 mov %esp,%ebp
8048397:83 ec 10 sub $0x10,%esp
...
080483aa <foo>:
80483aa:55 push %ebp
80483ab:89 e5 mov %esp,%ebp
80483ad:83 ec 08 sub $0x8,%esp
...
080483c4 <main>:
80483c4:8d 4c 24 04 lea 0x4(%esp),%ecx
80483c8:83 e4 f0 and $0xfffffff0,%esp
80483cb:ff 71 fc pushl -0x4(%ecx)
...
Disassembly of section .fini:
0804849c <_fini>:
804849c:55 push %ebp
804849d:89 e5 mov %esp,%ebp
804849f:53 push %ebx
crt1.o
Undefined characters inmain
Inmain.o
So the link is okay.crt1.o
There is also an undefined symbol__libc_start_main
It is not defined in several other target files, so it is in the executable filemain
Is still an undefined symbol. This symbol is inlibc
,libc
Does not link to executable files like other target filesmain
, But dynamic link during running:
The operating system is loading and executingmain
When using this program, first check whether it has undefined symbols that require dynamic links.
If you need dynamic links, you can view the shared libraries specified by this program (we use-lc
Specifiedlibc
) And what dynamic linker is used for Dynamic Link (we use-dynamic-linker /lib/ld-linux.so.2
Specifies the dynamic linker ).
The dynamic linker searches for the definition of these symbols in the shared library to complete the link process.
After learning about these principles, let's take a look._start
Disassembly:
...
Disassembly of section .text:
080482e0 <_start>:
80482e0: 31 ed xor %ebp,%ebp
80482e2: 5e pop %esi
80482e3: 89 e1 mov %esp,%ecx
80482e5: 83 e4 f0 and $0xfffffff0,%esp
80482e8: 50 push %eax
80482e9: 54 push %esp
80482ea: 52 push %edx
80482eb: 68 00 84 04 08 push $0x8048400
80482f0: 68 10 84 04 08 push $0x8048410
80482f5: 51 push %ecx
80482f6: 56 push %esi
80482f7: 68 c4 83 04 08 push $0x80483c4
80482fc: e8 c3 ff ff ff call 80482c4 <__libc_start_main@plt>
...
First, press a series of parameters on the stack, and then calllibc
Library functions__libc_start_main
Initialize the parameter of the Last Pressure stack.push $0x80483c4
Yesmain
Function address,__libc_start_main
After the initialization is completedmain
Function. Because__libc_start_main
Dynamic Links are required, so the commands of this library function are in executable files.main
But we have found this:
Disassembly of section .plt:
...
080482c4 <__libc_start_main@plt>:
80482c4: ff 25 04 a0 04 08 jmp *0x804a004
80482ca: 68 08 00 00 00 push $0x8
80482cf: e9 d0 ff ff ff jmp 80482a4 <_init+0x30>
These three commands are located.plt
Segment instead.text
Segment,.plt
Segment to help complete the dynamic link process. In the next chapter, we will detail the dynamic link process.
main
The most standard function prototype should beint main(int argc, char *argv[])
That is to say, the startup routine will pass two parametersmain
Function. We will explain the meanings of these two parameters after learning the pointer. So farmain
Function prototypeint main(void)
This is also allowed by the C standard. If you carefully analyze the exercises in the previous section, you should know that it is okay to pass more parameters instead of having to use them, if the parameter is missing, a problem occurs.
Becausemain
The function is called by the startup routine.main
Functionreturn
Is still returned to the startup routine,main
The Return Value of the function is obtained by the startup routine. If the startup routine is expressed as an equivalent C code (in fact, the startup routine is generally written directly by sink), it callsmain
The function form is:
exit(main(argc, argv));
That is, the startup routine obtainsmain
After the return value of the function is returned, it is immediately used for parameter calling.exit
Function.exit
Yeslibc
Function in, which first performs some cleanup work, and then calls_exit
System Call termination process,main
The Return Value of the function is finally passed_exit
The system call becomes the exit status of the process. We can alsomain
Direct calls in Functionsexit
The function terminates the process and does not return to the startup routine. For example:
#include <stdlib.h>
int main(void)
{
exit(4);
}
Andint main(void) { return 4; }
The effect is the same. Run the program in shell and check its exit status:
$ ./a.out
$ echo $?
4
By convention, if the exit status is 0, the program is successfully executed, and if the exit status is not 0, an error occurs. Note that the exit status is only 8 bits and is interpreted as the number of unsigned characters by shell. If you change the code aboveexit(-1);
Orreturn -1;
, The running result is
$ ./a.out
$ echo $?
255
Note: If the return value type of a declared function isint
, Each branch control flow in the function must be writtenreturn
Specifies the return value.return
The return value is unknown (think about why). The Compiler usually reports a warning, but if a branch control flow is calledexit
Or_exit
Instead of writingreturn
The compiler is allowed, because it has no chance to return, it doesn't matter if it doesn't specify the return value. Useexit
The function must contain header files.stdlib.h
And use_exit
The function must contain header files.unistd.h
And will be explained in detail later.