Main function and startup routine

Source: Internet
Author: User
Http://learn.akae.cn/media/ch19s02.html2. mainFunctions and startup routines

Why is the assembler entry?_startAnd the entry to the C program ismainWhat about functions? This section explains the problem. In the example 18.1 "simplest assembler", our compilation and link steps are:

$ as hello.s -o hello.o
$ ld hello.o -o hello

We usedgcc main.c -o mainYou can compile a program by using commands in three steps. The first step is to generate the assembly code, the second step is to generate the target file, and the third step is to generate the executable file:

$ gcc -S main.c
$ gcc -c main.s
$ gcc main.o

-SOption to generate assembly code,-cOption to generate the target file. Also described in section 2nd "Array application instance: Count random numbers"-EThe options are only pre-processed but not compiled. If these options are not addedgccComplete the compilation steps until the final link generates the executable file. As shown in.

Figure 19.2. GCC Command Options


All of these options can be-oIn combination, rename the output file instead of usinggccDefault file name (xxx.c,xxx.s,xxx.oAnda.out), Suchgcc main.o -o mainSetmain.oLink to an executable filemain. Target file previously generated by assembly code example 18.1 "simplest assembler"hello.oWe useldCan I usegccLink? Try it.

$ gcc hello.o -o hello
hello.o: In function `_start':
(.text+0x0): multiple definition of `_start'
/usr/lib/gcc/i486-linux-gnu/4.3.2/http://www.cnblogs.com/http://www.cnblogs.com/lib/crt1.o:(.text+0x0): first defined here
/usr/lib/gcc/i486-linux-gnu/4.3.2/http://www.cnblogs.com/http://www.cnblogs.com/lib/crt1.o: In function `_start':
(.text+0x18): undefined reference to `main'
collect2: ld returned 1 exit status

Two errors are prompted: 1. Yes_startThere are multiple definitions. One definition is provided by our Assembly code, and the other definition comes from/usr/lib/crt1.o2.crt1.oOf_startFunction to callmainFunction, while our Assembly Code does not providemainFunction Definition. From the last line, we can see that these error messages are causedldGiven. We can see that if we usegccLink,gccActually calledldChange the target filecrt1.oAnd ourhello.oLink together.crt1.oIt already provides_startAt the entry point, we will implement another_startMultiple definitions are used. The linker does not know which one to use, so an error is reported. In addition,crt1.oProvided_startNeed to callmainFunction, which is not implemented in our assembler.mainFunction, so an error is reported.

If the target file is compiled and generated by C code, usegccThe link is correct. The entry point of the entire program iscrt1.oProvided in_startFirst, it performs some initialization work (hereinafter referred to as the startup routine, startup routine), and then callsmainFunction. So, as we said beforemainThe function is actually an inaccurate entry point of the program,_startIs the real entry point, andmainThe function is_startCalled.

We will continue to study example 19.1 in the previous section "study the function call process ". If compilation is performed in two steps, step 2gcc main.o -o mainActually calledldThe link is equivalent to the following command:

$ ld /usr/lib/crt1.o /usr/lib/crti.o main.o -o main -lc -dynamic-linker /lib/ld-linux.so.2

That is to say,crt1.oIn additioncrti.o, These two target files and ourmain.oLink together to generate executable filesmain.-lcLink requiredlibcLibrary, as mentioned in section 1st "mathematical functions"-lcOption isgccThe default value is not to be written.ldThis is not the default option.-dynamic-linker /lib/ld-linux.so.2Specifies that the dynamic linker is/lib/ld-linux.so.2, Which will be explained later.

Socrt1.oAndcrti.oWhat is in it? We can usereadelfCommand to view. Here we only care about the symbol table. If you only want to view the symbol table, you can usereadelfCommand-sOption, you can also usenmCommand.

$ nm /usr/lib/crt1.o 
00000000 R _IO_stdin_used
00000000 D __data_start
U __libc_csu_fini
U __libc_csu_init
U __libc_start_main
00000000 R _fp_hw
00000000 T _start
00000000 W data_start
U main
$ nm /usr/lib/crti.o
U _GLOBAL_OFFSET_TABLE_
w __gmon_start__
00000000 T _fini
00000000 T _init

U mainThis line indicatesmainThis symbol is incrt1.oIt is used, but not defined (a table shows undefined), so another target file is required to provide a definition andcrt1.oLink together. Specificallycrt1.oTo usemainThe address represented by this symbol. For example, a command isThe address represented by the push $ Symbol MainBut I don't know what the address is.crt1.oThis command is currently writtenpush $0x0, Wait until andmain.oWhen you link to an executable file, you will know the address, for example, 0x80483c4.mainThis command in is changed by the linkerpush $0x80483c4. Here, the linker serves as a symbolic resolution. In section 5.2 "executable files", we can see that the linker serves as a relocation, both functions are implemented by modifying the address in the instruction, and the linker is also an editor,viAndemacsThe source file is edited by the linker, and the target file is edited by the linker. Therefore, the linker is also called the link editor.T _startThis line indicates_startThis symbol is incrt1.oThe definition is provided. The type of this symbol is code (T indicates text ). We select several symbols from the above output to illustrate the relationship between them with diagrams:

Figure 19.3 link process of c program


In fact, what we wrote aboveldThe command is simplified a lot,gccSeveral other target files are also used in the Link. Therefore, an extra box is drawn to form an executable file.mainExceptmain.o,crt1.oAndcrti.oThere are other target documents. This book will not be discussed in depth.gccOf-vFor more information about the compilation process:

$ gcc -v main.c -o main
Using built-in specs.
Target: i486-linux-gnu
...
/usr/lib/gcc/i486-linux-gnu/4.3.2/cc1 -quiet -v main.c -D_FORTIFY_SOURCE=2 -quiet -dumpbase main.c -mtune=generic -auxbase main -version -fstack-protector -o /tmp/ccRGDpua.s
...
as -V -Qy -o /tmp/ccidnZ1d.o /tmp/ccRGDpua.s
...
/usr/lib/gcc/i486-linux-gnu/4.3.2/collect2 --eh-frame-hdr -m elf_i386 --hash-style=both -dynamic-linker /lib/ld-linux.so.2 -o main -z relro /usr/lib/gcc/i486-linux-gnu/4.3.2/http://www.cnblogs.com/http://www.cnblogs.com/lib/crt1.o /usr/lib/gcc/i486-linux-gnu/4.3.2/http://www.cnblogs.com/http://www.cnblogs.com/lib/crti.o /usr/lib/gcc/i486-linux-gnu/4.3.2/crtbegin.o -L/usr/lib/gcc/i486-linux-gnu/4.3.2 -L/usr/lib/gcc/i486-linux-gnu/4.3.2 -L/usr/lib/gcc/i486-linux-gnu/4.3.2/http://www.cnblogs.com/http://www.cnblogs.com/lib -L/lib/../lib -L/usr/lib/../lib -L/usr/lib/gcc/i486-linux-gnu/4.3.2/http://www.cnblogs.com/.. /tmp/ccidnZ1d.o -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc/i486-linux-gnu/4.3.2/crtend.o /usr/lib/gcc/i486-linux-gnu/4.3.2/http://www.cnblogs.com/http://www.cnblogs.com/lib/crtn.o

Executable file generated by linkmainContains the symbols defined in each target file. The definition of these symbols can be seen through disassembly:

$ objdump -d main
main: file format elf32-i386


Disassembly of section .init:

08048274 <_init>:
8048274:55 push %ebp
8048275:89 e5 mov %esp,%ebp
8048277:53 push %ebx
...
Disassembly of section .text:

080482e0 <_start>:
80482e0:31 ed xor %ebp,%ebp
80482e2:5e pop %esi
80482e3:89 e1 mov %esp,%ecx
...
08048394 <bar>:
8048394:55 push %ebp
8048395:89 e5 mov %esp,%ebp
8048397:83 ec 10 sub $0x10,%esp
...
080483aa <foo>:
80483aa:55 push %ebp
80483ab:89 e5 mov %esp,%ebp
80483ad:83 ec 08 sub $0x8,%esp
...
080483c4 <main>:
80483c4:8d 4c 24 04 lea 0x4(%esp),%ecx
80483c8:83 e4 f0 and $0xfffffff0,%esp
80483cb:ff 71 fc pushl -0x4(%ecx)
...
Disassembly of section .fini:

0804849c <_fini>:
804849c:55 push %ebp
804849d:89 e5 mov %esp,%ebp
804849f:53 push %ebx

crt1.oUndefined characters inmainInmain.oSo the link is okay.crt1.oThere is also an undefined symbol__libc_start_mainIt is not defined in several other target files, so it is in the executable filemainIs still an undefined symbol. This symbol is inlibc,libcDoes not link to executable files like other target filesmain, But dynamic link during running:

  1. The operating system is loading and executingmainWhen using this program, first check whether it has undefined symbols that require dynamic links.

  2. If you need dynamic links, you can view the shared libraries specified by this program (we use-lcSpecifiedlibc) And what dynamic linker is used for Dynamic Link (we use-dynamic-linker /lib/ld-linux.so.2Specifies the dynamic linker ).

  3. The dynamic linker searches for the definition of these symbols in the shared library to complete the link process.

After learning about these principles, let's take a look._startDisassembly:

...
Disassembly of section .text:

080482e0 <_start>:
80482e0: 31 ed xor %ebp,%ebp
80482e2: 5e pop %esi
80482e3: 89 e1 mov %esp,%ecx
80482e5: 83 e4 f0 and $0xfffffff0,%esp
80482e8: 50 push %eax
80482e9: 54 push %esp
80482ea: 52 push %edx
80482eb: 68 00 84 04 08 push $0x8048400
80482f0: 68 10 84 04 08 push $0x8048410
80482f5: 51 push %ecx
80482f6: 56 push %esi
80482f7: 68 c4 83 04 08 push $0x80483c4
80482fc: e8 c3 ff ff ff call 80482c4 <__libc_start_main@plt>
...

First, press a series of parameters on the stack, and then calllibcLibrary functions__libc_start_mainInitialize the parameter of the Last Pressure stack.push $0x80483c4YesmainFunction address,__libc_start_mainAfter the initialization is completedmainFunction. Because__libc_start_mainDynamic Links are required, so the commands of this library function are in executable files.mainBut we have found this:

Disassembly of section .plt:
...
080482c4 <__libc_start_main@plt>:
80482c4: ff 25 04 a0 04 08 jmp *0x804a004
80482ca: 68 08 00 00 00 push $0x8
80482cf: e9 d0 ff ff ff jmp 80482a4 <_init+0x30>

These three commands are located.pltSegment instead.textSegment,.pltSegment to help complete the dynamic link process. In the next chapter, we will detail the dynamic link process.

mainThe most standard function prototype should beint main(int argc, char *argv[])That is to say, the startup routine will pass two parametersmainFunction. We will explain the meanings of these two parameters after learning the pointer. So farmainFunction prototypeint main(void)This is also allowed by the C standard. If you carefully analyze the exercises in the previous section, you should know that it is okay to pass more parameters instead of having to use them, if the parameter is missing, a problem occurs.

BecausemainThe function is called by the startup routine.mainFunctionreturnIs still returned to the startup routine,mainThe Return Value of the function is obtained by the startup routine. If the startup routine is expressed as an equivalent C code (in fact, the startup routine is generally written directly by sink), it callsmainThe function form is:

exit(main(argc, argv));

That is, the startup routine obtainsmainAfter the return value of the function is returned, it is immediately used for parameter calling.exitFunction.exitYeslibcFunction in, which first performs some cleanup work, and then calls_exitSystem Call termination process,mainThe Return Value of the function is finally passed_exitThe system call becomes the exit status of the process. We can alsomainDirect calls in FunctionsexitThe function terminates the process and does not return to the startup routine. For example:

#include <stdlib.h>

int main(void)
{
exit(4);
}

Andint main(void) { return 4; }The effect is the same. Run the program in shell and check its exit status:

$ ./a.out 
$ echo $?
4

By convention, if the exit status is 0, the program is successfully executed, and if the exit status is not 0, an error occurs. Note that the exit status is only 8 bits and is interpreted as the number of unsigned characters by shell. If you change the code aboveexit(-1);Orreturn -1;, The running result is

$ ./a.out 
$ echo $?
255

Note: If the return value type of a declared function isint, Each branch control flow in the function must be writtenreturnSpecifies the return value.returnThe return value is unknown (think about why). The Compiler usually reports a warning, but if a branch control flow is calledexitOr_exitInstead of writingreturnThe compiler is allowed, because it has no chance to return, it doesn't matter if it doesn't specify the return value. UseexitThe function must contain header files.stdlib.hAnd use_exitThe function must contain header files.unistd.hAnd will be explained in detail later.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.