[Compilation and C-language relations] 2. Main function and start routine

Source: Internet
Author: User

Why is the entry for the assembler a _start, and the C program's entry is the main function? Here's how to explain the problem.

The steps we assemble and link in the article "x86 compilation Program Basics (T-grammar)" are:

$ as Hello.s-ld hello.o-o Hello

We use GCC main.c-o main to compile a C program, which is actually divided into three steps: compile, assemble, link

gcc -S main.c   gcc -c main.s   gcc main.o Generate executable file

Our previous target file generated by the first assembler in the x86 assembler Fundamentals (t/t syntax) is HELLO.O we use LD to link, can I use GCC? As follows:

  

Two errors were reported: 1. _start has multiple definitions, and one definition is in our assembly code. Another definition comes from/usr/lib/cr1l.o;2. CRT1.O's _start function calls the main function, and our assembly code does not provide the definition of the main function. The last line shows that these error hints are reported by LD. So if we use GCC as a link, GCC is actually calling LD to link the target file crt1.o with the hello.o we wrote.

If the target file is generated by C program compilation, with GCC to do the link is right, the entire program is the entrance to the CRTL.O provided in the _start, it first do some initialization (hereinafter referred to as the startup routine, startup Routine), and then call the C code provided in the main function.  _start is the real entry point, and Main is called by _start. We continue our previous article on [compilation and C-language relationship]1. function call, gcc Main.o-o main is actually calling LD to do the link, equivalent to this command:

ld /USR/LIB/CRT1.O/USR/LIB/CRTI.O main.o-o main-lc-dynamic-/lib/ld-linux.so. 2

In addition to CRT1.O there are CRTI.O, these two target files and our hello.o link together to generate the executable file main. -LC indicates the need to link libc library,-LC option is gcc default, do not write, and for LD is not the default option, so write on. -DYNAMIC-LINKER/LIB/LD-LINUX.SO.2 specifies that the dynamic linker is/lib/ld-linux.so.2.

We can use Readelf to view the contents of CRT1.O and CRTI.O. Here we only care about the symbol table, if only look at the symbol table, you can use the-s option of the Readelf command, you can also use the NM command.

  

 $ nm/usr/lib/CRT1.O  00000000   R _io_stdin_used  00000000   D __data_ Start u __libc_csu_fini u __libc_csu_init u __libc_start_main  00000000   R _fp_hw  00000000   T _start  00000000   W Data_startu main$ NM /span>/usr/lib/CRTI.O U _global_offset_table_  w   __gmon_start__  00000000   T _fini  00000000  T _init 

U main this line means that the main symbol is used in CRT1.O, but there is no definition (u means undefined), so other target files are required to provide a definition and link with crt1.o. Specifically, in CRT1.O to use the address represented by the main symbol, for example, there is an instruction is the address of the push $ symbol main, but do not know what the address is, so in crt1.o this instruction is temporarily written $0x0, Wait until the MAIN.O is linked to the executable file and know how much this address, such as 0X80483C4, then the executable file in main, this command is changed to the linker to push $0X80483C4. The linker plays the role of symbolic parsing (symbol Resolution) here. Linker also has a role to reposition, and the linker edits the target file, so the linker is also an editor, VI and other editors are editing the source file, and the linker is editing the target file, so the linker is also called Link Editor. T _start this line indicates that the _start symbol is defined in CRT1.O, and the type of the symbol is code (t denotes text). We select several symbols from the above output to illustrate the relationship between them:

  

The LD command we wrote was simplified, and GCC used several other target files during the linking process, so we drew a box to represent the executable file main, besides MAIN.O, CRT1.O and CRTI.O, and other target files, gcc- V option to learn more about the compilation process.

The executable file that the link generates contains the symbols defined by each target file, and the definitions of these symbols are visible through disassembly:

The undefined symbol in CRT1.O main is defined in MAIN.O, so there is no problem linking together. CRT1.O also has an undefined symbol _libc_start_main is not defined in several other target files, so there is still an undefined symbol in the executable main. This symbol is defined in libc, and libc does not link to the executable file, like any other target file, to main, but to dynamic linking at run time:

1. When the operating system loads the program that executes main, first check that it has no undefined symbols that need to be dynamically linked.

2. If you need to make a dynamic link, see which shared libraries the program has developed (we specify LIBC with-LC) and what dynamic linker to use for dynamic linking (we specify dynamic linker with-dynamic-linker/lib/ld-linux.so.2).

3. Dynamic linker finds the definition of these symbols in the shared library and completes the linking process.

Having learned this, we'll look at _start's disassembly:

  

First, a series of parameters to press the stack, and then call libc library function __libc_start_main Do the initialization work, where the last push-stack parameter push $0X80483C4 is the address of the main function, __libc_start_ Main will call the main function after the initialization is complete. Since __libc_start_main needs to be dynamically linked, the instructions for this library function are certainly not found in the disassembly of the executable main, but we have found this:

  

At first see this thought is libc was linked in, actually not. These three instructions are located in the. PLT segment is not a. Text segment, and the. PLT Section assists in the process of dynamic linking.

The prototype of the main function is int main (int argc, char *argv[]), which means that the start example routines two arguments to the main function.

Since the main function is called by the startup routine, return from the main function returns to the startup routine, and the return value of the main function is obtained by the startup routine, if the startup routine is represented as equivalent C code (in fact, the startup routine is generally written directly with sinks). Then it calls the main function in the form of:

Exit (Main (ARGC, argv));

That is, when the start routine gets the return value of the main function, it immediately calls the Exit function with its arguments. Exit is also a function in Lib, it first do some cleanup work, and then call the _EXIT system call termination process, the return value of the main function is finally passed to the _exit system call, to become the process exit state. We can also call the Exit function directly in the main function to terminate the process without returning to the startup routine.

Note that the exit status is only 8 bits, and is interpreted by the shell as an unsigned number, if the above code is changed to exit (-1), or return-1; the echo $? will output 255.

Use the _exit function to include the header file Unistd.h.

[Compilation and C-language relations] 2. Main function and start routine

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.