is done by the "compiler driver" (Compiler Driver):
unix> Gcc-o Hello hello.c
Here, the GCC compiler driver reads the source file hello.c,
[CPP]View Plaincopy
- #include <stdio.h>
- int main ()
- {
- printf ("Hello, world/n");
- return 0;
- }
and translate it into an executable target file Hello, this process is divided into four stages. As shown, the implementation of these four stages of the program (preprocessor, compiler, assembler and linker) together constitute the compilation system.
Preprocessing phase: the preprocessor (CPP) modifies the original C program based on the command (directives) that begins with the character #. As the first line of HELLO.C in the # include <stdio.h> directive, it tells the preprocessor to read the contents of the system header file stdio.h and insert it directly into the program text. The result is another C program, usually with. I as the file name extension.
Compile stage: The compiler (CCL) translates the text file hello.i into a text file Hello.s, which includes an assembly language program. Each statement in the assembly language program is a standard text format that accurately describes a low-level machine language instruction. Assembly language provides a common output language for different compilers in different high-level languages.
Assembly Stage: The Assembler (AS) translates hello.s into machine language instructions, which are then packaged into a format called a "relocatable" (relocatable) target program and save the results in the target file hello.o. The hello.o file is a binary file whose byte encoding is a machine language instruction instead of a character. If you open it with a text editor, you will see a bunch of garbled characters.
Link stage: The Hello program invokes the printf function (a function in the standard C library, provided by each C compiler), and the printf function exists in a separate precompiled target file called PRINTF.O, which must be incorporated into the HELLO.O program in some way. The linker (LD) is responsible for processing this integration, resulting in an executable target file/hello file (or executable file). After the executable is loaded into storage, the system is responsible for executing it.
Original address: http://blog.csdn.net/lychee007/article/details/4123130
———————————————————————————————————————————————————————————————————————————————
GCC compilation process
Common compiler procedures for modern compilers:
Source file-up-to----compile/optimize------------
For GCC:
First step preprocessing
Command: Gcc-o test.i-e test.c
or Cpp-o test.i test.c (where CPP is not a value C plus plus, but the C preprocessor)
Result: Generate pre-processed file Test.i (can be opened and compared to pre-processing, of course, the length will scare you to jump)
Role: Read the C source program, the pseudo-instructions and special symbols for processing. Includes macros, conditional compilation, included header files, and some special symbols. is basically a replace process.
Second-step compilation and optimization
Command: Gcc-o test.s-s test.i
or/path/cc1-o Test.s test.i
Result: Generate assembly file Test.s (can open and view source file generated assembly code)
Role: Through lexical and grammatical analysis, to confirm that all directives conform to the grammatical rules (otherwise the compilation error), then translated into the corresponding intermediate code, in Linux is called RTL (Register Transfer Language), is usually platform-independent, this process is also known as the compiler front-end. Compile the backend to cut and optimize the RTL tree to get the executable assembly code on the target machine. GCC uses as as its assembler, so the sink encoding is the-t format, not the Intel format, so when compiling the embedded assembly with GCC, it also uses the-T format.
Third Step compilation
Command: Gcc-o test.o-c test.s
or As-o TEST.O Test.s
Result: Generate target machine instruction file TEST.O (available objdump view)
Function: The assembly language code translated into the target machine instruction, with file TEST.O can see TEST.O is a relocatable elf file, usually contains. text. Rodata code Snippets and data segments. You can use Readelf-r TEST.O to view the sections that need to be relocation.
Fourth Step link
Command: Gcc-o test TEST.O
or Ld-o test TEST.O
Result: Generate executable file test (available objdump view)
Role: The symbol referenced in one file is linked to the definition of that symbol in another file, so that all of these target files are linked to an executable that can be loaded into memory by the operating system. (if there are no symbol definitions, or duplicate definitions, etc., the link will be reported incorrectly). Using the file test, you can see that test is a executable elf file.
Of course, the link will also use the static link library, and dynamic Connection library. Both static and dynamic libraries are collections of. O Destination files.
Static libraries:
Command: ar-v-Q test.a TEST.O
Result: Generate static link library test.a
Role: The static library is the link process to extract the relevant code into the executable library (that is, the code of the function will be copied from its location in the static link library to the final executables), AR just put some other files into a file. Can be packaged, of course, can also unpack.
Dynamic libraries:
Command: gcc-shared test.so TEST.O
or/path/collect2-shared test.so test.o (omit some parameters)
Result: Generate Dynamic Connection Library test.so
Role: The dynamic library creates only a few symbol tables at the time of the link, and the code about the library is loaded into memory when it is run and mapped to the virtual address space of the corresponding process at run time. If there is an error, if the corresponding. So file cannot be found, the dynamic connection error will be reported at execution time (the specified path can be Ld_library_path). With file test.so, you can see that test.so is a shared object elf file.
Of course, the above steps can be completed in one step or several steps, such as Gcc-o test test.c directly to the executable file.
Report:
Elf file format
The elf file format is part of the ABI (Application Binary Interface) and is used by tool Interface Standards Committee as a portable target file format in 32-bit Intel architectures. Its format is more complex, it is not detailed here, just say its type.
The type defined in Specification 1.1. Represents the E_type in the ELF header
Name Value Meaning
==== ===== =======
Et_none 0 No file type
Et_rel 1 relocatable File
Et_exec 2 executable file
Et_dyn 3 Shared Object File
Et_core 4 CORE File
Et_loproc 0xff00 processor-specific
Et_hiproc 0xFFFF processor-specific
There are 4 main types of
1. relocatable file retains code and data, and is used to create executable files or shared object file with other object file. (That is, our common. o file)
2. executable file retains the program used to execute, which can be loaded by the system exec () to create a program process. (That is, the executable file that we often say)
3. The shared object file retains the code and data to be connected in two cases, and the link editor, such as LD, can use it with other relocateble or shared object file to create another object file. The second is the dynamic link with executable file or other shared object file as a process image. (i.e., the dynamic link library, or the. So file that we often call)
4. core file content is not specified in the specification and is currently used to record core dump information.
Original address: http://blog.csdn.net/lychee007/article/details/4123018
On UNIX systems, the compilation process from the source file, the target file, and the executable file