C program compilation process analysis, compilation Analysis
A few days ago, I read chapter 2 "Compilation and link" in "Programmer self-cultivation-linking, loading and library". I will summarize the C program compilation process based on the content.
I usually use gcc now, so it is natural to use GCC to compile hellworld as an example. A brief summary is as follows.
The source code of hello. c is as follows:
/* He asked hovertree.com */int main () {printf ("Hello, world. \ n"); return 0 ;}
We usually use gcc to generate an executable program. The command is gcc hello. c. The Executable File a. out is generated by default.
In fact, the command for compiling (including links): gcc hello. c can be divided into the following four major steps:
- Preprocessing(Preprocessing)
- Compile(Compilation)
- Assembly(Assembly)
- Link(Linking)
Gcc compilation
1. Preproceessing)
The Preprocessing process mainly includes the following processes:
- Delete all # define andExpand All macro definitions
- ProcessingAllConditional pre-compilation instructionsFor example, # if # ifdef # elif # else # endif
- Processing # includePre-compiled command to insert included files to the pre-compiled command location.
- Delete all comments"//" And "/**/".
- Add row number and file IDTo generate the debugging line number and the compilation error warning line number during compilation.
- Keep all# Pragma compiler commandsBecause the compiler needs to use them
The following commands are usually used for preprocessing:
Gcc-E hello. c-o hello. I
Parameters-EIndicates that only preprocessing is performed, or you can use the following command to complete the preprocessing process.
Cpp hello. c> hello. I/* cpp-The C Preprocessor */
Directly cat hello. I, you can see the pre-processed code.
2. Compile (Compilation)
The compilation process is to perform a series of lexical analysis, syntax analysis, Semantic Analysis and Optimization on the pre-processed files to generate the corresponding assembly code.
$ Gcc-S hello. I-o hello. s
Or
$/Usr/lib/gcc/i486-linux-gnu/4.4/PC3 hello. c
Note: In the current version of GCC, the preprocessing and compilation steps are combined into one step, which is completed by using the tool "cc1. Gcc is actually some packaging of the background program. It calls other actual processing programs according to different parameters, such as: Pre-compiled Compiling Program, assembler as, connector ld
The compiled ASSEMBLY code (hello. s) is as follows: ASSEMBLY
. File "hello. c"
. Section. rodata
. LC0:
. String "Hello, world ."
. Text
. Globl main
. Type main, @ function
Main:
Pushl % ebp
Movl % esp, % ebp
Andl $-16, % esp
Subl $16, % esp
Movl $. LC0, (% esp)
Call puts
Movl $0, % eax
Leave
Ret
. Size main,.-main
. Ident "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3"
. Section. note. GNU-stack, "", @ progbits
3. Assembly)
Assembler converts assembly code into commands that can be executed by machines. Each assembly statement corresponds to almost one machine command. Compared with the compilation process, the compilation process is relatively simple. You can translate the compilation commands one by one based on the comparison table of the Assembly commands and machine commands.
$ Gcc-c hello. c-o hello. o
Or
$ As hello. s-o hello. co
Because the content of hello. o is a machine code, it cannot be viewed in the form of common text (if you open the vi, you will see garbled code ).
4. Link (Linking)
Call the linker ld to link a large number of target files required for running the program, as well as other dependent library files, and finally generate executable files.
Ld-static crt1.o crti. o crtbeginT. o hello. o-start-group-lgcc-lgcc_eh-lc-end-group crtend. o crtn. o (the path name of the file is omitted ).
The general compilation and linking process of helloworld is like this. So what exactly does the compiler and the linker do?
The compilation process can be divided into six steps: scanning (lexical analysis), syntax analysis, semantic analysis, source code optimization, code generation, and target code optimization.
Lexical analysis: the Scanner splits the Character Sequence of the source generation into a series of tokens ). The lex tool implements lexical scanning.
Syntax analysis: the Syntax analyzer generates tokens into a Syntax Tree ). The yacc tool implements syntax analysis (yacc: Yet Another Compiler ).
Semantic Analysis: static semantics (semantics that can be determined by the compiler) and dynamic semantics (semantics that can only be determined at runtime ).
Source Code optimization: Source Code Optimizer converts the entire syntax into Intermediate Code (Intermediate Code) (Intermediate Code is irrelevant to the target machine and runtime environment ). The intermediate code divides the compiler into the frontend and backend. The front-end of the compiler is responsible for generating machine-independent intermediate code. The back-end of the compiler converts intermediate code into the code of the target machine.
Target Code generation: Code Generator ).
Target Code optimization: Target Code Optimizer ).
The main content of the link is to process the reference parts of each module, so that the modules can be correctly connected.
The main links include Address and Storage Allocation, Symbol Resolution, and Relocation.
Links are classified into static links and dynamic links.
Static LinkIt refers to directly adding the static library to the executable file during the compilation phase, so that the executable file will be relatively large.
WhileDynamic LinkThis means that only some descriptive information is added to the link stage, and the corresponding dynamic library is loaded into the memory during program execution.
The general process of static links is shown in:
-
Static linking
References:
Programmer self-cultivation-links, loads and libraries
Recommended: http://www.cnblogs.com/roucheng/p/3454292.html