What is the compilation done (***. c, compile to the end to do. c

Source: Internet
Author: User

What is the compilation done (***. c, compile to the end to do. c
(The first time I wrote a blog, I was so excited to say .......) we know that a program from source code to executable files is usually composed of these steps: Prepressing-> Compilation-> Assembly-> Linking ). The compilation process is to conduct a series of lexical analysis, syntax analysis, Semantic Analysis and Optimization of the pre-processed files to produce the corresponding assembly code files, this process is often the core part of the entire program build. So what is this core part. As you may see, I rolled up my sleeves and listened to my speech. What does the compiler do? From the most intuitive point of view, the compiler is a tool for translating advanced languages into machine languages. Take the c language as an example to explain the process of ***. c-> ***. o. Assume that test. c has the following code:Array [index] = (index + 4) * (2 + 6 );Let's talk about how this expression is translated into machine language. This process mainly involves the following five steps, which look like a long one. You need to calm down and take a look .... 1. lexical analysis-splits the source code Character Sequence into a series of tokens. The source code program is input to the Scanner ). The scanner's task is to use a Finite State Machine algorithm to split the source code Character Sequence into a series of tokens ). There are also some other tasks (put identifiers in the symbol table and put numbers and strings in the text table), such as (because the table is changed to a page, it looks like this, hope haihan) tokens generated by lexical analysis can be divided into the following types: keywords, identifiers, literal quantities (including numbers, strings, etc.), and special symbols (+ -*/). Note that macro replacement and file inclusion in C language are generally not implemented by the compiler, but by an independent pre-processor. There is a program called lex that can perform lexical scanning. 2. syntax analysis-generate a Syntax Tree (Tree with expressions as nodes) Grammar Parser performs Syntax analysis on the mark generated above to generate a Syntax Tree) -- The Analysis Method of context-independent syntax is used. Simply put, the syntax tree generated by the syntax analyzer is a tree with expressions as nodes. In the syntax analysis phase, many things (meaning and priority of symbols) must be distinguished. If there is an illegal situation (such as mismatched brackets and the expression lacks operators ), the compiler reports errors in the syntax analysis phase. It only completes the analysis of expression syntax and does not know whether the statement really makes sense. Syntax analysis also has a ready-made tool called yacc (Yet Another Compiler ). 3. Semantic Analysis-mark the meaning of nodes in the syntax tree. The next step is to use the Semantic Analyzer. The task is the expression identification type of the syntax tree. This is what it looks like. The type symbol and number are the smallest expressions. The semantics that the compiler can analyze is static semantics. (Dynamic semantics cannot be analyzed) static semantics: the semantics that can be determined at the compilation stage, usually including declarations and type Matching and type conversion. Dynamic semantics: semantics that can be determined at runtime. For example, a semantic error occurs when 0 is used as a divisor. 4. Intermediate Language Generation-a modern compiler has many levels of Optimization in an optimization process. Here we will introduce a Source Code Optimizer which will be optimized at the Source Code level. For example, in the example (2 + 6), this expression is optimized because it can be determined to be 8 in the compilation phase. Because it is difficult to optimize the syntax tree directly, the source Code optimizer often converts the entire syntax tree into Intermediate Code (Intermediate Code ), is the Sequential Representation of the syntax tree (very close to the target code ). There are many types of intermediate Code, which have different manifestations in different compilers. Common examples include Three-address Code and P-Code ). The intermediate code allows the compiler to be divided into front-end and back-end. Frontend: generates machine-independent intermediate code backend: converts intermediate code into target code 5. target Code generation and optimization (the back-end is started here, and the front-end is used above) the back-end of the compiler mainly includes Code Generator and Target Code Optimizer ). Code Generator: converts intermediate code into the code of the target machine. This process relies heavily on machines because different machines have different character lengths, registers, integer data types, and floating point data types. For our example, the following code sequence may be generated (expressed in x86 assembly). The target code optimizer: optimizes the above target code. For example, you can select an appropriate addressing method, use displacement instead of multiplication, and delete unnecessary commands. Our example may be optimized to this .. ------ I am a split line ------ well, after so long, the source code has finally become the target code. The problem is that the address of index and array is not yet determined. If you use the assembler to compile the target code into a command that can be executed on a machine, where do these two addresses come from. If index and array are defined in the same compilation unit as the source code above, the compiler can allocate space for them and determine their addresses. What if it is defined in another module? It's a long story ...... Some words attached to the book: (help to understand) (1 ). modern compilers can compile a source code file into an unlinked target file, and then the linker Finally links these target files to form executable files. (2) The assembler converts assembly code into commands that can be executed by machines. Almost every Assembly statement corresponds to one machine command. (3 ). therefore, the compilation process of the assembler is relatively simple compared with that of the compiler. It does not have complex syntax, semantics, or instruction optimization, only one-to-one translation can be performed based on the comparison table of Assembly commands and machine commands. (4). the target File is directly output after pre-compilation, compilation, and compilation ). Reference document "programmer's self-cultivation-links, loading and Database" P41-P48 (in fact, it is excerpted and sorted out a bit, haha)

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.