[Compilation principles] Chapter 1 Introduction

Source: Internet
Author: User

I. language processor

1) An integrated software development environment, including many types of language processors, such as compilers, interpreters, compilers, connectors, loaders, debuggers, and program summary extraction tools.

2) compiler: compile each statement of the source program into a machine language and save it as a binary file. In this way, the computer can run the program directly in the machine language at a high speed; including compilers, anti-compilers, and cross Compilers

3) Interpreter: system software that can execute programs written in other computer languages. It is a translation program. Its execution method is translation and execution, so its execution efficiency is generally low, but the implementation of the interpreter is relatively simple, and the advanced language for programming the source program can use more flexible and expressive syntax.

The machine language target program generated by the compiler is much faster than the interpreter. The interpreter generally performs better error diagnosis than the compiler.

4) Assembler: A program that translates an assembly language into a machine language. Generally, the compilation generates the target code, which can be executed only after the linker generates the executable code.

5) connector: link functions and global variables. Therefore, we can use the intermediate target file (o file or OBJ file) generated by the compiler to link our applications. The linker does not care about the source file where the function is located, but only about the intermediate target file of the function. In most cases, due to too many source files, too many intermediate target files are generated during compilation, during the link, you need to clearly specify the intermediate target file name, which is inconvenient for compilation. Therefore, we need to pack the intermediate target file, in Windows, this package is called "library file", that is. lib file. In UNIX, it is an archive file, that is. file

6) Loader: Put All executable target files into the memory for execution

7) Debugger: working principle is based on the exception mechanism of the central processor, and is encapsulated and processed by the exception distribution/event distribution subsystem (or module) of the operating system, real-time interaction with the debugger in a friendly way

 

2. Structure of a compiler

1) the compiler can map the source program to a semantic Equivalent Target Program. This ing process consists of two parts: analysis and synthesis.

2) analysis part (compiler front-end): divides the source program into multiple components and adds the syntax structure to these elements, then it uses this structure to create an intermediate representation of the source program. If the source code syntax or syntax is incorrect, provide useful information to the user. The analysis part also involves mobile phone information about the source program, and puts the information into a data structure called a symbol table.

3) integrated part (compiler backend): constructs the expected target program based on the information in the intermediate part and symbol table.

4) lexical analysis: the lexical analysis stage is the first stage of the compilation process and the basis for compilation. In this phase, the task reads from the source program one character from left to right, that is, scanning the character stream that constitutes the source program and then recognizing words (also called word symbols or symbols) According to word formation rules ). The lexical analysis program implements this task. Lexical analysis programs can be automatically generated using tools such as lex.

5) syntax analysis: syntax analysis is a logical phase of the compilation process. The task of syntax analysis is to combine word sequences into various types of syntax phrases based on lexical analysis, such as "programs", "statements", and "expressions. the syntax analysis program checks whether the source program is structured correctly. The structure of the source program is described by context-independent grammar.

6) semantic analysis: semantic analysis is a logical phase in the compilation process. The task of semantic analysis is to review the context-related nature of the source program with the correct structure. For example, it is important to perform type review. The array subscript must be an integer,

For example, a C program segment:
Int arr [2], B;
B = arr * 10;
The source program structure is correct.
Semantic Analysis will review the type and report an error: An array variable cannot be used in the expression, and the type on the right and left of the value assignment statement does not match.

7) intermediate code generation: After the syntax analysis and semantic analysis are performed, Some compilers convert the source program into an internal representation, this internal representation is called an intermediate language or intermediate representation or intermediate code. The so-called "intermediate code" is a simple and clearly defined mark system. The complexity of this mark system is between the source program language and the machine language, and it is easy to translate it into the target code. In addition, machine-independent optimization can be performed at the intermediate code level. The process of generating intermediate code is called generating intermediate code.

8) code optimization: the program code is equivalent (meaning it does not change the running result of the program. The program code can be the intermediate code (such as the four-element code) or the target code. The equivalent meaning is that the running result of the transformed code is the same as that of the code before the transformation. Optimization means that the final generated target code is short (shorter running time and smaller occupied space), and the time-space efficiency is optimized. In principle, optimization can be performed at all stages of compilation, but the most important one is to optimize the intermediate code, which is not dependent on a specific computer.

9) code generation: the code generator uses the intermediate code representation of the source program as the input and maps it to the target language. If the target language is machine language, you must specify the register or memory location for each variable used by the program, and then intermediate commands are translated into machine command sequences that can complete the same task. An essential aspect of code generation is the rational allocation of registers.

10) symbol table management:

During the compilation process, the compilation program is used to record the feature information of various names in the source program, which is also called the name feature table.

Name: program name, process name, function name, user-defined type name, variable name, constant name, enumerated Value Name, and label name.
Feature Information: The type, type, dimension, number of parameters, value, and target address (storage unit address) of the preceding name.

Fill in the form: when the description or Definition Statement in the program is analyzed, enter the description or definition name and related information in the symbol table.
Example: Procedure P ()

Lookup Table: (1) check whether the names in the same scope of the program are repeatedly defined;
(2) check whether the name type is consistent with the description;
(3) For strong-type languages, check whether the types of variables in the expressions are consistent;
(4) obtain the required address when generating the target command.

Iii. Application of compilation technology

1) The compilation time is also part of the running overhead. A common technology is to compile and optimize only the fragments of frequently-run programs.

2) Optimization of computer architecture: parallel (parallel commands, parallel operations, at the processor level: Parallel threads of the same application)

Memory Hierarchy (the memory consists of several layers of memory with different speeds and sizes. The layer closest to the processor is the fastest but the capacity is small)

3) design of new computer architecture

Balanced CED instruction-set computer command system

CISC (Complex Instruction-set computer) complex command system: Making assembler easier

Arm is a well-known enterprise in the microprocessor industry. It has designed a large number of high-performance, low-cost, and low-energy-consuming Proteus processors, related technologies and software. The technology features high performance, low cost, and low energy consumption. Suitable for multiple fields, such as embedded control, consumption/education multimedia, DSP and mobile applications.

The general name of a microprocessor architecture first developed and manufactured by x86 intel

Iv. programming language basics

1) Environment: ing between names and memory locations (variables)

Status: memory location to their value ing c to map the left value to the right value

2) identifier: a string pointing to an object (data object, process, class, type ). All identifiers are names, but not all names are identifiers. For example, x. y indicates the y field in the X structure.

Variable: point to a specific location in the storage

3) Declaration: int A definition: A = 2;

4) dynamic scope: a scope depends on one or more factors that can only be known during program execution. It is dynamic.

5) parameter transfer mechanism: 1> passing a value pointer or array will change the original value

2> reference the address of the called real parameter as the form parameter. In the called code, locate the memory location specified by the caller along the pointer and change the shape parameter, just like changing the real parameter.

3> call discard

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.