Do it yourself. Simple compiler (i) Formal language theory

Source: Internet
Author: User

• Preliminary knowledge (Compilation overview)

A translator is a program that translates a program written by a language (source language) into an equivalent program (the target language) of another language (the target program).

A compiler is a translation program that translates a source program written in a high-level language into an equivalent machine language or assembly language target program. The working process can generally be divided into the following five stages:

1: Lexical analysis

The task of lexical analysis phase is to scan and decompose the string composing the source program from left to right, according to the lexical rules of the language, recognize a word with independent meaning (also called word symbol, abbreviation symbol).

Note: The word code is the formation rule of the word symbol, which specifies which string constitutes a word symbol.

2: Syntax analysis

The task of parsing is to recognize the grammatical units (such as expressions, descriptions, statements, etc.) from the word symbol string according to the grammatical rules of the language, and to check the grammatical structure of grammatical units.

Note: The grammar rules of language stipulate how to form grammatical units from word symbols, and grammatical rules are the rules of formation of grammatical units.

3: Semantic analysis and Intermediate code generation

The task of semantic analysis is to first make a static semantic review of each grammatical unit, then analyze its meaning, and describe it in a different language form, which is closer to an intermediate code in the target language than the source language or directly to the target language.

4: Code optimization

The task of code optimization is to change or transform the intermediate code generated by the former phase, in order to obtain more efficient and time-and space-saving target code. Optimization mainly includes local optimization and cycle optimization, such as the above four-yuan is partially optimized after

5: Target code generation

The task generated by the target code is to transform the intermediate code into an absolute instruction code or a relocatable instruction code or assembly instruction code on a particular machine.

Two: Grammar and language (formal language theory)

syntax is the definition of a language structure. Semantics is a description of the meaning of language. Pragmatics is to describe language from the perspective of use. For each specific language, there are two aspects of grammar and semantics, formal language refers to the specific meaning regardless of the language, that is, regardless of semantics.

1. Alphabet:

A non-exhaustive collection of elements, for example, ∑={A, B, c}

2. Symbols (characters):

An element in an alphabet is called a symbol or a character.

3. Symbol string (Word):

The symbol's poor sequence is called the symbol string, which does not contain any symbolic string, called the empty symbol string, expressed in ε.

4: Formal Language:

A formal language is a collection of all the symbolic strings (sequences) that are made up of some sort of rule on an alphabet. Conversely, a collection of symbol strings on any alphabet can be defined as a formal language.

5: Rule (production):

A rule is an ordered pair of symbols with a string of symbols (a,β), usually written: a→β (or a∷=β). The purpose of the rule is to tell us how to use the symbols in the rule to generate sequences in the language. The symbols appearing in the rules are divided into two categories, one is the Terminator, the other is the non-terminating symbol.

Non-finalization symbols are those symbols that appear in the left part of the rule that derive a symbol or symbol string, that is, each non-terminating symbol represents a set of symbolic strings, in uppercase letters or with angle brackets to enclose non-terminating symbols. For example, a in the previous example.

A terminator is a symbol that does not belong to a non-terminating symbol, which is the basic symbol of the constituent language and is a non-fractal basic symbol of a language, usually in lowercase letters.

6: Grammar:

A non-exhaustive collection of rules, usually expressed as a four-tuple g={vn,vt, P, S}

VN is a collection of non-terminating symbols in a rule, and VT is a set of terminator symbols in a rule, and P is a set of grammatical rules, and S is a special non-terminator, called the starting symbol of a grammar, which appears at least as the left side of a rule. It starts with the recognition of the language we define.

For ease of writing, for several left-hand identical rules, they are abbreviated as: A→A1 | A2 | ... | An, where each AI is sometimes referred to as a candidate for a

Example 1 set Alphabet ∑={a, B}, try to design a grammar, description language l= {a2n, b2n | n≥1}

Definition Language L grammar g[s]= (vn,vt,p,s)

Where non-terminator collection vn={a, B, D}, Terminator set Vt={a, B}, grammar start symbol s =A

Set of grammatical rules: p={A→AA|AAB|BB|BBD

B→aa|aab

D→BB | BbD}

The grammar that describes a language is not unique, as the other answers in the previous example

P ': {a→b | D

B→aa | ABa

D→BB | BDB}

7.: Sentence patterns and sentences

Do it yourself. Simple compiler (i) Formal language theory

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.