Grammar 2 of compilation principles

Last Update:2018-12-04 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In 1956, Noam Chomsky divided the distribution into four types based on the different rows that imposed restrictions on the production formula and defined the corresponding four formal languages, as follows:

Grammar type	Generate limit	Grammar Language
Type 0 Grammar	α → β Alpha and beta belong to (vtuvn) *, and Alpha length is not 0	Language 0
Type 1 Grammar	α → β Alpha and beta belong to (vtuvn) *, and Alpha length is smaller than beta length.	1 Language/context-related language
Type 2 Grammar	A → β A belongs to VN and beta belongs to (vtuvn )*	Type 2 Language/context-independent language
Type 3 Grammar	A → α \| α B (right linear) or a → α \| B α (left linear) Where a and B belong to VN and α belong to VT or empty set	Type 3 Language/formal language

Grammar restrictions

Type 0 to type 3 grammar, restrictions gradually increase, and are inclusive.

Type 0 Grammar

Type 0 grammar is also known as phrase grammar. A very important conclusion is that the 0 grammar capability is equivalent to a Turing machine. Any Type 0 language is recursive and enumerative. Conversely, recursive enumerated sets must be a type 0 language.

Type 0 grammar is a syntax with the least limit. The generative formula only needs to contain a non-terminator (uppercase) on the left, for example, a0 → A0, A1 → B.

Type 1 Grammar

Type 1 grammar is also known as contextual grammar. Based on Type 0 grammar, a new restriction is added: the right part of the generative formula has a length greater than or equal to the left part. The exception is S → ε, that is, it is derived from an empty set. For example, 0a0 → 011000.

Type 2 Grammar

Type 2 Grammar, also known as context-independent grammar, adds restrictions based on type 1 grammar: the left part of the generative formula is not a Terminator. For example, a → AB, AB → Bac.

Type 3 Grammar

Type-3 grammar is the strictest syntax, also known as regular syntax. On the basis of type-2 grammar, the restriction is added: the right part of the formula has at most two symbols, it also has one of the following forms: A → A, A → AB, where A, B, vn, A, VT. Note that the type 3 syntax can only be left or right linear, but not both. Left and Right Linear refer to the right part of the generative formula, and the position of non-Terminator is left or right.

Note that different syntaxes may generate the same language.

Guide tree

If all the terminal nodes are associated with the Terminator, the string consisting of the terminal nodes from left to right of each guide tree is a sentence pattern of grammar G, then the string is a sentence in grammar g, and the Guide tree is a complete guide tree.

A syntax tree should have the following features:
1. Each node has a mark, which is a symbol of V:
2. The root tag is s:
3. If a node N has at least one child except itself and marks a, a must be in VN;
4. For the direct descendant of node N, the order from left to right is node N1, N2 ...... NK, whose labels are A1, A2 ,..., AK, so a> A1, A2... AK, which must be a production formula in P.

Existing syntax G = ({a, B}, {S, A}, S, P), where: S → AAS | A, A → SBA | SS | Ba, construct a guide tree corresponding to aabaa.
From the formula, VT = {a, B}, vn = {S, A} can be obtained, and S → AAS | A, that is, S → AAS, S →, A → SBA | SS | Ba, that is, a → SBA, A → SS, A → ba. Based on the generated formula, a guide tree can be obtained:

Regular

A regular expression is also called a regular expression. It is a tool that represents the regular level. Each regular expression corresponds to a regular syntax (Type 3 syntax ).

Convert regular grammar into regular syntax

Rule 1: obtained from a → XB, B → Y: A → XB → XY
Rule 2: From A → XA | y, we can see that: A → XA, A → y, push down a → XA → x ^ 2a → x ^ 3A ...... → X * A → x * y
Rule 3: A → X, A → y to a = x | y

For example, the regular expression of the language L = {A ^ MB ^ n | M> = 0, N> = 1: because * represents 0 to multiple, M is greater than or equal to 0, so a ^ m can be expressed as a *, and N is greater than or equal to 1, which can be represented by BB, therefore, the regular expression of language l can be expressed as a * BB *.

Finite Automaton

Finite automaton is a system mathematical model with discrete input and output. Finite Automation has a limited number of States. Each state can be migrated to zero or multiple States. The input string determines the state of migration to be executed. Finite automatic machines can be recorded as a quintuple: M = (Q, Σ, Delta, q0, F), where:

Q
The input alphabet is rich, and each element is called an input symbol.
The transfer function delta: Q x Σ-> 2q is a single-value ing between Q and Σ Cartesian product to Q.
The initial status is q0 and q0 belongs to Q.
End state set F, F included in Q

For example, M = ({S, A, B, C, F}, {}, S, {f}, Delta), and its delta is: Delta (S, 0) = B, Delta (s, 1) = A, Delta (A, 0) = F, Delta (A, 1) = C, Delta (B, 0) = C, delta (B, 1) = F, Delta (C, 0) = F, Delta (C, 1) = F. The corresponding status transition diagram is:

This finite automatic machine can be interpreted as: Starting from S, ending with F, S accepts 1 to A, s accepts 0 to B ...... If all the string W concatenated by characters received from S to F comes from Σ of the alphabet, W is recognized by this automatic machine, and m can recognize the set of string W to become the language that m can recognize.

Finite Automaton can be divided into Deterministic Finite Automaton and uncertain finite automaton. The difference is that the starting state of an uncertain Finite Automaton and the State to which it is switched are uncertain. Finite automatic machines are in the lexical analysis phase during compilation. They are used to determine the state transition and execute relevant semantic actions. For example, when an identifier is identified, add the identifier to the symbol table and send the word of the identifier to the syntax analysis program.

Conversion between regular and Finite Automaton

Each formal type r corresponds to a finite automatic machine m, and m can accept the value of the formal type.

Define the Initial State S and end state F. s goes through R to F to form a directed graph:

Conversion rules:

For example, the identifier in C can only start with "_" and contain letters and numbers. Assume that a represents the letter {A, A, B, B ,......}, B Represents the number {0 ...... 9}, then the regular expression of the identifier that C can accept can be:

(_|a)(_|a|b)*

The finite automatic diagram corresponding to this regular expression is:

Let's talk about grammar, regular expression, and automatic priority first.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Grammar 2 of compilation principles

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Grammar 2 of compilation principles

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support