The following content is mainly from Wikipedia.

**Formal Science**It refers to the science of abstract forms, such as logic, mathematics, computing theory, information theory, and statistics.

The branch of mathematics and computer science that specializes in language syntax is called**Formal Language Theory**It only studies the syntax of a language and does not focus on its semantics.

In computer science,**Formal Language**Yes: a set of finite long strings in an alphabet, and formal grammar is a method to describe this set. Formal Grammar is named like this because it is similar to grammar in human natural language.

The basic idea of the formal syntax to describe the formal language is to continuously apply some generative rules from a special initial conformity to generate a string set. The generative rule specifies how some symbol combinations are replaced by other symbol combinations. For example, if the alphabet contains only two characters: 'A' and 'B', and the initial symbol is 's', we apply the following rules:

1. S-> ASB

2. S-> Ba

So we can rewrite "S" to "ASB" (rule 1), and we can continue to apply this rule to rewrite "ASB" to "aasbb ". This rewriting process repeats until the result contains only letters in the alphabet. In the example, we can get the result s-> ASB-> aasbb-> aababb. The language depicted by grammar contains all strings that can be generated in this way, such as Ba, Abab, aababb, and aaabbb.

**A formal grammar G is a triplet (n, Σ, P, S) composed of the following elements)**:

- "Non-Terminator" set N.
- The "Terminator" is a collection of Σ, Σ and N.
- Take the following form as a group of "generative rules" P,
- The string in (Σ ∪ N) * → (Σ ∪ N) *, and the string on the left of the formula must contain at least one non-Terminator.
- "Starting symbol" S, s belongs to n.

A language generated by the formal syntax G = (n, Σ, P, S) is a collection of all the following forms of strings, all of which are composed of symbols in the "Terminator" set Σ, in addition, we can start from the "Initial symbol" S and continuously apply the "generative rules" in P.

**Chomsky)**Is to portrayGrammarA classification family of expressive ability, which is composedNorm JordanYu1956Proposed. It consists of four layers:

**0-type grammar (unrestricted grammar or phrase structure grammar) includes all grammar**. This type of grammar can generate allTuring MachineRecognized language. A language that can be recognized by a Turing machine is a string that can shut down a Turing machine.Recursive enumerable Language. Note that recursive enumerable languages andRecursive LanguageThe difference is that the latter is a real subset of the former, and is a language that can be determined by a Turing machine with a total downtime.

**1-type grammar (contextual grammar) generationContext-related language**. The generative rules of this syntax are in the same form as α A β-> α γ β. Here, a is a non-Terminator, while α, β, and γ are strings containing non-Terminator and terminator. α and β can be empty strings, but gamma must not be empty strings; this syntax can also contain rules S-> ε, but at this time, any generative rules of the syntax cannot contain S on the right. The language specified by this grammar can be**Linearly Bounded Uncertain Turing Machine**Accept.

**2-grammar generationContext-independent language**. The generative rules of this syntax are in the same form as a-> gamma. Here, a is a non-Terminator, and gamma is a string containing non-Terminator and terminator. The language specified by this grammar can be**UncertainPush-down Automation**Accept. Most context-independent languagesProgramDesign LanguageProvides the theoretical basis.

**3-type grammar (regular grammar) generationRegular Language**. This syntax requires that only one non-terminator can be contained on the left of the generative expression, and only an empty string, an Terminator, or a non-terminator can be followed by an Terminator; if the initial symbol S is not included on the right of all the formula, the rule s-> ε is also allowed. The language specified by this grammar can be**Finite State Automation**Accept or pass**Regular Expression**. Regular Languages are usually used to define the lexical structure in the retrieval mode or programming language.

The formal language class is included in the context-independent language class, the context-independent language class is included in the context-related language class, and the context-related language class is included in the recursive enumerable language class. The inclusion here is the true inclusion relationship of a set. That is to say, a recursive enumerable language does not belong to the context-related language class, and a context-related language does not belong to the context-independent language class, context-independent languages are not formal languages.

The following table summarizes the main features of the above four types of grammar:

Grammar |
Language |
Automatic Machine |
Generative rules |

0-type |
Recursive enumerable Language |
Turing Machine |
Unlimited |

Type 1 |
Context-related language |
Linearly Bounded Uncertain Turing Machine |
α A β-> α γ β |

Type 2 |
Context-independent language |
Uncertain push-down machine |
A-> gamma |

3-type |
Regular Language |
Finite State Automation |
A-> AB a-> |

**Type 0 Grammar**

Set G = (VN, VT, P, S). If each of its generative α → β is in this structure: α (VN 1_vt) * And contains at least one non-Terminator, while β( VN 1_vt) *, G is a 0-type syntax. Type 0 grammar is also called phrase grammar. A very important theoretical result is that the 0 grammar capability is equivalent to Turing ). In other words, any 0-type language is recursive and enumerative. Conversely, recursive enumerated sets must be a 0-type language. Type 0 grammar is the least restrictive of these types of grammar.

**Type 1 Grammar**

A 1-type grammar is also called a contextual grammar, which corresponds to a Linear Bounded automatic machine. It is based on the 0-type grammar. Each α → β has | β |> = | α |. | β | indicates the length of Beta.

Note: Although | β |> = | α | is required, there is a special case: α → ε also meets the 1 grammar.

If a-> BA is available, | β | = 2, | α | = 1 meets the requirements of type 1 grammar. Otherwise, AA-> A does not conform to the 1-type syntax.

**Type 2 Grammar**

A 2-type grammar is also called a context-independent grammar, which corresponds to a push-down machine. The type-2 Grammar is based on the type-1 grammar and satisfies that every α → β has a non-Terminator. For example, a-> BA meets the requirements of Type 2 Grammar.

For example, although AB-> Bab meets the requirements of type 1 grammar, it does not meet the requirements of Type 2 Grammar because its α = AB, while AB is not a non-Terminator.

**Type 3 Grammar**

A 3-type grammar is also called a regular syntax. It corresponds to a finite state automation. It satisfies the following requirements on the basis of Type 2 Syntax: A → α | α B (right linear) or a → α | B α (left linear ).

For example, a-> A, A-> AB, B-> A, B-> CB meets the requirements of Type 3 grammar.

If the derivation is a-> AB, a-> AB, B-> A, B-> CB, or a-> A, A-> BA, b-> A, B-> CB does not meet the requirements of Type 3 method.

Specifically, for example, a-> AB, a-> AB, B-> A, B-> A-> AB in CB does not conform to the definition of type 3 grammar, if you change AB to the form of "A non-terminator + A Terminator" (that is, AB), that's right. In Example A-> A, A-> BA, B-> A, B-> CB, if you change B-> CB to B-> BC, because the rules a → α | α B (right linear) and a → α | B α (left linear) cannot appear in one syntax at the same time and can only satisfy one of them, to calculate the type 3 syntax.

Note: upper-case letters in the preceding example indicate non-terminologies, while lower-case letters indicate terminologies.

Extended bacos-chaner Paradigm (**Ebnf**Is expressed as a description of the computerProgramming LanguageAnd the formal mode of the formal form language. It is an extension of the basic BNF syntax notation.