Grammar and grammar types in compilation principles

Source: Internet
Author: User

The following content is mainly from Wikipedia.

Formal ScienceIt refers to the science of abstract forms, such as logic, mathematics, computing theory, information theory, and statistics.

The branch of mathematics and computer science that specializes in language syntax is calledFormal Language TheoryIt only studies the syntax of a language and does not focus on its semantics.

In computer science,Formal LanguageYes: a set of finite long strings in an alphabet, and formal grammar is a method to describe this set. Formal Grammar is named like this because it is similar to grammar in human natural language.

The basic idea of the formal syntax to describe the formal language is to continuously apply some generative rules from a special initial conformity to generate a string set. The generative rule specifies how some symbol combinations are replaced by other symbol combinations. For example, if the alphabet contains only two characters: 'A' and 'B', and the initial symbol is 's', we apply the following rules:

1. S-> ASB

2. S-> Ba

So we can rewrite "S" to "ASB" (rule 1), and we can continue to apply this rule to rewrite "ASB" to "aasbb ". This rewriting process repeats until the result contains only letters in the alphabet. In the example, we can get the result s-> ASB-> aasbb-> aababb. The language depicted by grammar contains all strings that can be generated in this way, such as Ba, Abab, aababb, and aaabbb.

A formal grammar G is a triplet (n, Σ, P, S) composed of the following elements):

    • "Non-Terminator" set N.
    • The "Terminator" is a collection of Σ, Σ and N.
    • Take the following form as a group of "generative rules" P,
    • The string in (Σ ∪ N) * → (Σ ∪ N) *, and the string on the left of the formula must contain at least one non-Terminator.
    • "Starting symbol" S, s belongs to n.

A language generated by the formal syntax G = (n, Σ, P, S) is a collection of all the following forms of strings, all of which are composed of symbols in the "Terminator" set Σ, in addition, we can start from the "Initial symbol" S and continuously apply the "generative rules" in P.

Chomsky)Is to portrayGrammarA classification family of expressive ability, which is composedNorm JordanYu1956Proposed. It consists of four layers:

    • 0-type grammar (unrestricted grammar or phrase structure grammar) includes all grammar. This type of grammar can generate allTuring MachineRecognized language. A language that can be recognized by a Turing machine is a string that can shut down a Turing machine.Recursive enumerable Language. Note that recursive enumerable languages andRecursive LanguageThe difference is that the latter is a real subset of the former, and is a language that can be determined by a Turing machine with a total downtime.
    • 1-type grammar (contextual grammar) generationContext-related language. The generative rules of this syntax are in the same form as α A β-> α γ β. Here, a is a non-Terminator, while α, β, and γ are strings containing non-Terminator and terminator. α and β can be empty strings, but gamma must not be empty strings; this syntax can also contain rules S-> ε, but at this time, any generative rules of the syntax cannot contain S on the right. The language specified by this grammar can beLinearly Bounded Uncertain Turing MachineAccept.
    • 2-grammar generationContext-independent language. The generative rules of this syntax are in the same form as a-> gamma. Here, a is a non-Terminator, and gamma is a string containing non-Terminator and terminator. The language specified by this grammar can beUncertainPush-down AutomationAccept. Most context-independent languagesProgramDesign LanguageProvides the theoretical basis.
    • 3-type grammar (regular grammar) generationRegular Language. This syntax requires that only one non-terminator can be contained on the left of the generative expression, and only an empty string, an Terminator, or a non-terminator can be followed by an Terminator; if the initial symbol S is not included on the right of all the formula, the rule s-> ε is also allowed. The language specified by this grammar can beFinite State AutomationAccept or passRegular Expression. Regular Languages are usually used to define the lexical structure in the retrieval mode or programming language.

The formal language class is included in the context-independent language class, the context-independent language class is included in the context-related language class, and the context-related language class is included in the recursive enumerable language class. The inclusion here is the true inclusion relationship of a set. That is to say, a recursive enumerable language does not belong to the context-related language class, and a context-related language does not belong to the context-independent language class, context-independent languages are not formal languages.

The following table summarizes the main features of the above four types of grammar:

Grammar Language Automatic Machine Generative rules
0-type Recursive enumerable Language Turing Machine Unlimited
Type 1 Context-related language Linearly Bounded Uncertain Turing Machine α A β-> α γ β
Type 2 Context-independent language Uncertain push-down machine A-> gamma
3-type Regular Language Finite State Automation A-> AB a->

Type 0 Grammar
Set G = (VN, VT, P, S). If each of its generative α → β is in this structure: α (VN 1_vt) * And contains at least one non-Terminator, while β( VN 1_vt) *, G is a 0-type syntax. Type 0 grammar is also called phrase grammar. A very important theoretical result is that the 0 grammar capability is equivalent to Turing ). In other words, any 0-type language is recursive and enumerative. Conversely, recursive enumerated sets must be a 0-type language. Type 0 grammar is the least restrictive of these types of grammar.
Type 1 Grammar
A 1-type grammar is also called a contextual grammar, which corresponds to a Linear Bounded automatic machine. It is based on the 0-type grammar. Each α → β has | β |> = | α |. | β | indicates the length of Beta.
Note: Although | β |> = | α | is required, there is a special case: α → ε also meets the 1 grammar.
If a-> BA is available, | β | = 2, | α | = 1 meets the requirements of type 1 grammar. Otherwise, AA-> A does not conform to the 1-type syntax.
Type 2 Grammar
A 2-type grammar is also called a context-independent grammar, which corresponds to a push-down machine. The type-2 Grammar is based on the type-1 grammar and satisfies that every α → β has a non-Terminator. For example, a-> BA meets the requirements of Type 2 Grammar.
For example, although AB-> Bab meets the requirements of type 1 grammar, it does not meet the requirements of Type 2 Grammar because its α = AB, while AB is not a non-Terminator.
Type 3 Grammar
A 3-type grammar is also called a regular syntax. It corresponds to a finite state automation. It satisfies the following requirements on the basis of Type 2 Syntax: A → α | α B (right linear) or a → α | B α (left linear ).
For example, a-> A, A-> AB, B-> A, B-> CB meets the requirements of Type 3 grammar.

If the derivation is a-> AB, a-> AB, B-> A, B-> CB, or a-> A, A-> BA, b-> A, B-> CB does not meet the requirements of Type 3 method.

Specifically, for example, a-> AB, a-> AB, B-> A, B-> A-> AB in CB does not conform to the definition of type 3 grammar, if you change AB to the form of "A non-terminator + A Terminator" (that is, AB), that's right. In Example A-> A, A-> BA, B-> A, B-> CB, if you change B-> CB to B-> BC, because the rules a → α | α B (right linear) and a → α | B α (left linear) cannot appear in one syntax at the same time and can only satisfy one of them, to calculate the type 3 syntax.
Note: upper-case letters in the preceding example indicate non-terminologies, while lower-case letters indicate terminologies.

Extended bacos-chaner Paradigm (EbnfIs expressed as a description of the computerProgramming LanguageAnd the formal mode of the formal form language. It is an extension of the basic BNF syntax notation.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.