Regular Expressions and context-independent Grammar

Source: Internet
Author: User

For grammar G = (V, T, S, P), if the form of the formula is as follows:

A-> XB
A-> X

Where a and B belong to V, and X belongs to T *, which is called the right linear syntax. Similar, if the form of the formula is as follows:

A-> BX
A-> X

It is called the left linear grammar. The right and left linear grammar are collectively referred toRegular syntax.

Regular ExpressionThe expression capability is equivalent to regular syntax. The regular expression is defined as follows:

Any letter in the alphabet is a regular expression, and empty strings and empty sets are also regular expressions;
If R and S are regular expressions, r | S, RS, R *, and (r) are also regular expressions.

Regular Expression Extension:

R +: one or more duplicates
.: Any character
[A-Z]: character range
[^ ABC]: any character not in the given set
R? : Optional

Regular Expressions can only use terminologies (letters in the alphabet), so they are easy to complex and difficult to understand. In practice, regular expressions are often used.Regular description, Regular expression description allows non-terminator definition expressions, much like ebnf, but it is limited to not use non-terminator before it is fully defined, that is, recursion or self-Nesting is not allowed.

Like regular expressions,BNFThe expression ability of the paradigm is equivalent to context-independent grammar.BNFIs the abbreviation of "Backus Naur Form. John Backus and Peter Naur introduced a formal symbol for the first time to describe the syntax of a given language.

The element of BNF:

: = Indicates "defined as". Some books use -->
| "Or"
<> Angle brackets are used to enclose non-terminologies.

BNF ExtensionEbnf:

Options are included in the metacharacters "[" and "]".
Duplicate items (zero or multiple items) are included in the metacharacters "{" and "}"
The Terminator of only one character is enclosed in quotation marks ("), which is different from the metacode.

The preceding operators are not strictly limited. Some people prefer to directly describe ebnf using the operator that extends the regular expression. In addition to convenient expression, another major reason for introducing ebnf is to more closely map grammar to the real code of recursive descent analysis programs. When you need to manually construct a fall analysis program, it is usually necessary to rewrite the context-independent syntax to ebnf.

If a context-independent grammar G is not self-nested or self-recursive, the following derivation does not exist:

U => * xuy

Then l (G) is the regular language. The self-nested context-independent syntax is not necessarily a regular language. In fact, a context-independent grammar is strict and cannot be produced by a regular syntax. if and only when all the grammar of the language is self-nested.

 

If a context-independent grammar G is not self-nested or self-recursive, the following derivation does not exist:

U => * xuy

Then l (G) is the regular language. The self-nested context-independent syntax is not necessarily a regular language. In fact, a context-independent grammar is strict and cannot be produced by a regular syntax. if and only when all the grammar of the language is self-nested.

BNF ExtensionEbnf:

Options are included in the metacharacters "[" and "]".
Duplicate items (zero or multiple items) are included in the metacharacters "{" and "}"
The Terminator of only one character is enclosed in quotation marks ("), which is different from the metacode.

The preceding operators are not strictly limited. Some people prefer to directly describe ebnf using the operator that extends the regular expression. In addition to convenient expression, another major reason for introducing ebnf is to more closely map grammar to the real code of recursive descent analysis programs. When you need to manually construct a fall analysis program, it is usually necessary to rewrite the context-independent syntax to ebnf.

If a context-independent grammar G is not self-nested or self-recursive, the following derivation does not exist:

U => * xuy

Then l (G) is the regular language. The self-nested context-independent syntax is not necessarily a regular language. In fact, a context-independent grammar is strict and cannot be produced by a regular syntax. if and only when all the grammar of the language is self-nested.

As mentioned above, the progressive nature of context-independent grammar has a great impact on its analysis methods. First, algorithms used to identify these structures must use recursive calling or explicitly managed analysis stacks. Secondly, the data structure used to represent the semantic structure of a language must also be recursive (usually an analytical tree), instead of linear (as in lexical and symbolic).

In programming languages, regular expressions are usually used to describe lexical rules. However, the expression of regular expressions is limited, and she cannot express syntax forms such as parentheses matching. Therefore, a more expressive context-independent syntax needs to be introduced. In the Compilation Program, regular syntax is commonly used to represent lexical, and context-independent grammar is used to represent syntax. In programming languages, which of the following are lexical syntaxes? A simple method is to convert all the rules represented by regular grammar into lexical rules, that is, we use the regular grammar to express more things as much as possible, which cannot be represented by regular expressions, such as the {statement;} syntax in C language. Some rules in a language still cannot be described using context-independent grammar, such as variable definition before use, type matching, etc. These are usually called (static) semantics, they are detected during the static semantic check phase of the Compilation Program.

 

If a context-independent grammar G is not self-nested or self-recursive, the following derivation does not exist:

U => * xuy

Then l (G) is the regular language. The self-nested context-independent syntax is not necessarily a regular language. In fact, a context-independent grammar is strict and cannot be produced by a regular syntax. if and only when all the grammar of the language is self-nested.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.