The principle of modern compiling--chapter II (LL (K) of grammar analysis)

Last Update:2015-05-02 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

LL (K) parsing techniques are based on predictive analysis techniques. Let's take a look at predictive analytics technology first. Consider the following grammar:

When using this grammar pair (1*2-3) +4 and (1*2-3) analysis, the former because of the call e->e+t, the latter should call e->t, how to determine the use of which production? This will use predictive analytics to build a predictive analytics parser, and LL (k) is one of them. The key of predictive analysis technology is to construct a conflict-free predictive analysis table. The so-called predictive analysis table is a program that can query the table based on its current state, and then determine which production to use next.

It takes two sets to build a predictive analysis table, which is the first collection and the Follow collection, respectively. Gamma is a string of terminator and non-Terminator, and first (gamma) is a collection of Terminator consisting of the beginning of any string that can be pushed out of gamma. A is a non-terminator, follow (a) meaning can be directly followed by a collection of all terminator behind A. The methods of these two sets can be described as follows:

First set of methods:

The first collection is ultimately the string for the right side of the production, but the key is to find the first set of non-Terminator, since the first collection of Terminator is itself, so it is very straightforward to get the first collection of each string after finding the first set of non-terminator.

1. Direct charge: to form as u-a ... (where A is terminator) and the income from A to first (U)
2. Repeated transmission: to the shape into the u-p ... (where p is non-terminator), the entire contents of first (p) should be transmitted to first (U). The method of follow collection:
The follow collection is for non-Terminator, and follow (U) is expressed as a set of all possible post-finalization symbols that are not terminator U, in particular, "#" is the post-attendant character of the recognition symbol. 1. Repeated transmission: to form like u ... The production of P (where p is non-terminator), the entire contents of follow (U) should be transmitted to follow (p) 2. Direct charge: Pay attention to the production of the right side of each shape, such as "... Ua ... "A direct income to follow (U). 3. To receive directly: to form as "... UPA "(P is a non-terminator), the first (p) proceeds directly into follow (U). If the first collection of P contains an empty (ε), then first (A) is also placed in follow (U).
It is important to note that NULL is only available in the follow collection in the first collection. 　　From the above method can be known, in fact, the first set is a non-terminator equivalent Terminator optional collection, that is, a can be pushed directly to the Terminator, if the first collection can be empty, then a can be directly ignored, this time, in order to predict a is empty after the situation, We built the follow collection.　　It can be seen that two of them are all collections that are created for predictive analysis. The Prediction Analysis table is a two-dimensional table, with a non-terminator character label each row, with Terminator good label each column, according to this first and follow two sets, we use the following rules to build a Prediction analysis table: from the production set to take a production of a->γ, if Terminator A in first (gamma) , the a->γ is placed in the position determined by the (a,a), if γ can be null and A is in follow (a), the a->γ is placed in the position determined by the (a,a). In this way, we get its corresponding predictive analysis form according to grammar following grammar: & nbsp How do I use this analysis form? You know, the data we analyze is generated by lexical analysis, and for the parser, the data that the lexical analyzer produces is the terminator. We use the stack (where the stack is generally also can be used to solve the problem, you can refer to the 47th page of the Tiger Book English version of the code) to record the current analysis of the non-terminator, from the lexical analyzer to get a terminator a, and the top of the stack data to look up the table, and then determine the use of the production, if Stack it from right to left (guaranteed to push from the leftmost non-terminator of the production to the left-hand side), then continue using the A-to-top element analysis. Until you find a production right that contains terminator A, then discard a, re-read the new Terminator, and continue the analysis.　　If you cannot get a production that contains only terminator a in the procedure, then there is a syntax error. The above algorithm process is a predictive analysis process we call the LL (1) algorithm, the first L is the left-to-right parse, the second L is leftmost derivation, and 1 is 1-symbol　　Lookahead. Meaning is the data generated from left to right analytic lexical analyzer, using the leftmost push to the principle, each time only looking ahead to see a terminator to determine the subsequent actions. But for the table above,The position of (Z,D) has two production, then the grammar is differentiation, indicating that it is not LL (K) grammar. You cannot use predictive analytics to parse the language and use more powerful LR (k) technology. Here are two examples of the possibility of generating two semantics, the first of which is left recursion. Look at the bottom left, because first (T) is exactly a subset of first (E+T), so there is a certain conflict. We use the right image to replace the original production, known as the elimination of left recursion. , &NB Sp　　 There is another case where the solution we are addressing is called extracting the right factor. , &NB Sp The above is the whole of LL (k) grammar, which is less powerful than the LR (k) grammar (We compare in LR (k)), but It is very simple to construct a predictive analysis table.　　 For some cases where LR (k) cannot be processed, we can quickly set up a corresponding ll (k) Analysis table to resolve.

Modern compilation Principles-Chapter II (LL (K) of parsing)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

The principle of modern compiling--chapter II (LL (K) of grammar analysis)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

The principle of modern compiling--chapter II (LL (K) of grammar analysis)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support