Interesting questions in Lex

Last Update:2015-04-13 Source: Internet

Author: User

Tags escape quotes

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Lex and Yacc are good lexical analyzers and parsers under UNIX, and under Linux, these two tools are both flex and bison, and are often used by C + + as a tool for building character analysis programs.

This article is not an introductory article, so let's say you already know the basic syntax for Lex and YACC
For an introductory article, refer to IBM's: "Yacc and Lex QuickStart"

Here we discuss some of its interesting uses and considerations

Recognition of strings

Conventional regular and matching problems are hard to come by, so here's a question, how do you recognize strings in C?

We know that the string is usually like this

"some \"string\" problem.\n"

But we will find that it contains a transfer character and quotation marks, how to simply write a regular formula as follows:

\"[^"]*\"

Will cause the quotation mark expression ability is not complete, cannot satisfy the C language request.

So we consider the inside of the expression part apart, first of all, although there is no quotation marks, but can let it have \" , so we changed the regular formula as follows:

\"(\\"|[^"])*\"

OK, then we can use this \" escape quotes, but how do you think so, that is a bit impatient, because there is a very important situation, that is, the latter half can actually contain \ , but in fact, we \ are actually escape character, to be paired with the use of, alone is not correct, so we should add a limit to it and not let it \ happen at random, then our regular becomes this:

\"(\\.|[^"\\])*\"

Well, this is the regular formula for our C-language string recognition.

Recognition of annotations

Well, solve the difficult problem of string recognition, then, found another situation, C language has two kinds of comments, how to correctly identify them?

// hello world/** * hello world */

First one is easier to implement, similar to the above method, as long as there is no line break in the comment:

//[^\n]*

But the following is a more complex, and of course, a simple way to implement

"/*"([^\*]|(\*)*[^\*/])*(\*)*

This regular is very complicated, let's break it down and explain

"/*"   ( [^\*]  |  (\*)*  [^\*/] )*  (\*)*   "*/"

( [^\*] | (\*)* [^\*/] )*This paragraph is looking for the non- * content, or the * later is * not / the part, this is allowed, someone asked, why * can't you follow * ?

This is because once can be followed * , the next match will not limit the / beginning of the match, in order to avoid this situation, to make restrictions, but also because there may be a continuous end of the * situation, so in the back to add a continuous*

Here, in fact, the use of other regular engine, there are simple solutions, specifically, you can refer to this English blog: "Finding Comments in Source Code Using Regular Expressions"

In addition, in the practical use of Lex, there is also a convenient way, that is, the use of fixed C code, processing comment Discard, the method is as follows:

"/*"                    comment();%%comment(){    char c, c1;loop:    while‘*‘0)        putchar(c);    if‘/‘0)    {        unput(c1);        goto loop;    }    if0)        putchar(c1);}

Interesting questions in Lex

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More