The principle of compiling--lexical analysis program

Last Update:2018-07-26 Source: Internet

Author: User

Tags bool comments reserved

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This is my study of compiling principles, the contents of the course experiment, the course has already finished, is now collated and published. first, the experimental task read the compiler's classic lexical analysis source program, in C or Java language to write a language of the lexical analyzer. Ii. contents of the experiment

Read the classic lexical analysis source program that already has a compiler.
Select a compiler, such as: Tiny or pl/0, and other compilers (need to bring your own source code). Reading lexical analysis source program, Understanding lexical Analysis program Construction Method-state diagram code. In particular, it requires a slightly detailed description of the function and function of the relevant functions and important variables. It is better to add learning experience.

According to the language keywords and recognition of lexical units and comments, etc., determine the keyword table, draw all lexical units and comments corresponding to the DFA diagram.

The lexical parser for the selected language is written in the previous lexical parser.

Prepare two or three test cases, requiring both positive and inverse examples to test the results of the compilation. three, lexical analysis

Tiny Lexical parsing Tiny's mark.

It can be seen that the tiny language is rarely marked, after all, simply a language. But at the beginning to think--how these, and how to judge, I was very confused at the beginning, I remember that I was also spent an afternoon to see the tiny of the lexical analysis of the source and experiment to the document, only to understand how to write, and then back to the dorm one night on the basis of tiny rewrite finished. < (￣︶￣) >

Simply put, a notation is a classification of a language, and when you read a string, you have to identify what the string is, what type it is, so you read a certain rule to play a certain character, you should judge what the character is, whether it is wrong, how to classify, and the mark is the standard you use to classify. State Transitions

Then, given the type, how do you judge the type? Convert the diagram with DFA.

This conversion is as simple as start state, innum number, INID string,+-*/=< (); Special symbol, done state, from the start state according to read the first character again , according to read the character conversion state, enter a state, when the next character read does not conform to the current state of the type, do not read the character, and read the end of the string, when it is a type. After judging the string, repeat the operation again until all characters are read.

When the state of need to judge is added, it becomes the lexical analysis conversion diagram of tiny.

Iv. Programming

I write the lexical analysis is actually in the tiny based on a little modification, lexical design This part is roughly the same.

Mark

Reserved words:
cin While then cout end
special characters:
= +-*/(); >> <<
Comment:
{This is a comment}

Conversion Table
Refer to above

Code resolution macro definition maximum match character variable length is 40
reserved word for 5 characters

#define Maxtokenlen
#define MAXRESERVED 5

Defines an enumeration type used to represent the set of States of the DFA

typedef enum {//DFA state set
    START,
    incomment,
    innum,
    INID,
    inin,
    INOUT,
done } StateType;

Defines an enumeration type that represents the string type to match against the input string.

typedef enum {//For matching type, judging input/
    * Exception status */
    Endfile,
    ERROR,/
    * reserved word */
    CIN,
    COUT, while
    ,
    Then,
    END,/
    * DFA status */
    ID,
    NUM,/
    * Special symbol */
    in, out
    ,
    EQ,
    PLUS,
    Minus, Times
    , over
    ,
    Lparen,
    rparen,
    SEMI
} tokentype;

Defines a struct that is used to output reserved words based on the reserved word to which it is matched.

static struct//reserved word struct for output
{
    const char *str;
    Tokentype Tok;
} Reservedwords[maxreserved] = {{
    "cin", cin},
    {"While", while}, {' Then
    ', then},
    {"cout", cout},
    {"End", End}};

Defines a function that returns a type bool that determines whether the input character is a letter.

BOOL Isletter (char c)//is the character
{
    if (c >= ' a ' && c <= ' z ') | | (c >= ' A ' && C <= ' Z '))
    {
        return true;
    }
    else
    {
        return false;
    }
}

Determines whether the matched string is a reserved word.

Static Tokentype Reservedlookup (char *s)
{for
    (int i = 0; i < maxreserved; i++)
        if (!strcmp (S, reservedwor DS[I].STR))
            return reservedwords[i].tok;
    return ID;
}

The string is output based on the type of character matched to.

void Printtoken (Tokentype token, const char tokenstring[]); Output function

A function that matches a string.

void GetToken (string ss); Lexical analysis

Five, experimental testInput

{SDFs
ADF}
cin  >>{SDFSADF} x;
{SDFSADF}
cin>>y;
while (cin>>z) THEN{SDFSADF}
x=x z y;
cout << x;
End

Output

1:{sdfs
2:ADF}
3:cin  >>{SDFSADF} x;
        3:reserved word:cin
        3: >>
        3:id, Name= x
        3:;
4:{SDFSADF}
5:cin>>y;
        5:reserved word:cin
        5: >>
        5:id, name= y
        5:;
6:while (cin>>z) THEN{SDFSADF}
        6:reserved word:while
        6: (
        6:reserved word:cin
        6: >>< C18/>6:id, Name= z
        6:)
        6:reserved word:then
7:x=x z y;
        7:id, Name= x
        7: =
        7:id, name= x
        7:id, name= z
        7:id, name= y
        7:;
8:cout << x;
        8:reserved word:cout
        8: <<
        8:id, name= x
        8:;
9:end;
        9:reserved word:end
        9:;

Vi. Summary of the experiment

Actually just wrote this blog when I am a little bit empty, because do not remember tiny of grammar analysis of what is going on, can only bite the bullet to see the experimental document, and then slowly back up. And then you see the blog.

I really spent the afternoon 2, 3 hours to see the experimental documents, look at the experimental documents, see tiny source code, slowly understand what is, and then plan to change to what kind of, experimental content that part I have to revise some of the content requirements are we determined our language, Encourage us to define a language, and I was trying to write the language of math, and that was the input.

Then in fact, not good to write, especially to write the whole front-end situation, more and more powerless, I later the experiment to understand this point, and then did not insist to write their own language, in fact, no one of the students write their own language of the entire front end, not tiny is the PL change.

Also, I am also here into the enumeration and structure of the pit, and then there are times the experiment code crazy with the structure of the structure is very complex, in short, if you still look at the code behind my experiment, you will know what is called insanity ... vii. download of information

Specific code See Mathlex

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More