(compiler Principle) Java Implementation Lexical Analyzer _ compiler principle

Source: Internet
Author: User
1. Gossip

Recently learned the principle of compiling, the need to use language to implement a lexical analyzer, in fact quite simple, mainly involved in some language string manipulation, if the regular expression, the feeling of implementation of this will be very simple, but I do not ah, and then their own Java implementation, It also reinforces the use of some character manipulation methods for Java.

The implementation of this parser, the algorithm is basically not difficult, but some of the logic involved in the thinking, frankly speaking is so many situations, there are writing and the intersection of the part, how do you let yourself not around, and use the code to achieve their own ideas on this problem.

So gossip about this, exactly how I think, how to deal with the look below

2, the question request I. Purpose of the experiment

design, compile and debug a lexical analysis program to deepen the understanding of the principle of the analysis of the word law. Ii. content of the experiment

2.1 Simple morphology to be analyzed

(1) Keywords: all the keywords are lowercase
The begin if then while doing end (2) operator and the bounds character: = +-*/< <= <> > >= =; ( ) #

(3) Other words are identifiers (IDs) and integer constants (NUM), defined by the following formal definitions:

ID = letter (Letter | digit) *
NUM = Digit digit*

(4) spaces are composed of blanks, tabs, and line breaks. Spaces are generally used to separate IDs, NUM, operators, bounds, and keywords, and the lexical analysis phase is often ignored.

2.2 Various word symbols corresponding to the category code:

2.3 Function of the lexical analysis program:

Input: The source string of the given grammar.

output: A sequence consisting of two tuples (Syn,token or num).

wherein: SYN is the word category code;
      token is the string of the word itself;
      num is an integer constant.

For example: The source program begin x:=9 If X>9 then X:=2*X+1/3, end #的源文件, after lexical analysis, output the following sequence:

(1,begin) (10,x) (18,:=) (11,9) (;) (2,if) ...


3. Mind Mapping

First on the mind map, according to the mind map to see how I think about this problem.


4, Code implementation

Key code:

    /** * will be divided by the separator list to determine whether it is a valid substring * @param list * @return * * * * @SuppressWarnings ("Rawtypes") Public list<map> getstringandsortnum (string[] List) {Char firstchar;//is used to record the first first character String keyword= "" , sortnum= "";//requires output to return the keyword, category code String ctype,word;//the type of the first character//mlist is used to return the entire List of characters and category code after the completion of the list& Lt

        Map> mlist=new arraylist<map> (); for (int i=0;i<list.length;i++) {map<string,string>map=new hashmap<string,string> ();//m is used to save the last returned
            The words that have been judged successful and category code//word represent the words that need to be processed for judgment word=list[i]; To determine if Word is an empty string, because it is possible to have empty line breaks or empty strings in a string separated by a space, do not process if (word== "" "| | word==null| |

            Word.trim () = = "") continue;
            Firstchar=word.charat (0);

            Gets the type of this character ctype=getchartype (Firstchar); if (ctype== "letter") {if (firstchar== ' W ') | | firstchar== ' I ' | | firstchar== ' B ' | | firstchar== ' d ' | | firstchar== ' E ' | |
            firstchar== ' t ') {        Obtain the keyword Word map<string,string>m=new hashmap<string,string> ();//m used to save the last returned words that have been judged successful
                    and category Code M=getkeyword (Word);
                        Remove the value of M if it is the keyword if (m!=null) {keyword=m.get ("keyword");
                    Sortnum=m.get ("Sortnum");
                            }//is not a keyword, but the previous string containing the keyword else{if (isid (word)) {
                            Keyword=word;
                        Sortnum=g.getsortnum ("ID") + ""; else{System.out.println ("This" +word+ "is not a valid ID character, where it is located in:" + (i+1) + "word")
                        ; }} else{//The first letter is a character, but you need to further determine if it is a valid ID if ISID (word
                        ) {Keyword=word;
                    Sortnum=g.getsortnum ("ID") + "";
                    }else{System.out.println ("This" +word+ "is not a valid ID character, where it is located in:" + (i+1) + "word"); }} if (ctype== "digit") {if (Isnum (word)) {key
                    Word=word;
                Sortnum=g.getsortnum ("NUM") + "";
                else{System.out.println ("This" +word+ "is not a valid num character, where it is located in:" + (i+1) + "word");
                } if (ctype== "opts") {//Get the length of this word, if it is a single operator judgment, if 2 is a multiple operator
                int Len=word.length ();
                        if (len==1) {if (issingleopt (word)) {Keyword=word;
                    Sortnum=g.getsortnum (Word) + "";
                        else if (isendopt (word)) {Keyword=word;
                    Sortnum=g.getsortnum (Word) + ""; } else{System.out.println ("This" +word+ "is not a valid num character, where it is located:" + (i+1) + "word"); } else if (len==2) {if (isdoubleopt (word)) {Ke
                        Yword=word;
                    Sortnum=g.getsortnum (Word) + "";
                    else{System.out.println ("This" +word+ "is not a valid num character, where it is located in:" + (i+1) + "word"); } else{System.out.println ("This" +word+ "is not a valid num character, where the
                Position: First "+ (i+1) +" word "); } if (Keyword.equals ("") | | Sortnum.equals ("") | | keyword== "" | |
            sortnum== "") continue;
                else{map.put ("keyword", keyword);
                Map.put ("Sortnum", sortnum); 
                Mlist.add (map);
                Keyword= "";
            Sortnum= "";
    } return mlist;
 }

The effect of code implementation:

Files for testing:

Begin x: = 9; If x > 9 then x: = 2 * x + 1/3
x = 3;
Begin
$y: = 2
if x = = $y:
y = qqwe221;

In fact, it looks very complicated a lexical analyzer, when really borrow a group of mind map or other flow diagram, the realization of ideas in the mind will be very clear, what to do next, how to achieve, when they think of these, you can use the form you like to record down, In this way the realization of the only need to use code to reproduce their own ideas.

Getting into a habit of programming is sometimes more advantageous than mastering a language.

This lexical analyzer source code and test files are placed on the GitHub, need to see for themselves.
Https://github.com/Ashplumage/Compile-principle

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.