In-depth JVM source code compilation Mechanism

Source: Internet
Author: User

In-depth JVM source code compilation Mechanism

For jvm source code compilation mechanism, refer to distributed Java application basics and practices. After learning, we will summarize the following.
I have no blogs recently, and my mood is messy,

 

The javac compilation. java file is a. class file.
Step 1: Analyze and input the data to the symbol table

Step 2: Comment

Sun jdk 6 supports this processing.

Step 3: Semantic Analysis and generation of class files

For the Gen class

Final Mind Map:

Source code analysis:
Lexical Analysis in the parse step.

It is related to the token. In fact, the token is generated.
Let me look at how to implement it. First check the variables to determine what needs to be stored.

We can infer from the name.
The first two are related to the temporary storage after the token is generated.
PervToken is a reference that traverses a List node and provides the previous node for the next operation.
SavedTokens: stores the generated tokenList
JavaTokenizer. For this, we need to find the token Generation class. Click in.

At a glance, readToken is the core method.
Other scans start with scanning input,
In addition to skip, is, add, and so on.
Analyze the readToken method.

public Token readToken() {        this.reader.sp = 0;        this.name = null;        this.radix = 0;        boolean var1 = false;        boolean var2 = false;        List var3 = null;        try {            int var9;            label474:            while(true) {                var9 = this.reader.bp;                int var4;                boolean var11;                switch(this.reader.ch) {                case '\t':                case '\f':                case ' ':                    do {                        do {                            this.reader.scanChar();                        } while(this.reader.ch == 32);                    } while(this.reader.ch == 9 || this.reader.ch == 12);                    this.processWhiteSpace(var9, this.reader.bp);                    break;                case '\n':                    this.reader.scanChar();                    this.processLineTerminator(var9, this.reader.bp);                    break;                case '\u000b':                case '\u000e':                case '\u000f':                case '\u0010':                case '\u0011':                case '\u0012':                case '\u0013':                case '\u0014':                case '\u0015':                case '\u0016':                case '\u0017':                case '\u0018':                case '\u0019':                case '\u001a':                case '\u001b':                case '\u001c':                case '\u001d':                case '\u001e':                case '\u001f':                case '!':                case '#':                case '%':                case '&':                case '*':                case '+':                case '-':                case ':':                case '<':                case '=':                case '>':                case '?':                case '@':                case '\\':                case '^':                case '`':                case '|':                default:                    if(this.isSpecial(this.reader.ch)) {                        this.scanOperator();                    } else {                        if(this.reader.ch < 128) {                            var11 = false;                        } else {                            char var13 = this.reader.scanSurrogates();                            if(var13 != 0) {                                this.reader.putChar(var13);                                var11 = Character.isJavaIdentifierStart(Character.toCodePoint(var13, this.reader.ch));                            } else {                                var11 = Character.isJavaIdentifierStart(this.reader.ch);                            }                        }                        if(var11) {                            this.scanIdent();                        } else if(this.reader.bp != this.reader.buflen && (this.reader.ch != 26 || this.reader.bp + 1 != this.reader.buflen)) {                            String var14 = 32 < this.reader.ch && this.reader.ch < 127?String.format("%s", new Object[]{Character.valueOf(this.reader.ch)}):String.format("\\u%04x", new Object[]{Integer.valueOf(this.reader.ch)});                            this.lexError(var9, "illegal.char", new Object[]{var14});                            this.reader.scanChar();                        } else {                            this.tk = TokenKind.EOF;                            var9 = this.reader.buflen;                        }                    }                    break label474;                case '\r':                    this.reader.scanChar();                    if(this.reader.ch == 10) {                        this.reader.scanChar();                    }                    this.processLineTerminator(var9, this.reader.bp);                    break;                case '\"':                    this.reader.scanChar();                    while(this.reader.ch != 34 && this.reader.ch != 13 && this.reader.ch != 10 && this.reader.bp < this.reader.buflen) {                        this.scanLitChar(var9);                    }                    if(this.reader.ch == 34) {                        this.tk = TokenKind.STRINGLITERAL;                        this.reader.scanChar();                    } else {                        this.lexError(var9, "unclosed.str.lit", new Object[0]);                    }                    break label474;                case '$':                case 'A':                case 'B':                case 'C':                case 'D':                case 'E':                case 'F':                case 'G':                case 'H':                case 'I':                case 'J':                case 'K':                case 'L':                case 'M':                case 'N':                case 'O':                case 'P':                case 'Q':                case 'R':                case 'S':                case 'T':                case 'U':                case 'V':                case 'W':                case 'X':                case 'Y':                case 'Z':                case '_':                case 'a':                case 'b':                case 'c':                case 'd':                case 'e':                case 'f':                case 'g':                case 'h':                case 'i':                case 'j':                case 'k':                case 'l':                case 'm':                case 'n':                case 'o':                case 'p':                case 'q':                case 'r':                case 's':                case 't':                case 'u':                case 'v':                case 'w':                case 'x':                case 'y':                case 'z':                    this.scanIdent();                    break label474;                case '\'':                    this.reader.scanChar();                    if(this.reader.ch == 39) {                        this.lexError(var9, "empty.char.lit", new Object[0]);                    } else {                        if(this.reader.ch == 13 || this.reader.ch == 10) {                            this.lexError(var9, "illegal.line.end.in.char.lit", new Object[0]);                        }                        this.scanLitChar(var9);                        char var12 = this.reader.ch;                        if(this.reader.ch == 39) {                            this.reader.scanChar();                            this.tk = TokenKind.CHARLITERAL;                        } else {                            this.lexError(var9, "unclosed.char.lit", new Object[0]);                        }                    }                    break label474;                case '(':                    this.reader.scanChar();                    this.tk = TokenKind.LPAREN;                    break label474;                case ')':                    this.reader.scanChar();                    this.tk = TokenKind.RPAREN;                    break label474;                case ',':                    this.reader.scanChar();                    this.tk = TokenKind.COMMA;                    break label474;                case '.':                    this.reader.scanChar();                    if(48 <= this.reader.ch && this.reader.ch <= 57) {                        this.reader.putChar('.');                        this.scanFractionAndSuffix(var9);                    } else if(this.reader.ch == 46) {                        var4 = this.reader.bp;                        this.reader.putChar('.');                        this.reader.putChar('.', true);                        if(this.reader.ch == 46) {                            this.reader.scanChar();                            this.reader.putChar('.');                            this.tk = TokenKind.ELLIPSIS;                        } else {                            this.lexError(var4, "illegal.dot", new Object[0]);                        }                    } else {                        this.tk = TokenKind.DOT;                    }                    break label474;                case '/':                    this.reader.scanChar();                    if(this.reader.ch == 47) {                        do {                            this.reader.scanCommentChar();                        } while(this.reader.ch != 13 && this.reader.ch != 10 && this.reader.bp < this.reader.buflen);                        if(this.reader.bp < this.reader.buflen) {                            var3 = this.addComment(var3, this.processComment(var9, this.reader.bp, CommentStyle.LINE));                        }                        break;                    } else {                        if(this.reader.ch != 42) {                            if(this.reader.ch == 61) {                                this.tk = TokenKind.SLASHEQ;                                this.reader.scanChar();                            } else {                                this.tk = TokenKind.SLASH;                            }                            break label474;                        }                        var11 = false;                        this.reader.scanChar();                        CommentStyle var5;                        if(this.reader.ch == 42) {                            var5 = CommentStyle.JAVADOC;                            this.reader.scanCommentChar();                            if(this.reader.ch == 47) {                                var11 = true;                            }                        } else {                            var5 = CommentStyle.BLOCK;                        }                        while(!var11 && this.reader.bp < this.reader.buflen) {                            if(this.reader.ch == 42) {                                this.reader.scanChar();                                if(this.reader.ch == 47) {                                    break;                                }                            } else {                                this.reader.scanCommentChar();                            }                        }                        if(this.reader.ch == 47) {                            this.reader.scanChar();                            var3 = this.addComment(var3, this.processComment(var9, this.reader.bp, var5));                            break;                        }                        this.lexError(var9, "unclosed.comment", new Object[0]);                        break label474;                    }                case '0':                    this.reader.scanChar();                    if(this.reader.ch != 120 && this.reader.ch != 88) {                        if(this.reader.ch != 98 && this.reader.ch != 66) {                            this.reader.putChar('0');                            if(this.reader.ch == 95) {                                var4 = this.reader.bp;                                do {                                    this.reader.scanChar();                                } while(this.reader.ch == 95);                                if(this.reader.digit(var9, 10) < 0) {                                    this.lexError(var4, "illegal.underscore", new Object[0]);                                }                            }                            this.scanNumber(var9, 8);                        } else {                            if(!this.allowBinaryLiterals) {                                this.lexError(var9, "unsupported.binary.lit", new Object[]{this.source.name});                                this.allowBinaryLiterals = true;                            }                            this.reader.scanChar();                            this.skipIllegalUnderscores();                            if(this.reader.digit(var9, 2) < 0) {                                this.lexError(var9, "invalid.binary.number", new Object[0]);                            } else {                                this.scanNumber(var9, 2);                            }                        }                    } else {                        this.reader.scanChar();                        this.skipIllegalUnderscores();                        if(this.reader.ch == 46) {                            this.scanHexFractionAndSuffix(var9, false);                        } else if(this.reader.digit(var9, 16) < 0) {                            this.lexError(var9, "invalid.hex.number", new Object[0]);                        } else {                            this.scanNumber(var9, 16);                        }                    }                    break label474;                case '1':                case '2':                case '3':                case '4':                case '5':                case '6':                case '7':                case '8':                case '9':                    this.scanNumber(var9, 10);                    break label474;                case ';':                    this.reader.scanChar();                    this.tk = TokenKind.SEMI;                    break label474;                case '[':                    this.reader.scanChar();                    this.tk = TokenKind.LBRACKET;                    break label474;                case ']':                    this.reader.scanChar();                    this.tk = TokenKind.RBRACKET;                    break label474;                case '{':                    this.reader.scanChar();                    this.tk = TokenKind.LBRACE;                    break label474;                case '}':                    this.reader.scanChar();                    this.tk = TokenKind.RBRACE;                    break label474;                }            }            int var10 = this.reader.bp;            switch(null.$SwitchMap$com$sun$tools$javac$parser$Tokens$Token$Tag[this.tk.tag.ordinal()]) {            case 1:                Token var18 = new Token(this.tk, var9, var10, var3);                return var18;            case 2:                NamedToken var17 = new NamedToken(this.tk, var9, var10, this.name, var3);                return var17;            case 3:                StringToken var16 = new StringToken(this.tk, var9, var10, this.reader.chars(), var3);                return var16;            case 4:                NumericToken var15 = new NumericToken(this.tk, var9, var10, this.reader.chars(), this.radix, var3);                return var15;            default:                throw new AssertionError();            }        } finally {            ;        }    }

First character of processing code

Process blank characters
32 space
9 HT (hZ? Http://www.bkjia.com/kf/ware/vc/ "target =" _ blank "class =" keylink "> keys + DQq1sdP2tb2/1bjx19a3 + 8qxo6y0psDtv9Ww19fWt/keys" here write picture description "src =" http://www.bkjia.com/uploads/allimg/150519/05042J213-10.jpg "title = "\ "/>

Process line breaks.

Process Code characters.
Process Operation characters first
Determine whether the remaining characters start with a java character (by supplementing the specified proxy pair consisting of code points ).
The following are the judgments for removing comments and styles.
Letter/number processing
[] {} And other code priority Block Processing
With the above information,

When code is converted into a token sequence, the logic is easily generated into a syntax tree. With the syntax tree, classes can be converted to symbol tables for storage.
For details about the generation of the pre-release tree, refer to the Parser class.
Convert to the symbol table Enter Class.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.