JVM-program compilation and code early (during compilation) optimization, jvm-Compilation
Early (Compilation) Optimization
I. Javac Compiler
1. Javac source code and debugging
The source code of Javac is stored in JDK_SRC_HOME/langtools/src/shares/classes/com/sun/tools/javac. Besides JDK APIs, only the code in JDK_SRC_HOME/langtools/src/shares/classes/com/sun/* is referenced. The debugging environment is simple and convenient, because no dependency is needed.
The compilation process can be roughly divided into three processes:
(1) parsing and filling the symbol table process
(2) annotation process processing of the plug-in Annotation Processor
(3) analysis and bytecode Generation Process
The Javac compilation action entry is com. sun. tools. javac. main, JavaCompikler class, the Code logic of the above three processes is concentrated in the compiler () and compiler2 () methods of this class.
2. parsing and filling the symbol table
The parsing steps include lexical analysis and syntax analysis.
(1) lexical and syntax analysis
Lexical Analysis converts the byte stream of source code into a Token set. A single character is the smallest element in the code process, while a tag is the smallest element in the compilation process.
In the source code of Javac, the lexical analysis process is implemented by the com. sun. tools. javac. parser. parser class.
Lexical analysis refers to the process of constructing an abstract syntax tree based on the Token sequence. Abstract Syntax is a tree expression that describes the syntax structure of program code, each node in the syntax tree represents a syntax structure in the program code.
The syntax analysis process is composed of com. sun. tools. javac. parser. in this stage, the abstract syntax tree contains com. sun. tools. javac. tree. the JTree class indicates that after this step, the compiler will basically no longer operate on the source code file, and subsequent operations will be built on the abstract syntax tree.
(2) Fill symbol table
Symbol Table is a Table composed of a group of Symbol addresses and Symbol information.
In syntax analysis, the contents registered in the symbol table are used for syntax analysis checks and intermediate code generation.
In the generation phase of the target Code, when a symbolic name is allocated to an address, the symbol table is the basis for address allocation.
In the Javac source code, the process of filling the symbol table is. sun. tools. javac. compiler. enter Class, the exit of this process is a list to be processed (ToDoList), including; top-level nodes of the abstract syntax tree of each compilation unit and top-level nodes of package-info-java.
3. annotation Processor
In the Javac source code, the initialization process of the plug-in Annotation processor is completed in the initProcessAnnotations () method, and its execution process is completed in the ProcessAnnotations () method. This method determines whether a new annotation processor needs to be executed. If yes, use com. sun. tools. javac. processing. the doProcessing () method of the JavacProvcessingEnviroment class generates a new JavaCompiler object to process the subsequent compilation steps.
4. Semantic Analysis and bytecode generation
(1) annotation check: The content includes whether the variables are declared before and after use, and whether the data types between variables and values can match. In the labeling check step, another important action is called constant folding.
In the javac source code, the implementation classes are com. sun. tools. javacComp. Attr and com. sun. tools. javac. comp. Check.
(2) data and control analysis
Further verifies the context logic of the program. It can check whether there is a value before and after the local variable is used, and whether each path of the method has a return value, check whether all checked exceptions are correct.
In the source code of Javac, the entrance to data and control flow analysis is the Flow () method. The specific operations are completed by the com. sun. tools. javac. comp. flow class.
(3) syntactic sugar
Syntactically Sugar (Syntatic Sugar), also known as Sugar clothing syntax, refers to a syntax added to a computer language. This syntax does not affect the functions of the language, but is more convenient to use.
In the source code of Javac, the process of decoding syntactic sugar is triggered by the desugar () method. dun. tools. javac. comp. transTypes class and com. sun. tools. javac. comp. lower class.
(4) bytecode generation
Bytecode generation is the last phase of Javac compilation. In the Javac source code, the com. sun. tolls. javac. jvm. Gen class is used to complete the compilation.
After traversing and adjusting the syntax tree, the symbol table filled with all required information will be handed over to com. sun. tolls. javac. jvm. classWrite class. The WiteClass () method of this class outputs bytecode to generate the final class file. So far, the entire compilation process has ended.
Ii. Java syntax sugar
1. Generic and type Erasure
C # The generic type exists in both the program source code, the compiled IL, And the CLR In the runtime of the cargo value, list <int> and List <string> are slightly different types. They are generated at system runtime and have their own virtual method tables and type data. This implementation mechanism is called type expansion, the generics implemented based on this method are called True generics.
The generics in the Java language are different. They only exist in the program source code. In the compiled bytecode file, they are replaced with the original native type, and the mandatory transformation code is inserted in the corresponding place. Therefore, for the runtime Java language, ArrayList <String> and ArrayList <int> are the same class, so the generic technology is actually a syntactic sugar in the Java language. The generic Implementation Method in the java language is called type erasure, and the generic type based on this method is called pseudo-generic.
New attributes such as Signature and LocalVariableType are introduced in the virtual machine specification to solve the problem of parameter type identification that comes with generics.
2. Automatic packing, unpacking, and traversing Loops
Automatic packing and unpacking are converted into the corresponding packing box restoration method after compilation, while the traversal loop restores the code to the implementation of the iterator.
The "=" operation of the packaging class does not automatically split the box without Arithmetic Operators, and their equals () methods do not process the relationship of data transformation.
3. Conditional compilation
The Java language supports Conditional compilation by using an if statement with a constant condition.
The implementation of Conditional compilation in Java is a syntactic sugar in Java. Based on the true and false values of Boolean constants, the compiler will clear Invalid code in the branch. This will be implemented in the compiler's de-syntactic sugar stage (in the com. sun. tools. javac. comp. Lower class.