Javac Compilation and JIT compilation

Source: Internet
Author: User
Javac Compilation and JIT compilation compilation Process

Whether it's a physical machine or a virtual machine, most of the program code starts with a set of instructions to be compiled into the target code or virtual function of the physical machine, as shown in the following figure:

The green modules can be selectively implemented. It is easy to see that the branch in the middle of the image above is the process of interpreting execution (that is, a byte code that interprets execution like JavaScript), and the following branch is the generation process from source code to target machine code in the traditional compilation principle.

Today, based on physical machines, virtual machines and other languages, most of them follow this idea based on the modern classical compiler principle, before the implementation of the program source code for lexical analysis and parsing processing, the source code into an abstract syntax tree. For a specific language implementation, lexical and grammatical analysis, and even the following optimizer and target code generator can choose to be independent of the execution engine, to form a complete meaning of the compiler to achieve, this type of Representative is a/C + + language. You can also implement a partially independent compiler for the steps before the abstract syntax tree or instruction stream, which is the Java language. Or you can centralize all of these steps and the execution engine, such as most JavaScript actuators. Javac Compilation

Referring to "compiling" in Java, it is easy to think of the process by which the Javac compiler compiles the. Java file into a. class file, where the Javac compiler is called the front-end compiler, and the other front-end compilers include the incremental compiler ECJ in Eclipse JDT. Corresponding to the back-end compiler, it converts bytecode into machine code while the program is running (Java programs are basically interpreted and executed at run time), such as the HotSpot virtual machine's own JIT (Just in time Compiler) compiler (split-client side and S Erver end). In addition, sometimes it is possible to encounter the static advance compiler (Aot,ahead of time Compiler) directly to the *.java file to compile the cost of the machine code, such as GCJ, Excelsior JET, such as the compiler we should be relatively less encountered.

The following is a brief introduction to the process of Javac compilation (front-end compilation). lexical and grammatical analysis

Lexical analysis transforms the source code's character stream into a tag (Token) collection. A single character is the smallest element in the program's writing process, while the tag is the smallest element of the compilation, and the keyword, variable name, literal, operator, and so on, can all be tokens, such as an integer flag int, which is made up of three characters, but it is only a tag and cannot be split

Parsing is the process of constructing an abstract syntax tree based on the token sequence. Abstract syntax tree is a tree representation that describes the grammatical structure of program code, and each node of the syntax tree represents a grammatical structure in the program code, such as Bao, type, modifier, operator, and so on. After this step, the compiler will basically no longer operate the source file, subsequent operations are based on the abstract syntax tree. Fill Symbol Table

After parsing and lexical analysis are completed, the next step is to populate the symbol table. A symbol table is a table consisting of a set of symbolic addresses and symbolic information. The information that is enlisted in the symbol table is used at different stages of compilation. In the semantic analysis (the latter step), the symbol table registers the content to be used for the semantic examination and produces the intermediate code, in the target code generation stage, when the party carries on the address assignment to the symbol name, the symbol table is the address assignment basis. Semantic Analysis

The syntax tree can represent the abstraction of a properly structured source program, but it cannot guarantee that the source program is logical. The main task of semantic analysis is to read the structure of the correct source program to examine the context-related nature. The semantic analysis process consists of two steps: tagging inspection and data and control flow analysis.

The callout check step checks for information such as whether the variable was declared before it was used, whether the data type between the variable and the assignment was matched, and so on.
Data and control flow analysis is a further validation of program context logic, it can check out such as whether the program local variables before the use of the assignment, the method of each path has a return value, whether all of the detected anomalies are properly handled and so on. byte code generation

Bytecode generation is the last phase of the Javac compilation process. The bytecode generation phase not only converts the information generated by the preceding steps into bytecode writing to disk, but the compiler also makes a small amount of code additions and transformations work. The instance constructor () method and the Class Builder () method are added to the syntax tree at this stage (the instance constructor here does not refer to the default constructor. Instead, we overload our own constructors, and if no constructors are provided in the user code, the compiler automatically adds an argument, The default constructor, which is consistent with the current class, has been completed in the Populate symbol table phase. JIT Compilation

Java programs are initially interpreted only through the interpreter, that is, the execution of bytecode-by-article interpretation, which is relatively slow to execute, especially if a method or block of code is running particularly frequently, this approach is inefficient. Then in the virtual machine to introduce a JIT compiler (Just-in-time compiler), when the virtual machine found that a method or block of code to run particularly frequently, it will be identified as "Hot Spot Code" (hotspot codes), in order to improve the efficiency of hot code execution, at runtime, The virtual machine will compile the code into the machine code associated with the local platform and perform various levels of optimization, which is the JIT compiler that completes the task.

Now mainstream commercial virtual machines (such as Sun HotSpot, IBM J9 contains both interpreters and compilers (JRockit, one of the three commercial virtual machines, is an exception, and there is no interpreter inside it, so there are drawbacks such as a long boot time, but it is mainly service-oriented applications, This type of application generally does not focus on startup time. Each has its advantages: when a program needs to be launched and executed quickly, the interpreter can play the role first, save the compile time, execute immediately, when the program runs, over time, the compiler will gradually return to the role, the more and more code to compile cost code, can achieve higher execution efficiency. Interpreting execution can save memory, while compilation execution can improve efficiency.

There are two JIT compilers built into the HotSpot virtual machine: Client complier and Server complier, which are used on both clients and servers, and the current mainstream HotSpot virtual machines work in a way that the interpreter works directly with one of the compilers.

There are two classes of "hotspot code" that are compiled by the Just-in-time compiler during the run: methods that are called multiple times. The loop body that is invoked multiple times.

In both cases, the compiler is a compilation object of the entire method, which is also the standard compilation method in the virtual machine. To know whether a piece of code or method is hot code, it is not necessary to trigger Just-in-time compilation, Hot spot detection (hotspot detection) is required. At present, the main hot spot to determine the following two kinds: Based on sampling hot spot detection: Using this method of virtual opportunity to periodically check the top of each thread, if found that some methods often appear on the top of the stack, that this method code is "hot code." The advantage of this approach is that it is simple and efficient, and it is easy to get a method invocation relationship, and the disadvantage is that it is difficult to accurately identify the heat of a method, and it is easy to disrupt hotspot detection because of thread blocking or other external factors. Hot spot detection based on counter: Using this method of virtual opportunity for each method, or even code blocks to establish counters, statistical methods of the execution times, if the number of executions over a certain threshold, it is considered a "hot method." This statistic method is more complicated, it needs to establish and maintain counters for each method, and it can't get the call relation of method directly, but its statistic result is more precise and rigorous.

The second-HotSpot hotspot detection method is used in the virtual machine, so it prepares two counters for each method: Method call counter and back-side counter.

The method call counter is used to count the number of method calls, and by default the method call counter counts not the absolute number of times the method is invoked, but rather a relative execution frequency, that is, the number of times the method was invoked over a period of time.

The back-edge counter is used to count the number of times the loop body code executes in a method (exactly, the number of times it should be back, since not all loops are back-side), and the instructions to jump after the control flow in the bytecode are called "back-side".

On the premise of determining the operating parameters of the virtual machine, both counters have a certain threshold, and when the value of the counter exceeds the threshold, JIT compilation is triggered. After JIT compilation is triggered, by default, the execution engine does not wait for the compilation request to complete, but continues into the interpreter to execute bytecode as interpreted until the submitted request is compiled by the compiler (the compilation works in a background thread). When the compilation completes, the next time the method or code is called, the compiled version is used.

Because the procedure counter triggers the process of Just-in-time compilation similar to the process of triggering just-in-time compilation by the back-edge counter, only the process in which the method call counter triggers Just-in-time compilation is given here:

The Javac bytecode compiler, in combination with the execution of a JIT compiler within a virtual machine, is actually equivalent to the compilation process performed by a traditional compiler. Reference website

Http://wiki.jikexueyuan.com/project/java-vm/javac-jit.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.