Deep Java Virtual machine bytecode execution engine

Last Update:2018-08-14 Source: Internet

Author: User

Tags arithmetic comparable

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Objective:
class file structure, classes loading mechanism, class loader, runtime data area These four Java technology systems are very important knowledge, after learning these, we know that a class is loaded through the class loader to the virtual machine, stored in the runtime data area, And we also know that the code in our method is compiled into bytecode stored in the method table in the Code property, then the virtual machine how to execute the code, the method to produce the output of the result? In this section we are going to learn about virtual machine bytecode execution engine knowledge. Through this chapter of learning, we need to grasp the knowledge point:

1.运行时栈帧结构2.方法调用3.基于栈的字节码执行引擎

Run-time stack frame structure

Stack frames are data structures that are used to support method invocations and method execution. He is the stack element in the virtual machine stack in the data area when the VM is running. The stack frame stores information such as the local variable table of the method, the operand stack, the dynamic connection, and the method return address. Each method call from the beginning to the completion of the process, corresponding to a stack frame in the virtual machine stack from the stack to the process.

Local variable table
A local variable table is a set of variable value storage spaces for storing method parameters and local variables defined within methods. When a Java program compiles to a class file, it determines the capacity of the maximum local variable table that the method needs to allocate in the Max_locals data item of the method's Code property.

Operand stacks
The operand stack is also often referred to as the Operation Stack, which is a post-in-first-out stack. As with the local variable table, the maximum depth of the operand stack is also written to the Max_stacks data item of the Code property at compile time. When the method starts executing, the operation stack of this method is empty, and during the execution of the method, various bytecode instructions are written to and extracted from the operand stack, that is, into the stack and the stack operation. For example, when the arithmetic operation is done by the operand stack, or when the other method is called by the operand stack to pass the parameter.

Dynamic connection
Each stack frame contains a method reference to the stack frame that runs the constant pool, which is held to support dynamic connections during method invocation. Through the class file structure we know that there are a number of symbolic references in the constant pool, and the method invocation directives in the bytecode are the parameters that point to the symbolic reference of the method in the constant pool. Some of these symbolic references are converted to direct references during the class loading phase or the first use, which is called static parsing. The other part will be converted to direct references during each run, which is called dynamic connections.

Method return Address
When a method is executed, there are two ways to exit the method, which is that the execution engine encounters a bytecode instruction returned by either method, which may return a value to the upper-level caller, which is called: normal completion of the exit.
Another way is that the method execution encounters an exception, and the exception is not processed in the method body, this time causes the exit method, which is called: abnormal completion of the exit. An abnormal exit method does not give the upper caller any return value.

The method return address is equivalent to the current stack frame out of the stack, restore the upper stack frame local variable table and the operand stack, press the return value into the operand stack of the caller stack frame, the PC counter plus 1, the execution of the value of the PC counter point to the method call instruction.

Method invocation
The method invocation is not equivalent to method execution, and the only task in the method invocation phase is to determine the version of the called method (which method is called), which does not involve a specific running procedure in the method body. So we all know that Java has method overloads and method overrides, so how do you determine the version of the calling method? All method invocations store only symbolic references in the class file, not the entry address of the method in the memory layout at the actual run time. This feature gives Java more powerful dynamic scalability, but also makes the Java method invocation process relatively complex, requiring a direct reference to the target method during class loading or even during runtime.

During the parsing phase of a class load, a portion of the symbolic reference is converted to a direct reference, which is based on the premise that the method has a deterministic version of the call before the program actually runs, and that the invocation version of the method is immutable during operation, called a parse call. There are two main types of methods, which are static and private, which conform to the "compile period shows that the running period is immutable". Neither of these methods can be overridden by inheritance or other means to rewrite other versions, so they are suitable for parsing during the class loading phase.

The parsing call must be a static process, fully deterministic during compilation, and the symbolic references that are involved in the parsing phase of the class load are all converted to deterministic direct references, and are not deferred until run time to complete.

Corresponding to this, four method call bytecode instructions are provided in the Java Virtual machine, respectively:
Invokestatic: Calling a static method
Invokespecial: Call instance constructor method, private method and Parent class method
Invokevirtual: Calling all virtual methods
Invokeinterface: Invokes an interface method that, at run time, determines an object that implements this interface.

As long as the method that can be called by invokestatic \invokespecial instruction, can determine the unique call version in the parsing phase, such as static method, private method, instance constructor and parent class method four classes, when the class is loaded, the symbolic reference will be converted to a direct reference, This type of method is also known as a non-virtual method. Although the final method is called with invokevirtual, it cannot be overwritten, there are no other versions, and the final method is explicitly stated in the Java language Specification as a non-virtual method.

Because Java has three main object-oriented features: inheritance, encapsulation, polymorphism. The basic manifestation of polymorphism is overloading and rewriting, so how does overloading and overriding methods determine the correct target method in a virtual machine?

Dispatch call
The dispatch call may be static or dynamic, and the number of parcels based on the allocation can be divided into single and multiple allocations, which constitute static single dispatch, static multi-Dispatch, Dynamic single dispatch, and dynamic multi-Dispatch. Let's take a look at the code below.

import com.sun.deploy.net.proxy.StaticProxyManager;import java.util.Map;/** * @Author:Administrator. * @CreatedTime: 2018/8/13. * @EditTime:2018/8/13. * @Version: * @Description: * @Copyright:  */public class StaticDispatch {    static abstract class Human {    }    static class Man extends Human {    }    static class Women extends Human {    }    public void sayHello(Human guy) {        System.out.println("Hello,guy!");    }    public void sayHello(Man guy) {        System.out.println("Hello,gentleman!");    }    public void sayHello(Women guy) {        System.out.println("Hello,lady!");    }    public static void main (String[] args) {        Human women = new Women();        Human man = new Man();        StaticDispatch sd = new StaticDispatch();        sd.sayHello(women);        sd.sayHello(man);    }}

The execution results are:
hello,guy!
hello,guy!

Experienced developers can see the results at a glance, so why would a virtual opportunity call a human SayHello method? Before we explain this, let's begin by understanding two concepts:
Human mans = New Man ();
Human are static types of variables or appearance types, while man is the actual type of the variable. static types and actual types can vary in a program, except that static types change only when they are used, and the static types of the variables themselves are not changed. And the final static type is known at compile time, and the result of the actual type change can be determined at run time, the compiler does not know what the actual type of an object is when compiling the program.

So back to the code above, the main method calls the SayHello method two times, which version depends entirely on the number of parameters passed in and the data type. The code deliberately defines two variables with the same static type and different actual types, but the virtual machine (specifically, the compiler) is judged by the static type of the parameter instead of the actual type when overloaded. And the static type is known at compile time, so during the compile phase, the Javac compiler decides which version to use based on the static type of the parameter, so SayHello (Human) is selected as the calling target.

Static Dispatch

All dispatch actions that rely on static types to locate a method's execution version are called static allocations. The most typical example of static dispatch is method overloading. Static dispatch occurs during the compilation phase, so determining that a static dispatch action is not actually performed by a virtual machine. More often, overloaded versions are not unique and can only be determined by a more appropriate version . Look at the following code:

/** * @Author: Administrator. * @CreatedTime: 2018/8/13. * @EditTime: 2018/8/13.        * @Version: * @Description: * @Copyright: */public class Staticdispatch {public static void SayHello (Object obj) {    System.out.println ("hello,object!");    } public static void SayHello (int c) {System.out.println ("hello,int!");    } public static void SayHello (double c) {System.out.println ("hello,double!");    } public static void SayHello (float c) {System.out.println ("hello,float!");    } public static void SayHello (Long c) {System.out.println ("hello,long!");    } public static void SayHello (Character c) {System.out.println ("hello,character!");    } public static void SayHello (char c) {System.out.println ("hello,char!");    } public static void SayHello (char ... c) {System.out.println ("hello,char...!");    } public static void SayHello (Serializable c) {System.out.println ("hello,serializable!"); } Public Static void Main (string[] args) {SayHello (' C '); }}

The result of the execution is: hello,char!, then we execute the SayHello (char c) annotation and find the result: hello,int!
Here's the process of automatic conversion, ' C '->65; we continue to comment out this method to continue execution, the output is hello,long! here two automatic conversions: c->65->65l. This way the automatic conversion can last several times: char- >int->long->float->double. Comment out the double parameter overload method after execution: hello,character! There is an automatic boxing process here. So we continue to comment out this method, and continue, and find out: What does hello,serializable! have to do with serialization? After the automatic boxing, found that the matching parameter type, but found the boxed class implementation of the interface serializable, then continue to automatic transformation, note that the package type character is not able to convert to integer, it can only be safely converted to its implementation of the interface or the parent class. The Character also implements a java.lang.comparable<character> interface that, if serializable, Comparable<character> method overloads occur simultaneously, Its priority is the same, this time the compiler will error: Type fuzzy, compilation does not pass. The corresponding interface must be specified at this time to be compiled (such as SayHello (comparable<character> ' C ')). Continue commenting out the overloaded method of the serializable parameter, execute! This time is hello,object! automatically boxed into the parent class type, if there are multiple inheritance, then from the bottom up, the higher the higher priority. Continue commenting, and finally execute char ... The overload method of variable-length parameter shows that the matching priority of variable-length parameter is the lowest. This example is the nature of the Java implementation method overload, this example is an extreme example, often work is almost useless, usually put in the face of the question "embarrassed" the interviewer.

Dynamic Dispatch
After we understand the static dispatch, we continue to see how the dynamic dispatch is implemented. Dynamic dispatch is another important embodiment of polymorphic properties override (Override). Look at the following code:

/** * @Author:Administrator. * @CreatedTime: 2018/8/13. * @EditTime:2018/8/13. * @Version: * @Description: * @Copyright:  */public class StaticDispatch {    static abstract class Human {        protected abstract void sayHello();    }    static class Man extends Human {        @Override        protected void sayHello() {            System.out.println("man Say Hello!");        }    }    static class Women extends Human {        @Override        protected void sayHello() {            System.out.println("Women Say Hello!");        }    }    public static void main (String[] args) {        Human man = new Man();        Human women = new Women();        man.sayHello();        women.sayHello();        man = new Women();        man.sayHello();    }}

Operation Result:
Mans Say hello!
Women Say hello!
Women Say hello!

Believe that the results of this operation are certainly in your expectation, because you are accustomed to the object-oriented programming of you that this is taken for granted. But how does a virtual machine know which method to call? Obviously there is no way to determine the static type of the parameter!
Human mans = New Man ();
Human women = new Women ();
These lines allocate the memory space of man and women in memory, call the instance constructors of man and women, and place two instances on the first and second slots of the local variable table.

The symbolic reference is converted to a direct reference at run time, so man and women are parsed into different direct references, and this process is the essence of method rewriting. The dispatch process of determining the method version based on the actual type of operation is called Dynamic dispatch.

Stack-based byte-code interpretation execution engine

Now that the Java Virtual machine is calling the method, the next step is how the virtual machine executes the bytecode instructions. The virtual machine has two choices for interpreting execution and compiling execution when executing code.

Interpreting execution
When the Java language was first defined as interpreting the language of execution, it was more accurate in jdk1.0, but as the virtual machine began to contain the instant compiler, the code in the class file was interpreted or compiled in the sense that only the virtual machine could judge itself.

However, whether it is an explanation or a compilation, whether it is a physical machine or a virtual machine, for an application, the machine must not read and understand like a human, and then gain execution. Most of the program code to the physical machine or virtual machine executable bytecode instruction set, need to go through a number of steps, such as, and the middle of that is to explain the process of execution.

In the Java language, the Javac compiler completes the process of parsing the code through lexical analysis, parsing to an abstract syntax tree, and then traversing the tree to generate a linear byte-code instruction stream. Because the part is outside the virtual machine, and the interpreter is inside the virtual machine, the compilation of the Java program is a semi-independent implementation.

Stack-based instruction set and register-based instruction set
java compiler output of the instruction stream, basically a stack-based instruction set architecture, the instruction flow is mostly 0 address instructions to see, they rely on the operation of the stack to work. A different set of common instruction set schemas is a register-based instruction set.

Both advantages and disadvantages:
1. The main advantage of the stack-based instruction set is portability, but because the same action requires frequent operation of memory and more than a register instruction set, the speed is slow.
2. The main advantage of register-based instruction set is that it is fast and low in operation. But because the register is hardware-dependent, its portability is affected.

Stack-based interpreter execution process
This content is explained by a arithmetic, the following is the code:

public int calc() {    int a = 100;    int b = 200;    int c = 300;    return (a+b) * c;}

The following is a bytecode execution process diagram (with bytecode instructions, PC counters, operand stacks, local variable tables):

The above demo, a conceptual model, is definitely not the same as this, because the interpreter and the instant compiler in the virtual machine will optimize the bytecode.

Summarize:
Here, we've learned how Java programs are stored (class file structure), such as Finer (ClassLoader, ClassLoader), runtime data area, and how it's done. The next chapter we should learn is the garbage collector and the memory allocation strategy. How can I tell if the memory that the object holds is recoverable? Reclaim those areas that are primarily in the run-time data area? and memory allocation and recycling policies.

Deep Java Virtual machine bytecode execution engine

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More