Deep understanding of Java Virtual Machine (5) --- bytecode execution engine, deep understanding of bytecode

Source: Internet
Author: User

Deep understanding of Java Virtual Machine (5) --- bytecode execution engine, deep understanding of bytecode

What is bytecode?

Baidu's explanation is as follows:

Byte-code is a binary file consisting of an execution program and a sequence of op code/data pairs. Bytecode is an intermediate code that is more abstract than a machine code.

It is often seen as a binary file containing an execution program, more like an object model. The bytecode is called this because each opcode is usually a word term,

However, the length of the instruction code varies. Each instruction has a one-byte operating code from 0 to 255 (or hexadecimal: 00 to FF) followed by parameters such as registers or memory addresses.

 

After talking about this, you may still not understand what it is. Well, simply put, it is the ". class" file after java compilation.

Therefore, the class file is a bytecode file that is executed by a virtual machine. That is, the difference between java and C & C ++ is that the entire compilation and execution process has a virtual

Machine step. This has been explained in the article "Understanding Java Virtual Machine (3)-class structure", which is a milestone design. The previous section describes how virtual machines are loaded.

This section describes how a virtual machine executes class files.

 

The Java virtual machine specification defines the concept model of Virtual Machine bytecode execution. Specific virtual machines can be implemented differently.

Runtime stack frame structure

Stack is the memory exclusive to each thread.

Stack frames store local variable tables, operand stacks, dynamic connections, and return addresses.

The execution of each method corresponds to the process of a stack frame in the Virtual Machine from Stack to stack.

Only stack frames at the top of the stack are valid. The corresponding method is called the current method.

All commands run by the execution engine are only for the current stack frame and method.

1. Local variable table

The storage space of a group of variables stored in the local variable table. Stores the local variable table defined inside the method parameters and methods.

When java is compiled into a class, the maximum capacity required for the local variable table has been determined.

The minimum unit of a local variable table is a Slot.

The virtual machine specification does not specify the size of a Slot. It only stipulates that it can put down boolean, byte,... reference & return address.

Reference refers to the reference of an object instance. The size of reference is not specified. But we can understand it as a pointer like in C ++.

The local variable table is read using an index, starting from 0. Therefore, a local variable table is simply a table.

The allocation order of the local variable table is as follows:

This reference. It can be considered as an implicit parameter.

The parameter table of the method.

Allocate Solt according to the local variable order.

One variable is solt, and 64 represents 2 solt. In java, 64-bit long & double

To save as much space as possible, Solt can be reused.

Note: local variables are only allocated memory and there is no preparation stage for the class Object. Therefore, you must assign values before using local variables.

2. operand Stack

The operand stack is like a register in concept.

The Java virtual machine cannot use registers, so there is an operand stack to store data.

The virtual machine uses the operand stack as its Workspace-most commands need to pop up data from here, execute operations, and then compress the results back to the operand stack.

For example, the iadd command will pop up two integers from the operand stack and execute the addition operation. The result is compressed back to the operand stack. Let's take a look at the following example,

It demonstrates how the virtual machine adds two int-type local variables and then saves the result to the third local variable:

Begin

Iload_0 // push the int in local variable 0 onto the stack

Iload_1 // push the int in local variable 1 onto the stack

Iadd // pop two ints, add them, push result

Istore_2 // pop int, store into local variable 2

End

Data Reading and Writing on the operand stack is an out-of-stack operation.

3. Dynamic connection

Each stack frame contains a reference pointing to the runtime constant pool to support dynamic connections.

Some of the references of the symbol pool are determined during the first use or initialization. This is called static reference.

Another part is to take the confirmation during each execution, which is a dynamic connection.

4. Method return address

The method is only in the middle of 2 exit mode. Normally, the return command exits. In addition, exit unexpectedly.

Normal situation: Generally, the stack frame is stored in the caller's address in the program counter. The address of the caller who executes the method in this way,

Then, the return value is pushed into the caller's operand stack.

Exception: The method does not return any value. The returned address is identified by an exception table. Stack frames generally do not store information.

5. method call

The method call phase is not to execute this method, but only to call that method. The class file is not connected in the compilation phase ,,

Therefore, dynamic connection, an existing technology in C ++, has been applied to a new level in java. All functions (except private methods, constructor methods and static methods, the same below), theoretically

All functions can be used as virtual functions in C ++. Therefore, all functions must be dynamically bound to determine the "Explicit" function entity.

Analysis

The target methods of all method calls are symbolic references in the constant pool. In the loading and parsing phase of the class, some target methods are converted into direct references. (It can be understood as the direct address of the specific method)

Possible conversion methods are mainly static and private methods.

The Java virtual machine provides the following method call commands:

Invokestatic: Call a static method

Invokespecial: Call constructors, private methods, and parent class methods

Invokevirtual: Call the virtual Method

Invokeinterface: Call the Interface Method

Invokedynamic: This method is dynamically parsed during runtime and then executed.

The methods corresponding to invokestatic & invokespecial can be determined directly after parsing is loaded. Therefore, these methods are non-virtual methods.

Java requires that final is a non-virtual method.

Dispatch

Static assignment

Let's take a look at an example:

package com.joyfulmath.jvmexample.dispatch;import com.joyfulmath.jvmexample.TraceLog;/** * @author deman.lu * @version on 2016-05-19 13:53 */public class StaticDispatch {    static abstract class Human{    }    static class Man extends Human{    }    static class Woman extends Human{    }    public void sayHello(Human guy)    {        TraceLog.i("Hello guy!");    }    public void sayHello(Man man)    {        TraceLog.i("Hello gentleman!");    }    public void sayHello(Woman man)    {        TraceLog.i("Hello lady!");    }    public static void action()    {        Human man = new Man();        Human woman = new Woman();        StaticDispatch dispatch = new StaticDispatch();        dispatch.sayHello(man);        dispatch.sayHello(woman);    }}
05-19 13:58:05.538 14881-14881/com.joyfulmath.jvmexample I/StaticDispatch: sayHello: Hello guy! [at (StaticDispatch.java:24)]05-19 13:58:05.539 14881-14881/com.joyfulmath.jvmexample I/StaticDispatch: sayHello: Hello guy! [at (StaticDispatch.java:24)]

The result is the public void sayHello (Human guy) function. Shouldn't this be a polymorphism?

Human man = new Man();

Here we understand the Human as the static type, and the Man behind it is the actual type. We only know the static type in the compiler, and the actual type will not be known until dynamic connection.

Therefore, for the sayHello method, when a VM is reloaded, it uses the static type of the parameter instead of the actual type to determine which method to use.

If the type is forcibly converted:

    public static void action()    {        Human man = new Man();        Human woman = new Woman();        StaticDispatch dispatch = new StaticDispatch();        dispatch.sayHello(man);        dispatch.sayHello(woman);        dispatch.sayHello((Man)man);        dispatch.sayHello((Woman)woman);    }05-19 14:08:29.000 21838-21838/com.joyfulmath.jvmexample I/StaticDispatch: sayHello: Hello guy! [at (StaticDispatch.java:24)]05-19 14:08:29.001 21838-21838/com.joyfulmath.jvmexample I/StaticDispatch: sayHello: Hello guy! [at (StaticDispatch.java:24)]05-19 14:08:29.001 21838-21838/com.joyfulmath.jvmexample I/StaticDispatch: sayHello: Hello gentleman! [at (StaticDispatch.java:29)]05-19 14:08:29.002 21838-21838/com.joyfulmath.jvmexample I/StaticDispatch: sayHello: Hello lady! [at (StaticDispatch.java:34)]

If the conversion is strong, the type also changes.

A typical application of static allocation is method overload. However, method Overloading is sometimes not unique, so you can only select the appropriate one.

For example:

    public void sayHello(int data)    {        TraceLog.i("Hello int!");    }    public void sayHello(long  data)    {        TraceLog.i("Hello long");    }

When sayHello (1) is used, int-type methods are generally called. However, if only long-type methods are called for annotation, the long-type parameter methods are called.

 

Dynamic Allocation

The above is about overload, here is rewrite (@ Override)

package com.joyfulmath.jvmexample.dispatch;import com.joyfulmath.jvmexample.TraceLog;/** * @author deman.lu * @version on 2016-05-19 14:26 */public class DynamicDispatch {    static abstract class Human{        protected abstract void sayHello();    }    static class Man extends Human{        @Override        protected void sayHello() {            TraceLog.i("Hello gentleman!");        }    }    static class Woman extends Human{        @Override        protected void sayHello() {            TraceLog.i("Hello lady!");        }    }    public static void action()    {        Human man = new Man();        Human woman = new Woman();        man.sayHello();        woman.sayHello();        man = new Woman();        man.sayHello();    }}

Let's take a look at the red sentence above: the method should be used to parse man's sayhello. The problem is what man is. I don't know it when I parse it. So"Man. sayHello ();"The method of the class to be executed must be in the Virtual Machine

Dynamic connections are known, which is polymorphism. If you use javap for analysis, you can understand this sentence. In the class file, ynamicDispatch $ Human: sayHello. Yes, the class file does not know whether the sayhello is going to be used.

Method to call.

The invokevirtual Command Parsing process is roughly as follows: first, the actual type of the first element in the operand stack is C.

If the same class name and method as the constant descriptor are found in type C, after the permission verification is passed, the direct reference of this method is returned.

Otherwise, the parent class of C is searched in sequence.

In this process, we first look for the method with the same name from the current class. If not, find it from the parent class of C!

The method found here is the method we actually need to call.

If not found, exception is returned. In general, the compilation tool will help us avoid this situation.

Single Assignment and multi-assignment

It is difficult to understand the concept. To put it bluntly, there are both heavy loads and rewriting:

package com.joyfulmath.jvmexample.dispatch;import com.joyfulmath.jvmexample.TraceLog;/** * @author deman.lu * @version on 2016-05-19 15:02 */public class MultiDispatch {    static class QQ{}    static class _360{}    public static class Father{        public void hardChoice(QQ qq){            TraceLog.i("Father QQ");        }        public void hardChoice(_360 aa){            TraceLog.i("Father 360");        }    }    public static class Son extends Father{        public void hardChoice(QQ qq){            TraceLog.i("Son QQ");        }        public void hardChoice(_360 aa){            TraceLog.i("Son 360");        }    }    public static void action()    {        Father father = new Father();        Father son = new Son();        father.hardChoice(new _360());        son.hardChoice(new QQ());    }}
05-19 15:07:44.429 29011-29011/com.joyfulmath.jvmexample I/MultiDispatch$Father: hardChoice: Father 360 [at (MultiDispatch.java:19)]05-19 15:07:44.429 29011-29011/com.joyfulmath.jvmexample I/MultiDispatch$Son: hardChoice: Son QQ [at (MultiDispatch.java:25)]

There is no suspense in the results, but the process still needs to be clear. The choice of hardChoice is confirmed during static compilation.

Son. hardchoise has confirmed the function type, but needs to further confirm the object type. Therefore, the dynamic connection is a single assignment.

 

Dynamic Language Support:

You can use the C ++ language to define a call method:

Void sort (int list [], const int size, int (* compare) (int, int ));

However, it is difficult for java to achieve this,

Void sort (List list, Compare c); Compare is generally implemented using interfaces.

Java 1.7 supports MethodHandle.

This part will be updated later because I cannot configure or call it in the local environment.

 

With so many preparations, let's talk about the execution of bytecode.

6. stack-based bytecode execution engine

Stack-based instruction sets and register-based instruction sets.

Let's first look at an addition Process:

Iconst_1

Iconst_1

Iadd

Istore_0

This is stack-based, that is, the operand stack mentioned above.

First, add the two elements into the stack, then add them back to the top of the stack, and then store the value of the top of the stack in slot 0.

Register-based is not explained.

Both register-based and stack-based instruction sets exist. Therefore, it is hard to say that it is superior or inferior.

Stack-based instruction sets are independent of hardware, while register-based instruction sets depend on hardware. Register-based efficiency advantages.

However, the emergence of virtual machines is to provide cross-platform support, so the jvm execution engine is a stack-based instruction set.

    public int calc()    {        int a = 100;        int b = 200;        int c = 300;        return (a+b)*c;    }

The analysis result of javap is as follows:

 

This section describes the code, operand stack, and local variable table changes throughout the execution process.

 

 

These processes are just a conceptual model, and there will be many optimizations for the actual virtual machine.

Disclaimer: The copyright of the reference books on the images in this article is owned by the original author.

Refer:

Zhou Zhiming, a deep understanding of Java Virtual Machine

 

  

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.