JVM Summary (v): JVM bytecode execution engine

Source: Internet
Author: User

JVM byte code execution engine
Run-time stack frame structure
Local variable table
Operand stacks
Dynamic connection
Method return Address
Additional Information
Method invocation
Analytical
Dispatch – Implementation of "overloaded" and "overridden"
Static Dispatch
Dynamic Dispatch
Single Dispatch and multiple dispatch
Implementation of dynamic dispatch of the JVM
Stack-based byte-code interpretation execution engine
Stack-based instruction set and register-based instruction set

JVM byte code execution engine

The virtual machine is relative to the "physical machine", both of these machines have code execution capability, the difference is mainly that the physical machine execution engine is directly based on the processor, hardware, instruction set and operating system level, and the virtual machine execution engine is implemented by itself. So programmers can make their own configuration of instruction set and execution engine, and be able to execute instruction set format which is not directly supported by hardware.
A conceptual model of the virtual machine bytecode execution engine is developed in the Java Virtual Machine specification, which is called the uniform appearance of various virtual machine execution engines. In a virtual machine implementation, there are two possible ways of doing this: interpreting execution (through the interpreter) and compiling execution (generating local code through the instant compiler). Some virtual machine values take a single execution, but are a bit of two, and may even contain several different levels of compiler execution engines.
All Java Virtual machine execution engines are consistent: the input is the bytecode file, processing is the equivalent bytecode parsing process, the output is the execution results.

Run-time stack frame structure

Stack frame is a data structure that is primarily used to support virtual machines for method invocation and method execution. It is the stack element of the virtual machine stack of the data area when the VM is running.
Contains content: Stack frames contain local variable tables, operand stacks, dynamic joins, method return addresses, and Some additional additional information .
Execution procedure: A method call chain in a thread can be long and many methods are in the execution state at the same time. In the active thread, only the stack frame at the top of the stack is valid, called the current stack frame, and the method associated with the stack frame is called the current method, and all bytecode instructions run by the execution engine are only manipulated for the current stack frame.
Execution significance: Each method from the start of the call to the completion of the process, corresponding to a stack frame in the virtual machine stack from the stack to the process.

It is worth noting that when compiling the code of the program, the stack frame needs a large local variable table, the depth of the operand stack has been fully determined, and written to the Code property of the method table, so a stack frame need to allocate how much memory, and is not affected by the run-time variable data, It only depends on the implementation of the specific virtual machine.

Local variable table

A set of variable value storage spaces that hold method parameters and local variables defined inside the method. When a Java program is compiled into a class file, the container for the largest local variable table to be allocated for the method is determined in the Max_locals data item of the method's Code property.  
contains the type: Boolean, Byte, char, short, int, float, There are eight types of reference or ReturnAddress types.  
Capacity units: Variable slots (slots). However, the virtual machine does not explicitly determine the size of the memory space occupied by each variable slot, but is directed to indicate that each variable slot should be stored in eight types: Boolean, Byte, char, short, int, float, Reference or returnaddress type of data. This description and clearly states that "each slot occupies 32 bits of memory space" has some difference, which allows the length of the slot to change depending on the processor, operating system, or virtual machine. Using a 64-bit-length memory space on a 64-bit system to implement a slot, the virtual machine still uses the alignment filler to make the slot look the same as the 32-bit virtual machine.

The data types that are within 32 bits in Java are Boolean, Byte, char, short, int, float, reference, or returnaddress type, and the first six types are not interpreted, whereas the latter reference are references to objects. The virtual machine specification does not describe its length, nor does it specify what structure the reference should have, but generally speaking: the virtual machine implementation should at least be able to find, directly or indirectly from this reference, the object type data in the starting address index and the method area in the Java heap . Instead, ReturnAddress is serving the bytecode directive JSR, Jsr_w, and RET, which points to the address of a bytecode directive.  
for a data type of 64, the virtual opportunity allocates two contiguous slot spaces in the previous way. That is, a long and a double of two types. The practice is to split the long and double types into a 32-bit read-write approach. However, because the local variable table is built on the thread's stack, it is the private data of the threads, regardless of whether the read-write two consecutive slots are atomic operations, it does not cause data security problems.

Virtual machine Indexing method: The virtual machine uses the local variable table through index positioning, which ranges from 0 to the maximum number of slots in the local variable table. If the data type is 32, index n indicates the use of the nth slot, and if it is a variable of 64-bit data type, then the nth and n+1 two slots are used.
During method execution, the virtual machine is the pass-through process that uses the local variable table to complete the parameter value to the parameter list . In the case of an instance method (not a static method), the slot of the No. 0-bit index in the local variable table defaults to the reference used to pass the object instance to which the method belongs, and the implied argument can be accessed through the keyword "This" in the method. The remaining parameters are arranged in the order of the parameter tables, occupying a local variable slot starting at 1, and assigning the rest of the slots to the variable order and scope defined within the method body, after the parameter table has been allocated.
A slot in a local variable table is reusable, a variable defined in the method body, whose scope does not necessarily overwrite the entire method body, and if the value of the current bytecode PC counter exceeds the scope of a variable, the corresponding slot of the variable can be given to other variables for use. Save stack space. However, it is also possible to affect the garbage collection behavior of the system.

It is also important to note that local variables do not have a "prep phase" as the class variables described earlier. We know that class variables pass through two of the initial values during the loading process: once in the Prep phase, give the system initial value, and another time in the initialization phase, give the programmer the initial value defined. But local variables are different, if a local variable is defined but not assigned the initial value is not available. All do not assume that in any case in Java there is a default value such as an integer variable defaults to 0, and a Boolean variable defaults to False. Take a good look at this.

Operand stacks

The Operation Stack, which is a post-in first-out stack. As with local variable tables, the maximum depth of the operand stack is also written to the Max_stacks data item of the Code property at compile time.
Each element of the operand stack can be any Java data type, including a long and a double. The 32-bit data type takes up a stack capacity of 1 and 64 bits has a stack capacity of 2. At any point in the execution of the method, the depth of the operand stack does not exceed the maximum value set in the Max_stacks data item.

When a method is just beginning to execute, this method's operand stack is empty, during the execution of the method, there will be various bytecode instructions to the operand stack to write and extract content, that is, into the stack operation.
The data type of the elements in the operand stack must match the sequence of the bytecode instruction strictly, and the compiler should strictly guarantee this when compiling the program code, and verify this again in the data flow analysis of the class check phase.
In addition, in the conceptual model, two stack frames are completely independent of each other as the elements of a virtual machine stack. However, most of the virtual machine implementations will do some optimization processing, so that two stack frames appear to overlap a part. This allows you to share a subset of the data when you make a method call without having to perform additional parameter copy passing.

The Java Virtual machine's interpretation execution engine is called the "stack-based execution engine", where the stack refers to the operand stack.

Dynamic connection

Each stack frame contains a reference to the method that the stack frame belongs to in the run-time pool, and this reference is held to support dynamic connections during method invocation.
A large number of symbolic references exist in the class file, and the method invocation directives in the bytecode are referenced as parameters to the symbol in the constant pool that points to the method. These symbolic references are converted to direct references during the first use phase of the class loading phase, which is called static parsing. The other part will be converted to a direct reference during each run, which is called dynamic conversion.

Method return Address

When a method is executed, there are two ways to exit this method.
The first is that the execution engine encounters a bytecode instruction returned by either method, at which point the return value may be passed to the method caller of the upper layer (the method calling the current method is called the caller), whether there is a return value and the type of the return value will encounter what method to return the instruction to determine, this exit method is called normal completion exit.
Another way to exit is to encounter an exception during the execution of the method, and the exception is not handled in the method body, whether it is an exception generated inside the JVM, or an exception that is generated in code using the Athrow bytecode directive, as long as no matching exception handler is found in the exception table of this method. Will cause the method to exit. This approach is known as an abnormal exit exit. This method does not produce any return value for the upper-level caller.

Whichever exit mode is used, the method exits and returns to the location where the method was called before the program can continue execution. When the method returns, you may want to save some information in the stack frame to help restore the execution state of its upper-level method. In general, after the method exits normally, the value of the caller's PC counter can be used as the return address. It is likely that the counter value will be saved in the stack frame, and the return address will be determined by the Exception processor table when the method exits unexpectedly, and the stack frame does not normally save this part of the information.
Method exit is actually the operation of putting the current stack frame out of the stack: So what you might do when exiting: Restore the upper method local variable table and the operand stack, press the return value into the operand stack of the caller stack frame, and adjust the value of the PC counter to point to an instruction following the instruction.

Additional Information

Add some information that is not described to the stack frame. In general, dynamic joins, method return addresses, and other additional information are all grouped into a class called stack frame information.

Method invocation

The process of compiling a class file does not include a connection step in a traditional compilation, and all method calls are stored in the class file as symbolic references, not as the entry address of the method in the actual run-time memory layout (the equivalent of a direct reference previously mentioned). This feature gives Java more powerful dynamic scalability, but also makes the Java method invocation process relatively complex, requiring a direct reference to the target method during class loading, and even during runtime.

Analytical

The target method of all method calls in the class file is just a constant pool of symbolic references, during the parsing phase of the class loading, some of these symbolic references will be converted to direct references, this parsing can be established as long as the L method has a deterministic version of the call before the program actually runs, and the invocation version of this method will not change at runtime. In other words: The calling target must be determined when the program code is written and compiled by the compiler, and the invocation of such a method is called parsing.
5 method Call bytecode instructions are provided in the JVM, respectively:
Invokestatic: Calling a static method
Involespecial: Invokes the instance constructor method, private method, and parent class method.
Invokevirtual: All virtual methods are called.
Invokeinterface: Invokes an interface method that, at run time, determines an object that implements this interface.
Invokedynamic: The method referenced by the calling qualifier is parsed dynamically at run time before the method is executed.
As long as the method can be called by the invokestatic and invokeapecial instructions, it is possible to determine the unique invocation version in the parsing phase, which has a static method, a private method, an instance constructor, a parent class Method 4 class, which conforms to this condition. They parse the symbolic reference as a direct reference to the method when the class is loaded, which can be called a non-virtual method, in contrast to other methods called virtual methods (except the final method). Non-virtual methods, in addition to the above two, there is a final modification of the method, although the final method is called using the invokevirtual directive, but because it cannot be overwritten, so the final method can be regarded as a non-virtual method.
The parsing call must be a static process, and during compilation it can be determined that during the parsing phase of the class load, all of the symbolic references involved will be converted to deterministic direct references, and will not be deferred until the run time. The allocation call may be static or dynamic, according to the total number of allocations can be divided into single and multi-allocation, the 22 combination of the two allocation methods constitute static single dispatch, static multi-Dispatch, dynamic single dispatch and dynamic multi-Dispatch four kinds of dispatch combinations.

Dispatch – Implementation static dispatch of "overloaded" and "overridden"

The JVM is determined by the static type of the parameter instead of the actual type when it is overloaded, and the static type is known as the compile time, so during the compile phase, the Javac compiler uses that overloaded version depending on the static type of the parameter. The symbolic quotation marks of the method are then written into the parameters of the two invokevritual instruction of the main () method.
All dispatch actions that rely on static types to locate methods to perform versions are called static allocations, and static dispatch is typically overloaded with methods.
Static dispatch occurs when a static dispatch occurs in the compile phase, so determining that the static dispatch action is not actually performed by the virtual machine.
Static methods are parsed during the class loading period, while static methods can obviously have overloaded versions, and the process of selecting overloaded versions is done by Static dispatch.

Dynamic Dispatch

The dispatch process of running a version based on the actual type determination method is called dynamic dispatch. Dynamic Dispatch has a close relationship with method rewriting.

Single Dispatch and multiple dispatch

The receiver of the method and the parameters of the method are collectively referred to as the method's volume. Depending on how many kinds of parcels the allocation is based on, the allocation can be divided into single and multi-Dispatch, which is the choice of the target method based on one volume, and the multiple allocation selects the target method based on more than one volume.

Implementation of dynamic dispatch of the JVM

Because dynamic dispatch is a very frequent action, and the dynamic dispatch method version selection process requires the runtime to search for the appropriate target method in the method metadata of the class, in order to avoid frequent searches, the most commonly used "stability optimization" means to create a virtual method table for the class in the method area. Use virtual method table indexes instead of metadata lookups to improve performance.
The virtual method table holds the actual entry address of each method, if a method is not overridden in the subclass, then the address entry in the virtual method table of the subclass is consistent with the address entry of the same method as the parent class, which points to the implementation portal of the parent class, and if this method is overridden in the subclass, The address in the subclass method table will be replaced with the entry address that points to the subclass implementation version.
For the convenience of the implementation of the program, with the same signature method, the parent class, the subclass of the virtual method table should have the same index number, so when the type transformation, only need to change the method table in the lookup, you can from the different virtual method table by index to convert the required entry address.

Stack-based byte-code interpretation execution engine


Java program before the execution of the program source code for lexical analysis and parsing processing, the source code into an abstract syntax tree. For a specific language implementation, lexical analysis, syntax analysis, and the following optimizer and target code generator can choose to be independent of the execution engine, to form a full-meaning compiler to implement, such representatives are C/s language. Of course, you can also choose a subset of these steps to implement a semi-independent compiler, which is the Java language. Or all of these steps and execution engines are encapsulated in a closed black box, such as most JS actuators.

Stack-based instruction set and register-based instruction set

The Java compiler output instruction flow, basically is a stack instruction set architecture, instruction flow instructions are mostly 0 address instructions, they rely on the operation of the stack to work.
The main advantage of a stack-based instruction set is portability. In addition, there are other advantages, such as relatively more compact code (byte code in each byte corresponding to one instruction, and multi-address instruction set also need to store parameters), compiler implementation is simpler and so on.
The disadvantage is that the execution speed is relatively slow.

JVM Summary (v): JVM bytecode execution engine

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.