Java Virtual Machine (JVM) compilation, download, interpretation and execution, and Specification Description

Source: Internet
Author: User

Java Virtual Machine (JVM) is a hypothetical computer that can run Java code. As long as the interpreter is transplanted to a specific computer according to the JVM Specification Description, it can ensure that any compiled Java code can run on the system. This article first briefly introduces the process from Java file compilation to final execution, and then describes the JVM specifications.

1. Compilation, download, interpretation, and execution of Java source files

The Java application development cycle includes the compilation, download, interpretation, and execution.

The Java compiler translates the Java source program into JVM executable code-bytecode. This compilation process is somewhat different from that of C/C. When the C compiler compiles an object code, the code is generated to run on a specific hardware platform. Therefore, during the compilation process, the compiler converts all references to the symbol into a specific memory offset through the lookup table to ensure that the program runs properly. The J ava compiler does not compile references to variables and methods into numerical references, nor determine the memory layout during program execution. Instead, it retains these symbolic references in bytecode, the interpreter creates a memory layout during running, and then uses the table to determine the address of a method. This effectively ensures the portability and security of J ava.


The interpreter runs the JVM bytecode. The execution process is divided into three parts: Code loading, code verification, and code execution. Code loading is completed by the class loader. The class loader is responsible for loading all the code required to run a program. This also includes the classes inherited by the classes in the program code and the classes called by the class. When the class loader loads a class, the class is placed in its own namespace. There is no way between classes to influence other classes except to reference their own namespace through symbols. All classes on this computer are in the same address space, and all classes introduced from the outside have their own namespace. This allows local classes to achieve high running efficiency by sharing the same namespace, while ensuring that they do not affect each other with the classes introduced from outside. After all the classes required to run the program are loaded, the interpreter can determine the memory layout of the entire executable program. The interpreter establishes a correspondence and query table with a specific address space for symbolic reference. By determining the memory layout of the code at this stage, J ava solves the problem of sub-class crash caused by super class changes, and also prevents unauthorized access to the address by the Code.
Then, the loaded code is checked by the bytecode validator. The validator can detect multiple errors such as overflow of the operand stack and illegal data type conversion. After the verification is passed, the code is executed.

Java bytecode can be executed in two ways:
1. Instant compilation method: the interpreter first compiles the bytecode into a machine code and then executes the machine code.
2. Explain execution method: the interpreter interprets and executes a small piece of code each time to complete all operations of Java bytecode.

The second method is usually used. The JVM Specification Description is flexible enough, which makes it highly efficient to translate bytecode into machine code. For applications that require high running speed, the interpreter can instantly compile the J ava bytecode into machine code, thus ensuring the portability and high performance of Java code.

Ii. JVM Specification Description
JVM is designed to provide a computer model based on abstract specification descriptions, which provides great flexibility for interpreter developers, it also ensures that Java code can run on any system that complies with this specification. J vm defines some aspects of its implementation, especially the Java executable code, that is, the Bytecode format. This specification includes the syntax and value of the operation code and operand, the numerical expression of the identifier, and the Java object in the J ava class file, and the storage image of the constant buffer pool in the JVM. These definitions provide JVM interpreter developers with the required information and development environment. Java designers hope to give developers the freedom to use J ava as they wish.
JVM defines five specifications that control Java code interpretation execution and implementation. They are:
JVM Command System
JVM registers
JVM stack structure
JVM fragment collection heap
JVM storage Zone
2.1JVM Command System
The JVM command system is very similar to the command system of other computers. Java commands are composed of operation codes and operands. The operation code is an 8-bit binary number, and the operands follow the operation code. The length varies according to requirements. The operation code is used to specify the nature of a command operation (which is described in the form of an assembly symbol here). For example, I load indicates that an integer is loaded from the memory, anewarray is used to allocate space for a new array, and iand is used to represent the "and" of two integers. ret is used for process control, which indicates that it is returned from a call to a method. When the length is greater than 8 bits, the operands are divided into two or more bytes. JVM uses the "big endian" encoding method to handle this situation, that is, high bits are stored in low bytes. This is consistent with the encoding method used by Motorola and its CPU. It is different from Intel's "little endian" encoding method, that is, the method for storing low bits in low bytes.
The Java command system is designed for the implementation of the Java language. It contains commands used to call methods and monitor multi-process systems. The length of the Java 8-bit operation code allows the j vm to have a maximum of 256 commands. Currently, more than 160 operation codes are used.

2.2JVM Command System
All CPUs contain a register group used to save the system status and information required by the processor. If the Virtual Machine defines more registers, it can obtain more information from them without having to access the stack or memory, which improves the running speed. However, if there are more registers in the virtual machine than the actual c pu registers, it will take a lot of time for the processor to use the regular memory to simulate the registers, which will reduce the efficiency of the virtual machine. In this case, jv m only sets four most common registers. They are:
Pc program counter
Optop operand stack top pointer
Frame current execution environment pointer
Vars pointer to the first local variable in the current execution environment
All registers are 32-bit. The pc is used to record program execution. Optop, frame, and vars are used to record pointers to the Java stack.

2.3JVM stack structure
As a stack-based computer, Java stack is the main method for JVM to store information. After JVM obtains a Java bytecode application, it creates a stack framework for each method of a class in the code to save the state information of the method. Each stack framework includes the following three types of information:
Local variable
Execution Environment
Operand Stack
Local variables are used to store the local variables used in a class method. The vars register points to the first local variable in the variable table.
The execution environment is used to save the information required by the interpreter to interpret the Java bytecode. They are: the method called last time, the local variable pointer, And the stack top and bottom pointer of the operand stack. The execution environment is a control center for executing a method. For example, if the interpreter needs to execute I add (integer addition), first find the current execution environment from the frame register, and then find the operand Stack from the execution environment, two integers are displayed from the top of the stack for addition calculation, and the result is pushed to the top of the stack.
The operand stack is used to store the operands required for the operation and the results of the operation.

2.4JVM fragment collection
The storage space required for Java-class instances is allocated on the stack. The interpreter is responsible for allocating space for class instances. After a bucket is allocated to an instance, the interpreter starts to record the usage of the memory occupied by the instance. Once the object is used, it is recycled to the heap.
In Java, there are no other methods except the new statement to apply for and release memory for an object. The memory is released and recycled by the Java operating system. This allows the designers of the Java operating system to determine the method of fragment collection. In the Java interpreter and Hot Java environment developed by SUN, fragment is executed in the background thread mode. This not only provides good performance for the running system, but also frees programmers from the risk of controlling memory usage.

2.5JVM storage Zone
JVM has two types of storage areas: constant buffer pool and method zone. Constant buffer pool is used to store class names, methods, field names, and string constants. The method area is used to store the bytecode of the Java method. The specific implementation methods of these two storage regions are not specified in the j vm specification. Therefore, the storage layout of Java applications must be determined during running and depends on the implementation method of the specific platform.
JVM is a type description defined for Java bytecode independent of a specific platform, and is the basis for Java platform independence. At present, the JVM still has some limitations and deficiencies, which need to be further improved. However, the idea of j vm is successful in any case.

Comparative Analysis: if we think of the original Java program as our original C program, the bytecode generated after the original Java program is compiled is equivalent to the 80x86 machine code (Binary program file) after the C original program is compiled. The J vmvm is equivalent to the 80x86 computer system, and the Java interpreter is equivalent to 80 x CPU. The machine code is run on 80 x86cpu, And the Java bytecode is run on the Java interpreter.
The Java interpreter is equivalent to the "CPU" that runs the Java bytecode, but the "CPU" is not implemented by hardware, but by software. The Java interpreter is actually an application on a specific platform. As long as the interpreter program is implemented on a specific platform, the J ava bytecode can be run on the platform through the interpreter program, which is the basis of Java cross-platform. Currently, not all platforms have corresponding Java interpreter programs. This is why J ava cannot run on all platforms, it can only run on a platform that has implemented a Java interpreter program.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.