Java Virtual Machine Learning notes-Java Virtual Machine Internal System Overview (chapter 5)

Source: Internet
Author: User

Note: The type in this document refers to a class or an interface.

5.1. What is a Java Virtual Machine?

When talking about Java virtual machines, you may mean:
1. Abstract Java Virtual Machine specifications
2. A specific Java Virtual Machine implementation
3. A running Java Virtual Machine instance
5.2 lifecycle of Java Virtual Machine

A running Java virtual machine has a clear task: to execute Java programs. It runs only when the program starts to run, and stops when the program ends. When you run three programs on the same machine, there will be three running Java virtual machines.
The Java Virtual Machine always starts with a main () method. This method must be public, return void, and directly accept a string array. During program execution, you must specify the class name for the main () method for the Java Virtual Machine.
The main () method is the starting point of the program. The executed thread is initialized as the initial thread of the program. Other threads in the program are started by him. There are two types of threads in Java: daemon and non-daemon ). The daemon thread is a thread used by the Java Virtual Machine. For example, the thread responsible for garbage collection is a daemon thread. Of course, you can also set your program as a daemon thread. The initial thread containing the main () method is not a daemon thread.
As long as there are common threads in the Java Virtual Machine for execution, the Java Virtual Machine will not stop. If you have sufficient permissions, you can call the exit () method to terminate the program.
5.3 architecture of Java Virtual Machine

A series of subsystems, memory areas, data types, and user guides are defined in Java Virtual Machine specifications. These components constitute the internal structure of the Java Virtual Machine. They not only provide a clear internal structure for the implementation of the Java virtual machine, but also strictly regulate the external behavior implemented by the Java Virtual Machine.
Each Java Virtual Machine is composed of a Class Loader subsystem, which is responsible for loading types (classes and interfaces) in the program and giving a unique name. Each Java virtual machine has an execution engine that executes the commands contained in the loaded class.

The execution of a program requires a certain amount of memory space, such as bytecode, additional information of the loaded class, objects in the program, method parameters, return values, local variables, intermediate variables for processing, and so on. The Java virtual machine saves all the information in the runtime data zone.
Part of the runtime data zone is shared by all threads in the program, and some others can only be owned by one thread. Each Java Virtual Machine includes the method area and heap, which are shared by all threads. After the Java Virtual Machine loads and parses a class, it saves the parsed information from the class file in the method area. All objects created during program execution are saved in the heap.
When a thread is created, it will be allocated to its own PC register "PC register" (program counter) and java stack ). When the thread does not use the local method, the PC register stores the next instruction executed by the thread. The java stack stores the status of a thread when calling a method, including local variables, parameters for calling a method, return values, and intermediate variables for processing. The status of calling a local method is stored in the native method stack, which may be in registers or other non-platform-independent memory.
A java stack consists of many stack frames (or frames). A stack frame contains the status of Java method calls. When a thread calls a method, the Java Virtual Machine will press a new stack frame into the java stack. When this method returns, the Java Virtual Machine will pop up and discard the corresponding stack frame.
Java virtual machine uses java stack to store intermediate computing results instead of using registers. This means that the commands of the Java Virtual Machine are more compact and it is easier to implement the Java Virtual Machine on a device without registers.

As the java stack in the figure grows down, thread 3 in the PC register is grayed out because it is executing a local method and its next execution command is not stored in the PC register.

5.3.1. Data Types)

The data used in all Java virtual machines has a specific data type, and the data types and operations are strictly defined in the Java Virtual Machine specifications. The data types in Java are divided into the original data type (primitive types) and reference data type (reference type ). The reference type depends on the actual object, but it is not the object itself. Raw data types do not depend on anything. They are the data they represent.
The original data types in all Java programming languages are the original data types of Java virtual machines, except Boolean ). When the compiler compiles Java source code into its own code, it uses an integer (INT) or byte type (byte) to represent a boolean type. Note: In a Java virtual machine, Boolean false is represented by an integer 0, all non-zero integers represent Boolean true, and Boolean arrays are represented as byte arrays, although they may be stored in a byte array or byte block (bit fields) in the heap.
Except the boolean type, the original types in other Java languages are data types in Java virtual machines. In Java, data types are classified into: integer bytes, short, Int, long; char, float, and double. The data types in Java have the same scope on any host.
There is also a returnvalue type of the original data type that cannot be used in the Java language in the Java Virtual Machine ). This type is used to implement finally clauses in Java programs.
The reference type may be created as class type, interface type, and array type ). They all reference the dynamically created objects. When the reference type references null, it means that no object is referenced.
The Java virtual machine specification only defines the range of each data type, and does not define the space occupied by each type during storage. It is determined by the implementers of the Java Virtual Machine.
Data Type value range:
Byte 8-bit (-27 to 27-1, including both ends)
Short 16-bit (-215 to 215-1, including both ends)
Int 32-bit (-231 to 231-1, including both ends)
Long 64-bit (-263 to 263-1, including both ends)
Char 16-bit (0 to 216-1, including both ends)
Float 32-bit (IEEE 754 Single-precision floating point number)
Double 64-bit (IEEE 754 double-precision floating point number)
Returnvalueaddress the address of an operation code in the same method
Reference refers to an object in the heap.
5.3.2. Length of bytes

The smallest data unit word in a Java virtual machine, which is defined by the implementer of the Java Virtual Machine. However, the size of a word must be sufficient to accommodate byte, short, Int, Char, float, returnvalue, and reference. therefore, virtual machine implementers must provide at least 31 bits, but it is best to choose the most efficient word length on a specific platform.
During running, the Java program cannot determine the font length of the running machine. The font length does not affect the behavior of the program. It is only a manifestation of the Java Virtual Machine.

5.3.3 Class Loader Subsystem

The class loaders in the Java Virtual Machine are divided into two types: start class loaders and user-defined class loaders. Starting a class loader is part of the implementation of the Java virtual machine, and the user-defined class loader is part of the running program.
  1. Load, connect, and initialize (loading, linking and initialization)

The class loading subsystem is not only responsible for locating and loading class files, but also does many other things according to the following strict steps: (for details, see Chapter 7 "class lifecycle ")

1) load: Find and import binary information of the specified type (class and Interface)
2) connection: verification, preparation, and resolution
① Verification: ensure the correctness of the Import Type
② Preparation: allocate memory for the type and initialize it as the default value
③ Parsing: Resolve the character reference to drinking directly
3) initialization: Call the Java code and set the initialization class variable to a proper value.
5.3.4 Method Area)

In the Java Virtual Machine, information of the loaded type is stored in the method area. The organizational form of this information in the memory is defined by the implementer of the Virtual Machine. All threads in the program share a method zone. Therefore, the method for accessing the information in the method zone must be thread-safe. The size of the method area does not have to be fixed. The virtual machine can dynamically adjust the size according to the application requirements. Likewise, the virtual machine can detach a class that is "no longer referenced" to minimize the memory occupied by the method area. Virtual machines can also allow users to specify the initial size, minimum and maximum size of the method area.

The VM stores the following information for each mounted type in the Method Area:
1) Full name of the type
2). Full name of the type parent type (unless there is no parent type or Java. Lang. Object)
3) Is this type a class or an interface?
4) type modifiers (public, private, protected, static, final, volatile, transient, etc)
5) List of full names of all parent Interfaces
6) type constant pool
7) the information of the type field includes: field name, field type, field modifier (public, private, protected, static, final, volatile, transient, etc)

8 ),Type method information includes: method name, method return type, method parameter type and type, method modifier (public, private, static, final, synchronized, native, abstract) the bytecode of the method, the size of the local variable area in the operand stack and the stack frame, and the exception table.

9). All static class variables (non-constant)
10) A reference pointing to the Class Loader
11). A reference pointing to the class

1) constant pool of type (the constant pool for the type)
The VM must maintain a constant pool for each mounted type. The constant pool is an ordered set of constants used for this type, including constants such as strings, integers, and floating-point numbers, and symbol reference for other types, fields, and methods. The data in the pool is accessed through indexes like arrays, because the constant pool stores symbolic references of all types, fields, and methods used by the corresponding types, therefore, it plays a core role in the dynamic connection of Java programs.
2) Class (static) variable (class variables)
Class variables are shared by all class instances, even if they are not accessible through class instances. These variables are bound to the class (instead of the class instance), so they are part of the Logical Data of the class. Before Java virtual machines use this class, they need to allocate memory for the class variable (non-final ).
Constants (final) are processed differently from non-final variables. Each type will copy a constant to its own constant pool. Constants are also saved in the method area like class variables, and they are also saved in the constant pool. (It is possible that class variables are shared by all instances, and the constant pool is unique to each instance ). Non-final class variables are saved as part of the data for the type that defines the declares them, the final constant is saved as part of the data for any type that uses them.
3) method tables)
In addition to the original type information, the implementation also includes other data structures to speed up access to the original data, such as the method table, which is saved as part of the class information in the method area. A method table is an array. Its elements are direct references to all instance methods that may be called by its instances, including the instance methods inherited from the superclass.

5.3.5. Heap
When a Java program creates an instance or array of classes, it allocates memory for new objects in the heap. There is only one heap in the virtual machine, and all threads share it.
  1. Garbage Collection)
Garbage collection is the main way to release objects that are not referenced. It may also move objects to reduce heap fragments.
  2. Object Representation)
The Java virtual machine specification does not define how objects are stored in the heap. The data stored by each object mainly includes the class to which it belongs and the instance variables defined in the parent class. As long as there is an object reference, the virtual machine must be able to quickly locate the data of the object instance. In addition, the corresponding class data (data in the Method Area) must be accessed through this object reference ).

A possible heap design is to divide the heap into two parts: the reference pool and the Object pool. An object reference is a local pointer to the reference pool. Each entry in the reference pool contains two parts: the pointer to the object data in the object pool and the pointer to the object class data in the method area. This design makes it easy to organize the Java Virtual Machine heap fragments. When a virtual machine moves an object in the object pool, you only need to modify the pointer address in the corresponding reference pool. The disadvantage is that the data of each object to be accessed must be processed twice. Demonstrate the heap design.

Another heap design is to direct the object reference to a group of data, which includes the object instance data and the pointer to class data in the method area. This design facilitates object access, but the movement of objects is very complex. Demonstrate this design


Note: The reason why the virtual machine must be able to obtain type data through Object Reference: 1) when the program attempts to convert an object to another type, the VM needs to determine whether the conversion is the type of the object or its parent type. 2) When the program applies the instanceof statement, it will do similar things.3) when a program calls an instance method of an object, the virtual machine needs to dynamically bind it. It cannot determine the method to be called according to the referenced type, but must be based on the actual class of the object.

No matter which design is used by the Virtual Machine implementer, it can save the information of a similar method table for each object. Because it can improve the speed of object method calling and is very important to improve the performance of virtual machines, but it is not required to implement similar data structures in the specifications of virtual machines. This structure is described:

The figure shows an implementation method that associates a method table with an object reference. Each object's data contains a pointer to a special data structure, which is located in the method area, there are two parts: one pointing to the corresponding type of data in the method area, and the other is the method table of the object. The method list is an array of pointers to all methods that may be called. The method array consists of three parts: the size of the operand stack and the size of the local variable area of the method stack; the bytecode of the method; and the exception table.
  In addition, there is a logical part of the object data on the stack, that is, the object lock. In the Java Virtual Machine, each object has an object lock, however, the corresponding lock data is allocated only when the lock is required for the first time, but the virtual machine needs to use some indirect method to contact the object data and the corresponding lock data. In addition to locking data, each Java object is logically associated with the data that implements the waiting set. The lock is used to implement multi-thread mutex access to shared data, the waiting set is used to allow multithreading to achieve the common goal and coordinate the work. The last data associated with the Java object is the data related to the garbage collector.

3. array Representation)
In Java, an array is an object in the full sense. It is stored in the heap like an object and has a reference pointing to a class instance. All arrays of the same dimension and type have the same class. The length of the array is not considered. The corresponding class name is represented as dimension and type. For example, if the class of an integer data is "[I", the Class Name of the byte three-dimensional array is "[[[B", and the class name of the Two-dimensional object data is "[[ljava. lang. object ".

Multi-dimensional arrays are represented as arrays, such:

The array length, array data, and reference to array data must be saved in the heap. If an array is referenced, the virtual machine should be able to obtain the length of an array, access specific data through the index, and call the object-defined method. Object is the direct parent class of all data classes.
5.3.6. PC register (program counter) (the program counter)

A program counter is created when each thread starts execution. The program counter has only one word, so it can save a local pointer and returnvalue. during thread execution, the program counter stores the address of the instruction being executed. This address can be used as a local pointer or an offset pointer starting from the method bytecode. If you execute a local method, the program counter value is not defined.
5.3.7. java stack (the java stack)

When a thread starts, the Java Virtual Machine creates a java stack for it. The java stack saves the running status of the thread in frames. The virtual machine only performs two operations directly on the java stack: Push-in and pop-up frames.
The method being executed in the thread is called the current method, and the frame corresponding to the current method is called the current frame ). The class defining the current method is called the current class, and the constant pool of the current class is called the current constant pool .). When the thread is executed, the Java Virtual Machine tracks the current class and the current constant pool.
5.3.8. stack frame)

Stack frames include the local variable area, operand stack, and frame data. Both the local variable area and the size of the operand Stack are in the word unit. They have been determined and placed in the class file after compilation. The size of the frame data depends on the implementation. When a program calls a method, the virtual machine obtains the local variable area and the size of the operand Stack from the class data, creates a stack frame of the appropriate size, and then pushes it into the java stack.
  1. Local Variables)
The local variable area of the java stack frame is organized into an array of characters that count from 0. The command obtains the corresponding value from the local variable area by providing their indexes. Int, float, reference, returnvalue occupies one word. byte, short, and Char are converted to int and stored. Long and Doubel occupy two words.
The command returns the value of long and Doubel by providing the first one of the two word indexes. For example, if a long value is stored on index 3 and 4, the command can use 3 to obtain the long value.

The local variable area contains the method parameters and local variables. The compiler places the parameters of the method in the order they declare before the array. However, the compiler can arrange local variables in an array of local variables.Note: 1) this parameter is implicitly passed for any instance method. Therefore, the first parameter in the local variable area is a reference ). 2) because the stack frame is in the unit of Word Length, byte, short, and Char are converted to int storage in the stack frame. When it is stored back to the heap or method area, will be converted back to the original type. 3) in Java, all objects are passed by reference and stored in the heap. Object copying will never be found in the local variable area or in the operand stack, only object references are allowed.

  2. operand Stack)
Like the local variable area, the operand stack is also organized into an array in characters. However, unlike the local variable area, index access is implemented through push and pop. The VM stores data in the operand stack in the same way as in the local variable area. The values of the byte, short, and char types are converted to int before being pushed to the operand stack.
Unlike program counters, Java virtual machines do not have registers. Program counters cannot be accessed by Java program instructions because they are registers. The commands of the Java Virtual Machine are obtained from the operand stack rather than from the register. Therefore, the operating method of the commands is stack-based. Of course, commands can also be operands from other places, such as the operation code behind the command, or the constant pool. However, Java Virtual Machine commands are used to obtain the required operands from the operand stack.
The Java Virtual Machine regards the operand stack as a work zone. The majority of commands pop up from here to execute operations, and then return the result to the operand stack.
  3. frame data)
In addition to the local variable zone and the operand stack, java stack frames also include the data required to support constant pool resolution, method return values, and exception distribution. They are stored in the frame data.
Whenever a VM needs to execute a command that uses the constant pool data, it will access the required information through the pointer pointing to the constant pool in the frame data area. As mentioned above, references to types, fields, and methods in the constant pool are symbols at the beginning. When a VM searches in the constant pool, if the entry pointing to a class, interface, field, or method is encountered, if they are still symbols, the virtual machine will parse them.
In addition, when a method returns normally, the virtual machine needs to reconstruct the stack frame of the method that calls this method. If the returned value is returned for the executed method, the virtual machine needs to push the value to the stack of the operand of the calling method.

Frame data also contains references to abnormal tables used by virtual machines to handle exceptions. The exception table defines a byte code protected by a catch statement. Each item in the exception table has information such as the start position and end position of the Code protected by the catch clause. When a method throws an exception, the Java virtual machine uses an exception table to determine how to handle this exception. If the VM finds a matched catch, it will give control to the catch statement. If no matching catch is found, the method returns an exception and continues the process in the called method.
In addition to the preceding three functions, frame data may also contain implementation-dependent data, such as debugging information.
5.3.9. Local method Stack

The local method area depends on different implementations of virtual machines. Virtual Machine implementers can decide which mechanism to use to execute local methods.
Any native method interface uses some form of local method stack.

5.3.10. execution engine

The core of a Java virtual machine is the execution engine. In Java Virtual Machine specifications, the execution engine is described as a series of commands. For each instruction, the specification describes what they should do, but does not say how to do it.Every thread of the running Java program is an instance of an independent virtual machine execution engine.From the beginning to the end of the thread life cycle, it is either executing the bytecode or executing the local method. A thread may directly execute bytecode by interpreting or using chip-level commands, or indirectly execute locally compiled code through an instant compiler. Java Virtual Machine implementation may use threads that are invisible to user programs, such as the garbage collector. Such a thread does not need to be an execution engine instance. All the threads that belong to the user running the program are actually working engines.
1. Instruction Set
  In a Java virtual machine, the byte code stream of a method is a sequence of commands. Each instruction consists of a single-byte opcode and possible operands (operands ). The operation code indicates what to do. The operations provide additional information that may be required to execute the operation code.An abstract execution engine executes a command each time. This process occurs in every execution thread. Based on the operation code, the virtual machine may not only follow the operation code's operations, but also need to obtain the operations from some other storage areas. When a VM executes an instruction, it may use the item in the current constant pool, the value in the local variable of the current frame, or the value at the end of the current frame operand stack.
Sometimes, the execution engine may encounter a command that calls a local method. In this case, the execution engine tries to call a local method, but when the local method returns, the execution engine continues to execute the next instruction in the byte code stream. The local method can also be seen as an extension of the instruction set in the Java Virtual Machine.
Deciding to execute the next command is also part of the execution engine. The execution engine has three methods to obtain the next instruction. Most commands will execute the commands that meet with him; some commands like Goto and return will decide their next command when they execute; when an instruction throws an exception, the execution engine matches the catch statement to determine the next instruction to be executed.
The focus of the Java Virtual Machine instruction set is the operand stack. Generally, the value to be used is pushed into the stack. Although the Java Virtual Machine does not save any value registers, each method has a set of local variables, the actual way the instruction set works is to use local variables as registers and indexes for access. Before using a value stored in a local variable, you must first push it into the operand stack.
Platform independence, network mobility, and security affect the Java Virtual Machine instruction set design. Platform independence is one of the main factors influencing instruction set design. The stack-based structure enables Java virtual machines to be implemented on more platforms. Smaller operation codes and a compact structure allow byte codes to take advantage of network bandwidth more effectively. One-time bytecode verification makes the bytecode safer without affecting too much performance.
2. Execution Technology
Many execution technologies can be used in the implementation of Java virtual machines: interpretation execution, instant compilation (just-in-time compiling), adaptive optimization, and Chip-level direct execution. Adaptive Optimization is one of the most meaningful and fast execution technologies. The first virtual machine parses a bytecode each time. The second-generation Virtual Machine is added to the real-time compiler. During the first execution of the method, the local code is compiled and then executed.At the beginning of the adaptive Optimization Vm, all the Code is interpreted and run, but it monitors the code execution. Most programs spend 80%-90% of the time to execute 10%-20% of the Code, when the adaptive Optimization Virtual Machine determines this situation, it will start a background thread to compile the code at the cost of bytecode and optimize the local code very carefully. The adaptive optimization technology enables the program to eventually turn the code of the original 80%-90% runtime into the extremely optimized, static connection C ++ local code.
3. threads
Java virtual machine specification defines a thread model to be implemented on more platforms. Local threads can be used when a Java thread model is a target. By using local threads, the threads in the Java program can be truly executed simultaneously on a multi-processor machine.
One price of the Java thread model is the thread priority. A Java thread can run on a priority of 1 to 10. 1 is the lowest and 10 is the highest. If the designer uses a local thread, they may map the 10 priorities to the local priority. The Java virtual machine specification only defines that a thread with a higher priority can have some CPU time, while a thread with a lower priority can also obtain some CPU time when all high-priority threads are congested, but this is not guaranteed: low-priority threads cannot obtain a certain CPU time when there is no congestion in high-priority threads. Therefore, if you need to collaborate between different threads, you must use synchronizatoin )".
Synchronization involves two parts: Object locking and thread wait and activation ). The help thread of the object lock can be independent from other threads. Thread waiting and activation allow different threads to collaborate.
  In Java Virtual Machine specifications, Java threads are described as variables, primary memory, and working memory. Each Java Virtual Machine instance has a primary memory, which contains all program variables: instance variables of the object, elements of the array, and class variables. Each thread has its own working memory, which stores copies of variables it may use.Because local variables and parameters are private to each thread, they can be viewed logically as part of the working memory. Rules for managing lower-layer thread behavior:
1) copy the variable value from the primary memory to the working memory.
2) write the value in the working memory to the main memory.
If a variable is not synchronized, the thread may update the variable in the main memory in any order. To ensure correct execution of multi-threaded programs, the synchronization mechanism must be used.
5.3.11 native method Interface)
The implementation of Java virtual machine does not require local method interfaces. Some implementations may not support local method interfaces at all. Sun's local method interface is JNI (Java Native Interface ).

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.