Java virtual machine architecture, Java Virtual Machine
Lifecycle of a Java Virtual Machine
A running java VM instance is responsible for running a java program. When a Java program is started, a virtual machine instance is born. When the program is closed and exited, the virtual machine instance will die. If three Java programs run on the same computer at the same time, three Java Virtual Machine instances will be obtained. Each Java program runs on its own Java Virtual Machine instance.
A Java Virtual Machine instance runs a Java program by calling the main () method of an initial class. The main () method must be public, static, and return value void. A string array is used as a parameter. Any class that has such a main () method can be used as the starting point for Java program running.
public class Test {
public static void main (String [] args) {
// TODO Auto-generated method stub
System.out.println ("Hello World");
}
}
In the above example, the main () method in the initial class of the Java program will be used as the starting point of the initial thread of the program. Any other thread is started by this initial thread.
There are two kinds of threads inside the Java virtual machine: daemon threads and non-daemon threads. Daemon threads are usually used by virtual machines themselves, such as threads that perform garbage collection tasks. However, a Java program can also mark any thread it creates as a daemon thread. The initial thread in the Java program-the one that starts in main (), is a non-daemon thread.
As long as there are any non-daemon threads running, then this Java program will continue to run. When all non-daemon threads in the program are terminated, the virtual machine instance will automatically exit. If the security manager allows it, the program itself can also exit by calling the exit () method of the Runtime class or System class.
JAVA virtual machine architecture
Is the structure diagram of JAVA virtual machine. Each Java virtual machine has a class loading subsystem, which loads the type (class or interface) according to the given fully qualified name. Similarly, each Java virtual machine has an execution engine, which is responsible for executing the instructions contained in the methods of the loaded class.
When the JAVA virtual machine runs a program, it needs memory to store many things, such as: bytecode, other information obtained from the loaded class file, objects created by the program, parameters passed to the method, return value, local Variables and so on. The Java Virtual Machine organizes these things into several "runtime data areas" for easy management.
Some runtime data areas are shared by all threads in the program, while others can only be owned by one thread. Each Java virtual machine instance has a method area and a heap, which are shared by all threads in the virtual machine instance. When the virtual machine loads a class file, it will parse the type information from the binary data contained in the class file. Then put these types of information into the method area. When the program runs, the virtual machine puts all the objects created by the program at runtime into the heap.
When each new thread is created, it will get its own PC register (program counter) and a Java stack. If the thread is executing a Java method (non-native method), then the value of the PC register will always be Point to the next instruction to be executed, and its Java stack always stores the state of Java method calls in this thread-including its local variables, parameters passed in when called, return values, and intermediate results of operations and many more. The state of the local method call is stored in the local method stack in a method that depends on the specific implementation, or it may be in a register or some other memory area related to a specific implementation.
The Java stack is composed of many stack frames. A stack frame contains the status of a Java method call. When a thread calls a Java method, the virtual machine pushes a new stack frame into the thread's Java stack. When the method returns, the stack frame is ejected from the Java stack and discarded.
The Java virtual machine has no registers, and its instruction set uses the Java stack to store intermediate data. The reason for this design is to keep the instruction set of the Java virtual machine as compact as possible, and at the same time it is convenient for the Java virtual machine to be implemented on platforms with few general-purpose registers. In addition, the Java virtual machine's stack-based architecture is also helpful for the code optimization of dynamic compilers and just-in-time compilers implemented by certain virtual machines at runtime.
Depicts the memory area created by the Java virtual machine for each thread. These memory areas are private, and no thread can access the PC register or Java stack of another thread.
Shows a snapshot of a virtual machine instance with three threads executing. Thread 1 and thread 2 are both executing Java methods, while thread 3 is executing a native method.
The Java stack grows downward, and the top of the stack is displayed at the bottom of the figure. The stack frame of the currently executing method is indicated in light color. For a thread that is running a Java method, its PC register always points to the next instruction to be executed. For example, both thread 1 and thread 2 are displayed in light colors. Since thread 3 is currently executing a local method, its PC register, the one displayed in dark colors, is undefined.
type of data
The Java virtual machine performs calculations through certain data types. Data types can be divided into two types: basic types and reference types. Variables of the basic type hold the original value, and variables of the reference type hold the reference value.
All basic types in the Java language are also basic types in the Java virtual machine. But boolean is a bit special. Although the Java virtual machine also regards boolean as a basic type, the instruction set only has very limited support for boolean. When the compiler compiles Java source code into byte code, it will be represented by int or byte. boolean. In the Java virtual machine, false is represented by the integer zero, all non-zero integers represent true, and operations involving boolean values use int. In addition, the boolean array is accessed as a byte array, but in the "heap" area, it can also be represented as a bit field.
The Java virtual machine also has a basic type that is only used internally: returnAddress, which cannot be used by Java programmers. This basic type is used to implement the finally clause in Java programs. This type is required for jsr, ret and jsr_w instructions, and its value is a pointer to the opcode of the JVM instruction. The returnAddress type is not a simple numeric value, does not belong to any basic type, and its value can not be modified by the running program.
The reference types of the Java virtual machine are collectively referred to as "references". There are three types of references: class types, interface types, and array types, and their values are all references to dynamically created objects. The value of the class type is a reference to the class instance; the value of the array type is a reference to the array object. In the Java virtual machine, the array is a real object; and the value of the interface type is a reference to the implementation of the interface Class instance reference. Another special reference value is null, which means that the reference variable does not refer to any object.
Reference passing of method parameters in JAVA
There are two kinds of parameter passing in java, which are passing by value and passing by reference. Needless to say, passing by value, let's talk about passing by reference.
"When an object is passed as a parameter to a method", this is called pass by reference.
public class User {
private String name;
public String getName () {
return name;
}
public void setName (String name) {
this.name = name;
}
}
public class Test {
public void set (User user) {
user.setName ("hello world");
}
public static void main (String [] args) {
Test test = new Test ();
User user = new User ();
test.set (user);
System.out.println (user.getName ());
}
}
The output of the above code is "hello world", which needless to say, if the set method is changed to the following, what will the result be?
public void set (User user) {
user.setName ("hello world");
user = new User ();
user.setName ("change");
}
The answer is still "hello world", let's analyze the code above.
First of all
User user = new User ();
Is to create an object in the heap, and create a reference in the stack, this reference points to the object, such as:
test.set (user);
Is to pass the reference user as a parameter to the set method. Note: What is passed here is not the reference itself, but a copy of the reference. In other words, there are two references (reference and reference copy) that point to objects in the heap at the same time, such as:
user.setName ("hello world");
In the set () method, the "copy referenced by user" operates the User object in the heap, and sets the string "hello world" to the name attribute. Such as:
user = new User ();
In the set () method, another User object is created, and the "copy of user reference" is pointed to this newly created object in the heap, such as:
user.setName ("change");
In the set () method, the "copy referenced by user" operates on the newly created User object in the heap.
Set () method is completed, and then return to the mian () method
System.out.println (user.getName ());
Before, "copy of user reference" has set the name attribute of the User object in the heap to "hello world", so when the user in the main () method calls getName (), the printed result is "hello world". Such as:
Class loading subsystem
In the JAVA virtual machine, the part responsible for finding and loading the type is called the class loading subsystem.
JAVA virtual machine has two class loaders: startup class loader and user-defined class loader. The former is part of the JAVA virtual machine implementation, the latter is part of the Java program. Classes loaded by different class loaders will be placed in different namespaces inside the virtual machine.
The class loader subsystem involves several other components of the Java virtual machine, as well as several classes from the java.lang library. For example, a user-defined class loader is an ordinary Java object, and its class must be derived from the java.lang.ClassLoader class. The methods defined in ClassLoader provide an interface for the program to access the class loader mechanism. In addition, for each type that is loaded, the JAVA virtual machine creates an instance of the java.lang.Class class for it to represent the type. Like all other objects, user-defined class loaders and instances of the Class class are placed in the heap area of memory, and the type information loaded is located in the method area.
In addition to locating and importing binary class files, the class loader subsystem must also be responsible for verifying the correctness of the imported classes, allocating and initializing memory for class variables, and helping to resolve symbol references. These actions must be performed strictly in the following order:
(1) Load-find and load binary data of type. (2) Connection-points to verification, preparation, and analysis (optional).
● Verification Ensure the correctness of the imported type.
● Preparation Allocate memory for class variables and initialize them to default values.
● Analysis Convert symbolic references in types to direct references.
(3) Initialization-Initialize class variables to the correct initial value.
Each Java virtual machine implementation must have a startup class loader that knows how to load trusted classes.
Each class loader has its own namespace, which maintains the types loaded by it. So a Java program can load multiple types with the same fully qualified name multiple times. The fully qualified name of such a type is not sufficient to determine the uniqueness in a Java virtual machine. Therefore, when multiple class loaders are loaded with a type of the same name, in order to uniquely identify the type, the class loader identifier that loads the type (indicating the namespace in which it is located) is also added before the type name.
Method area
In the Java virtual machine, information about the loaded type is stored in a memory that is logically called a method area. When the virtual machine loads a certain type, it uses the class loader to locate the corresponding class file, and then reads this class file-a linear binary data stream, and then transmits it to the virtual machine, and then the virtual machine extracts the Type information and store this information in the method area. Class (static) variables of this type are also stored in the method area.
How the JAVA virtual machine stores type information internally is determined by the designer of the specific implementation.
When the virtual machine runs a Java program, it looks for and uses the type information stored in the method area. Since all threads share the method area, their access to the method area data must be designed to be thread-safe. For example, suppose two threads try to access a class named Lava at the same time, and this class has not been loaded into the virtual machine. Then, there should be only one thread to load it, and the other thread a.
For each loaded type, the virtual machine stores the following type information in the method area:
● This type of fully qualified name
● The fully qualified name of the direct superclass of this type (unless this type is java.lang.Object, it has no superclass)
● Is this type a class type or an interface type
● This type of access modifier (a subset of public, abstract, or final)
● An ordered list of fully qualified names of any direct super interfaces
In addition to the basic type information listed above, the virtual machine has to store the following information for each type loaded:
● This type of constant pool
● Field information
● Method information
● All class (static) variables except constants
● A reference to class ClassLoader
● A reference to Class
Constant pool
The virtual machine must maintain a constant pool for each type loaded. A constant pool is an ordered collection of constants used by that type, including direct constants and symbolic references to other types, fields, and methods. The data items in the pool are accessed by index like an array. Because the constant pool stores the symbol references of all types, fields, and methods used by the corresponding types, it plays a central role in the dynamic linking of Java programs.
Field information
For each field declared in the type. The following information must be saved in the method area. In addition, the order in which these fields are declared in the class or interface must also be preserved.
○ Field name
○ Type of field
○ Field modifiers (a subset of public, private, protected, static, final, volatile, transient)
Method information
For each method declared in the type, the following information must be saved in the method area. Like fields, the order in which these methods are declared in a class or interface must be preserved.
○ Method name
○ Method return type (or void)
○ Number and type of method parameters (in order of declaration)
Modifiers of ○ methods (a subset of public, private, protected, static, final, synchronized, native, abstract)
In addition to the items listed in the above list, if a method is not abstract and local, it must also save the following information:
○ Method bytecodes (bytecodes)
○ Operand stack and the size of the local variable area in the stack frame of the method
○ Exception table
Class (static) variables
Class variables are shared by all class instances, but even if there are no class instances, it can be accessed. These variables are only related to the class-not an instance of the class, so they are always stored in the method area as part of the type information. In addition to the compile-time constants declared in the class, the virtual machine must allocate space for these class variables in the method area before using a class.
Compile-time constants (that is, class variables initialized with final declarations and values known at compile-time) are treated differently from general class variables. Each type that uses compile-time constants copies all of its constants to its Constant pool, or embedded in its bytecode stream. As part of the constant pool or bytecode stream, compile-time constants are stored in the method area-just like ordinary class variables. But when general class variables are saved as part of the data plane that declares their types, compile-time constants are saved as part of the types that use them.
Reference to ClassLoader class
When each type is loaded, the virtual machine must track whether it is loaded by the startup class loader or the user-defined class loader. If it is loaded by a user-defined class loader, the virtual machine must store a reference to the loader in the type information. This is saved as part of the type data in the method table.
The virtual machine uses this information during the dynamic connection. When a type refers to another type, the virtual machine requests to load the class loader that initiated the reference type to load the referenced type. This process of dynamic connection is also crucial to the way the virtual machine separates the namespace. In order to be able to correctly perform dynamic connection and maintain multiple namespaces, the virtual machine needs to know in the method table which class loader is loading each class.
Reference to Class
For each loaded type (whether it is a class or an interface), the virtual machine will create an instance of the java.lang.Class class for it accordingly, and the virtual machine must somehow store this instance and store it in the method area. Type data in.
In Java programs, you can get and use references to Class objects. A static method in the Class class allows the user to get a reference to the Class instance of any loaded class.
public static Class <?> forName (String className)
For example, if you call forName ("java.lang.Object"), you will get a reference to the Class object representing java.lang.Object. You can use forName () to get a reference to any type of Class object in any package, as long as this type can be (or has been) loaded into the current namespace. If the virtual machine cannot load the requested type into the current namespace, a ClassNotFoundException will be thrown.
Another way to get a reference to the Class object is to call the getClass () method of any object reference. This method is inherited by all objects from the Object class itself:
public final native Class <?> getClass ();
For example, if you have a reference to an object of class java.lang.Integer, then you can simply call the getClass () method of the Integer object reference to get the Class object representing the java.lang.Integer class.
Method area usage example
To show how the virtual machine uses the information in the method area, the following is an example:
class Lava {
private int speed = 5;
void flow () {
}
}
public class Volcano {
public static void main (String [] args) {
Lava lava = new Lava ();
lava.flow ();
}
}
Different virtual machine implementations may operate in completely different ways. The following describes only one of these possibilities—but not the only one.
To run the Volcano program, first tell the virtual machine the name "Volcano" in some "implementation-dependent" way. After that, the virtual machine will find and read the corresponding class file "Volcano.class", and then it will extract the type information from the binary data in the imported class file and put it in the method area. By executing the bytecode stored in the method area, the virtual machine starts to execute the main () method. During execution, it will always hold the constant pool (a data structure in the method area) that points to the current class (Volcano class) pointer.
Note: When the virtual machine starts to execute the byte code of the main () method in the Volcano class, although the Lava class has not been loaded yet, like most (perhaps all) virtual machine implementations, it will not wait until the program is used All classes are loaded before they start running. On the contrary, it will only load the corresponding class when needed.
The first instruction of main () tells the virtual machine to allocate enough memory for the class listed in the first item of the constant pool. So the virtual machine uses the pointer to the Volcano constant pool to find the first item and find that it is a symbolic reference to the Lava class. Then it checks the method area to see if the Lava class has been loaded.
This symbolic reference is just a string giving the fully qualified name "Lava" of class Lava. In order for the virtual machine to find a class from a name as quickly as possible, the designer of the virtual machine should choose the best data structure and algorithm.
When the virtual machine finds that the class named "Lava" has not been loaded, it starts to find and load the file "Lava.class", and puts the type information extracted from the read binary data in the method area.
Next, the virtual machine replaced the first item of the constant pool with a pointer directly to the Lava class data in the method area (that is, the string "Lava"), and you can use this pointer to quickly access the Lava class in the future. This replacement process is called constant pool resolution, which is to replace the symbol references in the constant pool with direct references.
Finally, the virtual machine is ready to allocate memory for a new Lava object. At this point it needs the information in the method area again. Remember the pointer you just put in the first item of the Volcano class constant pool? Now the virtual machine uses it to access Lava-type information and find out the information recorded in it: how much heap space a Lava object needs to allocate.
The JAVA virtual machine can always determine how much memory an object needs by storing and type information of the method area. When the JAVA virtual machine determines the size of a Lava object, it allocates such a large space on the heap and puts this object instance The variable speed is initialized to the default initial value of 0.
When the reference of the newly generated Lava object is pushed onto the stack, the first instruction of the main () method is also completed. The following instruction calls Java code through this reference (this code initializes the speed variable to the correct initial value of 5). Another instruction will use this reference to call the flow () method referenced by the Lava object.
Pile
All class instances or arrays created by the Java program at runtime are placed on the same heap. There is only one heap space in a JAVA virtual machine instance, so all threads will share this heap. And because a Java program monopolizes a JAVA virtual machine instance, each Java program has its own heap space-they will not interfere with each other. However, multiple threads of the same Java program share the same heap space. In this case, you must consider the synchronization of multi-threaded access objects (heap data).
The JAVA virtual machine has an instruction to allocate a new object in the heap, but there is no instruction to free memory, just as you cannot explicitly release an object using the Java code area. The virtual machine itself is responsible for deciding how and when to release the memory occupied by objects no longer referenced by the running program. Usually, the virtual machine delegates this task to the garbage collector.
Internal representation of the array
In Java, arrays are real objects. Like other objects, arrays are always stored on the heap. Similarly, arrays also have a Class instance associated with their class. All arrays with the same dimensions and types are instances of the same class, regardless of the length of the array (the length of each dimension of a multidimensional array). For example, an array containing 3 int integers and an array containing 300 integers have the same class. The length of the array is only related to the instance data.
The name of the array class consists of two parts: each dimension is represented by a square bracket "[", and the element type is represented by a character or string. For example, the class name of a one-dimensional array with element type int integer is "[I", the three-dimensional array with element type byte is "[[[B", and the two-dimensional array with element type Object is "[[Ljava / lang / Object ".
Multidimensional arrays are represented as arrays of arrays. For example, a two-dimensional array of type int will be represented as a one-dimensional array, where each element is a reference to a one-dimensional int array, such as:
Each array object in the heap must also hold data when referring to the length of the array, the array data, and some class-like data that points to the array. The virtual machine must be able to obtain the length of this array by reference to an array object, access its elements by index (during the check whether the array boundary crosses the boundary), call the methods declared by the direct superclass Object of all arrays, and so on.
Program counter
For a running Java program, each of these threads has its own PC (program counter) register, which is created when the thread starts, the size of the PC register is a word length, so it can both Holding a local pointer can also hold a returnAddress. When a thread executes a Java method, the content of the PC register is always the "address" of the next instruction to be executed. The "address" here can be a local pointer or relative to the method in the method bytecode The offset of the start instruction. If the thread is executing a local method, then the value of the PC register is "undefined".
Java stack
Each time a new thread is started, the Java virtual machine will allocate a Java stack for it. The Java stack saves the running state of the thread in units of frames. The virtual machine will only directly perform two operations on the Java stack: push and pop in frames.
The method that a thread is executing is called the current method of the thread, the stack frame used by the current method is called the current frame, the class to which the current method belongs is called the current class, and the constant pool of the current class is called the current constant pool. When a thread executes a method, it tracks the current class and current constant pool. In addition, when the virtual machine encounters an operation instruction in the stack, it performs an operation on the data in the current frame.
Each time a thread calls a Java method, the virtual machine pushes a new frame into the thread's Java stack. And this new frame naturally becomes the current frame. When executing this method, it uses this frame to store data such as parameters, local variables, and intermediate calculation results.
The Java method can be done in two ways. One that returns through return is called normal return; the other is that it terminates abnormally by throwing an exception. No matter which way to return, the virtual machine will pop the current frame JThe ava stack is then released, so that the frame of the previous method becomes the current frame.
All data on the Java frame is private to this thread. No thread can access the stack data of another thread, so we do not need to consider the synchronization of stack data access in the case of multithreading. When a thread calls a method, the local variables of the method are saved in the frame of the calling thread's Java stack. Only one thread can always access those local variables, that is, the thread that calls the method.
Local method stack
All the runtime data areas mentioned above are clearly defined in the Java Virtual Machine Specification. In addition, for a running Java program, it may also use some data areas related to local methods. When a thread calls a local method, it enters a whole new world that is no longer restricted by virtual machines. Local methods can access the virtual machine's runtime data area through the local method interface, but not only that, it can also do whatever it wants.
Native methods are inherently implementation-dependent, and designers of virtual machine implementations are free to decide what mechanism to use for Java programs to call native methods.
Any local method interface will use some kind of local method stack. When the thread calls the Java method, the virtual machine creates a new stack frame and pushes it into the Java stack. However, when it calls a local method, the virtual machine keeps the Java stack unchanged, and no longer pushes new frames into the thread's Java stack. The virtual machine simply connects dynamically and directly calls the specified local method.
If the local method interface implemented by a virtual machine uses the C connection model, then its local method stack is the C stack. When a C program calls a C function, its stack operation is determined. The parameters passed to the function are pushed onto the stack in a certain order, and its return value is also passed back to the caller in a certain way. Again, this is the behavior of the local method stack in the virtual machine implementation.
It is likely that the local method interface needs to call back the Java method in the Java virtual machine. In this case, the thread will save the state of the local method stack and enter another Java stack.
Portrays a scenario in which when a thread calls a local method, the local method calls back another Java method in the virtual machine. This picture shows a panoramic view of the internal threads of the JAVA virtual machine. A thread may execute Java methods throughout its life cycle and manipulate its Java stack; or it may jump between the Java stack and the local method stack without any problems.
The thread first called two Java methods, and the second Java method called a local method, which caused the virtual machine to use a local method stack. Suppose this is a C language stack, there are two C functions, the first C function is called as a local method by the second Java method, and this C function calls the second C function. After that, the second C function calls back a Java method (the third Java method) through the local method interface, and finally this Java method calls a Java method (it becomes the current method in the figure).
Attention students of Java! ! !
If you encounter any problems during the learning process or want to obtain learning resources, welcome to join the Java learning exchange group: 299541275 Let's learn Java together!