Deep understanding of Java Virtual Machine notes-class loading process, deep understanding of virtual machines
I. Loading
The Loading stage is a stage of the Class Loading process. In the loading phase, the virtual machine must complete the following three tasks:
A. Get the binary byte stream that defines this class using the full limit name of a class.
B. Convert the static storage structure represented by this byte stream into the running data structure of the method area.
C. Generate a Java. lang. Class Object that represents the Class in the java heap, which serves as the access entry to the data in the method area.
These three requirements of Virtual Machine specifications are not specific, so virtual machine implementation is quite flexible with specific applications. For example, "getting a binary byte stream that defines this Class by using the full limit name of a Class" does not indicate that the binary byte stream needs to be obtained from a Class file, to be precise, it does not specify where to obtain and how to obtain it. The virtual machine design team built a very open and broad stage in the loading stage. During the Java development process, many important Java technologies were built on this basis. For example:
A. Read from the ZIP package. This is very common and will eventually become the basis for future JAR, EAR, and WAR formats.
B. obtain from the network. The most typical application in this scenario is Applet.
C. in this scenario, dynamic proxy technology is used most frequently. lang. reflect. proxyGenerator is used in Proxy. generatProxyClass is used to generate the binary byte stream of * $ Proxy for a specific interface.
D. generated by other files. Typical scenario: JSP application.
E. Read from the database. This scenario is rare. Some middleware servers (such as SAP Netweaver) can install programs in the database to distribute program code among clusters.
F ........
Part of the content in the loading and connection phases (for example, part of the bytecode file format Verification Action) is performed in a crossover manner. The loading phase has not been completed yet, and the connection phase may have started, however, these actions in the loading phase still attribute the content of the connection phase, and the start time of these two phases remains in a solid sequence.
Ii. Verification
Verification is the first step in the connection phase. The purpose of this phase is to ensure that the information contained in the byte stream of the Class file meets the requirements of the current virtual machine and does not endanger the security of the virtual machine itself.
Although the verification phase is very important, the workload of the verification phase accounts for a large part of the class loading subsystem of the virtual machine. If the input byte stream does not conform to the storage format of the Class file, a java. lang. VerifyError or its suberror is thrown. The specifications of virtual machines are not clearly stated in specific aspects, how to check, and when to check. Therefore, different virtual machines may implement verification differently, in general, the verification process is completed in four stages: File Format verification, metadata verification, bytecode verification, and symbol reference verification.
Iii. Preparation
The preparation phase is the phase in which the class variable is formally allocated memory and the initial value of the class variable is set. These memories will be allocated in the method area. It should be emphasized that, at this time, memory allocation only includes class variables, but does not include instance variables. instance variables will be allocated together with the object during Object Instantiation in the Java heap. The second is the initial value "normally", which is the zero value of the data type. Assume that a class variable is defined:
Public static int values = 123;
Then, the initial value of the variable value after the preparation stage is 0 rather than 123, because no Java method is started at this time, and the putstatic command that assigns value to 123 is after the program is compiled, stored in the class constructor "<clinit>" method. Therefore, the action that assigns value to 123 will be executed only during the initialization phase.
The initial value of "normally" mentioned above is zero. However, if the field Attribute Table of the class field has the ConstantValue attribute, in the preparation phase, the variable value is initially initialized to the value specified by the ConstantValue attribute. Assume that the above variable value is defined:
Public static final int value = 123;
During compilation, Javac will generate a ConstantValue Attribute Table for the value. During preparation, the value will be assigned to 123 Based on the ConstantValue settings.
Iv. Analysis
In the parsing phase, the virtual machine replaces the symbol reference in the constant pool with the direct reference process. In the Class file, it appears as a constant of the CONSTANT_Class_info, CONSTANT_Fieldref_info, CONSTANT_Methodref_info, and other types. The association between direct reference and symbolic reference is:
A. Symbolic References describes the referenced directory with a set of symbols. The symbols can be literal in any form and can be located to the target without ambiguity when used. Symbol reference has nothing to do with the memory layout implemented by the virtual machine. The referenced target is not necessarily loaded into the memory.
B. Direct reference can be a pointer directly pointing to the target, relative offset, or a handle that can indirectly locate the target. Direct reference is related to the memory layout implemented by virtual machines. The direct reference translated by the same symbol reference on different virtual machine instances is generally not the same. If direct reference is available, the referenced target must already exist in the memory.
The virtual machine specification does not specify the specific time when the parsing phase occurs. Only the newarray, heckcast, getfield, etstatic, instanceof, invokeinterface, invokespecial, nvokestatic, invokevritual, multianewarray, new, before the putfield and putstatic Byte Code commands used for operating symbol reference, parse the symbol reference they use. Therefore, the virtual machine will determine, as needed, whether to parse the symbolic reference in the constant pool when loading the class loader, or wait until a symbolic reference is to be used.
The same symbolic reference may carry out multiple resolution requests. Virtual Machine implementation may cache the results of the first resolution to avoid repeated parsing actions. Whether or not multiple parsing operations are performed, the virtual machine must ensure that all the operations are performed in the same object. If a symbolic reference is successfully parsed, subsequent parsing requests should always be successful; similarly, if the first parsing fails, other commands should also receive the same exception for the parsing request referenced by this symbol.
The parsing action mainly targets class or interface, field, method, and interface method symbol reference, corresponding to CONSTANT_Class_info, CONSTANT_Fieldref_info, respectively in the constant pool, CONSTANT_Methodref_info and CONSTANT_InterfaceMetodref_info constants. The following is the parsing process of these four references.
1. parsing process of classes or interfaces
Assume that the current code is in the Class D. If you want to resolve a symbolic reference N that has never been resolved to a direct reference of a class or interface C, the entire parsing process of the virtual machine requires the following three steps:
A. If C is not an array type, the VM will pass the fully qualified name representing N to the Class D loader to load the class C. In the loading process, because there is no data verification, bytecode verification is required, it may trigger loading of other related classes, such as loading the parent class of this class or the implemented interface. Once any exception occurs during the loading process, the parsing process will fail.
B. if C is an array type and the element type of the array is an object, that is, the N descriptor will be similar to "[Ljava. lang. the array element type is loaded according to the rule of vertex. If the N descriptor is in the preceding format, the element type to be loaded is "java. lang. Integer", and then an array object representing the array and element is generated by the virtual machine.
C. if no exception occurs in the above steps, C has actually become a valid class or interface in the virtual machine, but symbol reference verification is required before parsing is complete, check whether C has access to D. If no specific access permission is found, the java. lang. illegalAccessError error.
2. Field Parsing
To resolve a field symbol reference that has not been parsed, The CONSTANT_Class_info symbol reference of the index in the class_index item in the field table is first parsed, that is, the symbolic reference of the class or interface of the field attribute. If any exception occurs during parsing this class or interface symbol reference, the parsing of the field symbol reference will fail. If the resolution is completed successfully, the class or interface of the attribute of this field is represented in C. The specification of the VM requires that the following steps be taken to search for the subsequent fields of C:
A. If the C body contains a simple name and a field descriptor that matches the target, a direct reference of this field is returned and the search is complete.
B. otherwise, if the interface is implemented in C, the system recursively searches for each interface and its parent interface based on the inheritance relationship, if the interface contains a field that matches both the simple name-Answer field descriptor and the target, a direct reference of this field is returned, and the search is complete.
C. otherwise, if C is not java. lang. put the Object, the parent class will be searched recursively from the top down according to the inheritance relationship. If the parent class contains a simple name and a field descriptor that matches the target field, then the direct reference of this field is returned, and the search is complete.
D. Otherwise, the search fails and the java. lang. NoSuchFieldError error is thrown.
3. Class Method Parsing
The first step of class method Parsing is the same as field parsing. It also needs to first parse the symbolic reference of the class or interface of the method attribute of the index in the class_index item of the method table. If the parsing is successful, still use C to represent this class. Then, the VM will perform the following Class Method Search:
A. the constant Type Definitions referenced by class methods and interface method symbols are separated. If the index C in class_index is found to be an interface in the class method table, java is thrown directly. lang. incompatibleClassChangeError.
B. If step a is passed, check whether a simple name and descriptor match the target method in Class C. If yes, return the direct reference of this method and the search is complete.
C. Otherwise, in the parent class of class C, recursively look for methods with simple names and field descriptors matching the target, then return the direct reference of this method, and the search ends.
D. otherwise, in the list of interfaces implemented by class C and their parent interfaces, recursively find whether simple names and field descriptors match the target method. This indicates that class C is an abstract class, at this time, the search is complete and java is thrown. lang. invalid actmethoderror error.
E. Otherwise, the query fails and the java. lang. NoSuchMethodError error is thrown.
Finally, if a direct reference is returned in the search process, the system will verify the permission of the dizzy method. If the method is not authorized, a java. lang. IllegalAccessError error will be thrown.
4. Interface Method Parsing
The interface method also needs to first parse the class or interface symbol reference of the method attribute of the index in the class_index item in the interface method table. If the resolution is successful, it still uses C to represent this interface, next, the virtual machine will follow the steps below for subsequent interface access and search:
A. Opposite to class method resolution, if the index C in class_index is found to be a class rather than an interface in the interface method table, the java. lang. IncompatibleClassChangeError is thrown directly.
B. Otherwise, in Interface C, check whether all descriptors with simple names match the target method. If yes, return the direct reference of this method and the search is complete.
C. no, Recursively search in the parent interface of Interface C until java. lang. check whether there are simple methods whose names and Descriptors match the target. If so, return the direct reference of this method and the search is complete.
D. Otherwise, the query fails and the java. lang. NoSuchMethodError error is thrown.
Because all methods in the interface are public by default, there is no access permission problem. Therefore, the symbolic parsing of the interface method should not throw the java. lang. IllegalAccessError error.
V. Initialization
Class initialization is the last step in the class loading process. In addition to the previous class loading actions, the user application can participate in the loading stage through the custom class loader, other actions are completely dominated and controlled by virtual machines. At the initialization stage, the Java program code (or bytecode) defined in the class is actually executed ).
In the preparation phase, variables have been assigned an initial value required by the system. In the initialization phase, class variables and other resources are initialized according to the subjective program developed by the programmer, you can also express it in another way: the initialization stage is the process of executing the <clinit> () method of the class constructor. <Clinit> () method execution may affect some features and details of the program running behavior, as follows:
1. the <clinit> () method is generated by the compiler automatically collecting the values of all class variables in the class and merging the statements in the static statement block (static, the sequence collected by the compiler is determined by the sequence in which the statements appear in the source file. In the static statement block, only variables defined before the static statement block can be accessed, and variables defined after it are defined, in the previous static statement block, values can be assigned but cannot be accessed.
2. the <clinit> () method is different from the class constructor <init> (). It does not need to explicitly call the parent class constructor, the virtual opportunity ensures that the <clinit> () method of the parent class has been completed before the <clinit> () method of the subclass is executed. Therefore, the first class in the virtual machine to execute the <clinit> () method must be java. lang. Object.
3. Because the <clinit> () method of the parent class is executed first, it means that the static statement block defined in the parent class takes precedence over the class variable assignment operation of the subclass.
4. <clinit> () methods are not mandatory for classes or interfaces. If a class does not contain static statement blocks or assign values to class variables, the compiler can generate a <clinit> () method for this type.
5. the interface cannot use static statement blocks, but it can still be performed by colleagues who initialize variables. Therefore, the <clinit> () method is generated for interfaces and classes, but the interfaces and classes are different, the <clinit> () method of the execution interface does not need to first execute the <clinit> () method of the parent interface. The parent interface is initialized only when the variables defined in the parent interface are used. In addition, the implementation class of the interface does not execute the <clinit> () method of the interface during initialization.
6. the virtual opportunity ensures that the <clinit> () method of a class is properly locked and synchronized in a multi-threaded environment. If multiple threads initialize a class at the same time, only one thread executes the <clinit> () method of this class, and other threads need to block and wait until the active thread executes the <clinit> () method. If there are time-consuming operations in the <clinit> () method of a class, multiple threads may be blocked, which is often concealed in actual applications.