Java Virtual Machine class loading mechanism-Reading Notes for a deep understanding of Java Virtual Machine
The Class loading mechanism of java Virtual Machine loads Class files to the memory, verifies, converts, parses, and initializes the data in the Class files, and finally forms a java type process that can be directly used by virtual machines.
Java's dynamically scalable language feature relies on dynamic loading and dynamic link during runtime.
The entire lifecycle of a class starts from being loaded to the VM memory and ends with the memory being detached. It includes seven stages: Loading, verification, preparation, parsing, initialization, use, and detaching, verification, preparation, and resolution are collectively referred to as connections. The order of loading, verification, preparation, initialization, and uninstallation is certain, the loading process of classes must start step by step, but not necessarily step by step, because these stages are generally mixed with each other, it is usually called and activated during the execution of a stage, and the parsing stage can be performed after the initialization stage in some cases to support dynamic binding of the java language.
Virtual Machine specifications strictly stipulate that class initialization must be performed immediately in five cases: (called active reference to class)
(1) When four bytecode commands, namely new, getstatic, putstatic, and invokestatic, are run, if the class has not been initialized, the trigger must be initialized first,
(2) when using the java. lang. reflect package method to call the class reflection,
(3) When initializing a class, if the parent class has not been initialized, you must first trigger the initialization of the parent class,
(4) When a virtual machine is started, you need to specify a primary class to be executed. The virtual opportunity first initializes this primary class,
(5) When jdk1.7 is supported in dynamic languages. lang. invoke. the method handle of the final parsing result ref_getstatic of the methodHandle instance, and the class corresponding to the method handle is not initialized, you must first trigger its initialization.
In addition, all referenced classes do not trigger initialization, which is called passive reference:
(1) referencing static fields of the parent class through subclass does not cause subclass initialization. For static fields, only classes that directly define this field will be initialized,
(2) referencing a class through array definition does not trigger initialization of this class
(3) constants are stored in the constant pool of the call class during the compilation phase. Essentially, they are not directly referenced to classes that define constants, so initialization of classes that define constants is not triggered,
The interface loading and class loading are slightly different, but the interface also has an initialization process, which is consistent with the class, the above Code uses the static statement block "static {}" to output initialization information. The interface cannot use the "static {}" Statement block, but the compiler still generates" () "Class constructor is used to initialize the member variables defined in the interface. The real difference is that the class requires that all its parent classes have been initialized during initialization, however, during initialization, an interface does not require all its parent interfaces to be initialized. It is initialized only when the parent interface is actually used,
Class loading process:
In the loading phase, the Java Virtual Machine must complete the following three tasks:
A. Get the binary byte stream that defines this class using the full qualified name of a class.
B. Convert the static storage structure represented by the binary byte stream of the defined class to the data structure during runtime IN THE METHOD area.
C. Generate a java. lang. Class Object representing the Class in the java heap, which serves as the access entry to the data in the method area.
The loading phase of a non-array class (that is, the action of retrieving the binary byte stream of the class in the loading phase) is controllable by developers. However, the situations of the array class are different, the array class itself is not created through the Class Loader. It is directly created by the Java virtual machine, but the element type of the array class is ultimately created by the class loader, for the rules for creating an array class, see p215.
After the loading phase is complete, the binary byte stream outside the virtual machine is stored in the method area according to the format required by the virtual machine. The data storage format in the method area is defined by the virtual machine,
Verification phase:
Verification is the first step in the connection phase. It aims to ensure that the byte stream of the class file meets the requirements of the current virtual machine and does not endanger the security of the virtual machine itself, the java language itself is a relatively secure language. Using pure java code, you cannot perform data such as accessing data outside the array boundary. If you do so, the compiler will reject compilation, however, the class file does not necessarily need to be compiled using the java source code and can be generated in any way. Therefore, if the virtual machine does not check the input byte stream and fully trusts it, it is very likely that the system will crash due to loading harmful byte streams, so verification is an important task for the Virtual Machine to protect itself.
Four phases of verification:
1. File Format Verification:
Verify whether the master or secondary version number starts with the magic number 0xcafebabe, whether it is within the processing range of the current virtual machine, and whether the constant in the constant pool has an unsupported open type, the main purpose is to ensure that the input byte stream can be correctly parsed and stored in the method area,
2. Metadata Verification:
Perform Semantic Analysis on the information described by bytecode to ensure that the information complies with the java language requirements. The verification points include: whether the class has a parent class, whether the parent class of this class inherits the class that cannot be inherited. If the class is not an abstract class, whether it implements all the methods required by its parent class or interface, etc.
3. bytecode Verification:
The main purpose is to determine that the program semantics is legal and logical through data flow and control flow analysis. After the data type in the second stage is verified, in this phase, the method body of the class will be verified and analyzed. If the method body of a class passes the bytecode verification, it cannot be considered safe,
Due to the high complexity of data flow verification, avoiding excessive time consumption, the virtual machine adds an attribute named "stackmaptable" to the Attribute Table of the code attribute of the method body, describes the local variable table and the state of the Operation stack at the beginning of all basic blocks in the method body, so you do not need to deduce the legitimacy of these states based on the Program p218
4. symbol reference verification:
The purpose is to ensure that the parsing action can be executed normally. This occurs when the Virtual Machine converts a symbolic reference to a direct reference,
Preparation phase:
The preparation phase is the phase in which the class variable is formally allocated memory and the initial value of the class variable is set. First, the class variable does not include the instance variable, instance variables will be allocated to the java heap along with the object during Object Instantiation. The initial value is generally a zero value of the data type, but if the field Attribute Table of the class field has constantvalue (unchanged) attribute, it will be initialized as the value specified by the attribute in the preparation phase,
Parsing phase:
Parsing is a process in which the Virtual Machine replaces the symbol reference in the constant pool with the direct reference. symbol reference is irrelevant to the memory layout implemented by the virtual machine, it is very common to parse the same symbolic reference multiple times. In addition to invokedynamic, the virtual machine can cache the results of the first resolution to avoid repeated parsing actions, however, for the invokedynamic command, because it is used for dynamic language support,
The parsing process of the four types of references:
1. parsing of classes or interfaces:
2. Field parsing:
3. Class Method parsing:
4. Interface Method Parsing
Initialization phase:
The class initialization stage is the last step of class loading. At the initialization stage, the java program code defined in the class is actually executed. In the preparation stage, the variable has been assigned an initial value required by the system, in the initialization stage, it is the execution class constructor. () Process of the method,
First, let's take a look. () Features and details of method execution that may affect program running:
(1) () The method is generated by the compiler automatically collecting the values of all class variables in the class and merging the statements in the static statement block, the sequence collected by the compiler is determined by the sequence in which the statements appear in the source file. The static statement block can only access the variables defined before the static statement block and the variables defined after it, the preceding static statement block can be assigned a value but cannot be accessed,
(2) () The method is different from the class constructor. It does not need to explicitly call the parent class constructor, and the virtual opportunity is guaranteed in the subclass () The method of the parent class has been executed before the method is executed,
(3) () If the method is executed first, the static statement block of the parent class takes precedence over the variable assignment of the subclass,
(4) () Methods are not required for classes or interfaces. If a class does not define static statement blocks and does not assign values to variables, the compiler will not generate this method for this class,
(5) In the execution interface () The method does not need to execute () Method
Class loading of the Java Virtual Machine is implemented through the Class Loader. For any class, the class loader that loads it needs to establish its uniqueness in the Java Virtual Machine together with the class itself, that is to say, to compare whether two classes are equal, it makes sense only when these two classes are loaded by the same class loader,
(1 ). bootStrap ClassLoader: starts the class loader and is responsible for loading and storing it in the % JAVA_HOME % \ lib directory or in the path specified by the-Xbootclasspath parameter, and identified by the Java Virtual Machine (only by file name, such as rt. jar, a class library with an invalid name, won't be loaded even if it is placed in the specified path.) the class library is in the memory of the virtual machine, and the startup Class Loader cannot be directly referenced by java programs.
(2 ). extension ClassLoader: extends the class loader by sun. misc. launcher $ ExtClassLoader implementation, responsible for loading the % JAVA_HOME % \ lib \ ext directory, or by java. ext. all class libraries in the path specified by dirs system variables can be directly used by developers.
(3 ). application ClassLoader: Application class loader, which is composed of sun. misc. launcher $ AppClassLoader is used to load the class library specified on the classpath of the user class path. It is the return value of the getSystemClassLoader () method in the ClassLoader of the Class Loader. developers can directly use the application class loader, if no class loader is defined in the program, the loader is the default class loader in the program.
Note: although the class loaders provided by the above three JDK are parent-child class loaders, they do not use inheritance, but use composite links.
From JDK1.2, Java Virtual Machine standards recommend that developers use ParentsDelegation Model to load classes. The loading process is as follows:
(1) If a Class Loader receives a class loading request, it first does not attempt to load the class itself, but delegates the class loading request to the parent class loader.
(2) the class loader at each layer delegates the class loading request to the parent class loader until all the class loading requests should be passed to the top-level start class loader.
(3 ). if the top-level start class loader cannot complete the loading request, the subclass loader tries to load the load. If the class loader that initially initiates the class loading request cannot complete the loading request, will throw ClassNotFoundException, instead of calling its subclass loader for class loading.
The advantage of the class loading mechanism in the parent-child delegation mode is that the class loaders of the java class have a hierarchical relationship with priority. The more basic the class, the more loaded by the upper class loader, the more stable the java program runs. Implementation of the parental delegation mode:
[Java]View plaincopy
- ProtectedsynchronizedClassloadClass (Stringname, Booleanresolve) throwsClassNotFoundException {
- // First, check whether the requested class has been loaded.
- Classc = findLoadedClass (name );
- If (c = null ){
- Try {
- If (parent! = Null) {// delegate the parent loader to load
- C = parent. loadClass (name, false );
- }
- Else {// delegate the startup class loader to load
- C = findBootstrapClassOrNull (name );
- }
- } Catch (ClassNotFoundExceptione ){
- // The parent class loader cannot complete the class loading request.
- }
- If (c = null) {// Class Loader
- C = findClass (name );
- }
- }
- If (resolve ){
- ResolveClass (c );
- }
- Returnc;
- }
To implement a custom class loader, you only need to inherit the java. lang. ClassLoader class and override its findClass () method. Java. lang. the basic function of the ClassLoader class is to find or generate the corresponding byte code based on the name of a specified class, and then define a Java class from these byte code, that is, java. lang. an instance of the Class. In addition, ClassLoader is also responsible for loading the resources required by Java applications, such as files and configuration files. The ClassLoader's methods related to the loading class are as follows:
Method |
Description |
GetParent () |
Returns the parent class loader of the class loader. |
LoadClass (String name) |
Load the Class named binary with name. The returned result is an instance of the java. lang. Class. |
FindClass (String name) |
Find the Class named name, and the returned result is an instance of the java. lang. Class. |
FindLoadedClass (String name) |
Find the loaded Class with the name. The returned result is an instance of the java. lang. Class. |
ResolveClass (Class c) |
Link to the specified Java class. |
Note: Before JDK1.2, the Parent-Child delegation mode has not been introduced for class loading. Therefore, when implementing a custom class loader, The loadClass method is often rewritten to provide the Parent-Child delegation logic. After JDK1.2, the parent-child delegation mode has been introduced into the class loading system. The custom Class Loader does not need to write the logic of the parent-child delegation on its own. Therefore, the loadClass method is not encouraged to be rewritten, we recommend that you override the findClass method.
In Java, any class must be determined by the class loader that loads it and the class itself to determine its uniqueness in the Java Virtual Machine, that is, to compare whether the two classes are equal, it makes sense only when these two classes are loaded by the same Class loader. Otherwise, even if these two classes come from the same Class file, as long as the Class loaders that load it are not the same, the two classes must be not equal (here the equality includes the equals () method representing the Class Object of the Class, isAssignableFrom () result of the method, isInstance () method, and instanceof keyword ). The sample code is as follows:
[Java]View plaincopy
- Packagecom. test;
-
- PublicclassClassLoaderTest {
- Publicstaticvoidmain (String [] args) throwsException {
- // Implement the custom Class Loader for anonymous internal classes
- ClassLoadermyClassLoader = newClassLoader (){
- ProtectedClassfindClass (Stringname) throwsClassNotFoundException {
- // Obtain the class file name
- Stringfilename = name. substring (name. lastIndexOf (".") + 1) + ". class ";
- InputStreamin = getClass (). getResourceAsStream (filename );
- If (in = null ){
- ThrowRuntimeException ("Couldnotfoundclassfile:" + filename );
- }
- Byte [] B = newbyte [in. available ()];
- ReturndefineClass (name, B, 0, B. length );
- } Catch (io1_tione ){
- ThrownewClassNotFoundException (name );
- }
- };
- Objectobj = myClassLoader. loadClass ("com. test. ClassLoaderTest"). newInstance ();
- System. out. println (obj. getClass ());
- System. out. println (objinstanceofcom. test. ClassLoaderTest );
- }
- }
The output result is as follows:
Com. test. ClassLoaderTest
False
The reason why instanceof returns false is that the com. test. ClassLoaderTest class is loaded by default using the Application ClassLoader, while obj is loaded by the custom class loader, and the class loading is not the same, so they are not equal.
The Parent-parent Delegation Model of the class loader was introduced after JDK1.2 and is only a recommended model, which is not mandatory. Therefore, there are some exceptions that do not follow the parent-parent Delegation Model:
(1 ). before JDK1.2, the custom Class Loader must override the loadClass method to implement the class loading function. After JDK1.2 introduces the parent-parent delegation model, the loadClass method is used to delegate the parent class loader for class loading, only when the parent class loader cannot complete the class loading request can it call its findClass Method for class loading. Therefore, the loadClass method loaded before JDK1.2 does not follow the parent-parent delegation model, therefore, after JDK, we do not recommend that you overwrite the loadClass method. Instead, you only need to overwrite the findClass method.
(2 ). the parent-child delegation mode solves the unified basic classes of each class loader. the more basic classes are loaded by the upper-layer class loaders, the less basic classes are, when the base class wants to call the lower-level user code, it cannot delegate the subclass loader for class loading. To solve this problem, JDK introduces the ThreadContext thread context. The thread context class loader can be set through the setContextClassLoader method of the thread context.
JavaEE is just a specification. sun only provides the interface specification. The specific implementation is implemented by various vendors. Therefore, JNDI, JDBC, JAXB and other third-party implementation libraries can be called by JDK class libraries. The thread context class loader does not follow the Parent-Child delegation model.
(3 ). in recent years, hot code replacement, module hot deployment, and other applications require that code modules can be plug-and-play without restarting the Java Virtual Machine, giving birth to the OSGi technology, in OSGi, the Class Loader system is developed into a mesh structure. OSGi does not fully follow the parent-parent delegation model.