In-depth understanding of class loading mechanisms
Description
Before starting the text, I would like to say the purpose and benefits of writing these articles, mainly can make their own learning things to do a summary, sometimes spend a long time to learn some things, straighten out some ideas, but a few days to recall these things will become blurred, all also spend time to find relevant information, look at the information. And now do such a summary, not only can the knowledge of new learning to do a consolidation deepened, there is even if the future look back to see when there is a centralized information and general ideas, you can quickly restore memory; Used to record in a notebook or read a book directly, But it's hard to find when you look back, and it's easy to lose, which is why I write my own blog.
Overview
This is my study of the JVM class loading mechanism of the online some of the information on the collation and summary, the text will give a specific reference address. There is a lot of information here, which summarizes a general process and enriches the interpretation of many of the conceptual details.
About the JVM类加载机制
two articles I am going to introduce separately, one is mainly about the life cycle of classes in the JVM, and the other focuses on the ClassLoader. The class loader is explained separately because 类加载
this part is the only part that we can intervene through our own code program, while the rest is done directly inside the JVM. After the introduction of class loading there will be an article 反射
to explain, because they have a lot of relevance, if there is a chance I also want to say a bit of bytecode.
Before we start the text, we'll take a look at two pictures.
Let's take a look at the Java program's execution flowchart
Look at the approximate physical structure diagram of the JVM
This article will only cover part of these two pictures, not the whole content, the two pictures to have a general impression.
Class loading mechanism concepts
* Java Virtual machine Load the description class data from the class file into memory, and verify the data, transform parsing and initialization, and finally form a Java type that can be used directly by the virtual machine, which is the loading mechanism of the virtual machine. *
When the class file is loaded by the loader, a meta-information object describing the class structure is formed in the JVM, which can be used to learn the structure information of class, such as constructors, properties and methods, etc. Java allows the user to indirectly invoke the function of the class object by this class-related meta-information object, which is what we often see in class.
Class starts from being loaded into the virtual machine's memory, and its entire lifecycle includes: Load (Loading), validate (verification), prepare (preparation), Parse (Resolution), Initialization (initialization), use (using), and unload (unloading) seven phases. Where the validation, preparation, and parsing of three parts is collectively referred to as the connection (linking), the sequence of the seven phases is as follows:
Working mechanism
A class loader is a byte-code file that looks for a class and constructs an object component that the class represents within the JVM. In Java, the class loader loads a class into the JVM, taking the following steps:
(1) Load: Find and Import Class file, (2) Link: Merge binary data of class into JRE; (a) Verify: Check the correctness of loading class file data; (b) Prepare: Allocate storage space for static variables of class; (c) Parse: Convert a symbolic reference to a direct reference; (3) Initialize: Static variables for a class, static blocks of code perform initialization operations
1
2
3
4
5
6
7
8
9
10
11
12
Java programs can be dynamically extended by the runtime dynamic loading and dynamic link implementation, for example: If you write an application that uses an interface, you can wait until the runtime to specify its actual implementation (polymorphic), the parsing process can sometimes be executed after initialization, for example: Dynamic binding (polymorphic)
As shown, the order of the five stages of loading, validating, preparing, initializing, and unloading is determined, and the loading process of the class must begin in this order, and the parsing phase is not necessarily, and in some cases it can start again after the initialization phase.
Each phase of a class's life cycle is usually cross-mixed, often invoking or activating another phase during one phase of execution.
Detailed
When I refer to other people's data, I find that it is the initialization of the class that is introduced first, but I think this will cause a misunderstanding of what kind of initialization, and the timing of what it should happen. Here I will describe each process in the order of the class's load life cycle.
1. Loading (loading)
What is the load of a class
Class loading refers to reading the binary data in the class's. class file into memory, placing it in the method area of the run-time data area, and then creating a Java.lang.Class object in the heap that encapsulates the data structure of the class within the method area. The final product loaded by the class is a class object located in the heap, which encapsulates the data structure of the class within the method area, and provides the Java programmer with an interface to access the data structures within the method area.
The ClassLoader does not need to wait until a class is "first active" and then load it, and the JVM specification allows the ClassLoader to preload it when it is expected that a class will be used. If you encounter a missing or an error in a. class file during pre-loading, the ClassLoader must report an error (Linkageerror error) when the program first actively uses the class, and the class loader does not report an error if the class has not been actively used by the program.
The way to load a. class file is:
1. Load 2 directly from the local system. Download. class files over the network 3. Load the. class file from an archive file such as Zip,jar 4. Extracting. class files from a proprietary database 5. Dynamically compile the Java source file into a. class file
After you understand what a class is loaded, look back and see what the JVM does in the class-loading phase. The virtual machine needs to complete the following three things:
1. Obtain a binary byte stream that defines this class by using the fully qualified name of a class. 2. Convert the static storage structure represented by this byte stream into the run-time data structure of the method area. 3. Generate a Java.lang.Class object representing this class in the Java heap as the access entry for the data in the method area.
Relative to the other stages of the class loading process, the load phase is relatively strong during the development phase, which can be done either with the system-provided ClassLoader or by a user-defined ClassLoader, which the developer can use to define its own classloader to control how the byte stream is fetched. For more details on this process, I'll dwell on the loading of classes in the next section.
When the load phase is complete, the binary byte stream outside the virtual machine is stored in the method area in the format required by the virtual machine, and an object of the Java.lang.Class class is created in the Java heap so that the data in the method area can be accessed through the object.
2. Verification
The purpose of the validation is to ensure that the byte stream in the class file contains information that conforms to the requirements of the current virtual machine and does not compromise the security of the virtual machine itself. Implementations of different virtual machines for class validation may vary, but the following four phases of validation are generally performed: file format validation, metadata validation, bytecode validation, and symbolic reference validation.
1) file Format verification: Verify that the byte stream conforms to the class file format specification, and can be processed by the current version of the virtual machine, the main purpose of this verification is to ensure that the input byte stream can be correctly parsed and stored in the method area. After this phase of validation, the byte stream is stored in the memory's method area, and the subsequent three validations are based on the storage structure of the method area. 2) Metadata validation: Semantic validation of the metadata information of a class (in fact, the syntax check of each data type in the Class), and the assurance that there is no metadata information that does not conform to the Java syntax specification. 3) Bytecode verification: the main task of this phase verification is to conduct data flow and control flow analysis, and verify the method body of the class to ensure that the method of the checked class does not make the behavior that endangers the virtual machine security at runtime. 4) symbol Reference validation: This is the last stage of validation, which occurs when a virtual machine converts a symbolic reference to a direct reference (the conversion occurs during the parsing phase, which is explained later), and is primarily a matching check for information other than the class itself (a variety of symbol references in a constant pool).
3. Preparation
The prep phase is a phase that formally allocates memory for class variables and sets the initial value of class variables, which are allocated in the method area.
Note:
1) The memory allocation at this time includes only class variables (static), not instance variables, and instance variables are allocated to the Java heap as the object is instantiated when the object is initialized. 2) The initial value set here is typically the default 0 value of the data type (such as 0, 0L, NULL, FALSE, and so on), rather than being explicitly assigned to the value in Java code.
4. Parsing
The parsing phase is the process by which a virtual machine replaces a symbolic reference within a constant pool with a direct reference.
Symbol reference (Symbolic Reference): a symbol reference is a set of symbols that describe the target being referenced, a symbol reference can be any form of literal, the symbolic reference is independent of the memory layout implemented by the virtual machine, and the referenced target is not necessarily already in memory. Direct Reference: A direct reference can be a pointer to a target directly, a relative offset, or a handle that can be indirectly anchored to the target. The direct reference is related to the memory layout implemented by the virtual machine, and the direct reference that the same symbol reference is translated on different virtual machine instances is generally not the same, and if there is a direct reference, the referenced target must already exist in memory. 1. Parsing of classes or interfaces: Determine whether the direct reference to be converted is an array type or a reference to an ordinary object type for different parsing. 2. Field parsing: When parsing a field, first find in this class whether it contains a field with a simple name and a field descriptor that matches the target, and if so, the search ends, and if not, the interfaces and their parent interfaces implemented by the class are recursively searched according to the inheritance relationship, not yet, The parent class is searched recursively from top to bottom according to the inheritance relationship until the lookup is finished. 3, class method parsing: The parsing of the class method and the search procedure for the field resolution is similar, just more to judge the method is in the class or interface steps, and the class method matching search, is to search the parent class, then search the interface. 4, interface method parsing: Similar to the class method parsing steps, but the interface does not have a parent class, so only recursively search the parent interface.
5. Initialization
The class initialization phase is the last step in the class loading process, in addition to the load (Loading) stage in which the user application can participate through a custom ClassLoader, the rest of the action is entirely dominated and controlled by the virtual machine. In the initialization phase, the Java program code defined in the class is actually started.
initialization, which assigns the correct initial value to the static variables of the class, the JVM is responsible for initializing the class, primarily initializing the class variables. There are two ways to set the initial value of a class variable in Java:
① specifying an initial value when declaring a class variable ② specifying an initial value for a class variable using a static code block
JVM initialization steps
1, if the class has not been loaded and connected, the program first load and connect the Class 2, if the class's immediate parent class has not been initialized, the first initialization of its immediate parent class 3, if the class has initialization statements, the system executes the initialization statements in turn
The process of executing the class constructor () method when initializing the stage.
1) <clinit> () method is generated by the compiler automatically collects the assignment actions of all class variables in the class and the statements in the static statement block (static{} block). The order in which the compiler collects is determined by the order in which the statements appear in the source file. 2) The <clinit> () method differs from the constructor of a class in that it does not need to explicitly call the parent class constructor, which guarantees that the <clinit> () method of the parent class has been completed before the subclass's <clinit> () method executes. So the first class of the <clinit> () method that executes in the virtual machine must be java.lang.Object. 3) because the parent class's <clinit> () method executes first, it means that the static statement block defined in the parent class takes precedence over the subclass's variable assignment operation. 4) The <clinit> () method is not required for a class or interface, and the compiler can not generate an <clinit> () method for this class if there is no static statement block in the class and no assignment to the variable. 5) There may be variable assignment operations in the interface, so the interface also generates the <clinit> () method. However, unlike classes, the <clinit> () method of the execution interface does not require the parent interface's <clinit> () method to be executed first. The parent interface is initialized only if the variable defined in the parent interface is used. In addition, the implementation class of the interface does not execute the <clinit> () method of the interface at initialization time. 6) Virtual opportunity ensures that the <clinit> () method of a class is properly locked and synchronized in a multithreaded environment. If there are multiple threads to initialize a class at the same time, there will only be one thread executing the class's <clinit> () method, and the other threads will need to block the wait until the active thread executes the <clinit> () method. If there is a lengthy operation in the <clinit> () method of a class, it can cause multiple processes to block.
Trigger condition for class initialization: Class initialization is only caused when the class is actively used.
(1) When encountering the 4 bytecode directives of new, getstatic, putstatic, or invokestatic, if the class has not been initialized, it needs to trigger its initialization first. The most common Java code scenario for generating these 4 instructions is when instantiating an object with the new keyword, reading or setting the static field of a class (except for the static fields that were final decorated, which have been placed in the constant pool at compile time), and when invoking a static method of a class. (2) When using the Java.lang.reflect package method to make a reflection call to a class, if the class has not been initialized, it needs to trigger its initialization first. (3) When a class is initially initialized, it is necessary to trigger the initialization of its parent class if it finds that its parent class has not yet been initialized. (4) When the virtual machine starts, the user needs to specify a main class to execute (the class that contains the main () method), and the virtual opportunity initializes the main class first.
Only the above four situations trigger initialization, also known as an active reference to a class, except that all other methods do not trigger initialization, called a passive reference.
For these four kinds of statements above, a common explanation should correspond to the following six kinds:
(1) Create an instance of the class, that is, new (2) to access a static variable for a class or interface, or to assign a value to the static variable (3) to invoke the static method (4) reflection of the class (for example, Class.forName ("Com.shengsiyuan.Test")) (5) Initializes a subclass of a class, its parent class is also initialized (6) The Java Virtual machine is marked as the Startup class class (Java Test) when it is started, and runs a main class directly using the Java.exe command
1
2
3
4
5
6
7
8
9
10
11
12
End Life cycle
The Java virtual opportunity ends the life cycle in the following cases
1. Implementation of the System.exit () method
2. End of normal program execution
3. The program encountered an exception or error during execution and terminated abnormally
4. The Java Virtual machine process terminates due to an operating system error
Related articles:
Java Virtual machine Learning-class loading mechanism
Load and unload mechanisms for Java-based classes analysis from the JVM