Deep understanding of the JVM (5)--Virtual machine class loading mechanism

Source: Internet
Author: User
Tags array definition constant constructor prepare require static class visibility sap netweaver

The various information described in the class file will eventually need to be loaded into the virtual machine before it can be run and used. And how the virtual machine loads these class files while the virtual machine is in. What happens when the information in the class file goes into the virtual machine. This article will step through these questions. class Loading process overview

Class starts from being loaded into virtual machine memory, and its entire life cycle consists of the following 7 phases: Load (Loading) validation (verification) preparation (preparation) parsing (Resolution), until the memory is unloaded. Initialize (initialization) use (using) offload (unloading)

The first five phases are the whole process of class loading. A detailed introduction will be made later. The validation, preparation, and parsing of 3 parts are collectively referred to as the connection (linking). These 7 phases occur in the following order:

In the diagram above, the order of the 5 stages of loading, validating, preparing, initializing, and unloading is determined, and the loading process of the class must begin in such a sequential manner (starting instead of completing, these phases are intersecting each other, and another phase is activated during one phase execution). The parsing phase is not necessarily: it can in some cases start after the initialization phase, which is to support Java runtime Bindings (also known as dynamic binding or late binding). the time of class initialization

For the first stage of the class loading process: Load, the JVM specification does not impose a constraint on its starting time, which can be taken from the specific implementation of the JVM. However, for the initialization phase, the JVM specification strictly stipulates that there are only 5 cases in which the class must be "initialized" (it is natural that the load, validation, and preparation need to start before this): Encountered new, getstatic, Putstatic, Invokestatic these four bytecode directives, if the class has not been initialized, it must first trigger its initialization. The most common scenario for generating these 4 instructions is when instantiating an object using the New keyword, while reading or setting a static field of a class (except for a static field that is final decorated, which has placed the result in a constant pool at compile time), and when invoking a static method of a class. When you use the Java.lang.reflect package method to make a reflection call to a class, you need to trigger its initialization first if the class is not initialized. When initializing a class, it is necessary to trigger the initialization of its parent class if it finds that its parent class has not yet been initialized. When the virtual machine starts, the user needs to develop a main class to execute (the class that contains the main method), the virtual opportunity initializes the main class first, and when using jdk1.7 's dynamic language support, If the final parse result of a Java.lang.invoke.MethodHandle instance is ref_getstatic, Ref_putstatic, the Ref_invokestatic method handle, And the class that corresponds to this method handle is not initialized, it needs to trigger its initialization first;

The behavior in the above 5 scenarios is called an active reference to a class. In addition, all methods of referencing a class do not trigger initialization, which is called a passive reference. Common examples of passive references include: referencing a static field of a parent class through a subclass does not cause subclasses to initialize. Referencing a class through an array definition does not trigger initialization of this class, such as superclass[] SCA = new SUPERCLASS[10];. Constants are stored in the constant pool of the calling class at compile time and are not inherently referenced directly to the class that defines the constants, so initialization of classes that define constants is not triggered.

Interface loading process and class loading process is slightly different, the real difference is that in the previous 5 need to start initialization scenario 3rd: When a class is initialized, requires that its parent class has been initialized, but an interface at the time of initialization, does not require its parent interface to complete the initialization, Initialization is only possible when the parent interface is actually used, such as constants defined in the reference interface. A detailed description of the class loading process Loading

Loading is a phase of the class Loading process, and do not confuse the two. The virtual machine specification specifies that during the load phase, the JVM needs to complete the following three things: a binary byte stream that defines this class is obtained through the fully qualified name of a class. Converts the static storage structure represented by this byte stream into the runtime storage structure of the method area. A Java.lang.Class object representing this class is generated in memory as a access entry for various data of this class in the method area.

This three-point requirement is not specific and is very flexible when the JVM is implemented. For example, in the first article above, it does not indicate that the binary byte stream is to be obtained from a class file, and exactly does not indicate where to get it or how to obtain it. This also provides the basis for many Java technologies, such as reading from a ZIP package, which is common and eventually becomes the basis for future jar, EAR, and war formats. Obtained from the network, the most typical application of this scenario is the applet. Run-time compute generation, this scenario uses the most is the dynamic proxy technology, in the Java.lang.reflect.Proxy, is uses the Proxygenerator.generateproxyclass the proxy class binary byte stream. Generated by other files, a typical scenario is a JSP application that generates a corresponding class class from a JSP file. It is relatively rare to read from a database, for example, some middleware servers (such as SAP Netweaver) can choose to install the program into a database to complete the distribution of program code between clusters.

Loading of non-array classes

In contrast to other stages of the class loading process, the loading phase of a non-array class (to be exact, the act of getting the binary byte stream of a class during the load phase) is the strongest developer controllability, because the load phase can be done either using the system-provided boot class loader or by a user-defined ClassLoader. The LoadClass () method of a ClassLoader is overridden by customizing the class loader to control how the byte stream is fetched. The contents of the ClassLoader are described in the next article in the series.

Loading of array classes

The array class itself is not created by the class loader, it is created directly by the JVM. But the element type of the array class (element, which refers to the array that removes all the dimensions) is ultimately created by the ClassLoader, and the creation of a data class C follows the following rules: If the array's component type (ComponentType, Refers to the type of the array minus one dimension) is a reference type, recursively uses the loading procedure defined in this section to load this component type, the array class will be identified on the class name space of the class loader that loads the component type (this is important, in the next article, a class must determine uniqueness with the ClassLoader). If the array's component type is not a reference type (for example, the int[] array), the Java Virtual machine will mark the array class as associated with the bootstrap ClassLoader. The visibility of an array class is consistent with the visibility of its component type, and if the component type is not a reference type, the visibility of the array class will default to public.

After the loading phase is complete, the binary byte stream outside the virtual machine is stored in the method area according to the required format of the virtual machine, the data storage format of the method area is defined by the virtual machine implementation, and the virtual machine specification does not specify the specific data structure of this area. Then instantiate an object of the Java.lang.Class class in memory (not explicitly defined in the Java heap, for a hotspot virtual machine, the class object is special, although it is an object, but stored in the method area), This object will act as the external interface for the program to access these types of data in the method area. Validation

Validation is the first step in the connection phase, which is to ensure that the input class file's byte stream is correctly parsed and stored in the method area, in a format that meets the requirements for describing a Java type of information and does not compromise the security of the virtual machine itself. The rigor of the verification phase directly determines whether the Java virtual machine can withstand malicious code attacks. Overall, the verification phase will generally complete the following four phases of the inspection action: file format verification, metadata validation, bytecode verification, symbol reference validation.

1. File Format Verification

The first phase verifies that the byte stream conforms to the specification of the class file format and can be processed by the current version of the virtual machine. This phase may include the following verification points: whether to start with the magic number 0xCAFEBABE. Whether the major or minor version number is in constant pool constants within the current virtual machine's processing range has an unsupported constant type (tag flag). The various index values that point to a constant have a constant that does not exist or that does not conform to a type. The individual parts of the class file and the file itself have additional information that is deleted or appended. ......

This phase of verification is based on binary byte stream, only through the verification of this phase, the byte stream will enter the method area to store, so the subsequent 3 verification phase is based on the method area of the storage structure, no longer directly manipulate the byte stream.

2. Meta-data validation

The second stage is the semantic analysis of the information (i.e., the metadata information of the class) of the bytecode description to ensure that the information described is in accordance with the requirements of the Java language specification. For example, the following verification points: whether the class has a parent class (except for Java.lang.Object, all classes should have a parent class) whether the class's parent inherits the class that is not allowed to inherit (the final decorated class) If this class is not an abstract class, implements all the methods required in its parent class or interface ... ...

The main purpose of this stage is to verify the metadata information of the class, and ensure that there is no metadata information that does not conform to the Java language specification.

3. Byte code Verification

The main purpose of the third phase is to analyze the data flow and control flow, and to determine that the program semantics are legal and logical. After verifying the data types in the metadata information in the second phase, this phase verifies the method body of the class to ensure that the method of the checked class does not behave in a way that endangers the security of the virtual machine at run time. For example: Ensure that the data type of the operand stack at any time and the sequence of instruction code can work together. Ensure that the jump instruction does not jump to a bytecode directive other than the method body. It is guaranteed that the type conversion in the method body is valid, such as that the subclass object can be assigned to the parent class data type, but the parent class object is dangerous and illegal to assign a value to the subclass data type. ......

4. Symbol Reference Validation

The last phase of the validation occurs when the virtual machine converts the symbolic reference to a direct reference, and the conversion action takes place during the third phase of the connection-the parsing phase. Symbolic reference validation can be seen as a matching check of information outside the class itself (a variety of symbol references in a constant pool), and it is often necessary to verify that the corresponding class is found in the fully qualified name that is described in the symbol reference by a string. Specifies whether the class has a method and field that conforms to a descriptor and a simple name. The class, field, and method's accessibility (private, protected, public, default) in the symbol reference can be accessed by the current class. ......

The purpose of the symbol reference is to ensure that the parsing action is performed properly.

For the class-loading mechanism of the JVM, the validation phase is a very important but not necessarily (because there is no impact on the runtime) stage. If all of the code that is running has been verified repeatedly, then in the implementation phase you can consider using the '-xverify:none ' parameter to turn off most of the class validation measures to shorten the load time of the virtual machine class. Prepare

The main tasks of the preparation phase are as follows two: allocating memory for class variables set class variable initial value

The memory used by these variables will be allocated in the method area.

First, the memory allocations in the Prep phase include only class variables (variables that are modified by static), not instance variables, and instance variables are allocated in the Java heap along with the objects as they are instantiated.

Second, the initial value referred to here is the 0 value of the data type, assuming that a class variable is defined as:

public static int value = 123;

The initial value of the variable value after the prep phase is 0 instead of 123, since no Java method has started executing, and the putstatic instruction that assigns value 123 is that the program is compiled and stored in the class constructor <clinit> () method, So an action that assigns value to 123 will not be executed until the initialization stage. It is worth noting that if the Constantvalue attribute exists in the field property of the class field, the variable value in the prepare phase is initialized to the value specified by the Constantvalue property, assuming that the definition of the above class variable value becomes:

public static final int value = 123;

Compile-time Javac will generate the Constantvalue property for value, and in the prepare phase the virtual machine will assign value to 123 based on the settings of Constantvalue. parsing

The parsing phase is the process by which a virtual machine replaces a symbolic reference within a constant pool with a direct reference. The Association of symbolic references and direct references is as follows: Symbolic reference (Symbol References): The symbol reference is a set of symbols to describe the referenced target, the symbol can be any form of literal, as long as the use can be used without ambiguity to locate the target. The symbolic reference is independent of the memory layout implemented by the virtual machine, and the referenced target is not necessarily loaded into memory. The memory layouts that are implemented by various virtual machines can vary, but the symbolic references they can accept must be consistent because the literal form of the symbol reference is clearly defined in the Java Virtual Machine specification's class file format. Direct References: A direct reference can be a pointer to a direct target, a relative offset, or a handle that can be indirectly anchored to a target. The direct reference is related to the memory layout implemented by the virtual machine, and the direct references that are translated from the same symbol reference on different virtual machine instances are generally not the same. If there is a direct reference, then the target of the reference must already exist in memory.

The virtual machine specification does not specify the time at which the parsing action occurs, only the execution of Anewarray, Checkcast, GetField, Getstatic, instanceof, Invokeinterface, Invokespecial, Invokestatic, Invokevirtual, Multianewarray, New, Putfield, and putstatic are the 13 byte-code directives used to manipulate symbol references before they are parsed. So the virtual machine implementation can determine, as needed, whether the symbols in the constant pool are parsed when the class is loaded by the loader, or until a symbolic reference is to be used before parsing it.

Multiple resolution requests for the same symbol reference are common, except for the invokedynamic Directive (the invokedynamic directive is for dynamic language support, and its corresponding reference is called "Dynamic call Point Qualifier" and must wait until the program actually runs to this instruction. Resolution action is possible) the virtual machine implementation may cache the results of the first resolution (the direct reference is saved in the run-time pool), regardless of whether the virtual machine implementation has actually performed multiple parsing actions, and the VM implementations must be guaranteed to be in the same entity if a symbolic reference has been successfully parsed before. Subsequent reference resolution requests should always be successful and vice versa.

The parsing action is primarily for the following 7 classes of symbol reference class or Interface field class method (static method) interface method method type method handle call point qualifier

The latter three are closely related to the dynamic language support of Java. Initialize

The class initialization phase is the last step in the class loading process, in which the Java program code (or bytecode) defined in the class is actually started executing in the previous phase except when the user application can participate in the load phase by a custom classloader, and the other actions are completely dominated and controlled by the virtual machine until the initialization stage.

In the preparation phase, the variable has already assigned the initial value of the system requirement, and during the initialization phase, according to the programmer's subjective plan to initialize the class variables and other resources, simply speaking, the initialization phase is the process of virtual machine Execution class constructor <clinit> () method.

The following is a detailed explanation of how the <clinit> () method is generated, first of all to understand the characteristics and details of how this method might affect the behavior of the program during execution: <clinit> () The method is generated by the compiler's automatic collection of assignment actions for all class variables in the class and statements in the static statement block (static{} block), the order in which the compiler collects is determined by the order in which the statements appear in the source file, and in particular, the static statement block can only access the class variables defined before it. A class variable that is defined after it can only be assigned a value and cannot be accessed. For example, the following code:

public class Test {
    static {
        i = 0;  Copies of the variables can be compiled normally by
        System.out.print (i);  This compiler will prompt "Illegal forward reference"
    }
    static int i = 1;
}
The <clinit> () method differs from the class's constructor (or instance constructor <init> () method) and does not require an explicit call to the parent class's () method. The virtual opportunity automatically guarantees that the parent class's <clinit> () method has finished executing before the subclass's <clinit> () method runs. So the first class that executes the <clinit> () method in the virtual machine is definitely java.lang.Object. Because the <clinit> () method of the parent class executes first, it means that the static statement block defined in the parent class is superior to the variable assignment operation of the child class. For example, the following code:
static class Parent {public static int A = 1;
        static {A = 2;

}} static Class Sub extends Parent {public static int B = A;} public static void Main (string[] args) {System.ou 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.