"Deep Java Virtual Machine" Four: class loading mechanism

Last Update:2018-07-19 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Reprint Please indicate the source: http://blog.csdn.net/ns_code/article/details/17881581

class Loading Process

The class begins with loading into the virtual machine memory, until the memory is unloaded, and its entire lifecycle consists of loading, verifying, preparing, parsing, initializing, using, and uninstalling the seven phases. The order in which they begin is shown in the following illustration:

The class loading process includes five stages of loading, validation, preparation, parsing, and initialization. In these five phases, the order in which the four phases of loading, verification, preparation, and initialization occur is determined, and the parsing phase is not necessarily, and in some cases it can begin after the initialization phase, in order to support run-time binding (also a dynamic or late binding) of the Java language. Note also that several of these phases are started in order, not in order , as These phases are usually mixed in each other, usually in the process of one phase executing and invoking or activating another phase.

Here is a brief description of the bindings in Java: Binding refers to the invocation of a method with the class (method body) in which the method is located, and for Java, the binding is divided into static bindings and dynamic bindings:

Static binding: That is, early binding. The method is already bound before the program executes, and is implemented by the compiler or other linker. For Java, it is simple to understand the binding of the program compilation period. The methods in Java are only final,static,private and construction methods are early bound. Dynamic binding: Late binding, also called run-time binding. Binds at run time based on the type of the specific object. In Java, almost all methods are late-bound. The following details the work done at each stage of the class loading process.
Loading

The first phase of the load-time class loading process, in the load phase, the virtual machine needs to complete the following three things:

1, through a class of fully qualified name to get its definition of binary byte stream.

2. Converts the static storage structure represented by this byte stream into the RUN-TIME data structure of the method area.

3. Generates a Java.lang.Class object representing this class in the Java heap as an access point to the data in the method area.

Note that the binary byte stream in the 1th section of this article is not simply taken from the class file, for example, it can also be obtained from the jar package, obtained from the network (the most typical application is the applet), generated by other files (JSP application), and so on.

Compared to other stages of class loading, the the loading phase (exactly, the action of getting the binary byte stream of the class during the load phase) is the most controllable phase, because developers can either use the system-supplied classloader to complete the load or customize their classloader to complete the load.

After the load phase is complete, the binary stream of bytes outside the virtual machine is stored in the method area in the format required by the virtual machine, and an object of the Java.lang.Class class is also created in the Java heap, so that the data in the method area can be accessed through the object.

When it comes to loading, the class loader has to be mentioned, and the following class loader is specifically described below.

Although the ClassLoader is only used to implement the load action of the class, it plays a much more important role in the Java program than the loading phase of the class. For any one class, all need to be determined by its classloader and the class itself to be unique in the Java virtual machine, that is, even if two classes originate from the same class file, the two classes must not be equal as long as the classloader that loads them is different. The "equivalence" here includes the return results of methods such as equals (), IsAssignableFrom (), Isinstance (), which represent class objects of classes, and also the decision results of using the INSTANCEOF keyword to the object's affiliation.

From the point of view of the Java virtual machine, there are only two different class loaders:

Start ClassLoader: It uses C + + implementations (this is limited to hotspot, the default virtual machine after JDK1.5, many other virtual machines are implemented in the Java language) and is part of the virtual machine itself. All other class Loaders: These class loaders are implemented by the Java language, independent of the virtual machine, and all inherit from the abstract class Java.lang.ClassLoader, which needs to be loaded into memory by the startup ClassLoader before it can load other classes.

From the Java Developer's point of view, the ClassLoader can be roughly divided into the following three categories:

Start class loader: Bootstrap ClassLoader, same as above. It is responsible for loading the class libraries (such as Rt.jar, all java.* that are stored in the Jdk\jre\lib (JDK's installation directory, the same below), or in the path specified by the-xbootclasspath parameter, and that can be identified by the virtual machine. The beginning class is loaded by Bootstrap ClassLoader). The boot class loader cannot be referenced directly by Java programs. Extended class Loader: Extension ClassLoader, which is implemented by Sun.misc.launcher$extclassloader, which is responsible for loading the Jdk\jre\lib\ext directory, or all class libraries in the path specified by the JAVA.EXT.DIRS system variable (such as the class at javax.*), the developer can use the Extended class loader directly. Application class Loader: Application ClassLoader, which is implemented by Sun.misc.launcher$appclassloader, which is responsible for loading the class specified by the user classpath (ClassPath). Developers can use this type of loader directly, if the application does not customize their own classloader, in general this is the default ClassLoader in the program.

The application is loaded with the three kinds of loaders, and if necessary, we can add a custom ClassLoader. Because the JVM's classloader is simply the ability to load standard Java class files from the local file system, if you write your own classloader, you can do the following:

1 automatically verifies the digital signature before executing the non-confidence code.

2 dynamically create custom build classes that meet the specific needs of the user.

3 Get Java class from a specific location, such as in a database and in a network.

In fact, when using applets, a specific classloader is used, since Java class is required to be loaded from the network, and the relevant security information is checked, and the application server mostly uses the custom ClassLoader technology.

The hierarchical relationships of these kinds of loaders are shown in the following illustration:

This hierarchical relationship is called the class loader's parent delegation model. We call the class loader above each layer the parent loader of the current Layer class loader, and of course, the parent-child relationship between them is not implemented through inheritance relationships, but rather uses combinatorial relationships to use the code in the parent loader. The model was introduced during JDK1.2 and was widely used in almost all Java programs, but it was not a mandatory constraint model, but rather a class loader implementation method that Java designers recommend to developers.

The workflow for the parent delegation model is: If a class loader receives a request for class loading, it first does not attempt to load the class itself, but instead delegates the request to the parent loader to complete, in turn, so that all class-loading requests should eventually be passed to the top-level boot ClassLoader. The child loader will not attempt to load the class itself until the parent loader has not found the required class in its search scope and cannot complete the load.

The obvious benefit of organizing the relationship between class loaders using the parental delegation model is that the Java class, along with its classloader (plainly, the directory in which it resides), has a prioritized hierarchical relationship that is important to ensure the stable operation of the Java program. For example, class Java.lang.Object classes are stored in jdk\jre\ Lib under the Rt.jar, so no matter what class loader to load this class, it will eventually be delegated to the boot class loader to load, this way to ensure that the object class in the program in various class loaders are the same class.

The purpose of validation validation is to ensure that the byte stream in the class file contains information that meets the requirements of the current virtual machine and does not compromise the security of the virtual machine itself. The implementation of class validation may vary from one virtual machine to another, but roughly all of the following four phases are validated: file format validation, metadata validation, bytecode validation, and symbolic reference validation.

File Format verification: Verify that the byte stream conforms to the class file format specification, and can be processed by the current version of the virtual machine, the main purpose of this validation is to ensure that the input stream is correctly parsed and stored in the method area. After the verification of this phase, the byte stream is stored in the method area of memory, and the following three validations are based on the storage structure of the method area.
Meta-data validation: Semantic validation of metadata information for classes (in fact, syntax validation of various data types in a class) to ensure that there is no metadata information that does not conform to the Java syntax specification. BYTE code Verification: The main work of this phase verification is to carry out data flow and control flow analysis, check and analyze the method body of the class, in order to ensure that the method of the validated class does not do harm to the virtual machine security when running. Symbolic reference Validation: This is the last phase of validation, which occurs when a virtual machine converts a symbolic reference to a direct reference (the transformation occurs in the parsing phase, followed by the explanation), primarily to match the information outside the class itself (the various symbolic references in the constant pool).

The Preparation phase is the formal allocation of memory for class variables and the setting of the initial values of class variables, which are allocated in the method area. For this phase there are several points to note:

1. The memory allocation at this time includes only the class variable (static), not the instance variable, which is allocated to the Java heap as the object is instantiated.

2. The initial values set here are usually the default 0 values for data types (such as 0, 0L, NULL, FALSE, and so on), rather than being explicitly assigned in Java code.

Suppose a class variable is defined as:

public static int value = 3;

Then the initial value of the variable value after the preparation phase is 0, not 3, since no Java method has been executed at this time, and the putstatic instruction that assigns value to 3 is stored in the class constructor <clinit> () method after the program is compiled, So the action to assign the value to 3 will not be performed until the initialization phase.

The following table lists all the basic data types in Java and the default 0 values for the reference type:

Here are a few points to note:

For basic data types, for class variables (static) and global variables that are used directly if they are not explicitly assigned to them, the system assigns them a default value of 0, which, for local variables, must be explicitly assigned before use, otherwise the compile-time does not pass. For constants that are both static and final decorated, it must be explicitly assigned at the time it is declared, otherwise it will not be passed at compile time, and a constant that is only final decorated can be either explicitly assigned to it when it is declared, or it can be explicitly assigned when the class is initialized, anyway, You must explicitly assign a value to it before you use it, and the system will not give it a default 0 value. For reference data type reference, an array reference, an object reference, and so on, are used directly if it is not explicitly assigned, and the system assigns the default value of 0, or null. If the elements of an array are not assigned values when they are initialized, the elements will be given a default value of 0 based on the corresponding data type.

3. If the Constantvalue property exists in the field property sheet of the Class field, which is decorated with both final and static, then the variable value in the preparation phase is initialized to the value specified by the Constvalue property.

Suppose the above class variable value is defined as:

public static final int value = 3;

At compile time Javac will generate the Constantvalue property for value, and in the preparation phase the virtual machine assigns value to 3 according to the Constantvalue setting. This is the case, recalls the 2nd example of a passive reference to an object in the previous blog post . We can understand that the static final constant puts its results into a constant pool of classes that call it at compile time.

parsing Parsing phase is the process by which a virtual machine converts a symbolic reference in a constant pool into a direct reference. The differences and associations between symbolic references and direct references are already compared in the class class file Structure article, which is no longer discussed here. Previously, the parsing phase might start before initialization, or it might start after initialization, the virtual opportunity is judged by the need to parse the symbolic reference in the constant pool (before initialization) when the class is loaded by the loader, or wait until a symbolic reference is to be used before parsing it (after initialization). It is common to have multiple resolution requests for the same symbolic reference, and the virtual machine implementation may cache the results of the first resolution (direct references are recorded in the Run-time pool and the constants are parsed), thus avoiding the repetition of parsing actions. parsing action is mainly for class or interface, field, class method, interface method, four kinds of symbolic reference, corresponding to the constant pool of constant_class_info, Constant_fieldref_info, Constant_ Methodref_info, constant_interfacemethodref_info four types of constants. 1, parsing of classes or interfaces: determines whether the direct reference to be converted is an array type or a reference to an ordinary object type for different parsing. 2, field resolution: when parsing a field, it first finds in this class whether a field with a simple name and field descriptor matches the target, and if so, finds the end; is recursively searched for the various interfaces implemented by the class and their parent interfaces according to the inheritance relationship, and not yet, recursively searches for the parent class from the top down by the inheritance relationship until the lookup is complete, as shown in the following figure:

The search order for field resolution is easy to see from the execution results of the following code:

Class super{public
	static int m = one;
	static{
		System.out.println ("Execute Super class Static statement block");
	}


Class Father extends super{public
	static int m =;
	static{
		System.out.println ("Execute the parent class static statement block");
	}

Class Child extends father{
	static{
		System.out.println ("Execute subclass Static statement block");
	}

public class statictest{public
	static void Main (string[] args) {
		System.out.println (CHILD.M);
	}
}

Implementation of the result is as follows: Super class static statement block executed
The parent class static statement block was executed
33
If you comment out the row defined for m in the Father class, the output is as follows: The Super class static statement block was executed
11 In addition, it is clear that this is last blog postThe 1th example in the case, here we can analyze the following: The static variable occurs in the statically resolved phase, before it is initialized, when the symbolic reference to the field has been converted to a memory reference, and it is associated with the corresponding class, because the field that matches M is not found in the subclass. Then M will not be associated with the subclass, so it does not trigger initialization of the child class. Finally, we need to pay attention to: in theory, according to the above order for search analysis, but in practical applications, virtual machine compiler implementation may be more stringent than the requirements of the above specifications. If there is a field with the same name that appears in both the interface and the parent class of the class, or in the interface of either itself or the parent class, the compiler may reject the compilation. If you make some modifications to the above code, change the super to an interface and inherit the child class from the Father class and implement the Super interface, the following error will be reported at compile time: statictest.java:24: The reference to M is ambiguous, the variable m in Father and The variable m in Super
are matched
System.out.println (CHILD.M);
^
1 Error
3, the class method analysis:The parsing of a class method is similar to the search step for parsing a field, but it is more of a step to determine whether the method is in a class or an interface, and a matching search for a class method is to search the parent class first and then search the interface. 4, interface method analysis:Like the class method parsing step, the knowledge interface does not have a parent class, so it is only recursive to search the parent interface recursively.
Initialization of initialization is the final step in the class loading process, at which point the Java program code defined in the class is actually started. In the preparation phase, the class variable has already been assigned the initial value of the system requirement, and in the initialization phase, the class variables and other resources are initialized according to the programmer's subjective plan specified by the program, or they can be expressed from another angle: The initialization phase is the execution Class builder <clinit> () Process of the method. Here is a brief description of the execution rules of the <clinit> () method: 1, <clinit> () method is generated by the compiler automatically collecting the assignment actions of all class variables in a class and the statement merges in a static statement block. The order that the compiler collects is determined by the order in which the statements appear in the source file, where only the variables defined before the static statement block are defined in the static statement block, You can assign a value in the previous static statement, but you cannot access it. The 2, <clinit> () method differs from the instance constructor <init> () method (the constructor of the class), it does not need to explicitly invoke the parent class constructor, and the virtual opportunity guarantees that the subclass's <clinit> () method executes the <clinit> () method of the parent class before executing. Therefore, the class of the first executed <clinit> () method in the virtual machine must be java.lang.Object. The 3, <clinit> () method is not necessary for a class or interface, and if there is no static statement block in a class and no assignment to the class variable, then the compiler can not generate the <clinit> () method for the class. 4, static statement blocks cannot be used in interfaces, but there are still assignment operations initialized by class variables (final static), so the interface generates the <clinit> () method as the class does. But the interface fish is different: the <clinit> () method that performs the interface does not need to execute the <clinit> () method of the parent interface first, and the parent interface is initialized only if the variable defined in the parent interface is used. In addition, the interface's implementation class does not perform the interface's <clinit> () method as it initializes. 5, virtual opportunity to ensure that a class <clinit> () method is correctly locked and synchronized in a multithreaded environment, if multiple threads simultaneously initialize a class, then only one thread will execute the class<clinit> () method, other threads need to block the wait until the active thread executes the <clinit> () method. If there is a lengthy operation in the <clinit> () method of a class, it can cause multiple threads to block, and in practice the blocking is often very subtle.
A simple example is given below to give a clearer picture of the above rules:

Class father{public
	static int a = 1;
	static{
		a = 2;
	}
}

Class Child extends father{public
	static int b = A;
}

public class clinittest{public
	static void Main (string[] args) {
		System.out.println (child.b);
	}
}

Executing the above code will print out 2, which means that the value of B is assigned to 2. Let's look at the steps to get the result. First, allocate memory for the class variable and set the class variable initial value in the preparation phase so that both A and B are assigned the default value of 0, and then give them the value specified in the program when the <clinit> () method is invoked. When we call child.b, the <clinit> () method that triggers the child, according to Rule 2, performs the <clinit> () method of its parent class father first, and according to Rule 1, executes <clinit> () method, the associated static statement needs to be executed in the order in which the static statement or static variable assignment operation occurs in the code, so that when the <clinit> () method that executes the father is triggered, a value of a is assigned to 1. Then executes the statement in the static statement block, assigns a value of 2, and then executes the <clinit> () method of the child class, which assigns B a value of 2. If we reverse the Father class, "public static int a = 1;" Statement and the order of the "static statement block", which will print out 1 after the program executes. Obviously, according to Rule 1, when the Father <clinit> () method is executed, the contents of the static statement block are executed in order, and then "public static int a = 1 is executed". Statement. In addition, after reversing the order of the two, if you access a in a static statement block (such as assigning A to a variable), the error will be made at compile time, because according to Rule 1, it can only assign a value to a and cannot be accessed.

SummaryThroughout the class load process, except in the load phase user application can customize the class loader participation, all the rest of the action is completely driven by the virtual machine and control. The Java Program code (also byte code) defined in the class is initialized to start, but the execution code here is only the beginning, it is limited to the <clinit> () method. Class loading process is mainly the class file (accurately, should be the class of binary byte stream) loaded into the virtual machine memory, the actual execution of bytecode operations, after the completion of the load really started.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

"Deep Java Virtual Machine" Four: class loading mechanism

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

"Deep Java Virtual Machine" Four: class loading mechanism

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support