Java Magic Hall: Introduction to class loading mechanism

Source: Internet
Author: User

First, preface

When you enter $ java main<cr><lf> in Cmd/shell, the Main program starts running, but before you run it, you must load the Main.class and its dependent classes into the JVM! This article will record the learning experience of the class loading mechanism for future reference. If there is a mistake, please correct me, thank you!

The following are based on JDK7 and hotspot VMs.

Second, the implementation of Java at the moment

We all know the Java command to start the JVM and run the application, but how is the actual process?

1. First decide whether to run the JVM in client or server mode based on the JAVA runtime configuration item or <JAVA_HOME>/JRE/LIB/I386/JVM.CFG, and then load the <JAVA_HOME>/JRE /bin/client or Server/jvm.dll, and start the JVM;

2. Bootstrap ClassLoader will be loaded at the same time as the JVM is started (starting the class loader, written in C + +, is part of the JVM);

3. Load the Sun.misc.Launcher class via Bootstrap ClassLoader (Extclassloader and Appclassloader are its internal classes);

4. The Sun.misc.Launcher class creates an instance of itself during the initialization phase, creating an instance of the Extclassloader (extension ClassLoader), a appclassloader (System ClassLoader) instance during the creation process, The Appclassloader instance is set to the Threadcontextclassloader (thread context class loader) of the main thread.

5. The Appclassloader instance then starts loading the Main.class and its dependent class libraries.

Second, the class loading process

1. Loading (Loading)

2. Links (linking), subdivided into: validation (verification), preparation (preparation) and parsing (Resolution)

3. Initialization (initialization)

4. Use (using)

5. Uninstall (unloading)

Note: loading, linking, initializing three phases are cross-mixed, not after the load is complete, or after the link is complete.

The information loaded by the class can be viewed through -xx:+traceclassloading .

Third, loading phase

Throughout the class-loading mechanism, only the loading phase can be controlled by the programmer, and the rest of the stages are fully controlled by the JVM.

This is a total of 3 steps:

1. Obtain a binary byte stream that defines this class by the class loader based on the binary name of a class, the file format validation of the validation operation of the link stage when reading the binary byte stream of the class is started, and only after the file format is validated to be stored in the method area, if validation fails, it throws Java.lang.VerifyError or its child exception class. (file format validation is used to ensure that the read data is correctly parsed and stored in the method area of the JVM stack.) The class file format is specified by the JVM specification, and the data structure of the method area is determined by the JVM itself)

The low source of the binary byte stream is diverse and is listed below:

A. Convert a binary name (such as Com.fsjohnhuang.test.Main) to a platform-dependent file system path (Linux is Com/fsjohnhuang/test/main.class), and then find the corresponding class file relative to the ClassLoader;

B. By way of a, convert the binary name to a file system path, and then find the class file in the archive file such as jar, ear, and war under the class loader's jurisdiction;

C. Get the binary byte stream over the network.

2. Convert the static storage structure (class file structure) represented by the byte stream into the run-time data structure of the method area.

3. Generate a Java.lang.Class instance representing a class or interface in memory as a portal for manipulating the class or interface metadata (reflection is taking advantage of the class instance).

Attention:

1. For short boolean char int float long double basic data type is not required to perform class loading;

2. For a data type load, the component type of the array is essentially loaded (the string[] array has a component type of String), and a [Ljava.lang.String] array type is generated internally by the JVM (identified in bytecode as [ljava/lang/ String;). As a result, there is no problem with arrays being out of bounds in Java when manipulating arrays.

Iv. Link Stage

The link stage is subdivided into validation (verification), preparation (preparation), and parsing (Resolution) 3 sub-stages

Parsing (Resolution) is not necessarily performed at class load time, it is possible to execute at run time.

Verification (verification)

Verify file format validation, metadata validation, bytecode validation, and symbol reference validation with 4 operations.

1. File Format Verification

The validation process is not necessary first for classes that have been reused and validated. You can turn off validation with -xverify:none to shorten the time that virtual machine loads.

Action object: Binary byte stream
Purpose: Verify that the specifications for the class file format are met.

2. Meta-data validation

Action object: Information for a class or interface in a method area
Objective: To analyze the meta-data of the class of bytecode description to ensure that it conforms to the Java language specification.
The metadata information for a class includes:
A. Parent class information (fully qualified name, modifier, etc.);
B. The parent class field, method information;
C. Information about the class (fully qualified name, modifier, etc.);
D. field, method information for the class;
Wait a minute. Note: No method body information is included!

3. Byte code Verification

Action object: The Code property of the class information in the method area
Objective: To make a semantic analysis of the method body statement so as to ensure that the method runs without the event of compromising the JVM security.
This is a time-consuming operation that requires a type deduction because this semantic analysis needs to perform checks similar to the following.
1. Check whether the data type of the operand stack is compatible with the instruction operand type;
2. Check that the jump instruction does not jump to the bytecode instruction outside the method body;
3. Check that the type conversion is safe.

JDK1.6 adds a stackmaptable attribute to the code attribute that describes the state of the local variable table and the operand stack reference at the beginning of all basic blocks in the method (basic block, code block split by control flow). The bytecode verification then performs type checking rather than type deduction, which improves the performance of the validation. You can use -xx:-usesplitverifier to turn off type checking regression to type deduction, or by -xx:+failovertooldverifier to set type deduction when type checking fails.

JDK1.7 can only be used for type checking.

But Stackmaptable's data can still be tampered with, and that's what the JVM development team needs to consider.

NOTE: Bytecode validation triggers parsing of the symbolic references of the parent class or the implemented interface (that is, the class loading process is triggered).

4. Symbol Reference Validation

Action object: Class or interface information in the method area
Purpose: To validate the symbolic reference of a class and the actual information of the Class (class, field, method) to ensure that the symbol reference is successfully resolved to a direct reference, and that the current class can successfully access the direct reference
Symbol references are validated for symbolic references when the parse sub-stage of the link phase is executed, and the validation includes the following content:
A. The fully qualified name that is described by the string in the symbol reference can find the corresponding class in the method area.
B. Whether the corresponding fields and methods can be found in the method area through the symbol reference to the field, the simple name of the method, and the descriptor.
C. Whether the current instance has permission to access the class, field, and method of the symbol reference.
If validation fails, the java.lang.IncompatibleClassChangeError subclass Java.lang.IllegalAccessError is thrown, Java.lang.NoSuchFieldError and java.lang.NoSuchMethodError and so on.

Preparation (preparation)

Allocates the memory space for the class variable in the method area and initializes it to 0. Examples are as follows:

// after the preparation phase, the value class variable is stored in the method area, with a values of 0. The assignment of 123 will be done during the initialization phase.  publicstaticint123; // for class constants (class static constants), they are directly initialized to the value of the Constantvalue property.    After the prepare phase, the value class variable is stored in the method area with a value of 123.  publicstaticint123

0 values of each type

int 0 Long 0L  Short (short) 0 Char ' \u0000 ' byte (byte) 0  falsefloat0.0fdouble0.0dnull 

Parsing (Resolution)

Again, it is not necessary to execute at class load time, but to perform the preparation phase when called at run time.

The essence of the prep phase is to replace the symbolic reference in the constant pool with a direct reference.

symbol Reference (symbolic References): Describes the referenced target (class, interface, method, field, and so on) in a set of symbols. As long as the target can be unambiguously positioned and independent of the actual internal layout of the JVM, the referenced target is not necessarily loaded into memory. The form of symbolic references has been specified by the JVM specification.

Direct References: A direct reference can be a pointer to a target directly, a relative offset, or a handle that can be indirectly anchored to the target. If there is a direct reference, the target must already exist in memory.

In the execution of Newarray,checkcast,getfield,getstatic,instanceof,invokedynamic,invokeinterface,invokespecial,invokestatic, Invokevirtual,ldc,ldc_w,multianewarray,new,putfiled and Putstatic These 16 bytecode instructions are parsed before they are executed using the symbol references.

In addition to the invokedynamic instruction, the direct reference will be cached to avoid duplicate parsing when the other instruction trigger symbol reference resolves to a direct reference. (or do not cache, but the JVM will ensure that the first parse succeeds and subsequent parsing succeeds, and failure will receive the same exception as subsequent parsing). The invokedynamic is different for each resolution.

Parsing is primarily for classes or interfaces (Constant_class_info), Fields (Constant_fieldref_info), class methods (Constant_methodref_info), interface methods (Constant_ Interfacemethodref_info), method type (Constant_methodtype_info), method handle (Constant_methodhandle_info), and call Point qualifier (Constant_ Invokedynamic_info) 7 symbol references. (the latter three types are JDK1.7 new Dynamic language support information related)

1. Parsing of classes or interfaces

The symbolic reference N in Class D is parsed to direct reference C, first passing the fully qualified name of N to the class loader of D to load Class C, then into the load, validate, prepare phase, and load the parent class or the implemented interface because of bytecode validation. Once a class or interface fails to load, the symbolic Reference n resolves to the direct application C operation will be declared failed
After successful parsing, the symbol reference verification is performed to check if D has permission to access C. If not, then throw ' java.lang.IllegalAccessError '.

2. Parsing of fields

First, class or interface parsing is performed on the symbolic reference pointed to by the ' Class_index ' item of ' Constant_fieldref_info '. If the parse succeeds to get a direct reference to the class or interface C, then in C find a direct reference to the simple name and the field descriptor that matches the content pointed to by the ' Name_index ' item of ' Constant_fieldref_info '. If the failure of the recursive search from the bottom to the implementation of the interface in C, if there is a match, if the failure of the recursive search from the bottom to the C implementation of the parent class is matched, if the failure is thrown ' java.lang.NoSuchFieldError '.
If the direct reference is successfully parsed, the symbolic reference is validated, and the failure throws ' Java.lang.IllegalAccessError '.

3. Parsing of class methods

First, class or interface parsing is performed on the symbolic reference pointed to by the ' Class_index ' item of ' Constant_methodref_info '. If the parse succeeds to get a direct reference to the class or interface C, then in C find a direct reference to the simple name and the field descriptor that matches the content pointed to by the ' Name_index ' item of ' Constant_methodref_info '. If the failure is a recursive search from the bottom of the implementation of the parent class C whether there is a match, if the failure of a recursive search from the bottom to the C implementation of the interface is a match (if the success of the C is an abstract class and throw ' Java.lang.AbstractMethodError '), if the failure is thrown ' Java.lang.NoSuchMethodError '.
If the direct reference is successfully parsed, the symbolic reference is validated, and the failure throws ' Java.lang.IllegalAccessError '.

4. Parsing of interface methods

First, the ' Class_index ' of the ' constant_interfacemethodref_info ' key refers to the symbolic reference for interface parsing. If the parse succeeds to get a direct reference to the class or interface C (if C is not the interface throws ' Java.lang.IncompatibleClassChangeError '), then find the simple name and the field descriptor in C with ' Constant_ A direct reference to the content that the Interfacemethodref_info ' Name_index ' item points to, and if it fails, recursively searches for a match in the parent interface of C, if failure throws ' Java.lang.NoSuchMethodError '.

V. Phase of initialization

Classes and interfaces have initialization procedures, essentially executing the ' <clinit> ' constructors in bytecode.

Both static and static code blocks in the class are re-assigned by code to the ' <clinit> ' function. and the parent class must be initialized before the subclass is initialized.

The static field of the interface is also assigned by the code to the ' <clinit> ' function. However, instead of initializing the interface, it must have its parent interface complete initialization, but only trigger initialization when the parent interface (static constant field) is actually used.

The JVM automatically handles the synchronous mutex execution of the ' <clinit> ' function in a multithreaded environment. Therefore, executing a time-consuming operation in ' <clinit> ' will block execution of other threads.

Active referencing

The JVM specification stipulates that the following 5 cases must be initialized (load, link will naturally go into execution state before)

1. When encountering new, getstatic, Putstatic, or invokestatic 4 bytecode instructions, if the class has not been initialized, the initialization needs to be triggered first. The corresponding Java code calls the class method by using the keyword new instance to read or write a class variable.
2. When you use the method in the ' Java.lang.reflect ' package to manipulate the class, the initialization needs to be triggered first if the class has not been initialized.
3. When initializing a class, the parent class is initialized first if its parent class is not initialized.
4. When the virtual machine starts, the virtual opportunity initializes the class where the Ingress function resides.
5. JDK1.7 adds support for dynamic languages. If the final parsing result of a ' java.lang.invoke.MethodHandle ' instance is a ref_getstatic,ref_putstatic,ref_invokestatic method handle, The class where the handle is located is not initialized, it needs to trigger initialization first.

In addition to the above 5 cases, the other way of referencing a class is not to trigger initialization, and is called a passive reference . The following example is a passive reference
1. Accessing the parent class static field through a subclass does not cause the subclass to initialize, only causes the parent class to initialize.
2. Creating an Array object in Java code does not result in initialization of the array's component classes (such as superclass[] 's component class as superclass). Because the bytecode directive that creates the array class is NewArray.
3. The static constant of Class A accesses class B does not cause initialization of Class B. Since the constants used by the class are stored directly in the reference to their constant pool at compile time, the runtime Class A actually accesses its own constant that is not related to Class B.

Vi. Summary

If there is a mistake, please correct me, thank you!

Respect the original, reprint please indicate from: http://www.cnblogs.com/fsjohnhuang/p/4283511.html ^_^ Fat Boy John

Vii. Reference

Deep understanding of Java Virtual Machine JVM advanced features and best practices

Java Magic Hall: Introduction to class loading mechanism

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.