In-depth understanding of Java Virtual Machines: The process of class loading

Source: Internet
Author: User
Tags field table

In-depth understanding of Java Virtual Machines: The process of class loading

Class starts from being loaded into the virtual machine's memory, and its entire lifecycle includes: load, validate, prepare, parse, initialize, use, and unload seven phases , until the memory is unloaded. The process of loading , validating, preparing, parsing, and initializing five stages of the class load.
The following details the work done at each stage of the class loading process.

Load

During the first stage of the load-time class loading process, the virtual machine needs to complete the following three things during the load phase:

  • 1. Get the binary byte stream defined by the fully qualified name of a class.
  • 2. Transform the static storage structure represented by this byte stream into the runtime data structure of the method area.
  • 3. Generate a Java.lang.Class object representing this class in the Java heap as the access entry for the data in the method area.

Note that the binary byte stream in 1th here is not simply taken from the class file, such as it can be obtained from the jar package, obtained from the network (the most typical application is the applet), generated by other files (JSP application), etc.

In contrast to other stages of class loading, the load phase (accurately, the action of getting the binary byte stream of a class during the load phase) is the strongest stage, because developers can either use the system-provided classloader to complete the load or customize their own classloader to complete the load.

When the load phase is complete, the binary byte stream outside the virtual machine is stored in the method area in the format required by the virtual machine, and an object of the Java.lang.class class is created in the Java heap so that the data in the method area can be accessed through the object.

Verify

The purpose of the validation is to ensure that the byte stream in the class file contains information that conforms to the requirements of the current virtual machine and does not compromise the security of the virtual machine itself . Implementations of different virtual machines for class validation may vary, but the following four phases of validation are generally performed: file format validation, metadata validation, bytecode validation, and symbolic reference validation .

1, file Format Verification : Verify that the byte stream conforms to the class file format specification, and can be processed by the current version of the virtual machine. This phase may include the following verification points:

  • Verification of magic number
  • Version number verification, determine whether the virtual machine can handle
  • Does the common sense in a constant pool have a constant type that is not supported?
  • ...........

The main purpose of this verification is to ensure that the input byte stream is correctly parsed and stored within the method area . After this phase of validation, the byte stream is stored in the memory's method area, and the subsequent three validations are based on the storage structure of the method area.

2. meta-data validation : The semantic analysis of the information of bytecode description ensures that there is no metadata information that does not conform to the Java syntax specification. For example:

  • Does this class have a parent class
  • Whether the parent of this class inherits from a class that is not allowed to be inherited (a class that is final decorated)
  • If this class is not an abstract class, it implements an abstract method within its parent class or interface.
  • The field in the class, whether the method conflicts with the parent class (for example, a method overload that does not conform to a rule, etc.)

3, bytecode verification : The main work of this phase verification is the data flow and control flow analysis, the method body of the class to verify the analysis, in order to ensure that the method of the class is checked at runtime will not compromise the virtual machine security behavior.
For example

  • Ensure that the data type of the operand stack at any time and the sequence of instruction code can work together
  • Ensure that the jump instruction does not jump to a bytecode directive other than the method body.
  • Ensures that the type conversions in the method body are valid. (such as assigning a subclass object to the parent class object, which is correct, but assigning the parent object to the subclass data object is incorrect.) )
    ........

4. symbol Reference Validation : This is the last stage of validation, which occurs when a virtual machine converts a symbolic reference to a direct reference (the conversion occurs during the parsing phase, and is explained later). The main is to check the matching of information outside the class itself (the various symbol references in the constant pool).
For example

  • Whether the accessibility of classes, methods, and fields in a symbol reference can be accessed by the current class.
  • The full-qualified name of a groove in a symbol reference can find the corresponding class, method, and field in the string description.
  • ........
Get ready

The prep phase is a phase that formally allocates memory for class variables and sets the initial value of class variables, which are allocated in the method area . Here are a few things to explain
1th : In this case, the memory allocation consists only of class variables (static), not instance variables, and the instance variables are allocated to the Java heap as the object is instantiated.
2nd : The initial value set here is typically the default 0 value of the data type (such as 0, 0L, NULL, FALSE, and so on), rather than being explicitly assigned to the value in Java code.

For example, a class variable is defined as: public static int value = 123;
Then the initial value of the variable value after the prep phase is 0, not 123, because no Java method has yet started executing, and the putstatic instruction that assigns value 123 is stored in the class constructor () method after the program is compiled. So an action that assigns value to 123 will not be executed until the initialization stage.

3rd : The 2nd refers to the "normally" initialized to a value of 0, the relatively special case is as follows: If the Constantvalue attribute exists in the field attribute table of the class field (that is, a variable that is both final and static modified), Then in the prepare phase the variable value is initialized to the value specified by the Constvalue property.

Assume that the above class variable value is defined as: public static final int value =123;
Compile-time Javac will generate the Constantvalue property for value, and in the prepare phase the virtual machine will assign value to 123 based on the settings of Constantvalue

again , the prep phase is a phase that formally allocates memory for the class variable (static) and sets the initial value of the class variable (that is, the default value of 0).

Below are the 0 values for each of the basic types mentioned above, note that the values in parentheses below are zero values

  • Int (0)
  • Long (0L)
  • Short ((short) 0)
  • char (' \u0000 ')
  • Byte ((byte) 0)
  • Boolean (False)
  • Float (0.0f)
  • Double (0.0d)
  • Reference (NULL)

For class variables, instance variables, and local variables, here are a few things I want to say:

  • 1, for the basic data type, for class variables (static) and instance variables, if you do not explicitly assign them to use directly, then the system will give them a default value of 0, and for local variables, before use must be explicitly assigned to it, or compile without passing, Here is a blog I mentioned a few of the questions on the test to the knowledge point.
  • 2. For constants that are static and final modified at the same time, they must be explicitly assigned at the time of declaration, otherwise they will not pass at compile time, and only the final modified constants can be explicitly assigned to them at the time of declaration, or they can be assigned explicitly at class initialization, in summary, You must assign a value to it explicitly before you use it, and the system does not give it a default value of 0.
  • 3, for reference data type reference, such as array reference, object reference, etc., if it is not explicitly assigned to use directly, the system will give it a default value of 0, that is, null.
  • 4. If there are no values assigned to each element in the array at initialization, the element will be given a default value of 0 based on the corresponding data type.
Analytical

The parsing phase is the process by which a virtual machine converts a symbolic reference in a constant pool into a direct reference .

Here is a description of the differences and associations between symbolic references and direct references:

  • Symbol Reference: A symbol reference is a set of symbols to describe the referenced target, the symbol can be any form of the literal, as long as the use can be used without ambiguity to locate the target. The symbolic reference is independent of the memory layout implemented by the virtual machine, and the referenced target is not necessarily loaded into memory.
  • Direct reference: A direct reference can be a pointer to a target directly, a relative offset, or a handle that can be indirectly anchored to the target. A direct reference is related to the memory layout implemented by the virtual machine, and a direct reference that is translated on a different virtual machine instance by the same symbolic reference will not generally be the same. If you have a direct reference, it means that the target of the reference must already exist in memory.

It is said that the parsing phase may start before initialization, or it may start after initialization, and the virtual opportunity is judged by the need to parse the symbolic reference in the constant pool (before initialization) when the class is loaded, or until a symbolic reference is to be used before parsing it (after initialization).

A common thing to do with multiple parsing requests for the same symbol reference is that the virtual machine implementation may cache the results of the first parse (direct references are recorded in the run-time pool, and the constants are marked as resolved), thus avoiding repetitive parsing actions.

Parsing actions are mainly for classes or interfaces, fields, class methods, interface methods, method types, method handles, and call Point qualifier Seven class symbol references, respectively, corresponding to Constant_class_info, Constant_fieldref_info, and Constant_methodref_info, Constant_interfacemethodref_info and other seven kinds of constant types. Since the following 3 kinds of JDK1.7 are closely related to dynamic language support, will be explained in the following, the main explanation of the previous 4.

1. parsing of classes or interfaces : Determine whether the direct reference to be converted is an array type or a reference to an ordinary object type for different parsing. The process is as follows:

Assuming that the current code is in the Class D, if you want to parse a never-parsed symbol reference n into a direct reference to a class or interface C, the virtual machine needs the following 3 steps to complete the parsing process.
(1) If C is not an array type, pressing the virtual machine will pass the fully qualified name representing N to the class loader of D to load the Class C. During the loading process, due to metadata validation and bytecode validation, it is possible to trigger loading actions for other related classes, such as loading the parent class of the class or implementing an interface. Once any exception occurs in this loading process, the parsing process fails.
(2) If C is an array type and the element type of the array is an object, then the descriptor of n is similar to the "[Ljava/lang/integer" form, which will load the array element type according to the 1th rule. If the description of N is a previously assumed form, the element type that needs to be loaded is "Java.lang.Integer", and then the virtual machine generates an array object representing this array dimension and element.
(3) If the above step does not have any exception, then C in the virtual machine in fact has become a valid class or interface, but before the completion of the parsing is also a symbolic reference validation, confirm that D is all have access to C.

2. Field parsing :
To parse a field symbol that has not been parsed, the Constant_class_info symbol reference for the index in the Class_index item in the field table is first parsed, that is, the symbol reference for the class or interface to which the field belongs, and the reference class or interface parsing . If the parsing of a class or interface succeeds, first find in this class whether it contains a field with a simple name and a field descriptor that matches the target, and if so, the end of the search, and if not, the individual interfaces and their parent interfaces implemented by the class are recursively searched according to the inheritance relationship, not yet, The parent class is searched recursively from top to bottom according to the inheritance relationship until the lookup is finished.
3. class method parsing and interface method parsing : Similar to field parsing.

Initialization

Initialization is the last step in the class loading process, and at this stage, the Java program code defined in the class is really starting to execute. In the preparation phase, the class variable has been assigned the initial value of the system requirement, while in the initialization phase, the class variables and other resources are initialized according to the subjective plan specified by the programmer, or can be expressed from another angle: The initialization phase is the process of executing the class constructor () method.

Here is a description of the following () method generation process and execution process:

    • 1, () method is generated by the compiler automatically collects the assignment of all class variables in the class and the statement merge in the static statement block, the order that the compiler collects is determined by the order in which the statements appear in the source file, the static statement block can only access the variables defined in the static statement block, the variables defined behind it, You can assign a value in the preceding static statement, but you cannot access it. Examples are as follows
package org.wrh.classupload;publicclass TestClassDemo03 {    static{        i=0;//在前面的static块可以赋值,但是不可以引用        //System.out.println(i);//错误提示:Cannot reference a field before it is defined    }    publicstaticint i=1;}
    • 2, the () method differs from the constructor of the class (instance constructor () method), which does not require the explicit invocation of the parent class constructor, which guarantees that the () method of the parent class has been executed before the () method of the subclass is executed. Therefore, the class of the first executed () method in the virtual machine is definitely java.lang.Object. Examples are as follows
Package org.wrh.classupload; class Parent{     Public Static intValue=1;Static{System.out.println ("Parent Init"); Value=2; }} class Son extends Parent{     Public Static intValue_1=value;Static{System.out.println ("Son Init"); }} Public  class testinit {     Public Static voidMain (string[] args) {//TODO auto-generated method stubSystem.out.println (son.value_1); }}

The results of the program run as follows:

Parent Init
Son Init
2

    • 3, () method is not necessary for a class or interface, if there is no static statement block in a class, there is no assignment to the class variable, then the compiler can not generate a () method for this class.
    • 4. A static statement block cannot be used in an interface, but there are still assignment operations initialized by the class variable (final static), so that the interface generates () methods as a class. But the interface differs from the class: The () method of the execution interface does not need to execute the parent interface's () method first, and the parent interface is initialized only if the variable defined in the parent interface is used . In addition, the implementation class of the interface does not execute the interface's () method at initialization time.
    • 5, virtual opportunity to ensure that a class of () method in the multi-threaded environment is correctly locked and synchronized, if multiple threads to initialize a class at the same time, then there will be only one thread to execute the class () method, the other threads need to block wait until the active thread execution () method is complete. If there is a lengthy operation in the () method of a class, it can cause multiple threads to block, and in practice the blocking is often very covert.

    • The initialization of classes can also be seen in the "Deep Java Virtual Machine": Class initialization this blog post

Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

In-depth understanding of Java Virtual Machines: The process of class loading

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.