In-depth parsing of Java object initialization process

Source: Internet
Author: User
Tags aop constructor reflection regular expression static class jboss jboss server

Let's take a look at this interview question:

Public class Base {private String baseName = "base"; // Constructor public Base () {callName ();} // object method public void callName () {System. out. println (baseName);} // static internal class static class Sub extends Base {// static internal class field private String baseName = "sub"; // static class method, it can be found that public void callName () {System. out. println (baseName) ;}// program entry public static void main (String [] args) {Base B = new Sub ();}}


Find the output of this program.

[Good news: Comrades, don't underestimate this part of the code. Next we will slowly parse the process from the class loader to the final output. How does JVM handle it]

1. Let's start with class loading. After this class is compiled, two. class files are generated in the corresponding directory. One is Base. class, and the other is Base $ Sub. class. At this time, the class loader loads the two. class files to the memory.

2. The static code block is first executed. Therefore, the code in the Sub class is first executed. The Sub class does not have a static code block.

[Note-1] The sequence is child-first ---> in the parent class. [note-2] The field value is placed in the constructor and initialized in the code order.

3. After initialization (the loading of the bytecode file is complete), go to the main method and see the kids shoes here. Never blink, the key point is staged-Base B on the left. In the code here, it means a piece of nonsense. You can jump directly, starting with new Sub, in this case, the implicit constructor of the Sub class is called. This implicit constructor is offered to you by JVM for free. The essence of the constructor in the Sub class is as follows:

Public Sub (){

Super (); // The first line in the constructor. Why? This is because some software upgrades must be compatible with some features of the old version. The parent class, that is, some original initialization information, must also be executed before executing the current

BaseName = "sub ";

}

4. Okay. At this time, execute the super () line of code, that is, run it to the parent class. We will look at the constructor in the parent class. Public Base () {callName () ;}, similarly, we need to restore the essence of this code:

Public Base (){

BaseName = "base"; // 4.1 java contains no field rewriting, only method name rewriting, therefore, in this case, allocate the memory space for the baseName field of the parent class and assign a value to the baseName field.

CallName (); // call the 4.2 callName () method. Here is an execution detail that requires attention: when executing a method in the parent class, the execution principle is: the subclass has been overwritten, the execution subclass has not been overwritten, execute the parent class ). At this time, the callName method of the subclass is called,

}

5. The callName method of the subclass is executed, and the value of the baseName field of the subclass is printed. At this time, the assignment of fields in the subclass constructor is not executed, so the output is null at this time.

6. After the constructor of the parent class is executed, return to the subclass and continue execution from the next line of super (). In this case, the field baseName is allocated with the storage space, then assign the value.

The call sequence diagram of this process is as follows:



As you can see, the field value is printed before initialization in step 4. It is clear that the output is null at this time.

The output result is attached as follows:




Parsing the initialization process of Java classes and objects


Problem introduction

Recently, I am debugging an enumeration parser program that loads more than 10 thousand enumeration codes in the database into the cache, to quickly locate the enumeration code and all enumeration elements of a specific enumeration class, this class uses two policies to create a memory index while loading the enumeration code. This class is a public service class that is used at all levels of the program, so I implement it as a singleton class. This class runs normally before I adjust the class instantiation statement position, but when I adjust the class instantiation statement to the static initialization statement, my program no longer works for me.

The following is the simplified sample code:

[List 1]

Package com. ccb. framework. enums; import java. util. collections; import java. util. hashMap; import java. util. map; public class CachingEnumResolver {// all problems with a single-state instance cause private static final CachingEnumResolver SINGLE_ENUM_RESOLVER = new CachingEnumResolver (); /* MSGCODE-> Category memory index */private static Map CODE_MAP_CACHE; static {CODE_MAP_CACHE = new HashMap (); // to illustrate the problem, here I initialize a piece of data CODE_MAP_CACHE.put ("0", "Beijing ");} // Private, for single instance private CachingEnumResolver () {// This method is also responsible for initEnums ();} /*** Initialize all enumeration types */public static void initEnums (){//~~~~~~~~~ The problem is exposed from here ~~~~~~~~~~~ // If (null = CODE_MAP_CACHE) {System. out. println ("CODE_MAP_CACHE is empty. The problem is exposed here. "); CODE_MAP_CACHE = new HashMap ();} CODE_MAP_CACHE.put (" 1 "," Beijing "); CODE_MAP_CACHE.put (" 2 "," Yunnan Province ");//..... other code ...} public Map getCache () {return Collections. unmodifiableMap (CODE_MAP_CACHE);}/*** get single-state instance ** @ return */public static CachingEnumResolver getInstance () {return SINGLE_ENUM_RESOLVER;} public static void main (String [] args) {System. out. println (CachingEnumResolver. getInstance (). getCache ());}}


After reading the above code, you may feel a little confused. This class seems to be okay. This is indeed a typical hungry Chinese single-state mode. How can this problem be solved?

Yes, it seems that there is no problem, but if you run him up, the result is that he will not work correctly for you. Run this class. The execution result is:

[List 2]

CODE_MAP_CACHE is empty. The problem is exposed here.
{0 = Beijing}

How is my program like this? Why is CODE_MAP_CACHE empty in the initEnum () method? Why does the output CODE_MAP_CACHE contain only one element and the other two elements ????!!

If you are debugging the program, you must be surprised at the moment. Is there a problem with my Jvm? None! If not, what's wrong with my program? This is definitely not the result I want. In fact, no matter how you modify the initEnum () method, at least I will not doubt that the problem may occur when creating a CachingEnumResolver instance. It is precisely because I believe in the method for creating a CachingEnumResolver instance, coupled with a misunderstanding of the underlying principles of Java class initialization and object instantiation, it took me three or four hours-about half a working day.

So what exactly is the problem? Why is there such a strange thing? Before solving this problem, let's take a look at the underlying mechanism of JVM classes and object initialization.


Class lifecycle


The above figure shows the flow of the class life cycle. In this article, I only want to talk about the two stages of class "initialization" and "object instantiation.


Class initialization

Class "initialization" stage, which is the last task of a class or interface before it is used for the first time. This stage is responsible for assigning correct initial values to class variables.

The Java compiler collects all the class variable initialization statements and static class initialization tools in the <clinit> method. This method can only be called by Jvm and is dedicated to initialization.

In addition to interfaces, Before initializing a class, you must ensure that its direct superclass has been initialized, and the initialization process is thread-safe by Jvm. In addition, not all classes have a <clinit> () method. In the following conditions, this class does not have the <clinit> () method:

This class neither declares any class variables nor static initialization statements;
This class declares class variables, but does not explicitly use class variable initialization statements or static initialization statements for initialization;
This class only contains the class variable initialization statement of static final variables, and the class variable initialization statement is a regular expression of compilation.


Object initialization

When a class is loaded, connected, and initialized, this class can be used at any time. Object instantiation and initialization are the activities of the initial stage of object life. Here we mainly discuss the characteristics of object initialization.

When compiling each class, the Java compiler generates at least one instance initialization method for the class-that is, the "<init> ()" method. This method corresponds to each constructor in the source code. If the class does not explicitly declare any constructor, the compiler generates a default non-argument constructor for the class, the default constructor only calls the parameter-free constructor of the parent class, and also generates a "<init> ()" method corresponding to the default constructor.

Generally, the <init> () method contains the following code: calling another <init> () method, initializing instance variables, and the code in the corresponding constructor.

If the constructor is explicitly starting from calling another constructor in the same class, the corresponding <init> () method contains the following content: A call to the <init> () method of this class; all bytecode in the application constructor.

If the constructor does not start by calling other constructor methods of its own class and the Object is not an Object, the content contained in the <init> () method is: A call to the <init> () method of the parent class, a bytecode for the initialization method of instance variables, and a bytecode for the construction of sub-methods.

If this class is an Object, its <init> () method does not include calling the <init> () method of the parent class.


Class initialization time

So far, we have learned about the stages of the class lifecycle, but when is the class loading triggered at the beginning of the class lifecycle? When is the class initialized? Let's continue searching for answers with these three questions.

The Java Virtual Machine Specification strictly defines the class initialization time: "initialize on first active use" -- "initializing upon first active use ". This rule directly affects the mechanism of class loading, connection, and initialization classes-because it must have been connected before the type is initialized, but it must have been loaded before the connection.

The Java Virtual Machine Specification does not strictly define the class loading time related to the initialization time, this allows the JVM to adopt different loading policies based on its own characteristics. Let's take a look at the implementation principle of the Jboss AOP framework, which is to put your hands and feet on the loading of your class file-insert the relevant interception bytecode of AOP, this makes it completely transparent to programmers. Even the object instances you create with the new operator can be intercepted by the AOP framework-the corresponding Spring AOP, you must use his BeanFactory to obtain the managed objects that have been proxies by AOP. Of course, the disadvantage of Jboss AOP is also obvious-he is closely bound to the JBOSS server, you cannot easily port data to other servers. Hmm ~......, Speaking of this, I have some questions. I need to know that I can write a thick book on the implementation strategy of AOP. Hey, I can't help it.

After talking about this, the class initialization time is "when it is used for the first time". Under what circumstances does it meet the requirements for the first active use?

First active use:

When creating a new instance of a class-new, reflection, cloning or deserialization;
When a static method of a class is called;
When a static field of a class or interface is used or the field is assigned a value (except for final fields );
When calling some Java reflection methods
When initializing a subclass of a class
The startup class that contains the main () method when the VM is started.

Except for the above situations, all other methods that use the JAVA type are passively used, and they will not cause class initialization.


Where exactly is my problem?

Now that we understand the JVM class initialization and object initialization mechanisms, we have a theoretical basis to rationally analyze the problem.

Next, let's take a look at the bytecode translated from the JAVA source code anti-group in [list 1:
[List 3]

Public class com. ccb. framework. enums. cachingEnumResolver extendsjava. lang. object {static {}; Code: 0: new #2; // class CachingEnumResolver 3: dup 4: invokespecial #14; // Method "<init> ":() V ① 7: putstatic #16; // Field SINGLE_ENUM_RESOLVER: Lcom/ccb/framework/enums/CachingEnumResolver; 10: new #18; // class HashMap ② 13: dup 14: invokespecial #19; // Method java/util/HashMap. "<init>" :() V 17: putstatic #21; // Field CODE_MAP_CACHE: Ljava/util/Map; 20: getstatic #21; // Field CODE_MAP_CACHE: ljava/util/Map; 23: ldc #23; // String 0 25: ldc #25; // String Beijing 27: invokeinterface #31, 3; // InterfaceMethod java/util/Map. put :( Ljava/lang/Object;) Ljava/lang/Object; ③ 32: pop 33: returnprivate com. ccb. framework. enums. cachingEnumResolver (); Code: 0: aload_0 1: invokespecial #34; // Method java/lang/Object. "<init>" :() V 4: invokestatic #37; // Method initEnums :() V ④ 7: returnpublic static void initEnums (); Code: 0: getstatic #21; // Field CODE_MAP_CACHE: Ljava/util/Map; ⑤ 3: ifnonnull 24 6: getstatic #44; // Field java/lang/System. out: Ljava/io/PrintStream; 9: ldc #46; // String CODE_MAP_CACHE is empty. The problem is exposed here. 11: invokevirtual #52; // Method java/io/PrintStream. println :( Ljava/lang/String;) V 14: new #18; // class HashMap 17: dup 18: invokespecial #19; // Method java/util/HashMap. "<init>" :() V ⑥ 21: putstatic #21; // Field CODE_MAP_CACHE: Ljava/util/Map; 24: getstatic #21; // Field CODE_MAP_CACHE: ljava/util/Map; 27: ldc #54; // String 1 29: ldc #25; // String Beijing 31: invokeinterface #31, 3; // InterfaceMethod java/util/Map. put :( Ljava/lang/Object;) Ljava/lang/Object; 7 36: pop 37: getstatic #21; // Field CODE_MAP_CACHE: ljava/util/Map; 40: ldc #56; // String 2 42: ldc #58; // String Yunnan 44: invokeinterface #31, 3; // InterfaceMethod java/util/Map. put :( Ljava/lang/Object;) Ljava/lang/Object; protocol 49: pop 50: returnpublic java. util. map getCache (); Code: 0: getstatic #21; // Field CODE_MAP_CACHE: Ljava/util/Map; 3: invokestatic #66; // Method java/util/Collections. unmodifiableMap :( Ljava/util/Map;) Ljava/util/Map; 6: areturnpublic static com. ccb. framework. enums. cachingEnumResolver getInstance (); Code: 0: getstatic #16; // Field SINGLE_ENUM_RESOLVER: Lcom/ccb/framework/enums/CachingEnumResolver; Role 3: areturn}


If [list 1] shows that the list contains bytecode in the JDK1.4 environment, this list may not be very attractive to many of the brothers, because these JVM commands are really not as easy to understand as the source code. But it is indeed the most direct way to find and locate the problem. The answer we want is in this JVM command list.

Now, let's analyze the code execution track in [list 1] from class initialization to object instance initialization.

As described above, class initialization is the last step of the previous work when the class is actually available. This stage is responsible for correct initialization values for all classes, which is thread-safe, JVM ensures multi-thread synchronization.

Step 2: call the class initialization method CachingEnumResolver. <clinit> (). This method is invisible to the outside world. In other words, it is a dedicated internal JVM method. <clinit> () including all the class variables with the specified initial value in CachingEnumResolver. Note that not every class has this method. The specific content is described earlier.

Step 2: Go to the <clinit> () method. Let's look at the "①" line in the bytecode. This line is combined with the above two lines to represent a new CachingEnumResolver object instance, the code line itself refers to the <init> () method that calls the CachingEnumResolver class. Each Java class has a <init> () method, which is generated by the Java compiler during compilation and invisible to the outside world. <init> () the method includes all instance variable initialization statements with the specified initialization value and all statements in the java class constructor. This method is used to initialize objects during instantiation. However, at this point, a potential problem has been waiting for you to commit a crime.

Step 2: Let's look down the execution order. For the "④" row, the method of this row is the constructor of this class. This method first calls the constructor of the parent class <init> () initialize the parent object and call CachingEnumResolver. initEnum () method to load data.

Step 2: "⑤". This row obtains the "CODE_MAP_CACHE" field value, which is null during running. Note that the problem has started to appear. (As a programmer, you must have hoped that the field has been initialized, but in fact it has not been initialized ). By judging, because the field is NULL, the program will continue to execute the "6" row and instantiate the field as HashMap ().

Step 2: In the "7" and "Hangzhou" lines, the function is to fill in two pieces of data for the "CODE_MAP_CACHE" field.

Step 2: exit the object initialization method <init> () and initialize the generated object instance to the class field "SINGLE_ENUM_RESOLVER ". (Note: At this moment, the class variables in the object instance have not yet been fully initialized. The class variable "CODE_MAP_CACHE" that is assigned a value by calling the initEnum () method just now is <clinit> () the method has not yet initialized the field, and it will be overwritten again in the subsequent class initialization process ).

Step 2: continue to run the subsequent code in the <clinit> () method, "②" line. This line instantiates the "CODE_MAP_CACHE" field into a HashMap instance (note: this field has been assigned a value during object instantiation and is now assigned to another instance. At this moment, the class variable value of the instance referenced by the "CODE_MAP_CACHE" variable is overwritten, at this point, our questions have been answered ).

Step 2: The class initialization is complete, and the single-state class instantiation is also completed.

Through the above Bytecode eXecution process analysis, you may have understood the underlying cause of the error, or you may have been confused by the above analysis process, however, the problem is not broken. Although I can also elaborate on the problem from the source code perspective, it is not deep enough, and it is only possible for my personal opinion and lack of credibility.


Solution

To solve the problem in the above code, it is very easy to transfer the initialization value assignment statement of the "SINGLE_ENUM_RESOLVER" variable to the getInstance () method. In other words, you must avoid instantiating the class from the inside or referencing fields that have not been initialized before the class initialization is complete.


Conclusion

Calm down and cool, and carefully think about whether you have mastered the knowledge of the topic in this article. If you think you have mastered it completely or basically, it is very good. At the end, I will slightly modify the previous code. Do you have any problems with the two groups of programs?

Procedure 1

Public class CachingEnumResolver {
Public static Map CODE_MAP_CACHE;
Static {
CODE_MAP_CACHE = new HashMap ();
// To illustrate the problem, I initialize a piece of data here
CODE_MAP_CACHE.put ("0", "Beijing ");
InitEnums ();
    }

Procedure 2

Public class CachingEnumResolver {
Private static final CachingEnumResolver SINGLE_ENUM_RESOLVER;
Public static Map CODE_MAP_CACHE;
Static {
CODE_MAP_CACHE = new HashMap ();
// To illustrate the problem, I initialize a piece of data here
CODE_MAP_CACHE.put ("0", "Beijing ");
SINGLE_ENUM_RESOLVER = new CachingEnumResolver ();
InitEnums ();
        }

Finally, some comments about the JAVA community: Spring is a popular open-source framework, attracting the attention of a large number of JEE developers (I am also a member of fans ). However, let's take a closer look. How many of Spring fans have studied Spring source code? How many people have a deep understanding of Spring design ideas? Of course, I am not qualified to talk about things in such a tone. I just want to express my point of view that learning things must be "clear the source ".

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.