Java constant pool technology

Source: Internet
Author: User

Reprinted from http://my.oschina.net/xianggao/blog/84179

Author: Old demon

 

Java constant pool technology

The constant pool technology in Java is used to create some objects conveniently and quickly. when an object is required, you can obtain one from the pool (if there is no such object in the pool ), this saves a lot of time when you need to create equal variables again. The constant pool is actually a memory space, different from the heap space of the object created with the new keyword. The string class is also a much-used class in Java. To facilitate the creation of string objects, the constant pool technology is also implemented.

Before describing the constant pool, let's take a look at the memory model of the Data zone during JVM runtime. In the deep Java Virtual Machine (VM) book, The Memory Model of the JVM runtime data zone is composed of five parts:

[1] Method Area
[2] heap
[3] java stack
[4] PC register
[5] local method Stack

For string S = "Haha", its VM commands:

0: LDC #16; // string haha
2: astore_1
3: Return

For the above VM commands, their respective instructions are described in "deep into Java Virtual Machine" (combined with the above instance ):

LDC Command Format: LDC, Index

LDC command process: to execute the LDC command, JVM first searches for the constant pool entry specified by index. In the constant pool entry pointed to by index, JVM searches for constant_integer_info, constant_float_info, and constant_string_info. If these entries are not available, JVM will parse them. For the preceding hahajvm, The constant_string_info entry is found, and the reference pointing to the detained String object (generated by the process that parses the entry) is pushed to the operand stack.

Astore_1 Command Format: astore_1

Astore_1 command process: to execute the astore_1 command, JVM pops up a reference type or returnaddress type value from the top of the operand stack, and then stores the value in a local variable specified by index 1, save the reference type or returnaddress type value to local variable 1.

Return command process: return from the method. The return value is void.

My personal understanding:

From the execution process of the preceding LDC command, we can conclude that the value of S is a reference from the detained String object (generated by the process that parses the entry, that is, it can be understood that it is copied from the reference of the detained String object, so my personal understanding is that the value of S exists in the stack. The above is an analysis of the S value, followed by an analysis of the "Haha" value. We know that, for string S = "Haha" where "Haha" is determined during the Java program compilation period. To put it simply, the value of Haha is after the program is compiled into a class file, it is generated in the class file (you can use the UE editor or other text editing tools to view the Haha value in the bytecode file after opening the class file ). During Java program execution, the first step is to generate the class file and then load it To the memory for execution by JVM. In this case, the JVM loads this class into the memory. In this case, how does one open up space for the class and store the Haha value in the memory?

Here, let's take a look at the structure of the JVM constant pool, which is described in the deep Java Virtual Machine book:

Constant pool

The virtual machine must maintain a constant pool for each mounted type. A constant pool is an ordered set of constants used for this type, including direct constants (string, integer, and floating point constants) and symbolic references to other types, fields, and methods. For a String constant, its value is in the constant pool. The constant pool in JVM exists in the memory as a table. For the string type, there is a fixed-length constant_string_info table to store the text string value. Note: this table only stores text string values, but does not store symbol references. Here, we should have a clear understanding of the storage location of string values in the constant pool.

Detailed Structure

In Java programs, there are many things that are permanent and will not change during the running process. For example, a class name, a class field name/type, a class method name/return type/parameter name and type, a constant, there are also a large number of nominal values in the program. Each table is a constant table (constant item) in the constant pool ). There are also differences between these constant scales. There are 11 types of constant scales in the class file, as shown below:

(1) constant_utf8 uses UTF-8 encoding to represent all important constant strings in the program. These strings include: ① full-qualified names of classes or interfaces, ② full-qualified names of superclasses, ③ full-qualified names of parent interfaces, ④ class field names and type names, ⑤ class method name and return type name, and parameter name and type name. ⑥ String Literal Value

Table Format: Tag (flag 1: 1 byte) length (length of the byte occupied by the string, 2 byte) bytes (string byte sequence)

(2) constant_integer, constant_float, constant_long, and constant_double. For example, the 1 in the program is represented by constant_integer. 3.1415926f is represented by constant_float.

Table Format: Tag bytes (the byte sequence required for the basic data type)

(3) constant_class uses symbolic references to represent classes or interfaces. We know that all class names are stored as constant_utf8 tables. However, we do not know which strings in the constant_utf8 table are class names and those are method names. Therefore, we must use a symbol pointing to the class name string to reference constants.

Table Format: Tag name_index (index of the constant_utf8 table indicating the class or interface name)

(4) constant_string is the same as constant_class and points to the constant_utf8 table containing the string literal value.

Table Format: Tag string_index (index of the constant_utf8 table that represents the string literal value)

(5) constant_fieldref, constant_methodref, and constant_interfacemethodref point to the constant_utf8 table containing the name of the field or method, and to the constant_nameandtype table containing the name and descriptor of the field or method.

Table Format: Tag class _ index (index of the constant_utf8 table containing the class name) name_and_type_index (index of the constant_nameandtype table containing the field name or method name and descriptor)

(6) constant_nameandtype points to the constant_utf8 table containing the field name, method name, And descriptor.

Table Format: Tag name_index (index of the constant_utf8 table indicating the field name or method name) type_index (index of the constant_utf8 table indicating the descriptor)

Each literal string in the Java source code is compiled into a class file to form a constant table with the flag number 8 (constant_string_info. When the JVM loads the class file, a memory data structure is created for the corresponding constant pool and stored in the method area. At the same time, JVM will automatically create a String object (intern String object, also called detention String object) in the heap for the literal value of the String constant in the constant_string_info constant table ). Then, convert the entry address of the constant_string_info constant table into the direct address of the string object in the heap (constant pool parsing ).

Detention String object

In the source code, all string constants with the same literal value can only create a unique object to hold strings. In fact, JVM maintains this feature by recording the internal data structure referenced by the detention string. In Java programs, you can call the intern () method of string to make a regular String object become a detention String object.

Eight basic types of packaging classes and object pools

In Java, most of the basic packaging classes implement the constant pool technology. These classes are byte, short, integer, long, character, Boolean, the other two floating-point type packaging classes are not implemented. In addition, the object pool can be used only when the corresponding value is less than or equal to 127 for the five integer packaging classes byte, short, integer, long, and character, that is, objects are not responsible for creating and managing objects of these classes greater than 127. Test code:

Public class test {

Public static void main (string [] ARGs ){

// Objects of five integer packaging classes: byte, short, integer, long, and character,

// You can use a constant pool when the value is less than 127.

Integer I1 = 127;

Integer I2 = 127;

System. Out. println (I1 = I2); // Output True

// If the value is greater than 127, the object is not retrieved from the constant pool.

Integer I3 = 128;

Integer I4 = 128;

System. Out. println (I3 = I4); // output false

// The Boolean class also implements the constant pool technology.

Boolean bool1 = true;

Boolean bool2 = true;

System. Out. println (bool1 = bool2); // Output True

// The floating point type packaging class does not implement the constant pool technology

Double d1 = 1.0;

Double D2 = 1.0;

System. Out. println (d1 = d2); // output false

}

}

Code supplement for integer objects

Public static integer valueof (int I ){

Final int offset = 128;

If (I >=- 128 & I <= 127 ){

Return integercache. cache [I + offset];

}

Return new INTEGER (I );

}

When you directly give an integer an int value, it actually calls the valueof method, and the value you assign is very special, it is 128, so there is no cache method, it is equivalent to two new objects. Therefore, the two codes defining A and B in the question are similar:

Integer A = new INTEGER (128 );

Integer B = new INTEGER (128 );

What is the output result? You will know that it is false. If you change the number to 127, run the following command:

Integer A = 127;

Integer B = 127;

System. Out. println (A = B );

The result is true.

It is best to use equals for object comparison to facilitate control based on your own purposes. Here equals () and = are extracted. Equals compares the string literal value, that is, the comparison content, and = compares references.

Let's take a look at the content in the integercache class.:

Private Static class integercache {

Private integercache (){

}

Static final integer cache [] = new integer [-(-128) + 127 + 1];

Static {

For (INT I = 0; I <cache. length; I ++)

Cached [I] = new INTEGER (I-128 );

}

}

Because cache [] is a static array in the integercache class, that is, it only needs to be initialized once, that is, static {......} so, if the integer object is-128 ~ In the range of 127, you do not need to redefine the application space. All objects are the same-in integercache. cache, which can improve efficiency to a certain extent.

Supplement to string

References from the same string object in the same package.

Different classes in the same package are referenced from the same string object.

The same String object is still referenced under different classes of different packages.

When compiled into. Class, it can be recognized as the same string and automatically optimized to a constant. Therefore, it is also referenced from the same string object.

The string created at run time has an independent memory address, so it is not referenced from the same string object.

The intern () method of string will find whether there is an equal equivalent string in the constant pool,

If yes, a reference is returned. If no, add your own string to the constant pool. Note: it is only a string. Therefore, there will be two copies. The parts of the constant pool will be private and managed by the string class, and their own parts will continue to be used according to the object lifecycle.

String S = "Haha"

After introducing the concepts related to the JVM constant pool, let's talk about the memory distribution position of the "Haha" value mentioned at the beginning. The Haha value is actually before the class file is loaded into the memory by JVM and the engine parses the LDC command and executes the LDC command, JVM has allocated space for the Haha string in the constant_string_info table of the constant pool to store the Haha value. Since the Haha String constant is stored in the constant pool, according to the description in "Deep Java Virtual Machine": the constant pool is part of the type information, and the type information is each reprinted type, this type is reflected in the JVM Memory Model and exists in the Method Area of the JVM memory model, that is, the constant pool concept in this type information exists in the method area, the method area is allocated by JVM In the heap of the JVM memory model. Therefore, the value of Haha should exist in the heap space.

For string S = new string ("Haha"), its JVM command:

0: New #16; // class string
3: DUP
4: LDC #18; // string haha
6: invokespecial #20; // method Java/lang/string. "" :( ljava/lang/string;) V
9: astore_1
10: Return

For the above VM commands, their respective instructions are described in "deep into Java Virtual Machine" (combined with the above instance ):

New Command Format: New indexbyte1, indexbyte2

New instruction process:

To execute the new command, JVM generates an unsigned 16-bit index pointing to the constant pool through calculation (indextype1 <8) | indextype2. Then, JVM searches for the constant pool entry based on the calculated index. The entry to the constant pool to which the index points must be constant_class_info. If this entry does not exist, JVM will parse this constant pool entry, which must be a class. JVM allocates enough space for the new object image from the heap and sets the instance variable of the object as the default value. At last, JVM pushes the reference objectref pointing to the new object to the operand stack.

DUP Command Format: DUP

DUP command process:

To execute the DUP command, JVM copies the long-character content at the top of the operand stack, and then pushes the copied content to the stack. This command can copy any unit-length value from the top of the operand stack. But never use it to copy one of the two character lengths (long or double) at the top of the operand stack. In the preceding example, copy and reference objectref. In this case, two references exist in the operand stack.

LDC Command Format: LDC, Index

LDC instruction process:

To execute the LDC command, JVM first searches for the constant pool entry specified by index. In the constant pool entry pointed to by index, JVM searches for constant_integer_info, constant_float_info, and constant_string_info. If these entries are not available, JVM will parse them. For the preceding Haha, JVM finds the constant_string_info entry and pushes the reference pointing to the detained String object (generated by the process that parses the entry) to the operand stack.

Invokespecial Command Format: invokespecial, indextype1, indextype2

Invokespecial command process: for this class, this command is used to call the instance initialization method. For details about the length of this instruction, refer to the description in deep Java virtual machine. In the above example, the constructor of the string class is called through one of the references to initialize the object instance, so that another identical reference points to the initialized object instance, then, the operand stack is displayed in the previous reference.

Astore_1 Command Format: astore_1

Astore_1 command process:

To execute the astore_1 command, JVM pops up a reference type or returnaddress type value from the top of the operand stack, and then stores the value in a local variable specified by index 1, save the reference type or returnaddress type value to local variable 1.

Return command process:

Return from the method. The return value is void.

To execute the astore_1 command, JVM pops up a reference type or returnaddress type value from the top of the operand stack, and then stores the value in a local variable specified by index 1, save the reference type or returnaddress type value to local variable 1.

From the preceding six commands, we can see that Haha in string S = new string ("Haha"); is stored in the heap space, while s is in the operand stack.
The above is an analysis and understanding of the memory conditions of S and Haha values; then, how many objects have been created for the string S = new string ("Haha"); statement?
In my understanding, "Haha" itself is an object in the constant pool. When new string () is executed at runtime, the objects in the constant pool are copied and put in the heap, and assign the reference of this object in the heap to S. Therefore, this statement creates two string objects. As shown in:

FAQs about string

Usage and understanding of final in string:

Final stringbuffer A = new stringbuffer ("111 ");
Final stringbuffer B = new stringbuffer ("222 ");
A = B; // This sentence cannot be compiled

Final stringbuffer A = new stringbuffer ("111 ");
A. append ("222"); // compiled

It can be seen that final is only valid for the referenced "value" (that is, the memory address), which forces the reference to point only to the object initially pointed to. changing its direction will lead to compilation errors. Final is not responsible for the changes to the objects it points.

Several examples of the String constant pool problem:

[1]


String A = "A1 ";
String B = "A" + 1;
System. Out. println (A = B); // result = true
String A = "atrue ";
String B = "A" + "true ";
System. Out. println (A = B); // result = true
String A = "a3.4 ";
String B = "A" + 3.4;
System. Out. println (A = B); // result = true

Analysis: JVM connects the "+" Number of string constants. During the program compilation period, the JVM optimizes the "+" connection of the constant string to the connected value, take "A" + 1 as an example. After the compiler is optimized, it is already A1 in the class. The value of its String constant is determined during compilation, so the final result of the above program is true.

[2]

String A = "AB ";
String BB = "B ";
String B = "A" + BB;
System. Out. println (A = B); // result = false

Analysis: JVM references strings. Due to the existence of string references in the "+" connection of strings, the referenced values cannot be determined during program compilation, that is, "a" + BB cannot be optimized by the compiler. It is dynamically allocated only during the running period and assigned the new connection address to B. Therefore, the result of the above program is false.

[3]


String A = "AB ";
Final string BB = "B ";
String B = "A" + BB;
System. Out. println (A = B); // result = true

Analysis: The only difference from [3] Is that the BB string is decorated with final. For final modified variables, it is parsed as a local copy of a constant value during compilation and stored in its own constant pool or embedded in its byte code stream. Therefore, "a" + BB "and" A "+" B "have the same effect. Therefore, the result of the above program is true.

[4]

String A = "AB ";
Final string BB = getbb ();
String B = "A" + BB;
System. Out. println (A = B); // result = false
Private Static string getbb (){
Return "B ";
}

Analysis: JVM cannot determine the value of BB in string reference during compilation. Only after the method is called during the runtime, dynamically connect the return value of the method with "A" and assign the address B. Therefore, the result of the above program is false.

The above four examples show that:

String S = "A" + "B" + "C ";

It is equivalent to string S = "ABC ";

String A = "";
String B = "B ";
String c = "C ";
String S = A + B + C;

This is different. The final result is equal:

Stringbuffer temp = new stringbuffer ();
Temp. append (a). append (B). append (C );
String S = temp. tostring ();

From the above analysis results, it is not difficult to infer the cause of the inefficiency of the string using the join operator (+), such as the Code:

Public class test {
Public static void main (string ARGs []) {
String S = NULL;
For (INT I = 0; I <100; I ++ ){
S + = "";
}
}
}

Every time you do the plus sign (+), a stringbuilder object is generated and then thrown away after append. When the next loop arrives, A stringbuilder object is generated again, and then the append string is generated. The loop ends until the end. If we directly use the stringbuilder object for append, we can save n-1 time to create and destroy objects. Therefore, for applications that require string connection in a loop, the append operation is generally performed using the stringbuffer or stringbulider object.

The intern method of the string object is described as follows:

Public class test4 {
Private Static string a = "AB ";
Public static void main (string [] ARGs ){
String S1 = "";
String S2 = "B ";
String S = S1 + S2;
System. Out. println (S = A); // false
System. Out. println (S. Intern () = A); // true
}
}

Java is used here as a constant pool problem. For the S1 + S2 operation, a new object is actually created in the heap, and s stores the content of the new object in the heap space, therefore, the values of S and a are not equal. When the S. Intern () method is called, the address value of S in the constant pool can be returned. Because the value of A is stored in the constant pool, the values of S. intern and a are equal.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.