In-depth study on Java's creation and management of string objects

Source: Internet
Author: User
Tags code tag

I often see a lot of people discussing the string issue in Java, so I am also a little interested, because many people write a lot of difference on the internet,
I am confused that I spent one night reading Java virtual machine specification and
The Java language specification chapter has done a lot of experiments, summarized the content about string, and
I am not sure about a lot of content. I have also mentioned it below. I hope the experts can correct it.

Concept of constant pool:

When talking about some special cases of string, we will always mention string pool or constant pool, but I think many people are not
Understand what the constant pool is like and where it is stored during running. So let's talk about the content of the constant pool here.
The string pool corresponds to the region where the String constant is stored in the constant pool. It is also called the string pool.
String constant pool. It seems that there is no formal name ??

In the class file compiled by Java, there is a region called constant pool, which is a table composed of arrays, Type
Cp_info constant_pool [], used to store various constants used in the program, including class/string/Integer
For more information, see section 4.4 of the Java Virtual Machine specification.

For the constant pool, the basic general structure of the table is:
Cp_info {
U1 tag;
U1 info [];
}

Tag is a number used to represent the type of the stored constant. For example, 8 indicates the string type, and 5 indicates the long type. info []
The type code tag varies accordingly.

For the string type, the table structure is:
Constant_string_info {
U1 tag;
U2 string_index;
}

The tag is fixed to 8, and string_index is the string content information. The type is:
Constant_utf8_info {
U1 tag;
U2 length;
U1 bytes [length];
}

The tag is fixed to 1, the length is the length of the string, and the bytes [length] is the content of the string.

(The following code is compiled in JDK 6)
For more information about the constant pool structure, see the following code:
String S1 = "sss111 ";
String S2 = "sss222 ";
System. Out. println (S1 + "" + S2 );

Because both "sss111" and "sss222" are string constants, they have been created and stored in the class file during compilation.
The corresponding representation of the two constants will exist in the compiled class file:
08 00 11 01 00 06 73 73 31 31 31 08 00 13 01; ...... sss111 ....
00 06 73 73 32 32 32; .. sss222

Based on the string constant structure described above, let's analyze
The first 08 is the tag in the constant_string_info structure, and 11 should be its relative reference, 01 is
Constant_utf8_info tag, 06 is the length of the corresponding string, 73 73 31 31 31 is the string pair
The corresponding encoding is analyzed, and the storage structure corresponding to "sss222" is found later.

After the above analysis, we know that 11 and 13 are the relative references of two strings, you can modify the class file
To modify the printed content.
00 6e 00 04 00 03 00 00 00 24 12 10 4C 12 12 4D
Change
00 6e 00 04 00 03 00 00 00 24 12 10 4C 12 10 4D
The program will output sss111 sss111, instead of outputting sss111 sss222 like the original program, because I
We changed relative reference 12 for "sss222" to relative reference 10 for "sss111.

------------ Split line
Public class test {
Public static void main (string [] ARGs ){
String S1 = "sss111 ";
String S2 = "sss111 ";
}
}

There are two identical constants "sss111" in the above program. For N string constants with the same value, in the constant pool
Only one file will be created. Therefore, in the compiled class file, we can only find one expression for "sss111:
201700abh: 08 00 11 01 00 06 73 73 73 31 31 31 31; ...... sss111

During program execution, the constant pool is stored in the method area instead of heap.

In addition, for a String constant whose "" content is null, a string with a length of 0 and a blank content will be created and placed in the constant pool,
In addition, the constant pool can be dynamically expanded at runtime.

Description of the string class
1. String uses private final char value [] to store strings. That is to say, after a string object is created, it cannot
Modify the string content stored in this object. That is why the string type is immutable ).

2. A special method for creating the string class is to use "" Double quotation marks to create it. For example, new string ("I am") actually creates two
String object. One is created by "I am" through "" Double quotation marks, and the other is created by new, except that they are created in different periods,
One is the compilation period and the other is the runtime period!

3. Java reloads the + operator for the string type and can directly use + to connect two strings.

4. Call the intern () method of the string class at runtime to dynamically add objects to the string pool.

The following methods are generally used to create a string:
1. directly use the quotation marks to create an image.
2. Create with new string.
3. Create with new string ("somestring") and some other overloaded constructors.
4. Use the overloaded string join operator + create.

Example 1
/*
* "Sss111" is a constant during the compilation period. The value of sss111 can be determined during compilation.
* A Good class file already exists in the string pool. This statement will
* Search for strings equal to "sss111" in the string pool (determined by the equals (object) method ),
* If a reference exists, the reference is returned and the value is given to S1. if the reference does not exist, an "sss111" is created and placed in
* In the string pool, return the reference and pay the value to S1.
*
*/
String S1 = "sss111 ";

// This statement is the same as above
String S2 = "sss111 ";

/*
* Because the string pool only maintains a String object with the same value
* The reference obtained from the above two sentences is the same object in the string pool, so
* They reference equal
*/
System. Out. println (S1 = S2); // The result is true.

Example 2
/*
* In Java, a new object is created using the new keyword. In this example
* If an object with the same value already exists in the string pool, a new
* The String object is stored in heap, and the reference is returned to S1.
* In this example, the string Public String (string original) constructor is used.
*/
String S1 = new string ("sss111 ");

/*
* This sentence will be searched in the string pool as described in Example 1
*/
String S2 = "sss111 ";

/*
* Because S1 is a new object, it is stored in heap and S2 points to the object
* Stored in the string pool, they are definitely not the same object,
* If the stored string value is the same, false is returned.
*/
System. Out. println (S1 = S2); // The result is false.

Example 3
String S1 = new string ("sss111 ");
/*
* When the intern method is called, if the string pool already contains an object equal to this string
* String (determined by the equals (object) method), returns the string in the pool. Otherwise
* Add the string object to the pool and return the reference of this string object in the string pool.
*/
S1 = s1.intern ();

String S2 = "sss111 ";

/*
* Because S1 = s1.intern () is executed, the value of S1 pointing to the string pool is "sss111"
* String object. S2 also points to the same object, so the result is true.
*/
System. Out. println (S1 = S2 );

Example 4
String S1 = new string ("111 ");
String S2 = "sss111 ";

/*
* Since the two strings for connection are constants, the value after connection can be determined during compilation,
* The Compiler directly expresses them as "sss111" and stores them in the string pool,
* Because the above S2 = "sss111" has already been added to the string pool "sss111 ",
* This statement points S3 to the same object as S2, so they reference the same object.
* "Sss" and "111" constants are stored in the string pool.

*/
String S3 = "Sss"> "111 ";

/*
* Because S1 is a variable, the value of S1 cannot be determined during compilation.
* A New String object will be created during execution and stored in heap,
* Assign the value to S4.
*/
String S4 = "Sss" + S1;

System. Out. println (s2 = S3); // true
System. Out. println (s2 = S4); // false
System. Out. println (s2 = s4.intern (); // true

Example 5
This is the example in section 3.10.5 of the Java language specification. With the above description, it is not hard to understand.
Package testpackage;
Class test {
Public static void main (string [] ARGs ){
String Hello = "hello", Lo = "Lo ";
System. Out. Print (Hello = "hello") + "");
System. Out. Print (other. Hello = Hello) + "");
System. Out. Print (other. Other. Hello = Hello) + "");
System. Out. Print (Hello = ("El" + "Lo") + "");
System. Out. Print (Hello = ("El" + LO) + "");
System. Out. println (Hello = ("El" + LO). Intern ());
}
}
Class other {static string Hello = "hello ";}

Package Other;
Public class other {static string Hello = "hello ";}

The output result is true false true. Please analyze it yourself!

The above analysis is summarized as follows:
1. The strings created by using the "" quotation marks are constants, and are stored in the string pool at the compilation stage.
2. objects created using new string ("") are stored in heap, which is newly created at runtime.
3. A string connector that only contains constants, such as "AA" + "AA", is created as a constant. It can be determined during the compilation period and has been stored in the string pool.
4. objects created using a string connector containing variables such as "AA" + S1 are created at runtime and stored in heap.
6. I'm not sure if the object created in the form of "AA" + S1 and new string ("AA" + S1) is added to the string pool. It may be required.
Call the intern () method to join. I hope the expert can answer @_@

There are also a few Frequently Asked Questions:

1.
String S1 = new string ("S1 ");
String S2 = new string ("S1 ");
How many string objects are created above?
Answer: Three, one in the constant pool during the compilation period, and two in the heap during the runtime.

2.
String S1 = "S1 ";
String S2 = S1;
S2 = "S2 ";
What is the string in the object to which S1 points?
Answer: "S1"

Finally, let's talk about the Notes for the "+" connector:

To deepen our understanding, we can do a few small experiments.

Javac Test
View vm commands in javap-C Test

Tutorial 1: Pure strings

Java code
Public class test {
Public static void main (string ARGs []) {
String STR = "";
}
}

// Save string a to the constant pool
0: LDC #2; // string
// Store the reference in the local variable No. 1
2: astore_1
3: Return

Experiment 2: Adding strings

Java code
Public class test {
Public static void main (string ARGs []) {
String STR = "A" + "B ";
}
}

// Press the AB string into the constant pool.
0: LDC #2; // string AB
2: astore_1
3: Return

Experiment 2 shows that the bytecode generated by the compiler during compilation has optimized "A" + "B" to "AB ",
In the same way, the addition of multiple strings will also be optimized. Note that the string constants are added.

Experiment 3: Adding strings and automatically increasing Constants

Java code
Public class test {
Public static void main (string ARGs []) {
String STR = "A" + (1 + 2 );
}
}

// Press string A3 into the constant pool
0: LDC #2; // string A3
2: astore_1
3: Return

The VM command shows that the constants and string constants automatically upgraded from 1 + 2 are also optimized by the VM.

Experiment 2 and experiment 3 conclusion: the addition of constants does not cause efficiency problems.

Experiment 4: Adding strings and variables

Java code
Public class test {
Public static void main (string ARGs []) {
String S = "B ";
String STR = "A" + S;
}
}

// Press string B into the constant pool
0: LDC #2; // string B
// Store the reference in the local variable No. 1
2: astore_1
// Check the addition of constants. Then, create the stringbuilder object.
3: New #3; // class Java/lang/stringbuilder
// Re-create data from the stack, that is, copy string B
6: DUP
// Call the initial structure of stringbuilder
7: invokespecial #4; // method Java/lang/stringbuilder. "<init>" :() V
// Press string a into the constant pool
10: LDC #5; // string
// Call the append method of stringbuilder to add string
12: invokevirtual #6; // method Java/lang/stringbuilder. append :( ljava/lang/string;) ljava/lang/stringbuilder;
// Load the data reference from the local variable No. 1
15: aload_1
// Call the append method of stringbuilder to add string B
16: invokevirtual #6; // method Java/lang/stringbuilder. append :( ljava/lang/string;) ljava/lang/stringbuilder;
// Call the tostring method of stringbuilder
19: invokevirtual #7; // method Java/lang/stringbuilder. tostring :() ljava/lang/string;
// Save the tostring result to the local variable 2
22: astore_2
23: Return

Experiment 4 shows that when a constant string is added, the added variable contains the address reference of the string,
Because other specific values cannot be exactly known during compilation, there is no way to optimize them.
To achieve the connection effect, it adopts the stringbuilder mechanism internally for processing (newly added in JDK 5, I
There is no JDK 1.4 here, and it is estimated that stringbuffer is used in JDK 1.4), append all of them
And use tostring to output the data.

If S is of another type, for example, int type, it is also processed in the same way.

Similarly, according to the result of experiment 2, when string STR = "A" + "B" + S; is optimized to "AB" and then
S is processed according to experiment 4. At this time, stringbuilder only calls the append method twice.

If it is string STR = "A" + S + "B"; this form cannot be optimized, stringbuilder has to call
Use the three-time append method.

The conclusion in Experiment 4 shows that the stringbuilder object is generated internally when the string and the variable are added and
.

If there is only one string STR = "A" + s; in this way, the efficiency and
String STR = new stringbuilder (). append ("A"). append (s). tostring ();
Is the same.

Generally, the low efficiency of the string using the join operator (+) is mainly produced in the following situations:

Java code
Public class test {
Public static void main (string ARGs []) {
String S = NULL;
For (INT I = 0; I <100; I ++ ){
S + = "";
}
}
}

Every time you do the plus sign (+), a stringbuilder object is generated and then thrown away after append. When the next cycle arrives again
A new stringbuilder object is generated, and then the append string is generated. The loop ends until the end.

If we directly use the stringbuilder object for append, we can save n-1 Creation and
The time when the object was destroyed.

 

This article from the csdn blog, reproduced please indicate the source: http://blog.csdn.net/luojihaidao/archive/2009/02/05/3863658.aspx

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.