Reprint Please specify source: http://blog.csdn.net/tang9140/article/details/43982887
-
-
- Introduction
- A immutable characteristic
- The nature of two connection symbols
- Three equal judgments two ways equals description
- Question 1
- Question 2
- Question 3
- Complementary knowledge points
- Attach reflection Modify String object code
Introduction
In Java programming, strings are dealt with almost daily, so it is necessary to understand the string and its usage in depth. Below are three aspects to explain in detail the characteristics and usage of string-related
- Immutable (immutable) features
- The essence of connection symbol +
- Two ways of Equal judgment (==/equals) description
First, immutable characteristics
In order to facilitate the various operations of the string, the Java Designer abstracts out the string class, which encapsulates a series of operations such as searching, splicing, replacing, and intercepting strings. To view the source code of Java.lang.String, we can see the following description:
The String class represents character strings. All strings literals in Java programs, such as "ABC", is implemented as instances of this class.
Strings is constant; Their values cannot is changed after they is created. String buffers support mutable strings. Because String objects is immutable they can be shared.
The effect is that the string class represents a sequence of characters. all string literals in the Java language, such as "ABC", are implemented as instances of the string class .
A string object is a constant whose value cannot be changed after it has been created. string buffers support variable sequences of characters. Because of the immutability of string objects, they can be shared.
the immutable of string is embodied in two aspects :
The string class is decorated with the final keyword, which means that the class cannot be inherited. Because the string class cannot have subclasses, it guarantees that the static and instance methods of the string class cannot be inherited and modified, guaranteeing security.
Private member variables inside the string class, such as value (char[]), offset (int), and count (int) are decorated with the final keyword. When a string object is created, the value of offset and the Count field cannot be changed (because it is the base data type int), and the value variable cannot point to another character array (because it is a reference type variable)
As you can see here, one might say that although the Value property cannot point to other character arrays, the contents of the character array it points to can change, and if it can change its contents, it does not mean that the string object is mutable.
This is true, but the string class itself does not provide a way to modify the character array unless you change the value of the private property (which is followed by a code implementation) by unconventional means (such as reflection). Although it is entirely possible to change the value of each property of a string object since it was created (even if the property is private final decorated), it is an unconventional means of reflection. In normal usage, we cannot change the value of a string object, so we still think the string object is immutable.
Note: This is said to be a string invariant, which means that the value of a string object cannot be changed after it is created. For a variable of a reference type, you can point to a different string object to change the value it represents. as
"abc""def";
The reference type variable s points to a string object with a value of ' abc ', and then s points to a string object with a value of ' Def '. Although the string represented by S does change, for the string object ABC and DEF do not change, just s point to a different string object.
There are at least two benefits of string design as immutable:
One is security. The string class is final decorated, meaning that it is not possible for subclasses to inherit the string class and change their original behavior. Also, the resulting string object is immutable and is also secure in a multithreaded environment.
The second is efficiency. The string class is final decorated and implies that all methods of the class are final and the compiler can perform some optimizations. In addition, because the string object is invariant, it can be shared in multiple places without the need for synchronization between multithreading, which improves efficiency.
Because of the invariance of the string object, it may be inefficient to concatenate strings with the + sign, and the following is a detailed explanation of what is the nature of the connection symbol +? How does the bottom layer perform string concatenation? Under what circumstances is a string connection less efficient with the + sign?
Second, the connection symbol + Essence
To understand the essence of the + sign, start with the Java compilation. It is well known that Java code needs to be compiled into a class file before it is run (the structure of the class file is not explained in detail due to its limited space). Part of the class file is called the attribute table collection, which includes the Code property, which simply says that the code attribute contains the bytecode instructions that are compiled in the method body. Therefore, we can see the essence of the + by directly looking at the bytecode instruction in the class file . The sample code is as follows
publicclass StringTest { publicstaticvoidmain(String[] args) { "Hello"; " world!"; }}
Since we are not familiar with the class file structure, and bytecode is very difficult to understand, here do not directly see the contents of the compiled generated Stringtest.class file, but instead of the jad tool to decompile the bytecode to view the results. The tool for http://download.csdn.net/detail/tang9140/8426571 (my csdn resources, lay advertising non-hi do not spray). After executing the jad command under CMD jad -o -a -sjava StringTest.class
to successfully execute the above command, you will find that the Stringtest.class file is in the same directory as the source file Stringtest.java, as follows:
Public class stringtest{ Public stringtest() {//0 0:aload_0 //1 1:invokespecial #8 <method void Object () > //2 4:return} Public Static void Main(String args[]) {String s ="Hello";//0 0:LDC1 #16 <string "Hello" > //1 2:astore_1s = (NewStringBuilder (string.valueof (s))). Append ("world!"). ToString ();//2 3:new #18 <class stringbuilder> //3 6:dup //4 7:aload_1 //5 8:invokestatic #20 <method String string.valueof (Object) > //6 11:invokespecial #26 <method void StringBuilder (String) > //7 14:LDC1 #29 <string "world!" > //8 16:invokevirtual #31 <method StringBuilder stringbuilder.append (String) > ///9 19:invokevirtual #35 <method String stringbuilder.tostring () > //22:astore_1 //23:return}}
The above-compiled source code contains a comment line that represents the bytecode directive corresponding to the source. Obviously, the source code does not have the string Connector +, that is, the + number is compiled, has been replaced by the StringBuilder Append method call (Implementation on the jdk1.5 version before the + is replaced by the Append method call with StringBuffer after the compiler compiles). the so-called + number connection string is essentially a string concatenation by calling its Append method after the new StringBuilder object .
Java by overloading the string operator in the compilation phase +, in the convenience of the operation of the string also brings some side effects, for example, because the programmer does not know the nature of the + number and write inefficient code, see the following code:
publicconcat(){ ""; for (int01000; i++) { result += i; } return result; }
In the For loop body, where the + sign appears, the compilation is replaced with the following call:
result = (new StringBuilder(String.valueOf(result))).append(i).toString();
Obviously, each loop needs to copy the array of characters in result when the StringBuilder object is constructed, and when the ToString method is called, the character array in the StringBuilder is copied to construct the string object. The equivalent of each for loop, two object creation and two character array copies, so the program is inefficient. The more efficient code is as follows:
publicconcat(){ new StringBuilder(); for (int01000; i++) { result.append(i); } return result.toString(); }
At this point, I believe you already know the essence of the + number and how to avoid inefficient use of the + number. Let's take a closer look at two ways to determine the equality of a string object (often appearing in a Java interview).
Three, two ways of equal judgment (==/equals) description
- = = When the two operands are of the base data type, the comparison values are equal, and if the two operands are reference types, the comparison points to the same object.
- The Equals method is used to compare the contents of two objects for equality.
Because string is a reference type, when judged by = =, compares whether two string variables point to the same string object, and compares the contents of two string objects for equality when judged by the Equals method. when comparing strings in actual projects, it is essential to compare the contents of two string objects, so it is recommended that you use the Equals method for comparison .
Use = = for string comparison, often appear in the interview problem, rather than in the project code, for the actual work evaluation of little significance, just as everyone to the degree of understanding of the string of a test. Some of the following questions are listed below:
Question 1
"a1"; "a"1; System.out.println(a == b);
Answer: TRUE
Note: When two literals are connected, the literal concatenation of the Java compiler is actually performed at compile time. This means that there is no corresponding bytecode instruction in the generated class file String b = “a” + 1
, and it has been optimized to String b = “a1”
the corresponding bytecode instruction. I think this step optimization people should be able to understand that during compilation can determine the results and calculations, it can effectively reduce the class file bytecode instruction, that is, the program runs to reduce the need to execute the instructions, improve the program efficiency (you can use the above Jad command to decompile the class file to verify). Similarly, arithmetic operations for basic data type literals are also evaluated during compilation, for example int day = 24 * 60 * 60
, after compilation is replaced with code int day = 0x15180
. Since stitching has been done during compilation, local variables A and B both point to the ' A1 ' object in the constant pool, so a = = B output is true.
Question 2
"Hello world!"; "Hello"; " world!"; System.out.println(h == hw);
Answer: false
Description: Through the previous about String connector + analysis, we know that it h = h + " world!"
will be replaced after compiling h = (new StringBuilder(String.valueOf(h))).append(" world!").toString()
. Looking at the ToString method under StringBuilder, you can see that the method is actually, that is return new String(value, 0, count)
, H will point to the object on the Java heap, and HW is a pointer to the object in the constant pool. Although the contents of H and HW are the same, the output is false because it points to a different string object.
Question 3
Public Static FinalString H2 ="Hello"; Public Static FinalString h4 = Geth ();Private StaticStringGeth() {return "Hello"; } Public Static void Main(string[] args) {String HW ="Hello world!";FinalString H1 ="Hello";FinalString h3 = Geth (); String HW1 = h1 +"world!"; String hw2 = h2 +"world!"; String HW3 = h3 +"world!"; String hw4 = h4 +"world!"; System.out.println (HW = = HW1); System.out.println (HW = = HW2); System.out.println (HW = = HW3); System.out.println (HW = = HW4); }
Answer: True,true,false,false
Description: The local variable h1 is final decorated, meaning that H1 is a constant, and H1 is directly assigned to the string literal "Hello", so that the Java compiler can determine the value of H1 at compile time, so that the H1 appears directly replaced by the literal "Hello" (similar to C + + Constants defined with define), and then contact the previous about literal will be directly stitched at compile time, so the code is String hw1 = h1 + " world!"
optimized for String hw1 = "Hello world!"
, HW, hw1 all point to the string object in the constant pool, and the output is true. Similarly H2 is a static constant, and is a direct literal assignment, where the H2 appears will be directly replaced by the literal "Hello" after compilation, and finally, HW2 is also a string object in the constant pool, the output is true.
The local variable h3 is also final decorated, as a constant, but it is assigned through a method call, the compilation period can not determine its specific value (at this time the code is not executed, it is impossible to get the return value of the method through static analysis, even if the method body is simply the return string constant, such as the above example), and then contact before Intrinsic analysis, so String hw3 = h3 + " world!"
after compiling String hw3 = (new StringBuilder(String.valueOf(h3))).append(" world!").toString()
, HW3 will point to a string object on the Java heap, and the HW = = HW3 output to False. Similarly, HW4 also points to a string object on the Java heap, and the HW = = HW4 output is false.
Complementary knowledge points
There are two ways to assign a variable of type string:
- First, direct literal assignment, i.e.
String str = “abc”
;
- Second, the new way of assigning value, namely
String str = new String(“abc”)
;
In mode one, the variable str directly points to a string object of literal "ABC" in constant Pool 1, which points to a string object in the constant pool.
In mode two, the variable STR is assigned a value through the new constructor string (string original), which points to a string object in the Java heap. The constructor receives a string type argument, and the argument "ABC" points to a string object in the constant pool.
The above two ways of assigning a value to a string type variable are no different except that they point to various string objects. From the point of view of program efficiency, it is recommended to assign a value to a string type variable because mode two has one more string object assignment for the Java heap.
As mentioned earlier, string literals are directly considered an instance of the string class, which is actually stored in a constant pool of class files at compile time, and when the class file is loaded by the JVM, it enters into the run-time pool of the method area. If you want to add a new constant to the constant pool during run time, you can call the Intern () method of the string.
When the Intern method is called, if the constant pool already contains a string equal to this string object (as determined by the Equals (Object) method), the string in the constant pool is returned. Otherwise, this string object is added to the pool and a reference to this string object is returned.
Attach reflection to modify the string object code:
Public Static void Main(string[] args)throwsNosuchfieldexception, illegalaccessexception {String name ="Angel"; String name1 ="Angel"; Field Strfield = String.class.getDeclaredField ("Value"); Strfield.setaccessible (true);Char[] data = (Char[]) Strfield.get (name); data[4] =' R '; SYSTEM.OUT.PRINTLN (name); System.out.println (NAME1); SYSTEM.OUT.PRINTLN (name = = name1); Strfield = String.class.getDeclaredField ("Count"); Strfield.setaccessible (true); Strfield.setint (Name,Ten);inti = (Integer) strfield.get (name); System.out.println (i); System.out.println (Name.length ()); }
- The string constant pool is managed by the string class and belongs to the run-time pool of the method area, which is the constant pool mentioned above.
[Java] The end of the root of string