[Java] root-sourcing for String, string-based question
Reprinted please indicate the source: http://blog.csdn.net/tang9140/article/details/43982887
-
-
- Introduction
- Immutable features
- Binary Connection Symbol nature
- Equals
- Question 1
- Question 2
- Question 3
- Add knowledge points
- Attaches reflection to modify the String object code
Introduction
In java programming, it deals with String almost every day. Therefore, it is necessary to thoroughly understand String and its usage. The following describes the features and usage of String in three aspects.
- Immutable (Immutable) Features
- Connection Symbol + Essence
- Two equal conditions (=/equals)
I. Immutable features
Java designers abstract the String class to facilitate various operations on strings. This class encapsulates a series of operations such as searching, splicing, replacing, and intercepting strings. To view the source code of java. lang. String, You can first see the following description:
The String class represents character strings. All string literals in Java programs, such as "abc", are implemented as instances of this class.
Strings are constant; their values cannot be changed after they are created. String buffers support mutable strings. Because String objects are immutable they can be shared.
The String class represents the character sequence.All strings in Java, such as abc, are implemented as strings..
String objects are constants and their values cannot be changed after creation. The string buffer supports variable character sequences. The String objects can be shared because they are non-mutable.
The immutable of String is embodied in two aspects::
The String class is modified by the final keyword, meaning that the class cannot be inherited. Because the String class cannot have subclasses, the static and instance methods of the String class cannot be inherited and modified, ensuring security.
Private member variables in the String class, such as value (char []), offset (int), and count (int), are all modified by the final keyword. After a String object is created, the value of the offset and count fields cannot be changed (because it is a basic data type int), and the value variable cannot point to other character arrays (because it is a reference type variable)
Some people may say that although the value attribute cannot point to another character array, the content of the character array to which it points can be changed. If the content of the character array can be changed, that does not mean that the String object is variable.
The above statement is correct, but the String class itself does not provide a method to modify the character array, unless you use unconventional means (such as reflection) to change the value of the private attribute (the code implementation will be attached later ). Although the Code can completely change the values of the attributes of the created String object (even if the attributes are modified by private final), the unconventional method of reflection is used after all. Normally, we cannot change the value of the String object, so we still think that the String object is immutable.
Note: String immutability means that the value of a String object cannot be changed after it is created. Variables of the reference type can point to different String objects to change their values.For example
String s = "abc"; s = "def";
The reference type variable s points to the String object whose value is 'abc', and then s points to the String object whose value is 'def. Although the String represented by s has indeed changed, the abc and def of the String object have not changed, only s points to different String objects.
There are at least two benefits for designing String as immutable:
First, security. The String class is modified by final, which means that it is impossible for a subclass to inherit the String class and change its original behavior. In addition, the generated String object remains unchanged and is secure in a multi-threaded environment.
Second, efficiency. The String class is modified by final, which implies that all methods of this class are final, and the compiler can perform some optimization. In addition, because the String object remains unchanged, it can be shared in multiple places without synchronization between multiple threads, improving the efficiency.
Due to the immutability of the String object, the efficiency may be low when the + sign is used for String connection. What is the essence of the Connection Symbol +? How does one splice strings at the underlying layer? Under what circumstances is it less efficient to connect strings with the plus sign?
2. Connection symbols + Essence
To understand the essence of the plus sign, start with java compilation. As we all know, java code needs to be compiled into a Class file before running (the structure of the Class file is not described here due to the limited space ). In the Class file, some of them are called attribute tables, including the Code attribute. In short, the Code attribute contains the bytecode instruction corresponding to the Code in the method body after compilation. Therefore,We can directly view the bytecode commands in the Class file to understand the essence of +.. The sample code is as follows:
public class StringTest { public static void main(String[] args) { String s = "Hello"; s = s + " world!"; }}
Because we are not familiar with the Class file structure, and the bytecode is very difficult to understand, here we do not directly view the contents of the StringTest. class file generated by compilation, but throughDecompile bytecode using jadView the result. The tool for http://download.csdn.net/detail/tang9140/8426571 (My csdn resources, lay the ad is not like not to spray ). Run the jad command in cmd.jad -o -a -sjava StringTest.class
After successfully executing the preceding command, you will find that the StringTest. class file is located in the directory with more StringTest. java files. The content is as follows:
public class StringTest{ public StringTest() { // 0 0:aload_0 // 1 1:invokespecial #8 <Method void Object()> // 2 4:return } public static void main(String args[]) { String s = "Hello"; // 0 0:ldc1 #16 <String "Hello"> // 1 2:astore_1 s = (new StringBuilder(String.valueOf(s))).append(" world!").toString(); // 2 3:new #18 <Class StringBuilder> // 3 6:dup // 4 7:aload_1 // 5 8:invokestatic #20 <Method String String.valueOf(Object)> // 6 11:invokespecial #26 <Method void StringBuilder(String)> // 7 14:ldc1 #29 <String " world!"> // 8 16:invokevirtual #31 <Method StringBuilder StringBuilder.append(String)> // 9 19:invokevirtual #35 <Method String StringBuilder.toString()> // 10 22:astore_1 // 11 23:return }}
The source code decompiled above contains a comment line, representing the bytecode command corresponding to the source code. Obviously, there is no string connector + in the source code. That is to say, after the + code is compiled, it has been replaced with the StringBuilder append method call (implementation is prior to jdk1.5, + after compilation, the number is replaced by the StringBuffer append method call ).The so-called "+" connection string is essentially a String concatenation by calling its append method after the new StringBuilder object.
By reloading the string operator + in the compilation phase, java facilitates string operations and brings some side effects, for example, if programmers do not know the nature of the plus sign and write inefficient code, see the following code:
public String concat(){ String result = ""; for (int i = 0; i < 1000; i++) { result += i; } return result; }
In the for loop body, where the + sign appears, it will be replaced with the following call after compilation:
result = (new StringBuilder(String.valueOf(result))).append(i).toString();
Obviously, each loop needs to copy the character array in the result when constructing the StringBuilder object, and copy the character array in StringBuilder to construct the String object when calling the toString method.This is equivalent to a for loop, where two object creation and two copy of character arrays are required, resulting in low program efficiency.. The more efficient code is as follows:
public String concat(){ StringBuilder result = new StringBuilder(); for (int i = 0; i < 1000; i++) { result.append(i); } return result.toString(); }
So far, I believe everyone knows the essence of the plus sign and how to avoid inefficient use of the plus sign. Next, let's take a closer look at the two methods for determining the equality of String objects (often in java interview questions ).
Iii. Equal judgment (=/equals)
- =: When the two operands are of the basic data type, whether the comparison values are equal; when the two operands are of the reference type, whether the comparison points to the same object.
- The equals method is used to compare whether the content of two objects is equal.
Because String is a reference type, when = is used to determine whether two String variables point to the same String object, when equals is used to determine, to compare whether the content of two String objects is equal.When comparing strings in a project, it is basically to compare whether the content of two String objects is equal. Therefore, we recommend that you use the equals method to compare all objects..
Using = for String comparison, it is often used in the interview questions, rather than in the project code. It is of little significance to the actual work evaluation, but it is just an assessment of the degree of understanding of the String. The following are some interview questions:
Question 1
String a = "a1"; String b = "a" + 1; System.out.println(a == b);
Answer: true
Note: When two strings are connected, the strings are spliced in the java compiler compilation phase. That is to say, the compiled Class file does not exist.String b = “a” + 1
The corresponding bytecode instruction has been optimizedString b = “a1”
The corresponding bytecode command. I think you should be able to understand this optimization step. When you can determine the results during compilation and perform computation, You can effectively reduce the bytecode commands in the Class file, that is, the commands that need to be executed when the program runs, improves program efficiency (you can use the preceding jad command to decompile the Class file for verification ). Likewise, arithmetic operations on the literal amount of basic data types are also calculated during compilation, for exampleint day = 24 * 60 * 60
Is replaced with the code after compilation.int day = 0x15180
. As the local variables a and B point to the 'a1' object in the constant pool, the output of a = B is true.
Question 2
String hw = "Hello world!"; String h = "Hello"; h = h + " world!"; System.out.println(h == hw);
Answer: false
Note: We know from the previous analysis on the string connector +h = h + " world!"
After compilation, It will be replacedh = (new StringBuilder(String.valueOf(h))).append(" world!").toString()
. View the toString method of StringBuilder. You can see that this method is actuallyreturn new String(value, 0, count)
That is, h points to the objects on the java stack, and hw points to the objects in the constant pool. Although h and hw have the same content, the output is false because they point to different String objects.
Question 3
public static final String h2 = "Hello"; public static final String h4 = getH(); private static String getH() { return "Hello"; } public static void main(String[] args) { String hw = "Hello world!"; final String h1 = "Hello"; final String h3 = getH(); String hw1 = h1 + " world!"; String hw2 = h2 + " world!"; String hw3 = h3 + " world!"; String hw4 = h4 + " world!"; System.out.println(hw == hw1); System.out.println(hw == hw2); System.out.println(hw == hw3); System.out.println(hw == hw4); }
Answer: true, true, false, false
Note: The local variable h1 is modified by final, which means that h1 is a constant and h1 is directly assigned as the string literal "Hello", so that the java compiler can determine the value of h1 during compilation, so that the place where h1 appears is directly replaced with the word surface volume "Hello" (similar to the constant defined by define in c/c ++ ), contact the previous instructions that the literal volume will be directly spliced during compilation, so the codeString hw1 = h1 + " world!"
OptimizedString hw1 = "Hello world!"
, Hw and hw1 both point to the String object in the constant pool, and the output is true. Similarly, h2 is a static constant and a direct Literal Value assignment method. Where h2 appears, it will be replaced by the literal "Hello" after compilation. Finally, hw2 is also a String object pointing to the constant pool, and the output is true.
The local variable h3 is also modified by final, which is a constant, but it is assigned a value through method call, and the specific value cannot be determined during the compilation period (at this time, the Code is not executed, it is impossible to obtain the return value of a method through static analysis, even if the method body simply returns a String constant, as shown in the preceding example), and then contact the previous Essential Analysis about +. ThereforeString hw3 = h3 + " world!"
After compilationString hw3 = (new StringBuilder(String.valueOf(h3))).append(" world!").toString()
, Hw3 will point to the String object on the java stack, hw = hw3 output is false. Similarly, hw4 points to the String object on the java stack, and hw = hw4 outputs false.
Add knowledge points
There are two ways to assign values to variables of the String type:
- 1. Direct literal assignment, that is
String str = “abc”
;
- Second, assign a value using the new method, that is
String str = new String(“abc”)
;
Method 1: The variable str directly points to the String object in the String constant Pool 1 that is literally "abc", that is, the String object in the constant pool.
Method 2: The variable str is assigned a value through the new constructor String (String original), that is, it points to the String object in the java heap. This constructor receives String type parameters, and the real parameter "abc" points to the String object in the constant pool.
The above two methods assign values to variables of the String type, except that they point to different String objects, there is no difference between them. From the perspective of program efficiency, we recommend that you assign a value to the String type variable, because method 2 assigns another String object in the java heap.
As mentioned above, the String literal is directly regarded as an instance of the String Class. It is actually stored in the constant pool of the Class file during compilation. When the Class file is loaded by jvm, it will enter the runtime pool of the method area. To add a new constant to the constant pool during running, you can call the intern () method of String.
When the intern method is called, if the constant pool already contains a String equal to this String Object (determined by the equals (Object) method), the String in the constant pool is returned. Otherwise, this String object is added to the pool and a reference to this String object is returned.
Code for modifying the String object with reflection:
public static void main(String[] args) throws NoSuchFieldException, IllegalAccessException { String name = "angel"; String name1 = "angel"; Field strField = String.class.getDeclaredField("value"); strField.setAccessible(true); char[] data = (char[])strField.get(name); data[4] = 'r'; System.out.println(name); System.out.println(name1); System.out.println(name == name1); strField = String.class.getDeclaredField("count"); strField.setAccessible(true); strField.setInt(name, 10); int i = (Integer)strField.get(name); System.out.println(i); System.out.println(name.length()); }