Java-string of the root of the question

Source: Internet
Author: User

An introduction to the root of java-string

In Java programming, strings are dealt with almost daily, so it is necessary to understand the string and its usage in depth. The following three aspects are described in detail the characteristics and usage of string-related immutable (immutable) characteristics • Connection symbol + essence • Equivalence judgment two ways (==/equals) description

First, immutable characteristics

In order to facilitate the various operations of the string, the Java Designer abstracts out the string class, which encapsulates a series of operations such as searching, splicing, replacing, and intercepting strings. To view the source code of Java.lang.String, we can see the following description:

The String class represents character strings. All strings literals in Java programs, such as "ABC", is implemented as instances of this class. Strings is constant; Their values cannot is changed after they is created. String buffers support mutable strings. Because String objects is immutable they can be shared.

The effect is that the string class represents a sequence of characters. All string literals In the Java language, such as "ABC", are implemented as instances of the string class. A string object is a constant whose value cannot be changed after it has been created. string buffers support variable sequences of characters. Because of the immutability of string objects, they can be shared.

The immutable of string is embodied in two aspects:

The string class is decorated with the final keyword, meaning that the class cannot be inherited. Because the string class cannot have subclasses, it guarantees that the static and instance methods of the string class cannot be inherited and modified, guaranteeing security.

Private member variables inside the string class, such as value (char[]), offset (int), and count (int) are decorated with the final keyword. When a string object is created, the value of offset and the Count field cannot be changed (because it is the base data type int), and the value variable cannot point to another character array (because it is a reference type variable)

As you can see here, one might say that although the Value property cannot point to other character arrays, the contents of the character array it points to can change, and if it can change its contents, it does not mean that the string object is mutable.

This is true, but the string class itself does not provide a way to modify the character array unless you change the value of the private property (which is followed by a code implementation) by unconventional means (such as reflection). Although it is entirely possible to change the value of each property of a string object since it was created (even if the property is private final decorated), it is an unconventional means of reflection. In normal usage, we cannot change the value of a string object, so we still think the string object is immutable.

Note: This is said to be a string invariant, which means that the value of a string object cannot be changed after it is created. For a variable of a reference type, you can point to a different string object to change the value it represents. Such as

String s = "abc"; s = "def";

The reference type variable s points to a string object with a value of ' abc ', and then s points to a string object with a value of ' Def '. Although the string represented by S does change, for the string object ABC and DEF do not change, just s point to a different string object.

There are at least two benefits of string design as immutable:

• One is safe. The string class is final decorated, meaning that it is not possible for subclasses to inherit the string class and change their original behavior. Also, the resulting string object is immutable and is also secure in a multithreaded environment.

• Second, efficiency. The string class is final decorated and implies that all methods of the class are final and the compiler can perform some optimizations. In addition, because the string object is invariant, it can be shared in multiple places without the need for synchronization between multithreading, which improves efficiency.

Because of the invariance of the string object, it may be inefficient to concatenate strings with the + sign, and the following is a detailed explanation of what is the nature of the connection symbol +? How does the bottom layer perform string concatenation? Under what circumstances is a string connection less efficient with the + sign?

Second, the connection symbol + Essence

To understand the essence of the + sign, start with the Java compilation. It is well known that Java code needs to be compiled into a class file before it is run (the structure of the class file is not explained in detail due to its limited space). Part of the class file is called the attribute table collection, which includes the Code property, which simply says that the code attribute contains the bytecode instructions that are compiled in the method body. Therefore, we can see the essence of the + by directly looking at the bytecode instruction in the class file. The sample code is as follows

public class StringTest {    public static void main(String[] args) {        String s = "Hello";        s = s + " world!";    }}

Since we are not familiar with the class file structure, and bytecode is very difficult to understand, here do not directly see the contents of the compiled generated Stringtest.class file, but instead of the Jad tool to decompile the bytecode to view the results. After executing the jad command under CMD jad-o-a-sjava stringtest.class Successful execution of the above command, you will find the Stringtest.class file in the same directory will be more than the source file Stringtest.java, the contents are as follows:

public class stringtest{Public stringtest () {//0 0:aload_0//1 1:invokespecial #8        <method void Object () >//2 4:return} public static void Main (String args[]) {    String s = "Hello"; 0 0:ldc1 #16 <string "Hello" >//1 2:astore_1 s = (new StringBuilder (St Ring.valueof (s)). Append ("world!").    ToString ();             2 3:new #18 <class stringbuilder>//3 6:dup//4 7:aload_1 5 8:invokestatic #20 <method String string.valueof (Object) >//6 11:invokespecial #2 6 <method void StringBuilder (String) >//7 14:LDC1 #29 <string "world!" >//8 16:invokevirtual #31 <method StringBuilder stringbuilder.append (String) >//9 19:invoke      Virtual #35 <method String stringbuilder.tostring () >//22:astore_1      23:return}} 

The above-compiled source code contains a comment line that represents the bytecode directive corresponding to the source. Obviously, the source code does not have the string Connector +, that is, the + number is compiled, has been replaced by the StringBuilder Append method call (Implementation on the jdk1.5 version before the + is replaced by the Append method call with StringBuffer after the compiler compiles). The so-called + number connection string is essentially a string concatenation by calling its Append method after the new StringBuilder object.

Java by overloading the string operator in the compilation phase +, in the convenience of the operation of the string also brings some side effects, for example, because the programmer does not know the nature of the + number and write inefficient code, see the following code:

 public String concat(){        String result = "";        for (int i = 0; i < 1000; i++) {            result += i;        }        return result;    }

In the For loop body, where the + sign appears, the compilation is replaced with the following call:

Obviously, each loop needs to copy the array of characters in result when the StringBuilder object is constructed, and when the ToString method is called, the character array in the StringBuilder is copied to construct the string object. The equivalent of each for loop, two object creation and two character array copies, so the program is inefficient. The more efficient code is as follows:

  public String concat(){        StringBuilder result = new StringBuilder();        for (int i = 0; i < 1000; i++) {            result.append(i);        }        return result.toString();    }

At this point, I believe you already know the essence of the + number and how to avoid inefficient use of the + number. Let's take a closer look at two ways to determine the equality of a string object (often appearing in a Java interview).

Three, two ways of equal judgment (==/equals) description

==: When the two operands are of the base data type, the comparison values are equal, and when the two operands are reference types, the comparison points to the same object. The equals method is used to compare the contents of two objects for equality.

Because string is a reference type, when judged by = =, compares whether two string variables point to the same string object, and compares the contents of two string objects for equality when judged by the Equals method. When comparing strings in actual projects, it is essential to compare the contents of two string objects, so it is recommended that you use the Equals method for comparison.

Use = = for string comparison, often appear in the interview problem, rather than in the project code, for the actual work evaluation of little significance, just as everyone to the degree of understanding of the string of a test. Some of the following questions are listed below:

Question 1

        String a = "a1";           String b = "a" + 1;           

Answer: true: When two literals are connected, the literal concatenation is actually performed at the compile time of the Java compiler. That is, the compiler generated class file does not exist in string B = "a" + 1 corresponding bytecode instruction, has been optimized to string b = "A1" corresponding bytecode instruction. I think this step optimization people should be able to understand that during compilation can determine the results and calculations, it can effectively reduce the class file bytecode instruction, that is, the program runs to reduce the need to execute the instructions, improve the program efficiency (you can use the above Jad command to decompile the class file to verify). Similarly, arithmetic operations for basic data type literals are also evaluated during compilation, such as int day = 24 * 60 * 60, which is replaced with code int day = 0x15180. Since stitching has been done during compilation, local variables A and B both point to the ' A1 ' object in the constant pool, so a = = B output is true.

Question 2

        String hw = "Hello world!";        String h = "Hello";        h = h + " world!";        System.out.println(h == hw);

Answer: False description: Through the previous about String connector + analysis, we know h = h + "world!" After compilation, it is replaced with H = (New StringBuilder (string.valueof (h))). Append ("world!"). ToString (). Looking at the ToString method under StringBuilder, you can see that the method is actually the return new String (value, 0, Count), that is, H will point to the object on the Java heap, and HW is the object that points to the constant pool. Although the contents of H and HW are the same, the output is false because it points to a different string object.

Question 3

  public static final String h2 = "Hello";    public static final String h4 = Geth ();      private static String Geth () {return "Hello";        } public static void Main (string[] args) {String hw = "Hello world!";        Final String h1 = "Hello";        Final String h3 = Geth ();        String HW1 = h1 + "world!";        String hw2 = h2 + "world!";        String hw3 = h3 + "world!";        String hw4 = h4 + "world!";        System.out.println (HW = = HW1);        System.out.println (HW = = HW2);        System.out.println (HW = = HW3);    System.out.println (HW = = HW4); }

Answer: True,true,false,false Description: The local variable h1 is final decorated, meaning that H1 is a constant, and H1 is directly assigned to the string literal "Hello", so that the Java compiler can determine the value of H1 at compile time, Thus the place where the H1 appears is directly replaced with the literal "Hello" (similar to C + + with define defined constants), and then contact before the literal will be directly stitched in the compilation period, so the code string HW1 = h1 + "world!" After compilation is optimized to string hw1 = "Hello world!", HW, hw1 all point to a string object in the constant pool, and the output is true. Similarly H2 is a static constant, and is a direct literal assignment, where the H2 appears will be directly replaced by the literal "Hello" after compilation, and finally, HW2 is also a string object in the constant pool, the output is true.

The local variable h3 is also final decorated, as a constant, but it is assigned through a method call, the compilation period can not determine its specific value (at this time the code is not executed, it is impossible to get the return value of the method through static analysis, even if the method body is simply the return string constant, such as the above example), and then contact before The nature of the analysis, so string hw3 = h3 + "world!" After compilation is String hw3 = (New StringBuilder (String.valueof (H3))). Append ("world!"). ToString (), Hw3 points to a string object on the Java heap, and the HW = = HW3 output is false. Similarly, HW4 also points to a string object on the Java heap, and the HW = = HW4 output is false.

Complementary knowledge points

There are two ways to assign a variable of type string: • One, direct literal assignment, that is, string str = "ABC"; • Second, new method assignment, that is, string str = new String ("abc");

In mode one, the variable str directly points to a string object of literal "ABC" in constant Pool 1, which points to a string object in the constant pool. In mode two, the variable STR is assigned a value through the new constructor string (string original), which points to a string object in the Java heap. The constructor receives a string type argument, and the argument "ABC" points to a string object in the constant pool.

The above two ways of assigning a value to a string type variable are no different except that they point to various string objects. From the point of view of program efficiency, it is recommended to assign a value to a string type variable because mode two has one more string object assignment for the Java heap.

As mentioned earlier, string literals are directly considered an instance of the string class, which is actually stored in a constant pool of class files at compile time, and when the class file is loaded by the JVM, it enters into the run-time pool of the method area. If you want to add a new constant to the constant pool during run time, you can call the Intern () method of the string. When the Intern method is called, if the constant pool already contains a string equal to this string object (as determined by the Equals (Object) method), the string in the constant pool is returned. Otherwise, this string object is added to the pool and a reference to this string object is returned.

Attach reflection to modify the string object code:

 public static void main(String[] args) throws NoSuchFieldException, IllegalAccessException {        String name = "angel";        String name1 = "angel";        Field strField = String.class.getDeclaredField("value");        strField.setAccessible(true);        char[] data = (char[])strField.get(name);        data[4] = ‘r‘;        System.out.println(name);        System.out.println(name1);        System.out.println(name == name1);        strField = String.class.getDeclaredField("count");        strField.setAccessible(true);        strField.setInt(name, 10);        int i = (Integer)strField.get(name);        System.out.println(i);        System.out.println(name.length());    }   

 

Java-string of the root of the question

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.