A thorough understanding of string in Java

Source: Internet
Author: User
Tags stringreplace
To understand how string works in Java, you must make it clear that string is an immutable class ). What is a non-mutable class? Simply put, an instance of a non-mutable class cannot be modified. The information contained in each instance must be provided at the time of creation, and remains unchanged throughout the lifecycle of the object. Why does Java design string as a non-mutable class? You can ask James Gosling :). But the non-variable class does have its own advantages, such as single state, simple object, easy to maintain. Second, such object objects are essentially thread-safe and do not require synchronization. In addition, users can share mutable objects or even their internal information. (For details, see
Objective Java item 13 ). The string class is widely used in Java and even has its presence in class files. Therefore, it is appropriate to design it as a simple and lightweight non-variable class. 1. Create.   Now that we know that string is a non-mutable class, we can further understand the string construction method. To create a stirng object, you can use either of the following methods:Java code string str1 = new string ("ABC "); Stirng str2 = "ABC ";     Although both statements return a reference to a string object, the JVM treats the two statements differently. For the first type, JVM will immediately create a String object in heap, and then return the reference of this object to the user. For the second type, JVM first searches for whether the string object is stored in the object pool in the strings pool maintained internally using the string equels method. If yes, the existing String object is returned to the user, instead of re-creating a new String object in heap. If the string object does not exist in the object pool, the JVM creates a new String object in heap, return the reference to the user and add the reference to strings.
Pool. Note: When you use the first method to create an object, JVM will not take the initiative to put the object in the strings pool unless the program calls the intern method of string. See the following example:Java code string str1 = new string ("ABC"); // JVM creates a String object on the stack.         // JVM cannot find a string with the value of "ABC" in strings pool. Therefore     // Create a String object on the stack and add the object reference to the strings pool     // There are two string objects on the stack.   Stirng str2 = "ABC ";         If (str1 = str2 ){       System. Out. println ("str1 = str2 ");     } Else {       System. Out. println ("str1! = Str2 ");     }       // The print result is str1! = Str2, because they are two different objects on the stack           String str3 = "ABC ";     // At this time, JVM finds that there is an "ABC" object in the strings pool, because "ABC" equels "ABC"     // Therefore, the str2 object is directly returned to str3, that is, str2 and str3 point to the reference of the same object.       If (str2 = str3 ){       System. Out. println ("str2 = str3 ");       } Else {       System. Out. println ("str2! = Str3 ");       }     // Print the result as str2 = str3     Let's look at the following example:Java code string str1 = new string ("ABC"); // JVM creates a String object on the stack.       Str1 = str1.intern ();   // The program explicitly places str1 In the strings pool. The intern running process is as follows: first view the strings pool   // If there is no reference to the "ABC" object, create an object in the heap and add the reference of the new object   // Strings pool. After executing this statement, str1's original string object has become a spam object and will   // Collected by GC.       // At this time, JVM finds that there is an "ABC" object in the strings pool, because "ABC" equels "ABC"   // Therefore, str2 and str1 are directly returned to the str2 object, that is, str2 and str1 reference the same object,   // At this time, there is only one valid object on the stack.   Stirng str2 = "ABC ";         If (str1 = str2 ){       System. Out. println ("str1 = str2 ");     } Else {       System. Out. println ("str1! = Str2 ");     }       // Print the result as str1 = str2             Why can JVM process string objects like this? It is because of the non-variability of the string. Since the referenced object never changes once it is created, multiple references share one object without affecting each other.2. concatenation ).   Java programmers should know that misuse of String concatenation operators will affect program performance. Where does the performance problem come from? In the final analysis, it is the non-variability of the string class. Since the string object is non-mutable, that is, once an object is created, it cannot change its internal State. However, the concatenation operation obviously requires the growth of the string, that is, to change the internal status of the string, there is a conflict between the two. What should we do? To maintain the non-variability of the string, you have to create a new String object after the concatenation is complete to represent the new string. That is to say, every time a string operation is executed, new objects will be generated. If the string operation is executed frequently, a large number of objects will be created, and performance problems will arise.    To solve this problem, JDK provides a variable supporting class for the string class, stringbuffer. The stringbuffer object is variable. The internal data structure is changed only when the string is connected, but no new object is created. Therefore, the performance is greatly improved. For a single thread, JDK 5.0 also provides the stringbuilder class. In a single-threaded environment, this class can further improve the performance because synchronization is not required.3. Length of string   We can use the concatenation operator to get a long string. How many characters can a String object contain at most? View the source code of the string, we can know that the string class uses the Count field to record the number of object characters, and the count type is int. Therefore, we can infer that the longest length is 2 ^ 32, that is, 4G.    However, when writing the source code, if you define a string in the form of sting STR = "aaaa";, the ASCII characters in double quotation marks can only contain 65534. Why? In the class file specification, the constant_utf8_info table uses a 16-bit unsigned integer to record the length of the string. The maximum length can be 65536 bytes, while the Java class file uses a variant of the UTF-8 format to store characters, the null value is expressed in two bytes, so only 65536-2 = 65534 bytes are left. It is also the reason for the variant UTF-8, if the string contains non-ASCII characters such as Chinese, then the number of characters in double quotes will be less (a Chinese character occupies three bytes ). If this number is exceeded, the compiler reports an error during compilation.Public class test {     Public static void stringreplace (string text ){     // Copy the address of textstring to text. Text also points to the "Java" of textstring"     // Text. Replace ('J', 'I'); the result is "iava"     // Text = text. Replace ('J', 'I'); that is, re-point text to "iava"     // Because the return type of this method is void, the original textstring is not changed.     TEXT = text. Replace ('J', 'I ');     }         Public static void bufferreplace (stringbuffer text ){     // Copy the address of textbuffer to text     // Then add a "C"     // Although no results are returned, this operation affects the string pointed to by textbuffer.     // Print "Java" in the previous method. This method prints "javac"     TEXT = text. append ("C ");     }         Public static void main (string ARGs []) {     String textstring = new string ("Java ");     Stringbuffer textbuffer = new stringbuffer ("Java ");         Stringreplace (textstring );     Bufferreplace (textbuffer );         System. Out. println (textstring + textbuffer );     } } First, put the problem out. Let's first look at this code. String A = "AB "; String B = "A" + "B "; System. Out. println (A = B )); What is the printed result? For questions like this, someone has tested me, and I have also tested others (very interesting, you can also ask people to play). The general answer is as follows: 1. True     The result of "A" + "B" is "AB". In this way, both A and B are "AB". The content is the same, so "equal". The result is true.     This is usually the answer for new Java users. 2. False     "A" + "A" will generate a new object "AA", but this object is different from string a = "AB", (a = B) is to compare object reference, therefore, they are not equal. The result is false.     If you have a certain understanding of the Java string, this is usually the answer. 3. True     String A = "AB"; create a new object "AB"; then run string B = "A" + "B"; Result B = "AB ", no new object is created, but an existing "AB" object is obtained from the JVM String constant pool. Therefore, A and B have references to the same string object. The two references are equal and the result is true.     The answer to this question is basically a master. I have a better understanding of the string mechanism in Java.     Unfortunately, this answer is not accurate enough. Or, there is no runtime computing B = "A" + "B"; in fact, this operation only has string B = "AB ";     3 is applicable to the following situations:     String A = "AB ";     String B = "AB ";     System. Out. println (A = B ));     If String B = "A" + "B"; is executed at runtime, the viewpoint 3 cannot be explained. When two strings are added at runtime, a new object is generated. (This article will be explained later) 4. True     The following is my answer: Compilation optimization + 3 processing method = final true     String B = "A" + "B"; the compiler uses this "A" + "B" as a constant expression and optimizes it during compilation. The result "AB" is obtained directly ", this problem degrades.     String A = "AB ";     String B = "AB ";     System. Out. println (A = B ));     Then, based on the explanation of 3, the result is true.     The question here is that string is not a basic type, such Int secondsofday = 24*60*60;     This expression is a constant expression, which is easy to understand by the compiler during compilation. For expressions like "A" + "B", string is not a basic type of object, will the compiler optimize it as a constant expression?     The following is a simple example of my inference. Compile this class first: Public class test {     Private string a = "AA "; }     Copy the class file for backup and modify it Public class test {     Private string a = "A" + ""; }     Compile again and use a text editor such as ue to open the file and view the binary content. We can find that the two class files are completely consistent, and each byte is not bad.     OK. It is clear that there is no problem in the runtime processing string B = "A" + "B"; such code can be directly optimized during compilation. Next we will further discuss what kind of string + expression will be treated as a constant expression by the compiler? String B = "A" + "B "; This string + String is officially OK. What about the basic type of string +? String A = "A1 "; String B = "A" + 1; system. Out. println (A = B )); // Result = true String A = "atrue "; String B = "A" + true; System. Out. println (A = B )); // Result = true String A = "a3.4 "; String B = "A" + 3.4; System. Out. println (A = B )); // Result = true       It can be seen that the compiler directly evaluates the string + basic type as a constant expression to optimize it. Note that the strings here are all "**" and we will try using the variable: String A = "AB "; String BB = "B "; String B = "A" + BB; System. Out. println (A = B )); // Result = false It is easy to understand that bb in "A" + BB is a variable and cannot be optimized. The following explains why 3 is incorrect. If the string + String operation is performed at runtime, a new object is generated, instead of getting it directly from the JVM string pool. Modify BB as a constant variable: String A = "AB "; Final string BB = "B "; String B = "A" + BB; System. Out. println (A = B )); // Result = true It turned out to be true, and the compiler has been well optimized. Haha, consider the following situation: String A = "AB "; Final string BB = getbb (); String B = "A" + BB; System. Out. println (A = B )); // Result = false Private Static string getbb (){ Return "B "; } It seems that the string Optimization in Java (including the compiler and JVM) has reached the extreme. The so-called "object" of string cannot be regarded as a general object at all, java's processing of string is almost the same as that of the basic type, which maximizes the optimization. In addition, the string + code processing is the only "operator overload" in Java language (it is not unfamiliar to people who have been in contact with C ++?
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.