String is the most important data type in Java. Strings are one of the most important objects in software development, and in general, string objects always occupy the largest chunk of space in memory. Therefore, the efficient processing of strings will improve the overall performance of the system.
In the Java language, a string object can be thought of as a derivation and further encapsulation of a char array. Its main components are: Char array, offset, and length of string. A char array represents the contents of a string, which is a superset of the string represented by a string object. The actual contents of the string also require offsets and lengths to be further positioned and truncated in this char array. (View Java source code to see char arrays, offsets, and length definitions)
The three basic characteristics of a String object:
1, invariance; Once a string object is generated, it cannot be changed. This feature of a string object can be referred to as a invariant pattern, that is, the state of an object does not change after the object is created. In addition, the invariant mode is mainly used when an object needs to be shared by multiple threads, and when access is frequent, it can omit the synchronization and lock waiting time, thus greatly improving the performance of the system.
2, for constant pool optimization; When two string objects have the same value, they reference only the same copy in a constant pool. When the same string appears repeatedly, you can save a significant amount of memory space.
3, the final definition of the class. The final class of string objects cannot have any subclasses in the system, which is the protection of system security.
String
Classes include methods that can be used to examine a sequence of individual characters, compare strings, search strings, extract substrings, create a copy of a string, and convert all characters to uppercase or lowercase, etc. skilled use of these methods can be of great help in enterprise development.
Intercepting substrings is one of the most common operations in Java and provides two ways to intercept substrings in Java:
1 substring (intint endIndex)2 substring (int beginindex)
View the source code for substring (int beginindex, int endIndex):
1 PublicString substring (intBeginindex,intEndIndex) {2 if(Beginindex < 0) {3 Throw Newstringindexoutofboundsexception (beginindex);4 }5 if(EndIndex >value.length) {6 Throw Newstringindexoutofboundsexception (endIndex);7 }8 intSublen = EndIndex-Beginindex;9 if(Sublen < 0) {Ten Throw Newstringindexoutofboundsexception (Sublen); One } A return(Beginindex = = 0) && (endIndex = = value.length))? This -:NewString (value, Beginindex, Sublen); -}
View Code
At the end of the method, a new string object is returned, looking at its constructor:
1 Public String (charintint count) {2this . value=Value, 3 this. offset=offset,4this . count=Count5 }
View Code
In the source code, this is a package-scoped constructor that is designed to efficiently and quickly share a char array object within a string. But in this method of intercepting a string by an offset, the value array of the string native content is copied into the new substring. Imagine that if the substring length is short, but the original string length is very long, then the truncated substring contains all the contents of the native string, and occupies the corresponding memory space, but only by the offset and length to determine their own actual value. This method increases the computational speed but wastes space, and is a time-space-changing solution.
Instance code:
1 Public classTest {2 3 Public Static voidMain (string[] args) {4list<string> hander =NewArraylist<string>();5 for(inti = 0; I < 100000; i++) {6Hugestr str =Newhugestr ();7 //imphugestr str = new Imphugestr ();8Hander.add (str.getsubstring (0, 5));9 }Ten } One A } - - Static classImphugestr { the PrivateString str =NewString (New Char[10000]); - PublicString getsubstring (intBeginintEnd) {//truncate substring and regenerate new string - return NewString (str.substring (begin, end)); - } + } - Static classHugestr { + PrivateString str =NewString (New Char[10000]); A PublicString getsubstring (intBeginintEnd) {//truncate substring at returnstr.substring (begin, end); - } -}
Imphugestr uses the string constructor without a memory leak to regenerate the string object so that the string object with a memory leak returned by the substring () method loses a strong reference and is garbage collected by the garbage collection mechanism. Thus ensuring the stability of the system memory.
SubString () causes a memory leak because a string (int offset,int count,char[] value) constructor is used, which takes a space-time-out strategy.
The above constructors are used in java1.5, with the following constructors in java1.6:
1 PublicString (CharValue[],intOffsetintcount) {2 if(Offset < 0) {3 Throw Newstringindexoutofboundsexception (offset);4 }5 if(Count < 0) {6 Throw Newstringindexoutofboundsexception (count);7 }8 //Note:offset or Count might be near-1>>>1.9 if(Offset > Value.length-count) {Ten Throw NewStringindexoutofboundsexception (offset +count); One } A This. Value = Arrays.copyofrange (value, offset, offset+count); -}
View Code
The code is using This.value = Arrays.copyofrange (value, offset, offset+count), so that value loses a strong reference to be garbage collected, and there is no memory overflow issue.
- Segmentation and lookup of strings
String segmentation and Lookup is also one of the most commonly used methods in string processing, and the string segmentation is to cut the original string into a small set of strings based on one of the delimiters.
There are roughly three ways to split a string:
1. The most primitive string segmentation:
Split () is the most primitive method of string segmentation, but it provides a very powerful feature of string partitioning. The passed-in parameter can be a regular expression, allowing for complex logical string segmentation. However, the performance of Split () is not as satisfactory for a simple string style. All use in performance-sensitive systems is undesirable.
2. Use the StringTokenizer class to split the string:
StringTokenizer is a tool class provided by the JDK dedicated to string splitting substrings.
Its constructor: New StringTokenizer (String str,string delim), where the str parameter is the split string to be processed, and Delim is the delimiter. When a StringTokenizer object is generated, the next string to be delimited can be obtained through the Nexttoken () method, and the Hasmoretokens () method can be used to know if there are more substrings to be processed. See the API for more specific usage of stringtokenizer.
3, the implementation of their own string segmentation method:
By using the two methods of a String object: IndexOf () and substring (). The previous mention of substring () is a time-based strategy that executes quickly, while IndexOf is also a very efficient way to execute.
Implementation code:
1String str= "";//string to be split2 while(true){3String subStr =NULL;4 intj = Str.indexof (";");//Separator5 if(j<0) Break;6SUBSTR = str.substring (0,j);//truncate substring7str = str.substring (j+1);//left to intercept substring8}
Three ways of comparison, the first split function is powerful, but the efficiency is the worst; the efficiency of the second stringtokenizer as a result of split, it is possible to use stringtokenizer wherever possible stringtokenizer The third is the most efficient, but the readability is rather poor.
In addition, the string object provides a charat (int index) method that returns the character of index in the specified string, and its function is the opposite of indexof (), but its execution is equally efficient. For example, often when a project group encounters the problem of judging a string with the beginning of XX or the end of XX, we first think of the Startwith () and Endwith () methods provided by the string object, which is much faster if the Charat () method is used instead. For example: "ABCD". Startwith ("abc"), changed to "ABCD". CharAt (1) = = "A" && "ABCD". CharAt (1) = = "B" && "ABCD". CharAt (1) = = " C "Increases system efficiency in a large number of scenarios.
- StringBuffer and StringBuilder
The string object is an immutable object, and when modifications are required to a string object, the string object always generates new objects, all of which perform poorly. As a result, the JDK provides tools dedicated to creating and modifying strings, namely StringBuffer and StringBuilder.
1, the cumulative operation of the string constant:
The string object is immutable, so once the string object instance is generated, it cannot be changed. The following code:
1 String str = "abc" + "de" + "f" + "GH";
First, the "ABC" and "de" two strings Generate "ABCDE" objects, and then generate "abcdef" and "abcdefgh" objects in turn, which in theory is not efficient.
The appeal function is implemented through StringBuilder:
1 New StringBuffer (); 2 ab.append ("abc"); 3 ab.append ("de"); 4 Ab.append ("F"); 5 ab.append ("GH");
For connection operations of static strings, Java is thoroughly optimized at compile time, and the strings of multiple connection operations are generated at compile time with a single long string, so the efficiency is high.
2, the cumulative operation of the string variable:
String summation of the variable string:
1 String str1 = "abc"; 2 String str2 = "de"; 3 String str3 = "F"; 4 String str4 = "gh"; 5 String str = STR1+STR2+STR3+STR4;
Java compile, the string processing will be a certain optimization, for the summation of the variable string, using the StringBuilder object to achieve the summation of the string. However, when writing code, it is recommended to use StringBuffer or StringBuilder to optimize the display. Another method similar to "+ =" or "+" in Java is less efficient than the concat () method of a String object, and the Concat () method is much lower than the StringBuilder class. Therefore, use StringBuilder as much as possible in your project to improve efficiency.
3. Selection of StringBuffer and StringBuilder
StringBuffer to almost all of the methods are synchronized, and StringBuilder almost no synchronization, synchronization method needs to consume certain system resources, so StringBuilder efficiency is better than stringbuffer. However, in a multithreaded environment, StringBuilder can not guarantee thread safety, so the use of better performance StringBuilder without regard to thread safety, if the system wired security requirements are used StringBuffer.
In addition, both StringBuffer and StringBuilder can set the capacity size.
1 voidExpandcapacity (intminimumcapacity) {2 intnewcapacity = value.length * 2 + 2;//Expand Capacity3 if(Newcapacity-minimumcapacity < 0)4Newcapacity =minimumcapacity;5 if(Newcapacity < 0) {6 if(Minimumcapacity < 0)//Overflow7 Throw NewOutOfMemoryError ();8Newcapacity =Integer.max_value;9 }TenValue = arrays.copyof (value, newcapacity);//Array Replication One}
Pre-evaluation of the size of the capacity, can effectively reduce the operation of array expansion, thereby improving system performance.
Introduction to Strings in Java, optimization of strings, and how to use strings efficiently