Strings are one of the most important objects in software development. Typically, a string object occupies the largest chunk of space in memory, so how to handle strings efficiently is bound to be the key to improving overall performance.
1. String objects and their characteristics
The eight basic data types in Java do not have a string type because the string type is a further encapsulation of the char array by java.
The implementation of the String class consists mainly of three parts: the char array, offset offsets, and the length of the string.
There are three basic characteristics of the string type:
Non-denaturing
Invariance means that once a string object is generated, it can no longer be changed.
The effect of immutability is that when an object needs to be shared by multiple threads and accessed frequently, it can omit the time of synchronization and lock waiting, thus greatly improving system performance.
optimization for a constant pool
When two string objects have the same value, they reference only the same copy in the constant pool.
Final definition of class
As a final class, a string object cannot have any subclasses in the system, which is the protection of the system security!
Memory leak of the 1.1 subString () method
About this, in the JDK 1.7 and later has been resolved!
Before 1.7, the subString () method intercepts the string just by moving the offset, and the truncated string is actually the original size.
The truncated string is now copied to the new object when the string is truncated using the substring () method.
1.2 String segmentation and Lookup 1, original String.Split ()
The String.Split () method is simple, powerful, and supports regular expressions, but it is not advisable to use this method frequently in performance-sensitive systems.
Note * ^: |. These symbols remember \ Escape
2, use more efficient StringTokenizer class to split the string
The StringTokenizer class is a tool class that is specifically designed to handle string splitting in the JDK. Construction Method:
Where Str is the string to be split, Delim is the delimiter, RETURNDELIMS returns the separator, and false by default.
String s = "a;b;c"; StringTokenizer stringTokenizer = new StringTokenizer(s, ";", false); System.out.println(stringTokenizer.countTokens()); while (stringTokenizer.hasMoreTokens()) { System.out.println(stringTokenizer.nextToken()); }
3, the optimization of the string segmentation method
The IndexOf () method is a very fast-performing method, and SubString () uses time-changing space technology, so it is relatively fast.
public static List<String> mySplit(String str, String delim){ List<String> stringList = new ArrayList<>(); while(true) { int k = str.indexOf(delim); if (k < 0){ stringList.add(str); break; } String s = str.substring(0, k); stringList.add(s); str = str.substring(k+1); } return stringList; }
Comparison and selection of 4 and three kinds of segmentation methods
The split () method is powerful, but the least efficient;
StringTokenizer performance is better than split method, it is not necessary to use split () to use StringTokenizer;
The performance of the segmentation algorithm is the best, but the readability of the code and the maintainability of the system are the worst, it is recommended only when system energy becomes the main contradiction.
5, high-efficiency Charat method
charAt (int index) returns the char value at the specified index. function and indexof () In contrast, efficiency is just as high.
6, string before and after the judgment
public boolean startsWith(String prefix)
Tests whether this string starts with the specified prefix
public boolean endsWith(String suffix)
Tests whether this string ends with the specified suffix
These two Java built-in functions are much less efficient than the Charat () method. Unit tests:
@Test public void test(){ String str = "hello"; if (str.charAt(0)=='h'&&str.charAt(1)=='e'){ System.out.println(true); } if (str.startsWith("he")){ System.out.println(true); } }
1.3 of StringBuffer and StringBuilder1, string constants
String s = "123"+"456"+"789";
Although the efficiency of string accumulation is not high in theory, the execution time of the statement is 0; after we decompile the code, we find that the code is
String s = "123456789";
Obviously, Java was fully optimized at compile time. Therefore, there is no way to generate a large number of string instances as expected.
For connection operations of static strings, Java is thoroughly optimized at compile time, combining strings of multiple connection operations into a single long string at compile time.
2, string variable accumulation of operations
String str = "hello"; str+="word"; str+="!!!";
We use "+ =" To change the value of the string content, in fact the string does not change at all.
When the str+="word"
heap memory opens up word
the memory space and helloword
the two memory space (equivalent to instantiating two string objects), and the STR reference is pointed helloword
to, the original hello
and word
become garbage is recycled by the JVM.
3. Concat () connection string
The concat () string is a method that is specifically used for string connection operations and is much more efficient than "+" or "+ =".
4, StringBuffer and StringBuilder
Needless to say, it's the most efficient way to connect strings. The difference is that stringbuffer almost all of the methods are synchronized, StringBuilder did not do any synchronization, more efficient. Only in multithreaded systems, StringBuilder cannot guarantee thread safety and cannot be used.
5. Capacity Parameters
StringBuffer and StringBuilder are the encapsulation of strings, and string is the encapsulation of the char array. Is the size of the array, there is not enough time, not enough to only expand, that is, the original copied into the new array. The appropriate capacity parameters can naturally reduce the number of expansion, to achieve the purpose of improving efficiency.
At initialization time, the capacity parameter is 16 bytes by default. Specify the capacity parameter in the construction method:
1.4 with some practical methods
Determine string equality (ignoring case)
equalsIgnoreCase(String anotherString)
Determine if a substring exists (returns a Boolean type)
contains(CharSequence s)
Connects the specified string to the end of this string
concat(String str)
Returns a formatted string using the specified format string and parameters
format(String format, Object... args)
Use the default locale rule to convert all characters in this String to lowercase.
Converts all characters in this String to uppercase using the rules of the default locale.
Returns a copy of the string, ignoring leading and trailing blanks.
Replaces this string with the given replacement for all substrings that match the given regular expression.
String replaceAll(String regex, String replacement)
Compares two strings in a dictionary order, regardless of case.
int compareToIgnoreCase(String str)
- If the argument string equals this string, the value 0 is returned;
- If this string is less than the string parameter, a value less than 0 is returned;
- If this string is larger than the string argument, a value greater than 0 is returned.
This article refers to "Java Program performance optimization" Ge Yi