Original address: http://alphawang.com/blog/2014/12/the-impovement-of-string-substring-in-java7-/
The implementation of string#substring () in Java6 and Java7 is not the same. This is because JAVA6 implementations can cause memory problems, so the implementation is modified in JAVA7 to improve the problem. So the realization in the JAVA7 is really reasonable.
Let's start by guessing how Java implements the substring functionality. Since string is immutable, we might guess that the implementation mechanism is as follows:
However, the graph is not entirely correct, or it does not fully represent what really happens in the Java heap. the substring in Java6 ()
In Java, strings are supported by a character array, and in JDK6, the string class contains 3 instance variables:
-char[] Value represents a true array of characters;
-int offset represents the offset of the array;
-int count represents the number of characters that a string contains.
When the substring () method is invoked, a new string object is created, but the value of the string still points to the same array in the Java heap, and the two strings differ only in their count and offset values.
You can refer to the source code in JAVA6:
Java 6
String (int offset, int count, Char value[]) {
this.value = value;
This.offset = offset;
This.count = count;
}
Public String substring (int beginindex, int endindex) {
//check boundary return
new String (offset + beginindex, en Dindex-beginindex, value);
}
problems that may result from substring () in Java6
The problem with this implementation is that if you have a very long string, but you only need a small part of the string, you need only a tiny portion of it, and the substring contains an entire character array. This can cause a memory overflow problem.
One way to circumvent this problem is to recreate an object for the substring obtained by substring (). For example:
String littlestring = largestring.substring (0,2) + "";
Or:
String littlestring = new String (largestring.substring (0,2));
the substring in Java7 ()
The above problem is corrected in Java7, and when the substring () method is invoked, a new array is actually created in the heap, and the GC is recycled when the original character array is not referenced.
We look at the source code:
Java 7 public
String substring (int beginindex, int endindex) {return
(Beginindex = = 0) && (endindex = = Value.length))? This
: New String (Value, Beginindex, Sublen);
}
Public String (char value[], int offset, int count) {
this.value = arrays.copyofrange (value, offset, offset+count);
}
You can see that Java7 has recreated a character array through Arrays.copyofrange. is the modification of the JAVA7 reasonable.
Java7 Although it avoids the memory problem that substring may appear, the new implementation is really good.
JAVA6 implementation, when substring, use a shared content character array, faster, without having to reapply for memory. Although there are potential memory performance issues in this article, there are ways to solve them.
The JAVA7 implementation, for any string, even if not large string, will reapply the memory, the speed will be slower, performance will be worse. If most of the processes in our program are not large string, this effect on performance is not worth the candle.
If the implementation of JAVA6 is maintained, we call substring directly when dealing with a large string, while the large string is resolved by the circumvention method mentioned above. why the implementation of List#sublist () has not changed.
Java has a similar logic, function, implementation mechanism of the String#substring method: List#sublist. When JAVA6 handles the sublist of the large list, there is also a memory problem, and the odd time Java7 does not modify the implementation:
Abstractlist public
list<e> sublist (int fromindex, int toindex) {return
(this instanceof randomaccess?)
New Randomaccesssublist<> (This, Fromindex, Toindex):
new Sublist<> (This, Fromindex, Toindex));
}
sublist (abstractlist<e> list, int fromindex, int toindex) {
L = list;
offset = Fromindex;
size = Toindex-fromindex;
This.modcount = L.modcount;
}
So we still need to use circumvention when dealing with the large list:
public static <E> list<e> sublist (list<e> originallist, int fromindex, int toindex) {return
new Ar Raylist<e> (Originallist.sublist (Fromindex, Toindex));
}
Why Java7 not make changes to list#sublist to keep it consistent with string#substring's implementation mechanism. Unknown. Reference
http://www.programcreek.com/2013/09/the-substring-method-in-jdk-6-and-jdk-7/
Original address: http://alphawang.com/blog/2014/12/the-impovement-of-string-substring-in-java7-/