is the improvement of string.substring () in Java7 really reasonable?

Source: Internet
Author: User

Original address: http://alphawang.com/blog/2014/12/the-impovement-of-string-substring-in-java7-/


The implementation of string#substring () in Java6 and Java7 is not the same. This is because JAVA6 implementations can cause memory problems, so the implementation is modified in JAVA7 to improve the problem. So the realization in the JAVA7 is really reasonable.

Let's start by guessing how Java implements the substring functionality. Since string is immutable, we might guess that the implementation mechanism is as follows:

However, the graph is not entirely correct, or it does not fully represent what really happens in the Java heap. the substring in Java6 ()

In Java, strings are supported by a character array, and in JDK6, the string class contains 3 instance variables:
-char[] Value represents a true array of characters;
-int offset represents the offset of the array;
-int count represents the number of characters that a string contains.

When the substring () method is invoked, a new string object is created, but the value of the string still points to the same array in the Java heap, and the two strings differ only in their count and offset values.

You can refer to the source code in JAVA6:

Java 6
String (int offset, int count, Char value[]) {
     this.value = value;
     This.offset = offset;
     This.count = count;
}
 
Public String substring (int beginindex, int endindex) {
     //check boundary return
     new String (offset + beginindex, en Dindex-beginindex, value);
}
problems that may result from substring () in Java6

The problem with this implementation is that if you have a very long string, but you only need a small part of the string, you need only a tiny portion of it, and the substring contains an entire character array. This can cause a memory overflow problem.

One way to circumvent this problem is to recreate an object for the substring obtained by substring (). For example:

String littlestring = largestring.substring (0,2) + "";

Or:

String littlestring = new String (largestring.substring (0,2));
the substring in Java7 ()

The above problem is corrected in Java7, and when the substring () method is invoked, a new array is actually created in the heap, and the GC is recycled when the original character array is not referenced.

We look at the source code:

Java 7 public
    String substring (int beginindex, int endindex) {return
        (Beginindex = = 0) && (endindex = = Value.length))? This
                : New String (Value, Beginindex, Sublen);
    }
    
    Public String (char value[], int offset, int count) {
        this.value = arrays.copyofrange (value, offset, offset+count); 
  }    

You can see that Java7 has recreated a character array through Arrays.copyofrange. is the modification of the JAVA7 reasonable.

Java7 Although it avoids the memory problem that substring may appear, the new implementation is really good.

JAVA6 implementation, when substring, use a shared content character array, faster, without having to reapply for memory. Although there are potential memory performance issues in this article, there are ways to solve them.

The JAVA7 implementation, for any string, even if not large string, will reapply the memory, the speed will be slower, performance will be worse. If most of the processes in our program are not large string, this effect on performance is not worth the candle.

If the implementation of JAVA6 is maintained, we call substring directly when dealing with a large string, while the large string is resolved by the circumvention method mentioned above. why the implementation of List#sublist () has not changed.

Java has a similar logic, function, implementation mechanism of the String#substring method: List#sublist. When JAVA6 handles the sublist of the large list, there is also a memory problem, and the odd time Java7 does not modify the implementation:

Abstractlist public

    list<e> sublist (int fromindex, int toindex) {return
        (this instanceof randomaccess?)
                New Randomaccesssublist<> (This, Fromindex, Toindex):
                new Sublist<> (This, Fromindex, Toindex));
    }

    sublist (abstractlist<e> list, int fromindex, int toindex) {
        L = list;
        offset = Fromindex;
        size = Toindex-fromindex;
        This.modcount = L.modcount;
    }

So we still need to use circumvention when dealing with the large list:

    public static <E> list<e> sublist (list<e> originallist, int fromindex, int toindex) {return
        new Ar Raylist<e> (Originallist.sublist (Fromindex, Toindex));
    }

Why Java7 not make changes to list#sublist to keep it consistent with string#substring's implementation mechanism. Unknown. Reference

http://www.programcreek.com/2013/09/the-substring-method-in-jdk-6-and-jdk-7/



Original address: http://alphawang.com/blog/2014/12/the-impovement-of-string-substring-in-java7-/


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.