"Java Program Optimization"-Depth profiling List performance analysis

Source: Internet
Author: User

List is one of the important data structures. The most common is: ArrayList, Vector and LinkedList three species, their class diagram as shown:


By the way, these three lists all implement the Collection and list interfaces.

Of these three different implementations, ArrayList and vectors use an array implementation that encapsulates the operation of an internal array. The only difference between the two is that ArrayList no one method is synchronized, and the Vector is thread-synchronized. We will then take ArrayList as an example to explain.

Linklist uses a doubly linked list data structure and maintains first and last two member variables to indicate the head and tail of the list. This is a very different implementation technique than ArrayList, and it also determines that they are suitable for different scenarios.

LinkedList are connected by a series of table items. A table entry is always divided into three parts: the element content, the precursor table entry, and the post-drive table entry, as shown in:


In the JDK's line of sight, regardless of whether the linkedlist is empty, there is always a header table entry in the list, which represents the beginning of the linked list and also the end of the list.

The following examples of additions and deletions compare the differences between ArrayList and linklist.

1. Add elements to the end of the list

The code for adding elements to the end of the queue in ArrayList is as follows:

Public boolean Add (E e) {        ensurecapacityinternal (size + 1);  Make sure that the internal array has enough space        elementdata[size++] = e;        return true;    }


The performance of the Add () method in ArrayList depends on the Ensurecapacity () method. The implementation of Ensurecapacity () is as follows:
/** * The maximum allocated memory.        * Some virtual opportunities The array retains some header content, trying to allocate more space will cause memory overflow */private static final int max_array_size = integer.max_value-8;        private void ensurecapacityinternal (int mincapacity) {modcount++;    Overflow-conscious code if (mincapacity-elementdata.length > 0) grow (mincapacity); } private void Grow (int mincapacity) {//overflow-conscious code int oldcapacity = Elementdata.lengt        H int newcapacity = oldcapacity + (oldcapacity >> 1); Plan to scale up to 1.5 times times the current capacity if (Newcapacity-mincapacity < 0)//If the new capacity is less than the minimum required, use the minimum required newcapacity = mincapacity        ;        if (newcapacity-max_array_size > 0) newcapacity = hugecapacity (mincapacity); Mincapacity is usually close to size, so this is a win:elementdata = arrays.copyof (Elementdata, newcapacity);            Make a copy of the old and new array} private static int hugecapacity (int mincapacity) {if (mincapacity < 0)//Overflow THrow new OutOfMemoryError ();            Return (Mincapacity > Max_array_size)?    Integer.MAX_VALUE:MAX_ARRAY_SIZE; }

As you can see, the efficiency of the Add () operation is very high as long as the current capacity of the ArrayList is large. You need to scale up only if the ArrayList demand for capacity exceeds the size of the current array. During the expansion process, a large number of array copy operations are performed. When the array is copied, the arrays.copyof () method is eventually called.


The Add () operation of the LinkedList is implemented as follows, which increments any element to the end of the queue:

   Public boolean Add (E e) {        linklast (e);        return true;    }

  void Linklast (e e) {        final node<e> l = last;        Final node<e> NewNode = new node<> (l, E, null);        last = NewNode;        if (L = = null) First            = NewNode;        else            l.next = NewNode;        size++;        modcount++;    }

The Linklast () method inserts E into the end of the list. As you can see, LinkedList does not need to maintain the size of the capacity because it uses the structure of the list. From this point, it has a certain performance advantage over ArrayList,However, each element is incremented with a new Entry object and more assignment operations。 In frequent system calls, there is a certain impact on performance.

Run the following code using ArrayList and LinkedList, respectively:

public class Listtest {public static void Testarraylist () {Object obj = new Object (); List List = new arraylist<> (); for (int i = 0; i < 500000; i++) {list.add (obj);}} public static void Testlinkedlist () {Object obj = new Object (); List List = new linkedlist<> (); for (int i = 0; i < 500000; i++) {list.add (obj);}} public static void Main (string[] args) {testarraylist (); Testlinkedlist ();}}

The results of the run with TPTP analysis are:


It can be seen that the performance of ArrayList is about 4 times times higher than that of LinkedList.


2. Add elements to any position in the list

In ArrayList, the implementation code that is inserted at any location is as follows:

    public void Add (int index, E element) {        Rangecheckforadd (index);//Array out-of-bounds check        ensurecapacityinternal (size + 1);  Increase Modcount value        system.arraycopy (elementdata, index, Elementdata, index + 1,                         size-index);//index All subsequent elements need to be moved back one unit        elementdata[index] = element;        size++;    }
As you can see, every time you insert an array, you copy the arrays once. This operation does not exist when the element is added to the end of the List. A large number of array reassembly operations can result in poor system performance. Also, the more the inserted element is positioned in the list, the greater the cost of the array reorganization.

LinkedList now shows the advantages:

    public void Add (int index, E element) {        Checkpositionindex (index);        if (index = = size)            linklast (element);        else            Linkbefore (element, node (index));    }
It can be seen that for linkedlist, inserting data at the end of the List is the same as inserting data at any location. The insertion method's performance is not degraded by the insertion position. Now, in extreme cases, they are tested on this method, and each time the element is inserted at the front of the List.

public class Listtest {public static void Testarraylist () {Object obj = new Object (); List List = new arraylist<> (); for (int i = 0; i < 50000; i++) {list.add (0, obj);}} public static void Testlinkedlist () {Object obj = new Object (); List List = new linkedlist<> (); for (int i = 0; i < 50000; i++) {list.add (0, obj);}} public static void Main (string[] args) {testarraylist (); Testlinkedlist ();}}

To run the above code, the execution time is as follows:

It can be seen that the performance of the two differences in the difference between the days of soil.

Then insert the middle position each time:

List.add (List.size () >>1, obj);
The performance results are as follows:

As you can see, the performance of LinkedList is now several times higher.

Let's insert the end position again:

List.add (List.size (), obj);
The performance results are as follows:

As we can see, the performance of the LinkedList is still relatively slow.

Therefore, by comparing the insertion operation in the above three cases, we get the following conclusion: when the insertion position is in front, the ArrayList performance is due to LinkedList. And when the insertion position becomes more and more, ArrayList's performance becomes better and better than LinkedList.

This is because the performance loss of ArrayList is mainly due to the need to copy an array for each insertion, and the performance loss of LinkedList in the loop traversal of the linked list to find the insertion position element. When the insertion position is ahead, ArrayList spends a lot of time copying the array, while the LinkedList traversing the list to find the insertion position is the least expensive. As the insertion position becomes more and more ArrayList, the array needed to replicate is getting smaller and less time consuming. And LinkedList is looking for more time (to be exact, when the insertion position is in the middle, LinkedList is looking for the most time) or less than the ArrayList copying the array time.

3. Delete any position element

For ArrayList, the Remove () method is identical to the Add () method. When an element is removed at any location, the array is reorganized. The implementation is as follows:

    Public E Remove (int index) {        Rangecheck (index);        modcount++;        E OldValue = elementdata (index);        int nummoved = size-index-1;        if (nummoved > 0)            system.arraycopy (Elementdata, index+1, Elementdata, index,                             nummoved);        Elementdata[--size] = null; Let GC does its work        return oldValue;    }
You can see that after each effective element delete operation in ArrayList, the array is reorganized. And the more the position is removed, the greater the overhead of the array reorganization, and the lower the cost of the element to be removed.

For LinkedList, the implementation of the delete operation is as follows:

Public E Remove (int index) {        checkelementindex (index);        Return unlink (node (index));    } node<e> node (int index) {        //Assert Iselementindex (index);        if (Index < (size >> 1)) {            node<e> x = First;            for (int i = 0; i < index; i++)                x = X.next;            return x;        } else {            node<e> x = last;            for (int i = size-1; i > Index; i--)                x = X.prev;            return x;        }    }

The main time consumption is also in the search for deleted position elements above.

Performance comparisons and increased performance comparisons of deletions can be compared to performance comparisons of insert operations.

4. Capacity Parameters

Capacity parameters are unique performance parameters of array-based lists such as ArrayList and vectors, which represent the initialization of an array size. Each time the number of elements in the group exceeds its existing size, the table will be expanded once, and the expansion of the array will result in a memory copy of the entire array. Therefore, a reasonable array size can help reduce the number of array expansions, thus improving system performance.

By default, the initial size of the ArrayList array is 10, and each time it is expanded, the new array size is set to 1.5 times times the original.

5. Traversing the list

There are at least three ways to traverse after JDK1.5: ForEach, iterator, for loop.

Package Bupt.xiaoye.charpter2.list;import Java.util.arraylist;import Java.util.iterator;import java.util.List; public class Testfor {public static void Testforeach (List list) {Object Temp;for (object t:list) temp = t;} public static void Testfor (List list) {Object temp;for (int i = 0; i < 1000000; i++) {temp = List.get (i);}} public static void Testiterator (List list) {Object temp;for (iterator<object> it = List.iterator (); It.hasnext ();) { temp = It.next ();}} public static void Main (string[] args) {Object obj = new Object (); List List = new ArrayList (); for (int i = 0; i < 1000000; i++) {list.add (obj);} Testfor (list); Testforeach (list); Testiterator (list);}}

The result of the operation is:



As you can see, the direct for loop is the most efficient, followed by iterators and foreach operations.

As a syntactic sugar, in fact, after the ForEach compiled into bytecode, using an iterator implementation, after the Testforeach method is as follows:

public static void Testforeach (List list) {for (Iterator Iterator = List.iterator (); Iterator.hasnext ();) {Object T = Iterator.next (); object obj = t;}}

As you can see, the step of generating intermediate variables is only more than the iterator traversal, because the performance is also slightly reduced.









"Java Program Optimization"-Depth profiling List performance analysis

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.