Java list object performance analysis and testing

Source: Internet
Author: User

 

The SDK provides several java. util. List Implementation interfaces for Ordered Sets. Three of them are most well-known: Vector, ArrayList, and sorted List. The performance difference between these List classes is a frequently asked question. In this article, I want to discuss the performance differences between Vector list and Vector/ArrayList.

To fully analyze the performance differences between these classes, we must know their implementation methods. Therefore, I will first briefly introduce the implementation features of these classes from the perspective of performance.

1. Implementation of Vector and ArrayList
Both Vector and ArrayList carry an underlying Object [] array, which is used to save elements. When accessing an element through an index, you only need to access the elements of an internal array through an index:

Public Object get (int index)
{// First check whether the index is legal... this part of code return is not displayed here
ElementData [index];}


The internal array can be larger than the number of elements owned by the Vector/ArrayList object. The difference between the two can be used as the remaining space to quickly add new elements. With the remaining space, it is very easy to add elements. You only need to save the new elements to an empty position in the internal array, and then add the index value for the new free position:

Public boolean add (Object o)
{EnsureCapacity (size + 1); // elementData [size ++] = o; return true;
// Return value of List. add (Object}


Insert an element to any specified position in the Set (rather than the end of the Set). A little more complicated: All the array elements above the insertion point must move one forward before assigning values:

Public void add (int index, Object element ){
// First check whether the index is legal... this part of code is not displayed here
EnsureCapacity (size + 1 );
System. arraycopy (elementData, index, elementData, index + 1,
Size-index );
ElementData [index] = element;
Size ++;
}


When the remaining space is used up, if you need to add more elements, the Vector/ArrayList Object must replace its internal Object [] array with a larger new array, copy all array elements to the new array. Depending on the SDK version, the new array is 50% or 100% larger than the original one (the code shown below expands the array by 100% ):

Public void ensureCapacity (int minCapacity ){
Int oldCapacity = elementData. length;
If (minCapacity> oldCapacity ){
Object oldData [] = elementData;
Int newCapacity = Math. max (oldCapacity * 2, minCapacity );
ElementData = new Object [newCapacity];
System. arraycopy (oldData, 0, elementData, 0, size );
}
}


The main difference between the Vector class and the ArrayList class is synchronization. Except for two methods that are only used for serialization, none of the ArrayList methods have the ability to perform synchronization. On the contrary, most methods of Vector have the ability to synchronize, either directly or indirectly. Therefore, Vector is thread-safe, but ArrayList is not. This makes the ArrayList faster than the Vector. For some of the latest JVM, the speed difference between the two classes is negligible: strictly speaking, for these JVM, the speed difference between the two classes is less than the time difference shown in the test of comparing the performance of these classes.

When accessing and updating elements through indexes, the implementation of Vector and ArrayList has excellent performance because there is no overhead other than scope check. Unless the internal array space is exhausted, expansion is required. Otherwise, adding elements to the end of the list or deleting elements from the end of the list also has excellent performance. Insert and delete elements always need to copy the Array (when the array must be expanded first, two copies are required ). The number of copied elements is proportional to [size-index], that is, the distance between the insertion/deletion point and the last index position in the set. During the insert operation, when an element is inserted to the beginning of the Set (index 0), the performance is the worst. When the element is inserted to the end of the Set (after the last existing element), the performance is the best. As the scale of the Set increases, the overhead of array replication also increases rapidly because the number of elements that must be copied increases each insert operation.

II. Implementation of consumer list
The worker list is implemented through a list of nodes with two-way links. To access elements through indexes, you must search for all nodes until you find the target node:

Public Object get (intindex ){
// First check whether the index is legal... this part of code is not displayed here
Entry e = header; // Start Node
// Search for the forward or backward direction.
// Near decision
If (index <size/2 ){
For (int I = 0; I <= index; I ++)
E = e. next;
} Else {
For (int I = size; I> index; I --)
E = e. previous;
}
Return e;
}


It is easy to insert elements into the list: Find the node with the specified index and insert a new node immediately before the node:

Public void add (int index, Object element ){
// First check whether the index is legal... this part of code is not displayed here
Entry e = header; // starting node
// Search for the forward or backward direction.
// Near decision
If (index <size/2 ){
For (int I = 0; I <= index; I ++)
E = e. next;
} Else {
For (int I = size; I> index; I --)
E = e. previous;
}
Entry newEntry = new Entry (element, e, e. previous );
NewEntry. previous. next = newEntry;
NewEntry. next. previous = newEntry;
Size ++;
}


Thread-safe consumer list and other sets
To get a thread-safe consumer List from the Java SDK, you can use a synchronization package to get one from Collections. synchronizedList (List. However, the use of a synchronization package is equivalent to adding an indirect layer, which will bring a high performance cost. When the package passes the call to the encapsulated method, an additional method call is required for each method, the method encapsulated by the synchronous package is two to three times slower than the unencapsulated method. For complex operations such as search, the overhead of indirect calls is not very prominent. However, for simple methods such as access functions or update functions, this overhead may have a serious impact on performance.

This means that, compared with Vector, the synchronized encapsulated consumer list has a significant performance disadvantage, because Vector does not need to perform any additional indirect calls for thread security. If you want to have a thread-safe consumer list, you can copy the consumer list class and synchronize several necessary methods so that you can get a faster implementation. This is equally effective for all other collection classes: only List and Map have efficient thread security implementations (Vector and Hashtable classes ). Interestingly, these two efficient thread security classes exist only for backward compatibility, not for performance considerations.

For accessing and updating elements through an index, the performance overhead of an indexed list is slightly higher, because accessing any index requires crossing multiple nodes. When an element is inserted, in addition to performance overhead that spans multiple nodes, there is also overhead, that is, the overhead for creating node objects. In terms of advantages, the insert and delete operations implemented by the shortlist have no other overhead. Therefore, the insert-delete overhead is almost entirely dependent on the distance between the insert-delete point and the end of the set.

Iii. Performance Testing
These classes have many different functions that can be tested. The shortlist is frequently used because it is considered to have good performance in random insert and delete operations. Therefore, the following analysis focuses on the performance of the insert operation, that is, constructing a set. I tested and compared the rule list and ArrayList, because both are non-synchronous.

The insert operation speed is mainly determined by the size of the Set and the insert position of the element. When the insertion point is located at both ends and the middle of the set, the insertion performance and the best insertion performance are the worst.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.