Analysis of several methods of Java traversal collection (Implementation principle, algorithm performance, applicable occasions)

Last Update:2016-04-25 Source: Internet

Author: User

Tags define abstract goto

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Overview

The Java language provides a set of data collection frameworks that define abstract data types such as list, set, and each specific implementation of each abstract data type, with different implementations, such as ArrayList and LinkedList.

In addition, Java's traversal of data collections provides several different ways. Developers must clearly understand the characteristics of each traversal, where it is applied, and how it behaves at different levels of implementation. Here is a detailed analysis of this piece of content.

How is the data element stored in memory?

Data elements in memory, there are 2 main ways of storage:

1. stored in order, Random access (Direct access):

In this way, adjacent data elements are stored in adjacent memory addresses, and the entire block memory address is contiguous. The memory address can be calculated directly based on the location of the element and read directly. The average time complexity for reading a particular location element is O (1). normally, only collections based on array implementations have this feature. ArrayList is represented in Java.

2. Chained storage, sequential Access:

In this way, every data element, in memory, does not require an adjacent location, and each data element contains the memory address of its next element. It is not possible to compute the memory address directly based on the location of the element, only the elements can be read sequentially. The average time complexity for reading a particular location element is O (n). It is mainly represented by linked list. LinkedList is represented in Java.

What are the traversal methods available in Java?

1, the traditional for loop traversal, based on the counter:

The Walker maintains a counter outside the collection, then reads the elements of each position sequentially, stopping after the last element is read. The main thing is to read the element by its position. This is also the most primitive collection traversal method.

The wording is:

 for (int i = 0; i < list.size (); i++) {    list.get (i);}

2, iterator traversal, Iterator:

Iterator was originally a design model of OO, the main purpose is to block the characteristics of different data sets, the unified traversal of the interface of the collection. Java, as an OO language, naturally supports the iterator model in collections.

The wording is:

Iterator Iterator = list.iterator ();  while (Iterator.hasnext ()) {    iterator.next ();}

3. Foreach Loop traversal:

Explicitly declared iterator and counters are masked.

Pros: The code is simple and easy to make mistakes.

Cons: You can only do simple traversal, you cannot manipulate (delete, replace) data collection during traversal.

The wording is:

 for (ElementType element:list) {}

What is the implementation principle of each traversal method?

1, the traditional for loop traversal, based on the counter:

2, iterator traversal, Iterator:

Each specific implementation of the data set, generally need to provide the corresponding iterator. Iterator has banned explicit traversal counters compared to the traditional for loop. Therefore, iterator based on the sequential store collection can access the data directly by location. The normal implementation of iterator, which is based on chained storage sets, is where the current traversal needs to be saved. Then move the pointer forward or backward based on the current position.

3. Foreach Loop traversal:

Based on the deserialized bytecode, we can see that the internal foreach is implemented in a iterator way, except that the Java compiler helped us generate the code.

How is the performance of each traversal method for different storage modes?

1, the traditional for loop traversal, based on the counter:

Because it is based on the position of the element, it is read by location. So we can know that for sequential storage, because the average time complexity of reading a particular location element is O (1), the average time complexity for traversing the entire collection is O (n). For chained storage, because the average time complexity of reading a particular location element is O (n), the average time complexity for traversing the entire collection is O (n2)(square of N).

ArrayList code read by location: Read directly by element location.

transient object[] Elementdata;  Public E get (int  index) {    Rangecheck (index);     return Elementdata (index);} E elementdata (int  index) {    return  (E) elementdata[index];}

LinkedList code read by location: Each time you need to read backwards from the No. 0 element. In fact, it also made a small optimization inside.

transient intSize = 0;transientNode<e>First ;transientNode<e>Last ; PublicE Get (intindex)    {Checkelementindex (index); returnnode (index). Item;} Node<E> node (intindex) {    if(Index < (size >> 1)) {//query location in the first half of the list, starting with the list headernode<e> x =First ;  for(inti = 0; I < index; i++) x=X.next; returnx; } Else{//query location in the back part of the list, starting at the end of the listnode<e> x =Last ;  for(inti = size-1; i > Index; i--) x=X.prev; returnx; }}

2, iterator traversal, Iterator:

There is not much point for a collection of randomaccess types, but because of some additional operations, additional uptime is added. But for a collection of sequential access, it makes a lot of sense, because iterator internally maintains the current traversal, so each traversal, reading the next position does not need to start with the first element of the collection, just move the pointer back one to the line, so The time complexity of traversing the entire set is reduced to O (n);

(Here is just an example of LinkedList) the LinkedList iterator, implemented internally, is to maintain the position of the current traverse, and then move the pointer around it:

Code:

 PublicE Next () {checkforcomodification (); if(!Hasnext ())Throw Newnosuchelementexception (); lastreturned=Next; Next=Next.next; Nextindex++; returnLastreturned.item;} PublicE Previous () {checkforcomodification (); if(!hasprevious ())Throw Newnosuchelementexception (); lastreturned= Next = (Next = =NULL) ?Last:next.prev; Nextindex--; returnLastreturned.item;}

3. Foreach Loop traversal:

Parsing Java bytecode shows that the foreach internal implementation principle is also implemented through iterator, except that this iterator is generated by the Java compiler, so we don't need to write it manually. But because the type conversion check is done every time, it takes a little longer than iterator. The complexity of time is the same as iterator.

Byte code using iterator:

Code:0:New#16//class Java/util/arraylist3: DUP4:invokespecial #18//Method java/util/arraylist. " <init> ":() V7: Astore_18: Aload_19:invokeinterface #19, 1//Interfacemethod Java/util/list.iterator: () Ljava/util/iterator;14: astore_215:Goto25 18: Aload_219:invokeinterface #25, 1//Interfacemethod Java/util/iterator.next: () Ljava/lang/object;24: Pop25: Aload_226:invokeinterface #31, 1//Interfacemethod Java/util/iterator.hasnext: () Z31:ifne 18 34:return

byte code using foreach:

Code:0:New#16//class Java/util/arraylist3: DUP4:invokespecial #18//Method java/util/arraylist. " <init> ":() V7: Astore_18: Aload_19:invokeinterface #19, 1//Interfacemethod Java/util/list.iterator: () Ljava/util/iterator;14: Astore_315:Goto28 18: Aload_319:invokeinterface #25, 1//Interfacemethod Java/util/iterator.next: () Ljava/lang/object;24:checkcast #31//class Loop/model27: astore_228: Aload_329:invokeinterface #33, 1//Interfacemethod Java/util/iterator.hasnext: () Z34:ifne 18 37:return

What is the application of each traversal method?

1, the traditional for loop traversal, based on the counter:

Sequential storage: read performance is relatively high. Applies to traversing sequential storage collections.

Chained storage: The complexity of time is too large to be used to traverse the collection of chained storage.

2, iterator traversal, Iterator:

Sequential Storage: If you are not too concerned about time, it is recommended to choose this way, after all, the code is more concise, but also to prevent off-by-one problems.

chained storage: meaning is significant, the average time complexity is reduced to O (n), or very tempting, so recommend this kind of traversal method.

3. Foreach Loop traversal:

foreach just makes the code more concise, but he has some drawbacks, that is, the traversal process cannot manipulate the data collection (delete, etc.), so some occasions do not use. And it is based on iterator implementation, but because of the type conversion problem, it will be more than the direct use of iterator slower, but fortunately, the complexity of the time is the same. So how to choose, refer to the above two ways, make a compromise choice.

What are the best practices for Java?

In the Java Data Collection framework, a randomaccess interface is provided that has no methods, just a token. is typically used by the implementation of the list interface to mark whether the implementation of the list supports random Access.

A data collection implements the interface, which means that it supports random Access, and the average time to read elements by location is O (1). Like ArrayList.

Without implementing the interface, it means that random Access is not supported. Like LinkedList.

So it seems that JDK developers are also aware of this problem, so the recommended approach is if you want to traverse a list, then first determine whether to support random Access, that is, List instanceof randomaccess.

Like what:

if instanceof randomaccess) {    // use traditional for loop traversal.  Else  {    // use iterator or foreach. }

Analysis of several methods of Java traversal collection (Implementation principle, algorithm performance, applicable occasions)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More