Introduction to common Collections in Java Collections Framework, their features, applicable scenarios, and implementation principles

Source: Internet
Author: User
Tags failover

Introduction to common Collections in Java Collections Framework, their features, applicable scenarios, and implementation principles

JDK provides a large number of excellent sets for developers to use. Qualified programmers must be able to select the most appropriate set based on functional scenarios and performance requirements, this requires developers to be familiar with common Java Collection classes. This article will introduce the Collections commonly used in Java Collections Framework, their characteristics, applicable scenarios, and implementation principles for the learner's reference. Of course, it is recommended to read the JDK source code to have a deep understanding of Java's collection implementation.

Many Collection frameworks provided by Java are derived from two interfaces: Collection interface and Map interface.

CollectionAn interface defines a set of objects. Interface methods include:

Size ()-number of objects in the Set add (E)/addAll (Collection)-add a single/batch Object to the set remove (Object)/removeAll (Collection) -Delete A single/batch Object contains (Object)/containsAll (Collection) from the Collection-determine whether a/some objects toArray () exists in the Collection () -returns an array containing all objects in the set.

MapThe interface specifies a key for each object on the basis of Collection, and stores each key-value Pair using the Entry to quickly locate the object (value) through the key ). The main methods of the Map interface include:

Size ()-number of objects in the set put (K, V)/putAll (Map)-add a single/batch object to the Map get (K) -Return the remove (K) object corresponding to the Key.-delete the keySet () object corresponding to the Key.-Return the Setvalues () that contains all the keys in the Map () -Return CollectionentrySet () that contains all values in the Map-return EntrySetcontainsKey (K)/containsValue (V) that contains all key-value pairs in the Map) -Determine whether the specified key/value exists in the Map.

After learning about the Collection and Map interfaces, let's take a look at the common collections derived from these two interfaces.

List

The List interface inherits Collection and is used to define a set stored as a List. The List interface assigns an index to each object in the set to mark the position of the object in the List, you can also use index to locate the object at the specified position.

The main methods for adding a List to a Collection are as follows:

Get (int)-returns the object add (E)/add (int, E) at the specified index position-inserts an object set (int, e)-replace the Object indexOf (Object) placed on the specified index position in the List-return the subList (int, int) Position of the specified Object in the List) -Return the sub-List objects from the specified starting index to the ending index.

Common Implementation classes of the List interface:

ArrayList

ArrayList implements the set function based on arrays. It maintains an array of variable-length objects, stores all objects in the array, and dynamically scales the length of the array.

Shortlist

The listlist function implements the set function based on the linked list. It implements a static Node class. Each object in the set is saved by a Node, each Node has its own reference to the previous and next nodes.

ArrayList vs rule list
ArrayList is more efficient in addressing. The array-based ArrayList can be directly located to the target object, the failover list must start from Node or Node and traverse backward/forward several times to locate the insertion, deletion, and sequential traversal of the target object failover list, because each Node in the vertex list has a reference of the previous Node and the next Node. However, you must note that the iterator method is applied when traversing the topology list. Do not use the get (int) method. Otherwise, the efficiency will be low.

Vector

Vector and ArrayList are very similar. They are all array-based collections. The main difference between Vector and ArrayList is that

Vector is thread-safe, while ArrayList is not because the methods in the Vector are basically synchronized. Its performance is lower than that of ArrayListVector, which can define the factor for increasing the length of the array. ArrayList cannot

CopyOnWriteArrayList

Like Vector, CopyOnWriteArrayList can also be considered as a thread-safe version of ArrayList. The difference is that CopyOnWriteArrayList will first copy a copy during write operations and perform write operations on the new copy, then modify the reference. This mechanism allows CopyOnWriteArrayList to not lock read operations, which makes the Read efficiency of CopyOnWriteArrayList much higher than that of Vector. The CopyOnWriteArrayList concept is similar to read/write splitting, and is suitable for multi-threaded scenarios with more reads and less writes. Note that the CopyOnWriteArrayList can only ensure the final consistency of data, but cannot guarantee the real-time consistency of data. When performing a read operation, invalid data may be read.

 

Vector vs CopyOnWriteArrayList

Both are thread-safe and array-based ListVector read and write are thread-safe. CopyOnWriteArrayList cannot guarantee real-time thread-safe reading. CopyOnWriteArrayList has a much higher read performance than VectorCopyOnWriteArrayList occupies more memory space.

 

 

Map

The basic features of the Map interface have been described above. Let's take a look at the common implementation classes of the Map interface.

 

Common Implementation classes of the Map interface:

 

HashMap

As mentioned above, Map stores each key-value Pair in an Entry object. HashMap stores the Entry object in an array and uses a hash table to implement quick access to the Entry:

The position of the Entry in the array is determined by the hash value of the key in each Entry. With this feature, you can quickly find the Entry through the key to obtain the value corresponding to the key. Without hash conflicts, the search time complexity is O (1 ).

If the Indexes calculated by two different keys are the same, two keys correspond to the same position in the array, that is, hash conflicts. HashMap uses the zipper method to handle hash conflicts. That is to say, each location in the array is actually saved as an Entry linked list, and each Entry in the linked list has a reference pointing to the next Entry in the linked list. In the case of a ha-Greek conflict, append the conflicting Entry to the end of the linked list. When HashMap finds multiple entries on the array index corresponding to a key during addressing, it traverses the Entry linked list at this position until it finds the target Entry.

Entry class of HashMap:

static class Entry
 
   implements Map.Entry
  
    {        final K key;        V value;        Entry
   
     next;        int hash;}
   
  
 

 

Because of its fast addressing, HashMap is the most frequently used Map implementation class.

 

 

Hashtable

Hashtable can be said to be the predecessor of HashMap (Hashtable exists since JDK1.0, and HashMap and even the whole Map interface are new features introduced by JDK1.2). Its implementation is almost the same as that of HashMap, the Entry is stored in an array, and the index of the Entry in the array is calculated based on the hash value of the key. The hash conflict is solved using the zipper method. The biggest difference between the two is that Hashtable is thread-safe, and almost all the methods provided are synchronous.

 

 

ConcurrentHashMap

ConcurrentHashMap is the thread security version of HashMap (introduced from JDK1.5), providing more efficient concurrency performance than Hashtable.

Hashtable locks the entire Entry array during read/write operations, resulting in more data and worse performance. ConcurrentHashMap uses the separation lock to solve concurrency performance. It splits the Entry array into 16 segments and uses a hash algorithm to determine which Segment the Entry should be stored in. In this way, only one Segment can be locked during write operations, greatly improving the concurrent write performance.

During read operations, ConcurrentHashMap does not need to be locked in most cases. The value in its Entry is volatile, which ensures the thread visibility when the value is modified, thread-safe read operations can be implemented without locking.

HashEntry class of ConcurrentHashMap:

 

static final class HashEntry
 
   {        final int hash;        final K key;        volatile V value;        volatile HashEntry
  
    next;}
  
 

 

However, the fish and the bear's paw cannot have both sides. The high performance of ConcurrentHashMap is costly (otherwise Hashtable has no value), that is, it cannot guarantee absolute consistency of read operations. ConcurrentHashMap ensures that the read operation can obtain the latest value of an existing Entry, and that the read operation can also obtainCompletedBut if the write operation is creating a new Entry, the read operation isPossibleThis Entry cannot be obtained.

 

HashMap vs Hashtable vs ConcurrentHashMap
The three mechanisms at the data storage layer are basically the same. HashMap is NOT thread-safe. In a multi-threaded environment, besides data consistency cannot be guaranteed, it may lead to Entry linked list loops, as a result, the get method has an infinite loop. Hashtable is thread-safe and ensures absolute data consistency. However, due to its brute force locking of all operations, ConcurrentHashMap is thread-safe, the use of separation locks and volatile greatly improves read/write performance and ensures data consistency in most cases. But it cannot guarantee absolute data consistency. Before a thread adds an Entry to the Map, other threads may not be able to read the newly added Entry.

 

 

LinkedHashMap

LinkedHashMap and HashMap are very similar. The only difference is that the Entry of the former is in HashMap. add a reference to the previous insert and the next insert Entry based on the Entry to achieve traversal in the Entry insertion order.

 

 

TreeMap

TreeMap is a Map structure implemented based on the red and black trees. Its Entry class has reference to left/right leaf nodes and parent nodes, and records its own color:

static final class Entry
 
   implements Map.Entry
  
    {        K key;        V value;        Entry
   
     left = null;        Entry
    
      right = null;        Entry
     
       parent;        boolean color = BLACK;}
     
    
   
  
 

The red-black tree is actually a complex but efficient algorithm that balances Binary Trees. It has the basic nature of Binary Trees, that is, any node has a value greater than its left leaf node and less than its right leaf node. With this feature, treeMap can sort and quickly search entries.

About the specific introduction of the red and black trees, you can refer to this article, very detailed: http://blog.csdn.net/chenssy/article/details/26668941

The entries of TreeMap are ordered, so a series of convenient functions are provided, such as obtaining the KeySet (EntrySet) in ascending or descending order, and obtaining the entries in the specified key (Entry) key (Entry) and so on. It is suitable for scenarios where you need to perform orderly operations on keys.

 

 

ConcurrentSkipListMap

ConcurrentSkipListMap can also provide ordered Entry arrangement, but its implementation principle is different from TreeMap, based on the Skip table (SkipList:



As shown in, ConcurrentSkipListMap is implemented by a multi-level linked list. The bottom-layer chain has all elements, and the number of elements in each chain decreases progressively. Start from the top-level chain when searching, and search based on the first right and bottom priorities to achieve fast addressing.

 

Static class Index
 
  
{Final Node
  
   
Node; final Index
   
    
Down; // reference volatile Index
    
     
Right; // right reference}
    
   
  
 

Unlike TreeMap, ConcurrentSkipListMap only needs to modify the right reference of the affected node while the right reference is volatile during insert and delete operations. Therefore, ConcurrentSkipListMap is thread-safe. However, like ConcurrentHashMap, ConcurrentSkipListMap does not guarantee absolute consistency of data. In some cases, you may not be able to read the data being inserted.

TreeMap vs ConcurrentSkipListMap
Both of them can provide an ordered Entry set with similar performance. The search time complexity is O (logN) ConcurrentSkipListMap will occupy more memory space. ConcurrentSkipListMap is thread-safe, and TreeMap is not

Set

The Set interface inherits Collection, which is used to store a Set without repeated elements. All Set implementations are based on the same type of Map. Simply put, Set is a castrated Map. Each Set has a Map instance of the same type. Set stores elements as keys in its own Map instance, and value is an empty Object. Common implementations of Set include HashSet, TreeSet, and ConcurrentSkipListSet. The principles are exactly the same as those of the corresponding Map implementation.

Queue

The Queue interface inherits the Collection interface to implement a set of FIFO (first-in-first-out. Common Methods of the Queue interface include:

Add (E)/offer (E): adds an element to the end of a team. The difference between the two is that if the queue is bounded, The add method will throw IllegalStateException when the queue is full, the offer method will only return falseremove ()/poll (): leaves the queue, that is, one element is removed from the queue header. The difference between the two is that if the queue is empty, the remove method will throw NoSuchElementException, while poll will only return nullelement ()/peek (): view the Header element. The difference between the two is that if the queue is empty, the element method will throw NoSuchElementException, while peek only returns null

 

Common Implementation classes of the Queue interface:

 

Concurrent1_queue

Concurrentincluqueue is a queue based on a linked list. each Node in the queue has a reference to the next Node:

private static class Node
 
   {        volatile E item;        volatile Node
  
    next;}
  
 

Since all Node class members are volatile, concurrent1_queue is naturally thread-safe. It can ensure the atomicity and consistency of inbound and outbound operations, but it can only ensure weak data consistency during traversal and size () operations.

LinkedBlockingQueue

Unlike ConcurrentLinkedQueue, LinkedBlocklingQueue is an unbounded blocking queue. The so-called blocking queue means that when the queue is full, the thread will be blocked until the queue has space for the queue to enter and then return. When the queue is left empty, the thread will also be blocked until there are elements in the queue for the queue to return. LinkedBlocklingQueue is also implemented based on the linked list. The ReentrantLock is used to lock the queue and queue operations. Therefore, it is thread-safe, but in the same way, it can only ensure the atomicity and consistency of the inbound and outbound operations. It can only ensure the weak consistency of data during the time.

ArrayBlockingQueue

ArrayBlockingQueue is a bounded blocking Queue Based on arrays. The implementation of the synchronous blocking mechanism is basically the same as that of the LinkedBlocklingQueue. The difference is that the former uses the same lock for production and consumption, and the latter uses two separate locks for production and consumption.

The biggest difference between the two is that ArrayBlockingQueue is bounded and is suitable for implementing a fixed-length blocking queue. LinkedBlocklingQueue is unbounded and is suitable for implementing a non-long blocking queue.

ConcurrentLinkedQueue vsLinkedBlocklingQueue ArrayBlockingQueue
ConcurrentLinkedQueue is a non-blocking queue. The other two are blocking queues, which are thread-safe. However, they cannot guarantee absolute data consistency over time. The LinkedBlocklingQueue is unbounded and is suitable for queue with unlimited length, arrayBlockingQueue is suitable for long queues.

SynchronousQueue

SynchronousQueue is one of the most amazing queues implemented by JDK. It cannot store any elements. The size is always 0, and peek () always returns null. The thread that inserts an element into it will be blocked until another thread removes the element, and the thread that retrieves the element from it will also be blocked until another thread inserts the element.

This implementation mechanism is very suitable for transmission scenarios. That is to say, SynchronousQueue is applicable to scenarios where the producer thread needs to promptly confirm that the task of its production has been removed by the consumer thread before executing the subsequent logic.

PriorityQueue & PriorityBlockingQueue

These two Queue types are not FIFO queues, but are sorted based on the priority of the elements to ensure that the smallest element is the first to come out of the Queue. You can also import the Comparator instance when constructing the Queue, in this way, the PriorityQueue sorts the elements according to the requirements of the Comparator instance.

PriorityQueue is a non-blocking queue and is not thread-safe. PriorityBlockingQueue is a blocking queue and thread-safe.

Deque

Deque inherits the Queue interface and defines a dual-end Queue. That is to say, Deque can start or end a Queue. It is more flexible than Queue and can be used to implement data structures such as Queue and Stack. Deque provides additional methods based on Queue:

AddFirst (E)/addLast (E)/offerFirst (E)/offerLast (E) removeFirst ()/removeLast ()/pollFirst ()/pollLast () getFirst () /getLast ()/peekFirst ()/peekLast ()

The Deque implementation classes include consumer list (described previously), concurrentincludeque, and LinkedBlockingDeque. The implementation mechanism is very similar to the concurrentincluqueue and LinkedBlockingQueue mentioned previously.

Finally, a brief summary of the common set implementation classes described in this article:



 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.