Java concurrent programming-ConcurrentHashMap Principle Analysis, concurrenthashmap

Source: Internet
Author: User

Java concurrent programming-ConcurrentHashMap Principle Analysis, concurrenthashmap
Preface

A set is the most commonly used data structure in programming. When it comes to concurrency, it is almost always inseparable from the support of advanced data structures such as collections. For example, two threads need to access an intermediate critical zone (Queue) at the same time. For example, cache is often used as a copy of external files (HashMap ).

Before tiger, one of the most commonly used data structures isHashMapAndHashtable. We all know that synchronization is not considered in HashMap, while Hashtable uses synchronized. (HashMap is thread-safe and HashTable is thread-safe.) the direct impact of HashMap is optional, we can use HashMap in a single thread to improve efficiency, while Hashtable can be used in multithreading to ensure security.

Principle Analysis of ConcurrentHashMap

When we enjoy the benefits that jdk brings, we also bear the unfortunate consequences it brings. By analyzing Hashtable,Synchronized is for the entire Hash tableThat is to say, every time the entire table is locked, the thread is exclusive, and the security is a huge waste. Doug Lee's unique solution-ConcurrentHashMap.

On the left is the implementation method of Hashtable-lock the entire hash table; on the right is the implementation method of ConcurrentHashMap-Lock bucket (or segment).

This is what is said in the official document:

ConcurrentHashMap supports the obtained fully concurrent and updated hash tables with the expected adjustable concurrency. This class complies with the same functional specifications as Hashtable and includes the method versions corresponding to each method of Hashtable. However, although all operations are thread-safe, the get operation is not required to be locked, and the entire table cannot be locked in a way that prevents all accesses. This class can be fully interoperable with Hashtable through the program, depending on its thread security, and has nothing to do with the synchronization details.

ConcurrentHashMap divides the hash table16 barrels(Default), such as get, put, remove, and other common operations only lock the bucket currently used. Imagine that only one thread can enter, but now 16 write threads can enter at the same time (the write thread needs to be locked, and the read thread is almost unrestricted, which will be mentioned later ), the increase in concurrency is obvious.

What's even more surprising is that ConcurrentHashMap'sRead concurrency,Because locking is not used in most read operations, read operations are almost completely concurrent operations.And the granularity of the write operation lock is very fine, which is faster than before (this is more obvious in more buckets ).

ConcurrentHashMap only needs to lock the entire table when performing operations such as size.. In iteration, ConcurrentHashMap uses another iteration method different from the traditional set's fast failure iterator, which is called the Weak Consistent iterator. In this iteration mode, when the iterator is created and the Set changes, ConcurrentModificationException is no longer thrown. Instead, the new data will not affect the original data when the set changes, after the iterator completes, replace the header pointer with the new data, so that the iterator thread can use the old data, and the write thread can complete the change concurrently. More importantly, this ensures the continuity and scalability of concurrent execution of multiple threads, and is the key to performance improvement.
Next, let's take a look at several important methods in ConcurrentHashMap. After knowing the implementation mechanism, we will be more confident in using it.

The main object classes in ConcurrentHashMap are three:
1. ConcurrentHashMap (entire Hash table );
2. Segment (bucket );
3. HashEntry (node)
Corresponding to the figure above, we can see the relationship between them.

ConcurrentHashMap allows concurrent modification operations. The key is to use the lock separation technology. It uses multiple locks to control the modification of different parts of the hash table. ConcurrentHashMap uses segments to represent these different parts. Each Segment is actually a small hash table and they have their own locks. As long as multiple modifications occur on different segments, they can be performed concurrently.

Some methods need to be cross-segment, such as size () and containsValue (). They may need to lock the entire table, not just a specific segment. This requires locking all segments in order. After the operation is complete, the locks of all segments are released in order. Here "in order" is very important, otherwise there may be a deadlock, within ConcurrentHashMap, the segment array is final, and its member variables are actually final,, only declaring an array as final does not guarantee that the array members are final, which requires implementation guarantee.

ConcurrentHashMap allows multiple read operations to be performed concurrently, and the read operation does not need to be locked.

Get method (note that the analysis method here is for the bucket, because the biggest improvement of ConcurrentHashMap is to refine the granularity to the bucket ),First, it determines whether the number of data in the current bucket is 0. If it is 0, it is impossible to get anything. Only null is returned. This avoids unnecessary search.To avoid errors at the minimum cost. After obtaining the header node (the method will be involved below), it will judge whether it is a specified value based on the hash and key one by one. If it is and the value is not empty, it indicates it is found and a direct return is returned; the program is very simple, but there is a confusing point. What is the return readValueUnderLock (e) used? Study its code and return a value after locking. But here we already have a V v = e. value to get the value of the node. Is this return readValueUnderLock (e) An alternative? In fact, this is entirely for concurrency consideration. When v is empty, it may be that a thread is changing nodes, but previous get operations are not locked. According to the bernstein condition, data inconsistency may occur after reading or writing. Therefore, you need to lock this e and read it again to ensure that the correct value is obtained, here we have to admire Doug Lee's rigor. The entire get operation will only be locked in rare cases. Compared with the previous Hashtable, concurrency is inevitable!

Common ConcurrentHashMap Methods


This is the inheritance and implementation relationship of ConcurrentHashMap in the source code.

Note: ConcurrentMap
Official explanation: Memory consistency effect: when other concurrent collections exist, the operation of placing the object in the thread before ConcurrentMap happen-before then accesses or removes the element from ConcurrentMap through another thread.
In fact, ConcurrentMap can be seen as a cache container, which contains the remove, replace method, when the elements are removed, will notify the blocked thread, similar to the CacheBuilder (http://blog.csdn.net/xlgen157387/article/details/47293517) in Guava)

The source code is a constant:

Static final int DEFAULT_INITIAL_CAPACITY = 16; // default bucket capacity static final float DEFAULT_LOAD_FACTOR = 0.75f; // loading Factor static final int DEFAULT_CONCURRENCY_LEVEL = 16; // The New null ing static final int MAXIMUM_CAPACITY = 1 <30; // The displacement method is 1073741824 to the power of 2, representing the maximum value of the bucket static final int MIN_SEGMENT_TABLE_CAPACITY = 2; // The minimum number of parts for each table is 2 static final int MAX_SEGMENTS = 1 <16; static final int RETRIES_BEFORE_LOCK = 2;

(1) Delete:

    public V remove(Object key) {    int hash = hash(key.hashCode());        return segmentFor(hash).remove(key, hash, null);    }

According to the Code, the entire delete operation is to locate the specific hash segment based on the key and then calculate the hashCode, locate a section of ConcurrentHashMap, and then delegate the remove operation to the segment. When multiple delete operations are executed concurrently, they can be performed simultaneously as long as their segments are different.
When deleted, the deleted elements will be put into a table to be deleted, waiting for the garbage collector to clean up in a centralized manner.

Then there are some common methods, see: http://tool.oschina.net/apidocs/apidoc? Api = jdk-zh

Copyright Disclaimer: This article is an original article by the blogger and cannot be reproduced without the permission of the blogger.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.