Graphic set 5: incorrect use of HashMap causes infinite loops and element loss, correct hashmap

Source: Internet
Author: User

Graphic set 5: incorrect use of HashMap causes infinite loops and element loss, correct hashmap

Problem cause

The previous article explains the implementation principle of HashMap and explains that HashMap is NOT thread-safe. So What problems does HashMap have in a multi-threaded environment?

A few months ago, when a module of the company's project was running online, there was an endless loop, and the code of the endless loop was stuck on the HashMap get method. Although it was finally found that it was not caused by HashMap, I paid attention to the issue that HashMap will cause an endless loop in a multi-threaded environment. Next, we will use a piece of code to simulate the endless loop of HashMap:

public class HashMapThread extends Thread{    private static AtomicInteger ai = new AtomicInteger(0);    private static Map
 
   map = new HashMap
  
   (1);    public void run()    {        while (ai.get() < 100000)        {            map.put(ai.get(), ai.get());            ai.incrementAndGet();        }    }}
  
 

The role of this thread is very simple. It constantly increments AtomicInteger and writes it to HashMap. Both AtomicInteger and HashMap are globally shared, that is, all threads operate on the same AtomicInteger and HashMap. Open five threads to operate on the code in the run method:

public static void main(String[] args){    HashMapThread hmt0 = new HashMapThread();    HashMapThread hmt1 = new HashMapThread();    HashMapThread hmt2 = new HashMapThread();    HashMapThread hmt3 = new HashMapThread();    HashMapThread hmt4 = new HashMapThread();    hmt0.start();    hmt1.start();    hmt2.start();    hmt3.start();    hmt4.start();}

After several times of running, the endless loop came out. I ran it around 7 times and 8 times. The number of times was the array subscript out-of-bounds exception ArrayIndexOutOfBoundsException. One thing to mention here is that code problems in a multi-threaded environment do not necessarily mean that problems may occur in a multi-threaded environment, but as long as there is a problem, a deadlock, or an endless loop, there is no other way to restart your project, so the thread security of the Code must be considered during development and review. OK. Check the console:

The red box is always on, indicating that the code is endless. The issue of endless loops is usually located through jps + jstack to view stack information:

We can see that Thread-0 is in RUNNABLE, and we can see from the stack information that this endless loop is caused by the expansion of HashMap by Thread-0.

Therefore, this article explains why the expansion of HashMap results in an endless loop.

Normal resizing Process

Let's take a look at a normal expansion process of HashMap. Let's take a look. Assume that I have three numbers that have been finally rehash, namely, 5 7 3, and the HashMap table has only 2, then HashMap put the three numbers into the data structure and it should look like this:

This should be well understood. Then let's take a look at the resize code. The above stack contains:

void addEntry(int hash, K key, V value, int bucketIndex) {Entry
 
   e = table[bucketIndex];    table[bucketIndex] = new Entry
  
   (hash, key, value, e);    if (size++ >= threshold)        resize(2 * table.length);}
  
 
void resize(int newCapacity) {    Entry[] oldTable = table;    int oldCapacity = oldTable.length;    if (oldCapacity == MAXIMUM_CAPACITY) {        threshold = Integer.MAX_VALUE;        return;    }    Entry[] newTable = new Entry[newCapacity];    transfer(newTable);    table = newTable;    threshold = (int)(newCapacity * loadFactor);}
void transfer(Entry[] newTable) {    Entry[] src = table;    int newCapacity = newTable.length;    for (int j = 0; j < src.length; j++) {        Entry
 
   e = src[j];        if (e != null) {            src[j] = null;            do {                Entry
  
    next = e.next;                int i = indexFor(e.hash, newCapacity);                e.next = newTable[i];                newTable[i] = e;                e = next;            } while (e != null);        }    }}
  
 

To sum up the three pieces of code, the process of a HashMap scale-out should be:

1. Take twice the size of the current table as the size of the new table.

2. Create a new Entry array based on the calculated size of the new table, named newTable

3. Poll each location of the original table, calculate the position of the Entry connected to each location on the new table, and connect it as a linked list.

4. After all the entries in the original table are round-robin, it means that all the entries in the original table have been moved to the new table, and the table in HashMap points to newTable.

In this way, a scale-out is completed, which is shown in the figure below:

The normal expansion of HashMap is like this, which is easy to understand.

Endless loop caused by expansion

Since it is an endless loop caused by expansion, continue to look at the expansion code:

 1 void transfer(Entry[] newTable) { 2     Entry[] src = table; 3     int newCapacity = newTable.length; 4     for (int j = 0; j < src.length; j++) { 5         Entry
 
   e = src[j]; 6         if (e != null) { 7             src[j] = null; 8             do { 9                 Entry
  
    next = e.next;10                 int i = indexFor(e.hash, newCapacity);11                 e.next = newTable[i];12                 newTable[i] = e;13                 e = next;14             } while (e != null);15         }16     }17 }
  
 

Two threads: thread A and thread B. Assume that the execution of Row 3 is complete and thread A is switched. For thread A, it is as follows:

When the CPU is switched to thread B for running, thread B completes the whole expansion process and forms the following:

At this time, the CPU switches to thread A and runs the 8th rows ~ Do... While... Loop, first place the Entry 3:

We must know,Since thread B has been executed, according to the Java Memory Model (JMM), all the entries in the table are up-to-date, that is, the next of 7 is 3, and the next of 3 is null.. 3 is placed at the position of table [3]. The following steps are as follows:

1. e = next, that is, e = 7

2. Judge that e is not equal to null, And the loop continues.

3. next = e. next, that is, next of next = 7, that is, 3

4. Place the Entry 7

Therefore, the graph representation is:

After 7, run the Code:

1. e = next, that is, e = 3

2. Judge that e is not equal to null, And the loop continues.

3. next = e. next, that is, the next of 3, that is, null

4. Place the Entry 3

Move 3 to table [3], and the endless loop will come out:

3. Move to table [3], and next of 3 points to 7. Since next of the original 7 points to 3, it becomes an endless loop.

Execute the 13 rows of e = next, then e = null, And the loop ends. Although this loop is indeed over, the subsequent operations, as long as it involves polling the HashMap data structure, will have an endless loop at the linked list of table [3], whether it is iteration or expansion. This is the reason for the previous endless loop stack. The 484 rows of the transfer, because this is a resizing operation, you need to traverse the HashMap data structure. The transfer method is the last method for resizing.

3 5 7 What are the results?

Some people may think that the number 5, 7, and 3 above is a coincidence, such as a number deliberately selected to generate an infinite loop of HashMap.

I have answered this question: I remember the following section in "Principles and Practices of distributed consistency from Paxos to Zookeeper", which is roughly described in this article. I have come to the conclusion that,Any errors that may occur in multiple threads will eventually occur..

The number is unfortunate. It is normal that the two adjacent entries before the expansion are allocated to the same table location after the expansion. The key is that even if this exception scenario is less likely to happen, your system will be partially or even unavailable once-there is no way to restart the system. Therefore, this possible exception scenario must be killed in advance.

OK, don't talk about it. As mentioned above, 5 7 3 leads to an endless loop. Now let's take a look at the normal sequence 3 5 7. What will happen. A brief look is not as detailed as described above:

This is the content in the data structure before expansion. After expansion, it should be normal:

Now there is a problem with multithreading. Put a thread 7 first:

Then, set 5:

Because the next value of 5 is null at this time, the expansion operation is complete. The result of 3 5 7 is the loss of elements.

Solution

Using a non-secure set of threads as a global share is itself an incorrect approach, and errors will certainly occur in concurrency.

Therefore, there are two ways to solve this problem:

1. Use Collections. synchronizedMap (Mao M) the method turns HashMap into a thread-safe Map.

2. Use the Hashtable and ConcurrentHashMap threads to secure the Map

However, since we have chosen a thread-safe approach, we must pay a certain price for the performance. After all, there is no perfect thing in the world, which requires high running efficiency and thread security.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.