A thread blocking problem caused by using the local cache. The cache causes thread blocking.

Source: Internet
Author: User

A thread blocking problem caused by using the local cache. The cache causes thread blocking.
Symptom

A colleague's java System experienced request blocking after running for a period of time (504 is returned). From the perspective of the only memory dump file, most threads are blocked in a local cache (jodd cache) (ReentrantReadWriteLock $ ReadLock. lock ).

Troubleshooting Phase 1

The instinctive reaction should be that this happens only when the write lock is occupied. So I started to search for the write lock with the keyword "WriteLock. lock", but I couldn't find it. In fact, it is normal to not find it, because the write lock has been occupied, of course, it is impossible to stop on WriteLock. lock.

I started to translate the jodd LRUCache code and found that it was implemented by using javashashmap. I searched the mongohashmap write code on the dump file and found that a thread was executing the put Method of LRUCache, the Code stays in the pruneCache method of LRUCache (that is, when put, the cache is full and some locations are recycled ):

Protected int pruneCache () {if (isPruneExpiredActive () = false) {return 0;} int count = 0; // cacheMap is an instance of LinkedHashMap Iterator <CacheObject <K, v> values = cacheMap. values (). iterator (); while (values. hasNext () {CacheObject <K, V> co = values. next (); if (co. isExpired () = true) {values. remove (); count ++;} return count ;}

This proves that the original conjecture is correct. Only when the write lock is occupied can so many read threads be blocked.

It can be seen that jodd uses javashashmap + ReentrantReadWriteLock to implement LRUCache performance problems. A write operation locks the entire cache and blocks all read operations. This is the first problem.

Phase 2

Obviously, this cannot end. We need to pursue a higher level and continue to analyze the specific implementation of LRUCache. The main logic is to add a write lock to put and a read lock to get, internally, A javashashmap with accessOrder enabled is used as data storage.

At first glance, it seems quite normal. In fact, the javashashmap multi-threaded get with accessOrder enabled has a concurrency problem, because it will move the get element to the beginning of the two-way linked list. See the get method of javashashmap:

public V get(Object key) {    Entry<K,V> e = (Entry<K,V>)getEntry(key);    if (e == null)        return null;    e.recordAccess(this);    return e.value;} void recordAccess(HashMap<K,V> m) {    LinkedHashMap<K,V> lm = (LinkedHashMap<K,V>)m;    if (lm.accessOrder) {        lm.modCount++;        remove();        addBefore(lm.header);    }}

We can see that there is no concurrency control for changing the linked list structure here, so the parallel hashmap concurrent get is not OK, when jodd adds a read lock to get, there is a concurrency problem (if you do not understand it, Please study the ReentrantReadWriteLock mechanism on your own ). This is the second problem.

It can be imagined that the linked list is broken into various strange situations when high concurrency occurs (I will not describe it if it is more laborious). It is entirely possible to make the values in the pruneCache () method above. hasNext () is always true. This happens to be in LRUCache # pruneCache. Next time, it may be on javashashmap # transfer. Once the code block hang in the write lock is occupied, all the read threads are blocked, and the chances of such a problem may vary, it is difficult to simulate and reproduce.

JUC Bug

Some bugs in earlier JDK versions are also mentioned.

The ReentrantReadWriteLock may be hang without any threads holding the lock:
Http://bugs.sun.com/view_bug.do? Bug_id = 6822370
Http://bugs.sun.com/view_bug.do? Bug_id = 6903249

Summary
  • Do not use Jodd cache
  • Gauva cache is recommended.
    Implemented based on concurrentjavashashmap and has been integrated into guava.
  • Do not trust open-source components. You must thoroughly study them before using them.

More content first in http://jenwang.me

Further communication:

-Email: jenwang@foxmail.com

-If you are interested in some topics of this blog and want to further communicate, add the qq group: 2825967

-More technical exchanges in the circleArchitecture Miscellaneous 」To talk to drivers about cutting-edge Internet technologies, architectures, tools, and solutions.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.