Guava Source Code Analysis (Cache principle)

Last Update:2018-08-01 Source: Internet

Author: User

Tags volatile

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Objective

Google's guava is a Java core-enhanced library that is widely used.

I usually use very often, this time with the use of the Cache component of the day to see how Google Daniel Design.

Cache

This main discussion caches.

Caching is important in everyday development, and if your app has a high frequency of reading for a certain type of data, and it's a small change, it's a great fit to use caching to improve performance.

Caching can improve performance because it reads efficiently, like a CPU L1、L2、L3 cache, and the higher the level, the faster the corresponding read speed will be.

But not all the benefits are accounted for, reading speed is faster but its memory smaller resources more valuable, so we should cache the data really need.

In fact, the typical space-changing time.

The following is a discussion of the cache used in Java.

JVM Cache

The first is the JVM cache, or it can be considered a heap cache.

is to create some global variables, such as Map、List containers for storing data.

This advantage is simple to use but also has the following problems:

You can only explicitly write and clear the data.
The data cannot be eliminated according to certain rules, such as LRU，LFU，FIFO .
Callback notification when data is purged.
Some other customization features.

Ehcache, Guava Cache

So there are some open source tools that are specifically used as JVM caches, such as the guava cache mentioned in this article.

It has features not already in the JVM cache above, such as auto-purge data, multiple cleanup algorithms, purge callbacks, and so on.

But also because of these features, such a cache will inevitably have a lot more things need additional maintenance, and naturally increase the consumption of the system.

Distributed cache

The two caches mentioned just now are in-heap caches, which can only be used in a single node, so they are overwhelmed in distributed scenarios.

There are also some cache middleware, such as Redis, Memcached, which can share memory in a distributed environment.

Specifically, this is not the scope of this discussion.

Guava Cache Example

The reason to think of Guava's Cache is also recently in the making of a demand, broadly as follows:

The log information of the application system is read in real-time from Kafka, which contains the health status of the application.
If X times occur within the time window N, I need to give feedback (alarms, logs, etc.).

The cache for this guava is very suitable, I take advantage of its N time to not write data when the buffer is emptied, the data is read each time to determine whether the exception information is greater than X.

The pseudo code is as follows:

    @Value("${alert.in.time:2}")    private int time ;    @Bean    public LoadingCache buildCache(){        return CacheBuilder.newBuilder()                .expireAfterWrite(time, TimeUnit.MINUTES)                .build(new CacheLoader<Long, AtomicLong>() {                    @Override                    public AtomicLong load(Long key) throws Exception {                        return new AtomicLong(0);                    }                });    }            /**     * 判断是否需要报警     */    public void checkAlert() {        try {            if (counter.get(KEY).incrementAndGet() >= limit) {                LOGGER.info("***********报警***********");                //将缓存清空                counter.get(KEY).getAndSet(0L);            }        } catch (ExecutionException e) {            LOGGER.error("Exception", e);        }    }

The first is to build a Loadingcache object that reclaims the cache when no data is written in N minutes (by default, 0 is returned when the cache is not obtained through Key).

The method is then checked at each time checkAlert() of consumption, so that the requirements above can be met.

Let's imagine how guava it is to implement an expiration auto-purge data, and can be purged in such a way as LRU.

Under the bold hypothesis:

Internally through a queue to maintain the order of the cache, each access to the data moved to the head of the queue, and an additional thread to determine whether the data is out of date, the expiration is deleted. Sort of like I've written before. Implementing an LRU Cache

Hu Shi said: bold hypothesis careful argumentation

Let's see how guava is going to come true.

Principle Analysis

Look at the principle of the best but with the code step by step:

The sample code is here:

Https://github.com/crossoverJie/Java-Interview/blob/master/src/main/java/com/crossoverjie/guava/CacheLoaderTest.java

In order to be able to see how guava deleted the expired data, it slept for 5 seconds before acquiring the cache, reaching the timeout condition.

You will eventually find the com.google.common.cache.LocalCache 2187 rows in the class more critical.

Follow up. The No. 2182 Guild found first to determine whether count is greater than 0, this count holds the current number of caches and is guaranteed visibility with volatile adornments.

For more information about volatile, see the volatile keyword you should know

Then follow down to:

2761 lines, according to the method name can be seen to determine whether the current Entry is out of date, the Entry is through key query.

It is obvious here that the current key has expired based on the expiration method specified at the time of construction.

If it expires, go down, try to expire the deletion (need to lock, will be discussed in detail later).

Here it is clear:

Gets the total number of current caches
Self-subtract one (previous acquires lock, so thread safe)
Deletes and assigns the total number of updates to count.

In fact, this is the process in general, and guava does not maintain outdated data in accordance with a previously assumed thread.

The following should be the reason:

New threads require resource consumption.
Maintaining outdated data also acquires additional locks and increases consumption.

This is done in the same way, but it is not a problem for a high-throughput application if the cache does not have access to the data to be recycled.

Summarize

Finally, we will summarize the Cache of guava.

In fact, in the above code, you will find the following code when locating data through a key:

If you have seen the principle of concurrenthashmap should think that this is actually very similar.

In fact, guava Cache in order to meet the use of concurrent scenarios, the core data structure is in accordance with CONCURRENTHASHMAP, here is a key positioning to a specific location of the process.

First find Segment, and then find the specific location, is equal to do two times the Hash location.

One of the assumptions above is that it internally maintains two queues accessQueue,writeQueue for recording the cache order so that data can be retired sequentially (similar to using Linkedhashmap for LRU caching).

It is also a builder pattern to create objects from the way it was built.

Because as a tool for developers, there are a lot of custom attributes that need to be used, and the construction pattern is the right one.

Guava actually has a lot of things to say, such as the use of GC to reclaim memory, the callback notification when the data is removed, and so on. Then we'll discuss it later.

Last in a small ad:

Java-interview is currently close to 8K star.

This time set a small goal: fight for the impact 1W star .

Thank you for the support and praise of the old iron.

Welcome to the public exchange:

Guava Source Code Analysis (Cache principle)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More