Talk about the Cache

Last Update:2018-06-20 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Title: Talk about the cache

Categories

Tech

Comments:true
Date:2018-06-18 22:00:00

Last year in the work of system performance optimization, spent a lot of effort to customize the caching scheme for the business, then feel perfect, but some days ago inadvertently chat about the cache found in some details are still not considered. Here's a summary of the issues you need to consider when doing the cache.

The outline is as follows:

Cache mode
Cache obsolescence
Cache breakdown
Cache penetration
Cache avalanche

Cache mode

The more common patterns are divided into two main categories: Cache-aside and Cache-as-sor. where Cache-as-sor (System of Record, which is the DB that stores data directly) also includes read-through, Write-through, Write-behind.

Cache-aside

Cache-aside is a more general-purpose caching model in which the process of reading data can be summarized as follows:

Read the cache and return directly if the cache is present. If it does not exist, it executes 2
Read the SoR and update the cache to return
The code is as follows:

# 读 v1def get(key):    value = cache.get(key)    if value is None:      value = db.get(key)      cache.set(key, value)    return value

The process for writing numbers is:

Write the SoR
Write Cache
The code is as follows:

# 写 v1def set(key, value):    db.set(key,  value)    cache.set(key, value)

Logic seems simple, but there are plenty of surprises if you're in a highly concurrent distributed scenario.

Cache-as-sor

In cache-aside mode, the maintenance logic of the cache is implemented and maintained by the business side, while the cache-as-sor is the logic of the cache on the storage side, that is, the DB + cache is transparent to the business caller as a whole, the business does not need to care about the implementation details, only GE T/set can be. Cache-as-sor mode is common with Read Through, write Through, write Behind.

Read Through: When the reading operation occurs, query the cache, if Miss, then the cache query the SoR and update, the next access to the cache can be directly accessed (that is, the implementation of Cacha-aside on the storage side)
Write Through: When a write operation occurs, the cache is queried, and if hit, the cache is updated and the SoR is updated by the cache model
Write Behind: When a write operation occurs, the SOR is not immediately updated, only the cache is updated and then immediately returned, and the Sor is updated asynchronously (eventually consistent)

Read/write Through mode is a good understanding, is the synchronization of the update cache and SoR, read the scene is also cache priority, miss after reading the SoR. The main meaning of this type of mode is to alleviate the pressure of the SoR in the context of the read operation and to improve the overall response speed, there is no optimization for the write operation, it is suitable for the scene with less read and write. Write Behind's cache and SoR updates are asynchronous and can be used to optimize write operations by using batch and merge at asynchronous times, thus improving the performance of writes.

The following two figure is a flowchart from the Wikipedia write Through and write Behind:

Write Through and write Behind

Summary

Many DB now have memory-based caches that can respond to requests more quickly, such as the high performance of Hbase's Cache,mongo in blocks, and partly relies on its large system memory cache. However, it is more obvious that the local cache will be more effective locally, eliminating the large amount of network I/O, which will greatly increase the processing delay of the system and reduce the pressure of downstream cache + db.

Cache obsolescence

Cache elimination is a relatively old topic, the usual cache strategy is just a few, such as FIFO, LFU, LRU. And LRU is the standard for the cache-culling strategy, of course, depending on the business scenario, other strategies may be more appropriate.

FIFO elimination strategy usually uses queue + Dict, after all, the queue is inherently FIFO, the new cache object is placed in the tail, and when the queue full when the team first object out of the team expires.

The core idea of LFU (Least frequently used) is that the least recently used data is first eliminated, that is, to count the number of times each object is used, and when it needs to be eliminated, choose the least-used elimination. Therefore, the LFU is usually implemented based on the minimum heap + Dict. Because the complexity of each change in the minimum heap is O (logn), the efficiency of the LFU algorithm is O (LOGN), which is slightly less efficient than FIFO, LRU O (1).

LRU (Least recently used), based on the principle of locality, that if the data is recently used, then it is very likely to be used in the future, conversely, if the data is not used for a long time, then the probability of future use is lower.

LRU expiration typically uses a double-ended list + Dict
Implementation (in the production environment using a linked list is generally doubly linked list), the most recently accessed data from the original location to the list header, so that the data at the beginning of the chain is recently used, and the end of the chain is the longest unused, in the time complexity of O (1) to find the data to be deleted.

# LRU 缓存过期概要逻辑, 无锁版data_dict = dict()link = DoubleLink() # 双端队列def get(key):    node = data_dict.get(key)     if node is not None:        link.MoveToFront(node)    return node    def add(key, value):    link.PushFront(Node(key,value))    if link.size()>max_size:        node = link.back()        del(data_dict[node.key])        link.remove_back()

Ps:

Realization of Lru_cache in Py3 Functools
Golang Implementing LRU Cache

Cache breakdown

In high concurrency scenarios (such as seconds), if a key fails at some time, but there is a large number of requests to access the key, these requests will fall directly downstream of the DB, that is 缓存击穿 (Cache penetration), the DB caused great pressure, it is likely to hit a wave of db Business hangs off.

In this case, the more general protection downstream method is to access downstream DB through a mutex, the thread/process that obtains the lock is responsible for reading the DB and updating the cache, while other acquire lock failed processes retry the entire get logic.

This logic is implemented in the Redis set method as follows:

# 读 v2r = redis.StrictRedis()def get(key, retry=3):    def _get(k):        value = cache.get(k)        if value is None:            if r.set(k,1,ex=1,nx=true): # 加锁                value = db.get(k)                cache.set(k, value)                return true, value            else:                return None, false        else:            return value, true    while retry:        value, flag = _get(key)        if flag == True:            return value        time.sleep(1) # 获取锁失败，sleep 后重新访问        retry -= 1    raise Exception("获取失败")

Cache penetration

When the data requested for access is a nonexistent data, this non-existent data is not written to the cache, so requests to access this data are landed directly down to the downstream DB, which can also pose a risk to downstream DB when the volume of such requests is large.

Workaround:

Consider caching this data appropriately for a short period of time, caching the empty data as a special value.
Another more rigorous approach is to use the Bloomfilter, Bloomfilter features in the detection of the existence of the key will not be false (Bloomfilter does not exist, must not exist), but there may be false positives (bloomfilter exist, may not exist). Within Hbase, you use Bloomfilter to quickly find rows that do not exist.

Preventive penetration based on bloomfilter:

# 读 v3r = redis.StrictRedis()def get(key, retry=3):    def _get(k):        value = cache.get(k)        if value is None:            if not Bloomfilter.get(k):                 # cache miss 时先查 Bloomfilter                # Bloomfilter 需要在 Db 写时同步事务更新                return None, true            if r.set(k,1,ex=1,nx=true):                value = db.get(k)                cache.set(k, value)                return true, value            else:                return None, false        else:            return value, true    while retry:        value, flag = _get(key)        if flag == True:            return value        time.sleep(1)        retry -= 1    raise Exception("获取失败")

Cache avalanche

When a large number of caches fail at the same time for some reason, such as simultaneous expiration, restart, and so on, a large number of requests are hit directly downstream of the service or DB, causing great pressure to crash down the outage, or avalanche.

For 同时过期 This scenario, it often occurs because of cold start or traffic bursts, resulting in a large amount of data write caches in a very short period of time, and they expire at the same time, so they expire in a similar amount of time.

Workaround:

A simpler approach is to 随机过期 have the expiration time of each data set to expire + random .
Another good solution is to make a level two cache, such as a set local_cache + redis of storage scenarios, or patterns, that were designed before caching redis + redis .

In addition, it is a reasonable downgrade scheme. In high concurrency scenarios, when too much concurrency is detected or the resource has been impacted, downstream resources are protected through a current-limiting downgrade to prevent the entire resource from being overwhelmed and the cache is gradually built during the throttling period, and the throttling is resumed when the cache is gradually restored and degraded.

Reference

Http://www.cs.utah.edu/~stutsman/cs6963/public/papers/memcached.pdf
Http://www.ehcache.org/documentation/3.5/caching-patterns.html
Https://docs.microsoft.com/en-us/azure/architecture/patterns/cache-aside
Https://coolshell.cn/articles/17416.html
Https://en.wikipedia.org/wiki/Cache_ (computing)
Https://docs.oracle.com/cd/E13924_01/coh.340/e13819/readthrough.htm
https://blog.csdn.net/zeb_perfect/article/details/54135506
http://blog.didispace.com/chengchao-huancun-zuijiazhaoshi/

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Talk about the Cache

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Talk about the Cache

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support