Directory
- Cache breakdown/Penetration/avalanche
- Intro
- Cache breakdown
- Cache penetration
- Cache avalanche
- Reference
- Contact
Cache breakdown/Penetration/Avalanche intro
Using caching requires understanding several cache issues, cache breakdown, cache penetration, and cache avalanches, and you need to understand the causes and how to avoid them, especially when you're planning your own caching framework to consider how to handle these issues.
Cache breakdown
The general cache system is to follow the key to cache the query, if there is no corresponding value, it should go to the backend system lookup (such as the database). If the value of the key corresponds to a certain non-existent, and a large number of concurrent requests for the key, it will be a lot of pressure on the back-end system.
In high concurrency, multi-threaded query the same resource at the same time, if the cache does not have this resource, then these threads will go to the backend service or database lookup, the database is under great pressure, the cache will lose the meaning of existence.
Cache Breakdown Solution
The problem with the cache breakdown is that in high-concurrency multi-threaded scenarios, many requests go to the backend services and databases all at once, resulting in a sudden burst of pressure on the backend services and databases.
To deal with this problem, when the multi-threaded request of the same key, queued, so that the first request back-end services and databases to update the cached values, the next request from the cache to fetch data from the cache data, no longer request back-end services and databases.
Cache penetration
Cache penetration refers to the user querying the data, not in the database, and naturally not in the cache. This causes the user to query, in the cache can not be found, each time to go to the database query, thereby losing the meaning of the cache and compared to the direct query database has been increased every time to check the cache.
Cache penetration Solution
The cause of the problem is to request a nonexistent data so that the cache does not exist and thus cause the backend system (mainly the database) to withstand a lot of pressure, so to solve this problem, it is bound to be in the cache to intercept a large number of requests, so that eventually go to the backend system, query database requests as little as possible.
Generally dealing with this problem, the cache does not exist when the cache is set to a short time content is empty cache, thereby reducing the actual request to the backend and the number of database queries.
Some of the more complex solutions have a fabric filter, the basic principle is to set a list, query cache time from the list to judge, here do not introduce.
The Bron filter (Bloom filter) was proposed by Bron in 1970. It is actually a very long binary vector and a series of random mapping functions. The Bron filter can be used to retrieve whether an element is in a collection. Its advantage is that space efficiency and query time are far more than the general algorithm, the disadvantage is that there is a certain rate of error recognition and removal difficulties
If you want to determine whether an element is in a collection, it is common to think of saving all the elements and then identifying them by comparison. Data structures such as linked lists, trees, and so on are all such ideas. But as the elements in the collection increase, we need more storage space and slower retrieval speed (O (n), O (Logn)). There is, however, a data structure called a hash table (also known as a hash table). It can map an element to a point in a bit array (bit array) through a hash function. In this way, we just have to see if this point is 1 to know if it's in the collection. This is the basic idea of the Bron filter.
Cache avalanche
When the cache server restarts or a large number of caches fail at a certain time, it can also put a lot of pressure on the backend system and the database when it fails.
Cache Avalanche Solutions
The root cause of a cache avalanche is a large cache failure, which causes a large number of requests to not hit the cache, and a large number of requests go to backend services and databases, causing stress.
If the system starts up relying on many caches, it can be pre-warmed by other services, put the required data into the cache in advance, and prevent the system from initiating a large number of requests directly to the backend service and database.
Since the cache is a large number of failures at the same time, we can also start to do some optimizations from the cache expiration time, so that the cache does not expire at the same point in time.
Specific implementation, you can set the time of failure to randomly add a few seconds of expiration time, to avoid the same time point cache a large number of failures.
Reference
- Github.com/weihanli/weihanli.redis/issues/2
- Www.cnblogs.com/jinjiangongzuoshi/archive/2016/03/03/5240280.html
- 54135506
- 79459095
- Baike.baidu.com/item/%e5%b8%83%e9%9a%86%e8%bf%87%e6%bb%a4%e5%99%a8/5384697?fr=aladdin
Contact
Contact me:weihanli@outlook.com