Cache avalanche, cache penetration, cache warming, cache updates, cache demotion, and more

Source: Internet
Author: User
Tags manual redis
first, Cache avalanche

Cache Avalanche We can simply understand that due to the original cache failure, the new cache is not in the period (for example: we set the cache with the same expiration time, at the same time a large area of cache expiration), all should have access to the cache of the request to query the database, and the database CPU and memory caused great pressure, Serious results in database downtime. Thus forming a series of chain reaction, causing the whole system collapse.

The cache is normally obtained from Redis with the following schematic:

The cache failure moment diagram is as follows:

The avalanche effect of cache invalidation is terrible for the underlying system. Most system designers consider locking or queuing guarantees that there will be no large number of threads to read and write to the database at once, thus avoiding a large number of concurrent requests falling to the underlying storage system when it fails. There is also a simple scenario where the cache expiration time is dispersed, for example, we can add a random value based on the original failure time, such as 1-5 minutes of random, so that each cache expiration time of the repetition rate will be reduced, it is difficult to trigger collective failure events.

The following is a brief description of the pseudo-code for two implementations:

(1) In this case, the general concurrency is not particularly high time, the most used solution is to lock the queue, pseudo-code as follows:

Pseudo code Public
Object Getproductlistnew () {
    int cacheTime =;
    String CacheKey = "Product_list";
    String lockkey = CacheKey;

    String Cachevalue = Cachehelper.get (CacheKey);
    if (cachevalue! = null) {
        return cachevalue;
    } else {
        synchronized (lockkey) {
            Cachevalue = Cachehelper.get (CacheKey);
            if (cachevalue! = null) {
                return cachevalue;
            } else {
                //Here is generally SQL query data
                cachevalue = Getproductlistfromdb (); 
                Cachehelper.add (CacheKey, Cachevalue, cacheTime);
            }
        }
        return cachevalue;
    }
}
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

Locking the queue is just to relieve the pressure on the database and does not improve system throughput. Assuming that the key is locked during the cache rebuild during high concurrency, this is the 1000 requests that are 999 blocks in a block. This also causes the user to wait for a timeout, which is a stopgap method.

Note: The resolution of the lock queueing solution is the concurrency problem of distributed environment, it is possible to solve the problem of distributed lock, the thread will be blocked and the user experience is poor. Therefore, it is seldom used in real high concurrency scenarios.

(2) There is also a solution to the solution is: to each cache data to increase the corresponding cache tag, record whether the cache is invalid, if the cache flag is invalidated, then update the data cache, the instance pseudo-code as follows:

Pseudo code Public
Object Getproductlistnew () {
    int cacheTime =;
    String CacheKey = "Product_list";
    Cache tag
    String cachesign = CacheKey + "_sign";

    String sign = Cachehelper.get (cachesign);
    Gets the cache value
    String cachevalue = Cachehelper.get (CacheKey);
    if (sign! = null) {
        return cachevalue;//not expired, direct return
    } else {
        cachehelper.add (cachesign, "1", cacheTime); 
  threadpool.queueuserworkitem (Arg), {
            //Here is generally the SQL query data
            cachevalue = Getproductlistfromdb (); 
            Twice times the date cache time for dirty read
            cachehelper.add (CacheKey, Cachevalue, CacheTime * 2);                 
        });
        return cachevalue;
    }
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

Explanatory notes:

1, cache tag: Record whether the cached data expires, if the expiration will trigger the notification of the other thread in the background to update the actual key cache;

2. Cache data: Its expiration time is 1 time times longer than the cache tag, example: Tag cache time 30 minutes, data cache set to 60 minutes. This way, when the cache tag key expires, the actual cache can also return the old data to the caller until another thread returns to the new cache after the background update is complete.

For a workaround for cache crashes, here are three scenarios: using a lock or queue, setting an expiration flag to update the cache, setting a different cache expiration time for key, and a workaround called "Level Two cache", which interested readers can study on their own. second, cache penetration

Cache penetration refers to the user querying the data, not in the database, and naturally not in the cache. This causes the user to query the time, in the cache can not be found, each time to go to the database to query again, and then return to empty (equivalent to two times useless query). This request bypasses the cache and directly checks the database, which is often referred to as the cache hit rate problem.

There are many ways to effectively solve the problem of buffer penetration, the most common is to use a bitmap filter, all possible data hash to a large enough, a certain non-existent data will be intercepted by this bitmap, thus avoiding the query pressure on the underlying storage system.

There is also a more straightforward method, if the data returned by a query is empty (whether the data does not exist, or a system failure), we still cache the empty result, but its expiration time will be very short, up to a maximum of five minutes. The default value of this direct setting is stored in the cache, so that the second to the buffer to get the value, and does not continue to access the database, this method is the most simple and rude.

Pseudo code Public
Object Getproductlistnew () {
    int cacheTime =;
    String CacheKey = "Product_list";

    String Cachevalue = Cachehelper.get (CacheKey);
    if (cachevalue! = null) {
        return cachevalue;
    }

    Cachevalue = Cachehelper.get (CacheKey);
    if (cachevalue! = null) {
        return cachevalue;
    } else {
        //database query not available, null
        Cachevalue = Getproductlistfromdb ();
        if (Cachevalue = = null) {
            //if found to be empty, set a default value, also cache
            Cachevalue = string. Empty;
        }
        Cachehelper.add (CacheKey, Cachevalue, cacheTime);
        return cachevalue;
    }
}
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

The empty result is also cached so that the next time the same request can be returned directly empty, that is, you can avoid the cache penetration when the value of the query is empty. You can also set a separate cache area to store null values, pre-check the key to be queried, and then release to the back of the normal cache processing logic. third, cache preheating

Cache warming this should be a more common concept, I believe many small partners should be able to easily understand that cache preheating is the system on-line, the relevant cache data directly loaded into the cache system. This avoids the problem of querying the database before the user requests it, and then caching the data. The user directly queries the pre-warmed cache data.

Solution Ideas:

1, directly write a cache refresh page, on-line manual operation;

2, the data volume is not big, can be loaded automatically when the project starts;

3. Refresh the cache periodically; Cache updates

In addition to the cache invalidation policy that comes with the caching server (the Redis default has a 6 policy to choose from), we can also customize the cache obsolescence based on the specific business requirements, and there are two common strategies:

(1) Periodically to clean up the expired cache;

(2) When a user requests to come over, and then determine whether the cache used by the request expires, expired, the underlying system to get new data and update the cache.

Both have advantages and disadvantages, the first drawback is to maintain a large number of cache key is more troublesome, the second disadvantage is that every time the user requests to determine the cache failure, the logic is relatively complex. Depending on the scenario, you can weigh it against your own application. v. Cache demotion

When traffic surges, service problems (such as slow or unresponsive response times), or non-core services affect the performance of the core process, there is still a need to ensure that the service is still available, even if it is a lossy service. The system can be automatically degraded based on some key data, or the switch can be configured for manual demotion.

The ultimate goal of the downgrade is to ensure that the core services are available, even if they are lossy. And some services cannot be downgraded (such as adding a shopping cart, clearing).

The system should be combed before the downgrade to see if the system can be thrown out of the handsome, so that what must be protected, which can be degraded; For example, you can refer to the log Level Setup plan:

(1) General: For example, some services occasionally because of network jitter or service is on-line and time-out, can automatically downgrade;

(2) Warning: Some services in a period of time success rate fluctuations (such as between 95~100%), can be automatically degraded or artificially degraded, and send alarms;

(3) Error: For example, the availability rate is less than 90%, or the database connection pool is blown, or the traffic suddenly soared to the system can withstand the maximum threshold, at this time can be automatically degraded or artificially degraded according to the situation;

(4) Serious error: for example, due to special reason data error, this time need urgent manual demotion. Vi. Summary

These are the actual projects, may encounter some problems, but also the interview is often asked about the point of knowledge, in fact, there are many a lot of various problems, the solution in the article, it is not possible to meet all the scenarios, is relatively just a way to get started on the problem. Generally formal business scenarios tend to be more complex, different scenarios, methods and solutions are different, because of the above scenario, the problem is not very comprehensive, and therefore does not apply to formal project development, but can be used as a conceptual understanding of the introduction, the specific solution should be based on the actual situation to determine.

Reference article:

1, http://www.cnblogs.com/zhangweizhong/p/6258797.html
2, http://www.cnblogs.com/zhangweizhong/p/5884761.html
3, http://blog.csdn.net/zeb_perfect/article/details/54135506

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.