Analysis of distributed system cache design

Source: Internet
Author: User

A few days ago, I heard from Peng chunda in the department about a technology sharing of distributed cache, which is still very rewarding.

The PPT is as follows:

 

The subtitle of this sharing is "simple things are never simple". This sentence is very reasonable. The cache seems simple, but it is very exquisite to be a "good" cache system.

Write your own experiences:

1. distributed cache faces three major problems:

(1) data consistency.

This is especially important in distributed systems. There are three main reasons:

The consistency between the cache system and the underlying data. This is especially important when the underlying system is "readable and writable ".

Consistency between caches with inheritance relationships. To maximize the cache hit rate, the cache is also hierarchical: Global cache and secondary cache. They have an inheritance relationship. Global cache can be composed of two levels of cache.

Consistency between multiple cached copies. To ensure the high availability of the system, the cache system usually has two storage systems (such as memcache and redis). The above PPT also focuses on this aspect.

(2) cache avalanche

When the cache system is restarted or all the caches expire at the same time (for example, some systems will uniformly fl most of the data into the cache at system startup to speed up, at this time, if the cache time is set to 24 hours, then after 24 hours, it will be a tragedy.) When the application system fails because it cannot withstand the pressure.

(3) cache penetration

Query a data that does not exist and a key that does not exist. Each time, the database is accessed. If someone maliciously destroys the data, it is likely to directly affect the database.

The first point focuses on Data Authenticity and timeliness, while the second and third points focus more on performance. At the same time, caching is not necessarily necessary, especially when write operations are particularly frequent.

2. cache data elimination

In the past, the cache time is usually set for the elimination of cached data. For example, if I set the cache time for a data to 24 hours, the cache will not expire in the next 24 hours. Of course, the advantage is simple, and the disadvantage is obviously that it is not flexible, and it does not achieve good refined management.

The resources we can use are: 1. Add a tag to the cache, 2. Version Number (must increase monotonically, And the timestamp is the best choice) 3. provide an interface for manual cache cleaning.

For more information, see the above PPT.

Cache related interfaces:

 
VaRMe =Cache. Create (...); me. Set (Key, value, TTL, tags); me. Get (key); me. tagrm (TAG, offset, flush );

The cached data structure is as follows:

  var  DATA =  {'I ': now,  /*   * Data Writing timestamp   */  'E': Now  + TTL, /*   * expected expiration time   */  'K': Key,  /*   * original key   */  'V': value,  /*   * original value   */  'T': tags  /*   * tag list   */ };  

I didn't understand why I had to store the "original key" in the data at the beginning. Later, after a reminder from Peng Chun, I realized that the original key may be too long or has special characters, it cannot be directly used as the key of some systems. Therefore, the original key is often hashed to serve as the new cache key.

3. cache elimination policy

There are two cache elimination policies:

(1) regularly clear expired caches.

(2) When a user requests the request, determine whether the cache used by the request expires. If the cache expires, the system will obtain new data and update the cache.

The two have their own advantages and disadvantages. The first drawback is that it is troublesome to maintain a large number of cached keys. The second disadvantage is that the cache must be determined to be invalid every time a user requests come, and the logic is relatively complicated, you can weigh the specific solution based on your application scenario.

 

 

 

 

 

 

 

 

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.