Memcached is an open-source distributed cache system. Many large Web ApplicationsProgramFacebook, YouTube, Wikipedia, Yahoo, and so on are all using memcached to support hundreds of millions of pages each day. By integrating the cache layer with their Web architecture, their applications increase performance while greatly reducing the load on the database.
If you are not familiar with memcached, read the following link:
- Use memcached to build high-performance Web Applications
This article will focus on how to efficiently manage the cache stored in memcached. In our actual project, because the website often needs to be updated, the cache will inevitably be out of sync with the database but still be effective, then the website administrator needs to manually clear some or some caches.
For example, every merchant page of the public comments uses a cache, And the cache object with the shop. [shopid] key value coexist. If a page cache with a merchant ID of 1234 is faulty, we can use the remove ("shop.1234") method to notify memcached to clear the corresponding cache object.
But the problem comes with it. Sometimes we find that a batch of caches have problems and do not expire in time. At this time, we will face a dilemma:
- Clear all cache via flush_all
This is very pornographic and violent... Kill all problems and problems together...
- Find all the key values in the cache and clean them one by one.
This seems good, but there is no good way to achieve it. Because memcached is a hash table structure and cannot provide select from where operations like SQL, you do not know how much data is stored in the cache.
We certainly cannot take the initiative to consider the violence law. Otherwise, we will not use this article.ArticleOkay. So let's see if there is any feasible solution to implement method 2.
Logdb Solution
This solution is also very simple. Since memcached has no select operation, we use dB to record all cache set operations. As long as memcached stores a cache object into the instance, it writes a record to the logdb at the same time.
When we want to delete it, we will use the Select Operation to find all shop. xxxx merchants and perform the remove operation one by one.
Although this method can achieve the goal of accurate deletion, the cost is not too high. In order to clear the key, you need to pull another database to help. In addition, the cache access volume is very large, and frequent insert operations on the database may become another potential performance bottleneck. This solution is used with caution when hardware conditions are not adequate.
Custom keylocator Solution
Readers who know about memcached know that the instance to which the cache object is assigned is determined by the memcached client. The default memcached client allocationAlgorithmIs the SHA-1 hash algorithm. By default, the cache objects (shop. xxxx) of the merchant class are hashed to different memcached instances.
(X: user100, shop202 Y: user101, shop200 Z: user102, shop201)
If we can customize this allocation algorithm. cache objects of the XXX class are stored in one instance or several instances, so that we can clear these shops. xxx dedicated instance to achieve clear batch.
(X: user100 Y: shop200, shop201, shop202 Z: user101, user102)
This solution seems good, but there are still many derivative problems:
From the two examples above, we can see that the hash algorithm is used to store two objects in each instance. However, using a custom method, instance y stores three objects, X only saves one. If the usage of various keys cannot be fully considered before the implementation of the custom locator, the usage of instances may be uneven.
The key reading rate of some popular classes is very high, because it is originally hashed on each server, and the read load is naturally decomposed into each server. If the dedicated server mode is adopted, all read operations will be restricted to one server, which will undoubtedly increase the burden on this server.
Key flag Solution
This solution is one of the simplest and most effective methods that can be used to achieve lazy cleaning by marking the key. In fact, this practice is also very suitable for memcached's appetite. As mentioned in previous articles, the memcached server cleanup policy is just a lazy cleanup.
Let's take a look at the actual example:
We have a version mark behind all the keys. "shop.200 _ 1" indicates the cache with the merchant ID 200 and version number 1. Once any merchant cache needs to be cleared, we only need to upgrade the version of shop cache.
In the above example, the merchant cache versions have all been upgraded to 2, and the read key value has also changed. Therefore, the cache of the original shop. xxx_1 will never be read, which is equivalent to being invalid.
Another advantage is that you do not need to Perform Batch delete operations, which consumes a lot of maintenance time. Those out-of-date versions of the cache will follow the LRU principle. As long as the cache is full, they will be removed automatically, and we do not need to wait for them.
Therefore, there is also a principle in the memcached circle that you don't need to worry about it if you don't need it... It disappears.
If this solution is insufficient, I think the length of the key must be extended to include the version number. However, the performance impact of this solution is minimal.
The batch cleaning of memcached has always been a troublesome issue. We hope that the discussion of the above solutions will give you some inspiration.