Cache Server Design and Implementation (4)

Source: Internet
Author: User
Here we focus on a problem, that is, the cache is full. Generally, capacity is configured for the cache. Both the memory cache and the disk cache cannot be used unlimitedly. Here, the disk cache is used as an example. If the configured quota is used up, what should I do? For nginx, if you have enabled the cache function, you can see this process through the ps command: Cache manager process.In fact, this process is mainly used to delete objects when the file is invalid or the disk space is insufficient. So how to delete it? Because each cached object has a maximum validity period, after this time, the object is deemed invalid and needs to be deleted. In the nginx cache system, there is an LRU queue, and the elements in it are actually a node of the object in the rbtree. Each time an object is hit, the corresponding node is detached from the queue and then re-inserted to the queue header, so that the recently accessed node is moved to the beginning of the queue, cold objects are gradually pushed to the end of the team, a basic LRU idea ( Of course, such a processing has been recognized as a problem, that is, it is impossible to accurately judge the hot and cold degrees of a file, and it is not necessarily cold at the end of the team.). The cache manager process starts from the end of the team to check whether there are invalid files. Obviously, if the node at the end of the team does not expire, there will be no problem in the entire queue. Later, nginx will set a timer and check again later. However, if the cache is full and no file becomes invalid, LRU is required to delete some cold files. This is also impossible. At the time of deletion, it also starts from the end of the LRU queue. If the current file is no longer in use, you can directly delete it. If a request is still in use, it looks forward to the next object that can be deleted. Generally, it tries to find 20 times. If you haven't found a delete target for 20 times, you can wait for a while before processing. To what extent does the deletion end? Generally, you can delete the disk to a value lower than the quota. Of course, you can specify some standards, such as deleting 10% of the total amount. Another note: because each cache object corresponds to a memory structure that comes from the shared memory, the allocation may fail when the memory is insufficient, at this time, some rescue measures need to be taken. The processing of nginx is to perform LRU. Although LRU is used to delete files, it is clear that the control structure in the memory will be released at the same time, which can ease the memory shortage. This is basically the case. The rest is the code details. If you are interested, you can refer to the nginx code. Let's look at the second problem: a cache server suffers from abnormal shutdown or maintenance shutdown. When the machine starts the cache service again, it will encounter an unacceptable situation. Because your previous machine has been running for a long time and has cached thousands of files, the volume is as high as several hundred GB or even T. If the current machine is to provide services, the previous cache will be unavailable ( Actually invisible), Resulting in a large number of back-to-source requests. If the customer's origin site is killed, wait for the customer to settle the account with you. The key issue is how to recreate the previous cache. In fact, all files are still on the disk. What is missing is the control information in the memory. Without such information, the entire cache system cannot be connected to the actual disk file. In nginx, the arduous task of rebuilding the cache is called" Cache loader Process"Process to complete, the same through the ps command can be seen. In fact, the principle is very simple. Let's first look at the example in the previous article: /Cache/0/8d/8ef9229f02c5672c747dc7a324d658d0

During execution, the cache loader process will traverse all subdirectories and files under the cache directory. Additional verification is required for the file to determine whether the file meets the characteristics of a normal cache file, for example, if the file name is 32 characters (Name of the file represented by the hash string in this example), The minimum length of the file (because in nginx, the cache file always contains control information of a specific size at the beginning. Nginx will delete files that are considered invalid by nginx. If nginx finds a valid file, it needs to be registered to the cache system. The specific actions include the application structure, which is added to the rbtree and LRU queues. Especially when you insert an rbtree, the file name is exactly the keyword used for insertion and search in the original cache, so you can easily insert it. WhenCache
Loader Process
After the cache reconstruction task is completed, you can exit because it has no value. Another interesting thing is that if there are many files, the reconstruction process may be slow. Therefore, nginx uses three variables to control the speed,Loader_threshold,Loader_filesAndLoader_sleep. If the loader process continues to work until loader_threshold, take a rest. If the number of rebuilt files reaches loader_files, you can also take a rest as a reward. The rest time is loader_sleep. All three variables can be specified through configuration.Here is an important thing to emphasize:When the loader process re-caches, The nginx worker also provides services at the same time. Will this loader affect normal services? In fact, this loader process and common worker are derived from the master process through fork, so both can see everything in the shared memory, because the cache information of the entire system is in the shared memory, the two access to shared resources will be mutually exclusive. In the loader process, the process of inserting an object into the cache is the same as that in the worker. The lock is obtained first, and then the lock is searched. Therefore, whoever inserts an object into the cache system relies entirely on loader and worker to compete for the lock. The loser in the competition, when it gets the lock behind it, will find that people first login, Then skip the insert work. The same is true.

In general, nginx is very simple and clear in many mechanisms, and inherits the efficiency it has always been. From the perspective of professional cache, many functions are not complete yet, but there are still many things worth our reference and reference. If you want to design and implement a cache of your own, and the functional requirements are not that high, it is a good choice to refer to the nginx model. An important mechanism for cache design is expiration control. This is very important. If you have a rough understanding of the cache, you can think about its importance. I will discuss it later.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.