Memcache Internal Anatomy

Source: Internet
Author: User
Tags website server

Memcache in the web community is a very well-known system, and there is a good reason: it is fast, stable, lightweight, and if you install Memcache on the website server, it seems to automatically increase the speed of website Access 10 times times. While this may seem a bit magical, customizing a good caching strategy is useful for websites or apps. If you just want to know if you're applying memcache to your website, it's unfortunate that this article doesn't teach you how to use Memcache. We will cobwebs and see what makes Memcache so magical?

Although Memcache itself is not a very complex software, it has many good features that take a long time to talk about. I will focus on four areas:

    1. Big-o

    2. Lru

    3. memory allocation (allocation)

    4. Consistent hash (consistent hashing)


Big-o
Memcache Most functions (add, get, set, flush) have a time complexity of O (1). This means that they are all functions of a time constant, regardless of how many objects are in the cache, and these functions take the same amount of time as only one piece of data in the cache. For more information on Big-o, please read the blog (https://www.adayinthelifeof.nl/2009/12/21/big-o-notation/). There are many benefits of using the O (1) function (Memcache has been kept fast), but there are some drawbacks. For example: You cannot iterate over and iterate over all objects in the memcache (if you do, it is an abuse of memcache, but this is another topic). An iterator may be an O (N) operation with a time complexity, which means that if the cached object doubles, the time spent will be doubled. This also formally memcache the reason why the iterative function is not supported.

LRU algorithm

Before you start a memcache daemon, you need to tell it how much memory you need. Memcache will correctly allocate memory at boot time, so if you need 1G of memory to store the cached data, then this 1G memory will be allocated directly and not used for other purposes (like Apache or DB instance). But since we have told Memcache how much memory it can use, Memcache has the potential to store all of its memory for our data. So what happens if we need to add more data?

Perhaps you already know that Memcache will delete old data to provide space for new data, but it needs to know which data objects can be deleted. Is the most space-occupying object we recently put in? Or is it the first to put the cached object so that we can get a cache system based on the FIFO algorithm? The truth is that Memcache uses a more advanced technique called LRU: least recently used. To put it simply: it deletes objects that have not been used for the longest time. This is not necessarily the largest object, and it may not even be the first to put the cached data.

Within Memcache, all objects have a "counter". This counter holds a timestamp, and each time a new object is created, the counter is set to the current time. When an object is fetched, Memcache also resets the counter to the current time. Once memcache needs to "delete" an object to make room for a new object, it will find the smallest counter. This object is either not fetched or the last time it was fetched (perhaps not so much, otherwise the value of the counter will be close to the current timestamp).

In fact, this will create a simple system that uses caching very efficiently. If it is not used, it is removed from the system system.

Adampresley wrote a blog to teach you how to use this mechanism in a PHP project. In Memcache, the use of this system is slightly different, and it is used to ensure that the time complexity of most functions remains at O (1).

Memory allocation

Memory allocation is an area beyond the research of most PHP developers. This is why the Advanced programming language is simpler: most of the work is done by the underlying system, such as the compiler or the operating system. But since Memcache is written in C, it has to do its own memory allocation and management. Fortunately, most of the work (must) can be delegated to the operating system, so, in fact, we only use a function (malloc function) to allocate memory, A function that is not needed in some cases (the free function) frees up memory and may use a function (ReAlloc function) to reset the current memory block size.

Now that you've created the C project, it's all right, and in this project you need to allocate memory space for it before you create the string, then do the work on it and finally release the memory you're using. But high-performance systems such as memcache can have problems using this approach. The main reason is that the malloc function and the free function are not actually optimized in this system. In this case, the memory is prone to fragmentation, which means there is a lot of memory overflow, which is like when you write to or delete files on disk, which can cause a lot of disk space fragmentation (but in all fairness, this depends on the system you are using).

Now that you've created the C project, it's all right, and in this project you need to allocate memory space for it before you create the string, then do the work on it and finally release the memory you're using. But high-performance systems such as memcache can have problems using this approach. The main reason is that the malloc function and the free function are not actually optimized in this system. In this case, the memory is prone to fragmentation, which means there is a lot of memory overflow, which is like when you write to or delete files on disk, which can cause a lot of disk space fragmentation (but in all fairness, this depends on the system you are using). Do you remember the time when you had to tidy up the disk at intervals? Basically, the same thing exists in memory, which means that in the end, memory fragmentation will cause memory allocations to become slower and a large amount of memory cannot be used.

Currently, to solve the malloc function problem, Memcache uses its own memory manager by default (you can also let memcache use the standard malloc function, but this is unwise). The Memcache memory manager allocates the maximum value you set (such as 64M, or more) from the operating system via a malloc call, and from then on Memcache uses its own memory management system called the slab allocator.

Slab Distribution

When Memcache is started, it will reassign the memory allocated to it to a smaller part called a page. The size of each page is 1M (coincidentally, the maximum memory you can store for a single object in Memcache is also 1M). Each page can be assigned to a slab class or the page can be retracted (into a free page). The Slab class determines how large an object can be stored in a particular memory page. At the same time, each page is assigned a specific sub-class, so that the page is divided into smaller chunks (chunk). Each block in slab has the same size, so you cannot have two different sizes of blocks in the same page. For example, one page has a block size of 64 bytes (Slab Class 1), another page has a block size of 128 bytes (Slab Class 2), and so on, until the page with the maximum slab is only one block (1M size block). Each slab class can exist on more than one page, but once a page has been assigned a slab class (that is, it has been partitioned), the page cannot be assigned another slab class again.

The minimum block size starts at 80 bytes, and the block size grows using parameter 1.25 (up to the next 2 of the integer power size). Therefore, after 80 bytes the next smallest block size is 100, and so on, you can increase the "-vv" parameter to observe it when Mecache starts. You can also set the growth parameter by-f (Change the value of 1.25) and use-s to set the initialization size of the block, but do not change the initial parameter value unless you explicitly need to do so.

What you actually see is the number of slab classes, the size of the slab inner block, and how many blocks (obviously: the larger the block, the less the number of blocks). Memcache Initializes a page for each slab class, and the other pages remain idle (specify one if the slab class requires a page). Memcache has partitioned the memory so that it can add data to the slab. Suppose I have a 105-byte object (which includes the overhead of memcache, so the data that can be stored in memcache is less than real), Memcache will know that the object should use the Slab Class 3 block to store the object. Because the size of the Slab Class 3 is suitable for 105-byte objects, we have 128-105 = 23 bytes of memory that are not used, and this 23-byte memory cannot be used to do anything else. That block is marked for use, and that's the object. This is the price we have to pay for using the slab allocator, but the memory is not fragmented. In fact, this is the tradeoff between speed and memory waste.

So, once a page is used (all the blocks in the page are filled with data) and we add some additional data, Memcache will fetch a new free page and assign the page to the slab class of the attribute, divide it into blocks and use the first valid block to store the data. But if the page is used up, Memcache will use the LRU algorithm to clear the existing blocks to make room. That is, if we need a 128-byte block, it clears out a 128-byte block, although there may be 256-byte blocks older than the data in this 128-byte block. In addition, each slab class has its own LRU algorithm.

So let's assume:

If you are using a 128-byte block in all the caches to store the data (Slab Class 3 in the example above), Memcache will assign all the pages to that slab class. It may actually be that the other slab class has only one page (allocated at initialization time), that is, when a 1M object is stored in a page with a block size of 1M, the second 1M object has no new page to allocate. At this point, Memecache must use the LRU algorithm to remove an object, and the first 1M object is removed because there is only one 1M slab class corresponding to the page. At the end of this article, you will know how the slab allocator works.

Consistent hash (consistent hashing)

Your web app interacts with several different memcache servers at the same time, you only need to upgrade your app to a set of Memcache server IPs, so it uses all Memcache servers by default.

When we add an object to Memcache, it automatically selects a server that can store that data. This option is easy when there is only one Mecache server, but when you have multiple servers memcache must find a way to store the objects. There are several different algorithms for this scenario, such as using a round robin system to save each storage operation sequentially to the next server (the first object is saved in server 1, the second is saved to server 2, the third is saved to server 1, and so on). But how does a system like this know the right server when we want to get the specified data?

memcache This is very simple, but the load balancing technique is very effective: create a hash for each key (which you might think is MD5 (key), but in fact, it is a more professional and fast hashing method). At this point, we create a hash that is evenly distributed, so we can use the modulo function to find out which server stores the objects we need:

In PHP, the code is as follows:

Very good. Now, we can infer from this simple formula which server holds the specified key. The problem with this mechanism: once the $servercount (number of servers) changes, almost all the keys will change the server, perhaps some key server IDs remain the same, but this is purely coincidental. In fact, when you change the number of Memcache servers (whether it's up or down, it doesn't matter), your backend will add a lot of requests because all the keys are invalidated at once.

Now, let's embrace a consistent hash. By using this algorithm, we do not have to worry about the key changing the server when the server increases or decreases. Here's how it works:

A consistent hash uses a timer like a clock. Once it reaches "12", it will roll back to "1". Assuming that the timer is 16 bits, then its value range is 65535. If we assume that this is a clock, the numbers 0 and 65535 are like 12 points on the clock, 32200 will be at 6 o'clock, 48000 will be 9 o'clock, and so on. We call it a continuous clock.

On this continuous clock, we place (relatively) a large number of "dots" for each server. These points are randomly placed, just like we have a lot of dots on the clock.

As an example, let me show a continuous clock with 3 servers (S1,S2,S3) with 2 points per server:

If this continuous clock is a 16-bit number, the points on the S1 are more or less between 10 and 29000, the point on the S2 will be between 39000 and 55000, and the S3 will be between 8000 and 48000. Now, when we store a key, we create a 16-bit hash of the number that can be plotted on a continuous clock. Assuming we have four keys (K1 to K4), we get 4 hashes: 15000, 52000, 34000, 38000, respectively. As shown in the red dots:

To find a server where key should be stored, the only thing we need to do is to follow the continuous clock clockwise until we reach the "server" point. For K1, we go along the continuous clock until we find S1. K2 will find S2. K3 will find S3,k4 will find S3. So far, there's nothing special going on. In fact, it seems that we have done a lot of extra work in this way by using a modulo algorithm that is easy to do.

Here, the consistent hash is advantageous: Assuming that server 2 will be removed from the Memcache server pool, what happens if you want to get K1? No odd fast thing happens, we mark the K1 still in the same position on the continuous clock, it first encounters the server point is still s1.

However, when K3 is acquired, it will be moved to S2 because K3 is stored on the S2 and it s2 the error (because S3 has been removed). For K2, it will move to S1.

In fact, the more server points we place on a continuous clock, the less keys we lose when a server is removed (or incremented). The best number of server nodes should be between 100 and 200, and then if the server is more then it will be slower to find on the continuous clock (which is, of course, a very fast process). The more servers you add, the better the consistent hash will perform.

Unlike using the standard modulus algorithm, which causes almost all keys to move, a consistent hashing algorithm may only cause the 10%~25% key to fail (the number will also drop quickly as your server increases), in other words, the back-end server (such as a database) will be less stressful than using the modulo algorithm.

Conclusion

It is a pleasure to have some in- depth knowledge of some systems that we take for granted. Just as in "real life", things are more complicated and consist of many complex solutions that may help your own life. Algorithms like LRU and consistent hashing are not difficult to understand, but now you know they exist in the long run to help you become a better developer.

1. This article is a translation of the Programmer's architecture

2. This article is translated from https://www.adayinthelifeof.nl/2011/02/06/memcache-internals/

3. Reprint Please be sure to indicate this article from : Programmer Architecture (No.:archleaner )

4. More articles please scan the code:

Memcache Internal Anatomy

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.