Memcache Caching System principle

Last Update:2016-02-28 Source: Internet

Author: User

Tags app service

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In Web service development, server-side caching is one of the most frequently used methods to improve service performance in service implementations. It can improve the operation efficiency of the service by recording the results of some sub-calculations to try to avoid the complicated calculation needed to get the result again.

In addition to improving the operational efficiency of services, server-side caching is often used to improve the scalability of services. So some large-scale web applications, such as Facebook, often build a huge server-side cache. And the most common use of them is memcached.

In this article, we will briefly introduce the memcached.

Memcached Introduction

Before we introduce memcached, let's start with an example of what a server-side cache is.

I believe you have played some online games. In my time (around 03), these games often added PvP features and provided a ladder to show the player with the best results and the rank of the current player in the ladder system. This is often used by game developers as a way to gather popularity. And hopefully in the game to prove their players will inspire morale, and then spend more time in the ladder to achieve better results.

In the case of ladder systems, the main function is to provide the player with ladder rank information, and not allow the player to make any changes to the data recorded in the system. The result of this setting is that the entire ladder system reads mostly, while the write operation is very small. Conversely, because a player in a game may have tens or even billions of people, and the number of people online is often tens of thousands, access to the ladder will be very frequent. In this way, even if only 10 people per second visit the rank in the ladder, reading and sorting the rank of the hundreds of millions of players is a very consuming thing.

A natural idea is that after a calculation of the ladder rank, we cache the ladder rank on the server and return the results recorded in the cache when other players are accessing them. After a certain period of time, such as one hours, we re-update the data in the cache. So we don't have to do thousands of ladder rankings every hour.

This is the most important feature that server-side caching provides. It can improve the response speed of individual requests, but also can reduce the pressure of service layer and database layer. In addition to this, multiple service instances can read the information cached by the server-side cache, so we no longer need to worry about the data being kept in each service instance, which in turn needs to be synchronized with each other, which increases extensibility.

The memcached is a service-side cache implementation that uses a BSD license. However, unlike other server-side caching implementations, it consists of two main components: a standalone Memcached service instance, and a client for accessing those service instances. As a result, the Memcached service instance runs independently of the service instance, compared to the case where each cache is running on the service instance in a common server-side cache implementation:

As you can see, because the memcached cache instance runs independently of each application server instance, the app service instance can access any cache instance. Traditional caches are bound to specific application instances, so each application instance will only have access to a specific cache. This binding, on the one hand, leads to a small amount of cache capacity that can be accessed by the entire application, and may result in redundant data in different cache instances, reducing the overall efficiency of the caching system.

At run time, the Memcached service instance consumes very little CPU resources, but it requires a lot of memory. So before deciding how to organize your server-side cache structure, you first need to figure out the load of each service instance in the current service. If the CPU usage of a server is very high, but there is a lot of free memory, then we can run a memcached instance on it completely. And if all the service instances in the current service do not have too much free memory, then we need to use a series of separate service instances to build the server-side cache. A large service often has hundreds of memcached instances. The data stored in these hundreds of memcached instances are different. Because of the heterogeneity of this data, we need to decide on which memcached instance in the server cache system to record the data we want to access before we access the information recorded by memcached:

As shown, the user is required to complete access to the information logged by the cache service through a memcached client. The client knows all the Memcached service instances that are contained in the server-side cache system. When you need to access data with a specific key value, the client internally calculates the corresponding hash value based on the key value of the data that needs to be read, such as "Foo", and the configuration of the current memcached cache service, to determine which memcached instance records the information that the user needs to access. After the memcached instance is determined to log the required information, the Memcached client reads the address of the Memcached service instance from the configuration and sends a data access request to the memcached instance to read from the memcached instance with the key value " Foo "information. This is referred to as the two-phase hash (two-stage hash) of memcached in the discussion of various forums.

The record of the data also uses a similar process: Assume that the user wants to record the data "bar" with the server-side cache and assign it a key value of "Foo". Then the memcached client will first perform a hash calculation on the user-given key value "Foo" and the number of available service instances logged by the current server-side cache, and determine the Memcached service instance that stores the data based on the hash calculation results. The client then sends a request to the instance in which the data "bar" with the key value "foo" is recorded.

The advantage of this is that each memcached service instance is independent and has no interaction with each other. In this case, we can omit a lot of complex functional logic, such as data synchronization between nodes, and the broadcast of messages between node points. This lightweight architecture can simplify a lot of operations. If a node fails, we just need to replace the old node with a new memcached node. When it comes to scaling the cache, we just need to add additional services and modify the client configuration.

The data for these records in the server-side cache is globally visible. That is, once a new record has been successfully added to the memcached server cache, other instances of the application that use the caching service will also be able to access the record:

In memcached, each record is made up of four parts: The key of the record, the expiration date, a series of optional tokens, and the data that represents the contents of the record. Because the data in the recorded content does not contain any structure, the data that we record in memcached needs to be serialized.

Memory management

One of the problems we have to consider when using caching is how to manage the lifetime of these cached data. This includes how to make a record of the data in the cache expire, how to perform the substitution of data when the cache space is insufficient. So in this section, we will introduce the memory management mechanism of memcached.

First, let's take a look at Memcached's memory management model. In general, one of the most common problems that a memory management algorithm needs to consider is the fragmentation of memory (fragmentation): the memory used by the system tends to be scattered in discontinuous space after a long period of allocation and recycling. This makes the system difficult to find contiguous memory space, on the one hand increases the probability of memory allocation failure, on the other hand makes the memory allocation work more complex, reduce the operational efficiency.

To solve this problem, memcached used a structure called slab. In this allocation algorithm, memory is divided into pages in 1MB size, and the page memory continues to be split into a series of memory blocks of the same size:

Therefore, memcached does not directly allocate the corresponding size of memory directly according to the size of the data that needs to be recorded. When a new record arrives, memcached first checks the size of the record and selects the type of slab to which the record needs to be stored, based on the size of the record. Next, Memcached examines the type of slab it contains internally. If there is a spare block in these slab, then memcached will use that block to record the information. If no slab has an idle chunk of the appropriate size, Memcached creates a new page and divides the page by the type of the target slab.

A special case that needs to be considered is the update of records. When a record is updated, the size of the record may change. In this case, the corresponding slab type may also change. As a result, the location of the records in memory may change when the update occurs. But from the user's point of view, this is not visible.

Memcached the benefit of allocating memory in this way is that it can reduce fragmentation caused by multiple reads and writes of the record. Conversely, because Memcached chooses the block type to insert into according to the size of the record, the size of the block allocated for each record is often larger than the amount of memory the record actually needs, resulting in a waste of memory. Of course, you can use the memcached configuration file to specify the size of each block to minimize memory waste.

However, it is important to note that because the size of each page in memcached is 1MB by default, the maximum number of individual blocks is 1MB. In addition, memcached limits the length of the key for each data to be no more than 250 bytes.

In general, the size of each block in the slab and the increment of the block size can have a significant impact on the selection of the location where the record is located and on the memory utilization. For example, in the current implementation, the size of blocks in each slab is incremented by default by 1.25 times times. That is, in a memcached instance, the size of the block provided by some type of slab is 80K, and the size of the slab type providing a slightly larger space will be 100K. If we need to insert a 81K record now, memcached will select a slab with a 100K block size and try to find a slab with a free block to deposit the record.

You also need to note that we are using a 100K block size slab to record data with 81K size, so the memory waste caused by recording this data is 19K, which is 19% waste. In the case of the average distribution of the records that need to be stored, the magnitude of this memory waste is about 9%. This amplitude actually depends on the increment multiples of the block size in each slab we just mentioned. In the initial implementation of memcached, the increment of each slab block is 2 by default, not the current 1.25, resulting in an average of about 25% of memory waste. In future releases, however, this incremental multiplier may also change to optimize the actual performance of the memcached.

If you know the characteristics of the data you need to cache, such as the size of the data in general and the magnitude of each data, you can set the parameters mentioned above based on the characteristics of the data. If the data is relatively small in general, then we need to resize the smallest block smaller. If the size of the data changes is not very large, then we can set the increment of the block size is smaller, so that the size of each block as close as possible to the data to be stored to improve memory utilization.

Another notable thing is that, since Memcached calculates which service instance records data with a specific key, it does not consider the differences that are used to make up each server in the cache system. If only one memcached instance is installed on each server, there will be multiple differences in available memory for each memcached instance. However, because the probability of each instance being selected is basically the same, memcached instances with larger memory will not be fully exploited. We can solve this problem by deploying multiple memcached instances on a server with larger memory:

For example, the caching system shown is comprised of two servers. The size of the memory in these two servers is not the same. The first server has a memory size of 32G, while the second server has only 8G of memory. To take advantage of the memory of both servers, we deployed 4 memcached instances on servers with 32G of memory, and 1 memcached instances on servers with only 8G of memory. In this case, the 4 memcached instances on the 32G memory server will receive a total of 4 times times the load from the 8G server, thus making full use of the memory on the 32G memory server.

Of course, because the caching system has limited resources, it will be filled with data generated by the service at some point. If the cache system receives a request to cache data again, it will determine the data that needs to be removed from the cache system based on the LRU (Least recently used) algorithm and the expiration time of the data. The expiration algorithm used by memcached is special, also known as delayed expiration (lazy expiration): When a user reads data from a memcached instance, it first determines whether the data expires by the expiration time set in the configuration. If so, the next time the data is written but there is not enough space, memcached will select the memory block in which the expired data resides as the destination address for the new data. If no corresponding record is marked as expired at the time of writing, the LRU algorithm is executed to find the longest unused data that needs to be replaced.

The LRU here is within the slab range, not the overall. Assuming that the most commonly used data in the memcached cache system is stored in a 100K block, there is another type of slab in the system, with a block size of 300K, but the data that exists in it is not commonly used. When you need to insert a 99K of data and memcached has not enough memory to allocate a slab instance again, it does not release the slab with a 300K block size, but instead finds the block that needs to be freed in each slab of the 100K block size and adds the new data to the block.

High Availability

In enterprise applications, we often emphasize that a system needs to have high availability and high reliability. And for a composition, it needs to be able to run stably, and in the event of an exception, try to limit the impact of the exception to a specific range, without causing the whole system to not work properly. And after the exception, the composition needs to be able to return to normal working condition more easily.

So what kind of high availability does memcached need? Before explaining this question, let's look at what the server-side cache of memcached consists of in a large service:

As you can see, in a large service, the server-side cache composed of memcached is actually made up of a lot of memcached instances. As we have described earlier, the memcached instances are actually completely independent, and there is no interaction between the memcached instances. So when one fails, the other Memcached service instances are unaffected. If a memcached instance in a server-side cache system with 16 memcached instances fails, then the entire system will still have 93.75% cache capacity to continue working. While the decrease in cache capacity can slightly increase the pressure on subsequent service instances, an application is often experiencing a much larger load fluctuation, so the service should still be working properly.

This also shows the correctness of the independence of memcached. Because memcached itself is committed to creating an efficient and simple, yet highly scalable cache component, it does not emphasize the security of the data. Once one of the memcached instances fails, we can compute the data again from the database and the server and record it on the other available memcached instances.

I think you're going to think, "No, there's another problem, because the number of memcached instances can change the result of the hash calculation, causing all requests to the data to be directed to an incorrect memcached instance. The caching services provided by the Memcached instance cluster are all invalidated, resulting in a sudden burst of pressure on the database. ”

Yes, this is where I was worried. And this is not just when the server cache fails. This problem occurs as long as the number of memcached instances in the server cache has changed.

The solution used by memcached is consistent Hashing. With the help of this algorithm, the change in the number of memcached instances will only result in a change in the hash value of a small subset of the keys. So how exactly does the algorithm work?

First consider a circle that distributes multiple points on the circle to represent integers 0 through 1023. These integers are evenly distributed across the entire circle:

In the middle, we highlight 6 blue dots. The six blue dots basically divide the circle into six equal portions. They correspond to the three memcached instances that are contained in the current memcached cache system m1,m2 and M3. OK, next we will hash the data that we currently need to store and find the point that corresponds to the hash result 900 on the circle:

As you can see, the point is closest to the blue dot that represents 0, so the data with the hash value of 900 is recorded in the Memcached instance M1.

If one of the memcached instances fails, the data that needs to be logged by the instance is temporarily invalidated, and the data recorded by the other instances is still in:

As you can see, the data with a value of 900 will be invalidated in the case of Memcached instance M1 failure, while other data with values 112 and 750 will still be recorded on memcached instance M2 and M3. That is, the failure of one node will now result in only part of the data that is no longer present in the cache system, and does not cause the target instance of the data recorded on other instances to change.

But we also have to consider another problem, that is, a service-side cache consists of only one or a few memcached instances. In this case, the failure of one of the memcached instances is fatal because the database and the server instance will receive a large number of requests that require complex computations and will eventually overload the server instance and the database. So when designing server-side caches, we often define these caches in ways that exceed the capacity of the demand. For example, when the service actually needs 5 memcached nodes, we design a server-side caching system with 6 nodes to increase the fault tolerance of the whole system.

Use Memcached building a cache system

OK, after the introduction of the memcached internal operating principle, let's take a look at how to use memcached to build a caching system for your service.

First, you need to download the memcached installation files from the Memcached official website and install them on your system as a cache server. At installation time, you need to configure the memcached appropriately, such as the port it needs to listen to, the amount of memory it allocates, and so on. After memcached is properly configured and started, we can access these memcached instances and manipulate them through a series of client software. Since I am not an OPS person, we will not cover these configurations in detail here.

What we're going to introduce is how to read and write data using Memcached's Java client. A more common memcached client is spymemcached. Let's take a look at how spy Memcached's main function uses the functionality it provides:

123456789101112131415161718192021222324252627282930313233343536373839 publicstaticvoid main(String[] args) throws Exception{ if(args.length < 2){ System.out.println("Please specify command line options"); return; } MemcachedClient memcachedClient = newMemcachedClient(AddrUtil.getAddresses("127.0.0.1:11211")); if(commandName.equals("get")){ String keyName= args[1]; System.out.println("Key Name "+ keyName); System.out.println("Value of key "+ memcachedClient.get(keyName)); } elseif(commandName.equals("set")){ String keyName =args[1]; String value=args[2]; System.out.println("Key Name "+ keyName + " value="+ value); Future<Boolean> result= memcachedClient.set(keyName, 0, value); System.out.println("Result of set "+ result.get()); } elseif(commandName.equals("add")){ String keyName =args[1]; String value=args[2]; System.out.println("Key Name " + keyName + " value="+ value); Future<Boolean> result= memcachedClient.add(keyName, 0, value); System.out.println("Result of add "+ result.get()); } elseif(commandName.equals("replace")){ String keyName =args[1]; String value=args[2]; System.out.println("Key Name "+ keyName + " value="+ value); Future<Boolean> result= memcachedClient.replace(keyName, 0, value); System.out.println("Result of replace "+ result.get()); } elseif(commandName.equals("delete")){ String keyName =args[1]; System.out.println("Key Name "+ keyName ); Future<Boolean> result= memcachedClient.delete(keyName); System.out.println("Result of delete "+ result.get()); } else{ System.out.println("Command not found"); } memcachedClient.shutdown();}

As you can see, with the help of this client, the reading of the data stored in the Memcached instance has become very simple. We can simply invoke the client's get (), set (), add (), replace (), and delete () functions to complete the operation of the data.

This article blog belongs to reprint, the original content please visit: http://www.zhaoyafei.cn/index.php/Article/articleinfo.html?id=20

Memcache Caching System principle

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More