Memcached Complete Anatomy Series Tutorial "Turn" memcached Complete Anatomy Series Tutorial –4.memcached distributed algorithm

Source: Internet
Author: User
Tags crc32 message queue rehash perl script intel pentium

Directory of this document

Memcached's distributed

· What does memcached mean by distributed?

· A distributed approach to cache::memcached

· Calculate dispersion based on remainder

· Disadvantages of dispersion calculation based on remainder

Consistent Hashing

· A brief description of consistent hashing

· Library of functions supporting consistent hashing

· Summarize

Memcached's distributed

As described in the 1th time, memcached is called a "distributed" cache server, but there is no "distributed" functionality on the server side. The server-side only includes the memory storage features introduced in the 2nd and 3rd times, and its implementation is very simple. As for the distribution of memcached, it is fully implemented by the client library. This distribution is the biggest feature of memcached.

What does memcached mean by distributed?

The word "distributed" has been used many times here, but it has not been explained in detail. Now let's start with a brief introduction to the principle that the implementations of each client are basically the same.

The following assumes that the memcached server has node1~node3 three, the application will save the key named "Tokyo" "Kanagawa" "Chiba" "Saitama" "Gunma" data.

Figure 1 Distributed Introduction: Preparing

First add "Tokyo" to the memcached. When "Tokyo" is passed to the client library, the client-implemented algorithm determines the memcached server that holds the data based on the "key". When the server is selected, it commands it to save "Tokyo" and its values.

Figure 2 Distributed Introduction: When adding

Similarly, "Kanagawa" "Chiba" "Saitama" "Gunma" is the first to select the server and then save.

Next, you get the saved data. The key "Tokyo" To get is also passed to the library. The function library selects the server according to the "key" by the same algorithm as when the data is saved. Using the same algorithm, you can select the same server as you saved, and then send a GET command. As long as the data is not deleted for some reason, it can be saved

Figure 3 Distributed Introduction: When getting

This allows the memcached to be distributed by saving different keys to different servers. memcached server, the key will be scattered, even if a memcached server failure can not connect, nor affect the other cache, the system can continue to run.

Next, we introduce the distributed method of cache::memcached implementation of Perl client function library mentioned in the 1th time.

A distributed approach to cache::memcached

Perl's memcached client function library cache::memcached is Memcached's author, Brad Fitzpatrick's works, which can be said to be the original library of functions


The function library realizes the distributed function and is a distributed method of memcached standard.

Calculate dispersion based on remainder

Cache::memcached's distributed approach is simply to "scatter according to the remainder of the number of servers." The integer hash value of the key is obtained, divided by the number of servers, and the server is selected based on the remaining number.

The following Perl scripts are simplified to illustrate the cache::memcached.

Use strict;use warnings;use string::crc32; my @nodes= ('Node1','Node2','Node3'); my @keys= ('Tokyo','Kanagawa','Chiba','Saitama','Gunma'); foreachmy $key (@keys) {my $CRC=CRC32 ($key); # CRC my $mod= $CRC% ($ #nodes +1 ); My $server=$nodes [$mod]; # Select server printf based on remainder"%s =%sn", $key, $server;}

Cache::memcached uses a CRC when seeking a hash value.


The CRC value of the string is evaluated first, and the server is determined by dividing the value by the number of server nodes. After the above code executes, enter the following results:

Tokyo       == Node3chiba       = Node2saitama   = Node1gunma     = Node1

According to this result, "Tokyo" dispersed to Node2, "Kanagawa" dispersed to node3 and so on. In other words, when the selected server is unable to connect, Cache::memcached adds the number of connections to the key, computes the hash value again, and attempts to connect. This action is called rehash. You do not want rehash to specify the rehash = 0 option when you build the Cache::memcached object.

Disadvantages of dispersion calculation based on remainder

The remainder calculation method is simple, the dispersion of the data is very good, but also has its shortcomings. That is, when the server is added or removed, the cost of the cache reorganization is significant. After the server is added, the remainder can be transformed so that the same server as the save is not available, affecting the cache hit ratio. Write the snippet code in Perl to verify its cost.

Use strict;use warnings;use string::crc32; my @nodes=@ARGV; my @keys= ('a'..'Z'); my%nodes;foreachmy $key (@keys) {my $hash=CRC32 ($key); My $mod= $hash% ($ #nodes +1 ); My $server=$nodes [$mod]; Push @{$nodes {$server}}, $key;} foreachMy $node (sort keys%nodes) {printf"%s:%sn", $node, join",", @{$nodes {$node}};} 

This Perl script demonstrates how to save the "a" to "Z" key to memcached and access it. Save it as and execute it.

First, when the server is only three:

$ Node1 Node2 nod3node1:a,c,d,e,h,j,n,u,w,xnode2:g,i,k,l,p,r,s,ynode3:b,f,m,o,q,t,v,z

As a result, node1 save A, C, D, e......,node2 save G, I, K ..., each server has 8 to 10 data saved.

Next, add a memcached server.

$ node1 Node2 node3 node4node1:d,f,m,o,t,vnode2:b,i,k,p,r,ynode3:e,g,l,n,u,wnode4:a,c,h,j,q,s,x,z

Added the NODE4. Visible, only D, I, K, p, R, y hit. Like this, the server where the key is distributed after the node has been added can change dramatically. Only six of the 26 keys are accessing the original server, and all others are moved to the other server. The hit rate was reduced to 23%. When using memcached in a Web application, the instant cache efficiency in adding memcached servers is significantly reduced, and the load is concentrated on the database server, and there is a risk that a normal service cannot be provided.

This problem also applies to mixi Web applications, resulting in the inability to add memcached servers. But with the new distributed approach, it's now easy to add memcached servers. This distributed method is called consistent Hashing.

Consistent Hashing

About consistent hashing ideas, Mixi Co., Ltd. Development blog, and many other places have been introduced, here simply to explain.

Mixi Engineers ' blog–スマートな disperse で quick fit キャッシュライフ
Consistenthashing–コンシステントハッシュ method

A brief description of consistent hashing

Consistent hashing is as follows: first, the hash value of the memcached Server (node) is calculated and configured on the 0~232 Circle (Continuum). It then uses the same method to find the hash value of the key that stores the data and maps it to the circle. It then searches clockwise from where the data is mapped, saving the data to the first server found. If more than 232 still cannot find the server, it will be saved to the first memcached server.

Figure 4 Consistent Hashing: Fundamentals

Add a memcached server from the state. The remainder of the distributed algorithm affects the cache hit rate because the server that holds the key changes dramatically, but in consistent hashing, only the keys on the first server that increase the location of the server counter-clockwise on continuum are affected.

Figure 5 Consistent Hashing: adding a server

Therefore, the consistent hashing minimizes the redistribution of the keys. Moreover, some consistent hashing implementation methods also adopt the idea of virtual node. With the general hash function, the distribution of the server map location is very uneven. Therefore, using the idea of a virtual node, assign 100~200 points to each physical node (server) on the continuum. This can suppress uneven distribution and minimize cache redistribution when the server is increasing or decreasing.

The result of testing with the Memcached client function library, which is described in the following article using the consistent hashing algorithm, is that the hit rate calculation is calculated by increasing the number of servers (n) and the number of servers (m) added to the server:

(1–n/(n+m)) * 100

Library of functions supporting consistent hashing

Although cache::memcached is not supported by the consistent Hashing, several client function libraries have supported this new distributed algorithm. The first memcached client function library that supports consistent hashing and virtual nodes is the PHP library named Libketama, developed by

Libketama–a consistent hashing Algo for memcache clients–rjブログ–users at

As for the Perl client, the Cache::memcached::fast and cache::memcached::libmemcached described in the 1th time of the serialization support consistent Hashing.

· cache::memcached::fast–

· cache::memcached::libmemcached–

Both interfaces are almost identical to cache::memcached, and if you are using cache::memcached, you can easily replace them. Cache::memcached::fast re-implemented Libketama, you can specify ketama_points options when creating objects using consistent hashing.

My $memcached = cache::memcached::fast->new({    = = [" ",""],    ());

In addition, Cache::memcached::libmemcached is a Perl module that uses the C function library libmemcached developed by Brain Aker. The libmemcached itself supports several distributed algorithms, and also supports consistent Hashing, whose Perl bindings also support consistent Hashing.

· Tangent software:libmemcached


This paper introduces the distributed algorithm of Memcached, the main memcached distributed is the consistent hashing algorithm, which is implemented by the client function library and efficiently distributed data. Next you will introduce some of Mixi's experience with memcached applications, and related compatible applications

Memcached Full Anatomy Series Tutorial –5.memcached application and Compatibility program

[2010-01-07 22:29 by Plhwin | visit: 5,189 times | View comments Post comments]

Memcached's serial is finally over. So far, we have introduced the topics directly related to memcached, this time introduced some Mixi cases and practical application topics, and introduced some memcached-compatible programs.

Directory of this document

Mixi Case Study

· Server Configuration and number

· memcached process

Memcached using methods and clients

Maintaining connectivity through Cache::memcached::fast
Processing and rehash of public data

· memcached Application Experience

· Start with Daemontools

· Monitoring

· Performance of memcached

Compatible applications

· Tokyo Tyrant Case

· Summarize

Mixi Case Study

Mixi used memcached in the early stages of providing services. With the rapid increase of website access, simply adding slave to the database is not enough, so the memcached is introduced. In addition, we have verified the scalability of the memcached, proving that the speed and stability of the process can meet the needs. Now, Memcached has become a very important part of the Mixi service.

Figure 1 The system components now

Server Configuration and number

Mixi uses many servers, such as database servers, application servers, picture servers, reverse proxy servers, and so on. There are nearly 200 servers running in memcached alone. The typical configuration of the memcached server is as follows:

· Cpu:intel Pentium 4 2.8GHz

· Memory: 4GB


· Operating system: Linux (x86_64)

These servers were previously used for database servers, and so on. As CPU performance increases and memory prices fall, we actively replace database servers, application servers, and more, with more powerful and memory-intensive servers. This can suppress the sharp increase in the number of servers used by Mixi overall and reduce management costs. Since the memcached server consumes almost no CPU, the swapped-out server is used as the memcached server.

memcached process

Only one memcached process is started per memcached server. The memory allocated to memcached is 3GB and the startup parameters are as follows:

/usr/bin/memcached-p 11211-u nobody-m 3000-c 30720

Due to the use of the x86_64 operating system, it is possible to allocate more than 2GB of memory. 32-bit operating systems, each process can use up to 2GB of memory. have also considered the start of multiple allocation of 2GB of memory, but the number of TCP connections on one server will multiply, management becomes complex, so Mixi unified use of 64-bit operating system.

In addition, although the server's memory is 4GB, but only 3GB is allocated, because memory allocation exceeds this value, it is possible to cause memory exchange (swap). The 2nd time in the series of the former Sakamoto explained memcached memory storage "slab allocator", said at the time, memcached the specified memory allocation is memcached to save the amount of data, does not include "slab allocator" The memory that is occupied by itself, and the administrative space that is set up to save the data. Therefore, it should be noted that the actual memory allocations for the memcached process are larger than the specified capacity.

Most of the data Mixi saved in memcached is small. This way, the size of the process is much larger than the specified capacity. Therefore, we repeatedly change the memory allocation to verify that the size of 3GB does not trigger swap, this is the value of the current application.

Memcached using methods and clients

Now, Mixi's service uses 200 or so memcached servers as a pool. Each server has a capacity of 3GB, so there is a huge memory database of nearly 600GB. The client library interacts with the server by using the cache::memcached::fast of the car mentioned in this series many times. Of course, the cached distributed algorithm uses the consistent hashing algorithm introduced for the 4th time.

· cache::memcached::fast–

The use of memcached on the application tier is determined and implemented by the engineer who developed the application. However, in order to prevent wheel rebuild and prevent cache::memcached::fast from happening again, we provide the Cache::memcached::fast wrap module and use it.

Maintaining connectivity through Cache::memcached::fast

In the case of cache::memcached, the connection to the memcached (file handle) is stored in the class variable within the cache::memcached package. In environments such as Mod_perl and fastcgi, the variables in the package do not restart at any time as CGI, but remain in the process. The result is that the connection to the memcached is not disconnected, reducing the overhead of TCP connection creation, and also preventing TCP port resource exhaustion due to repeated TCP connections and disconnects in a short time.

However, Cache::memcached::fast does not have this functionality, so you need to keep the Cache::memcached::fast object in class variables outside of the module to ensure a persistent connection.

Package gihyo::memcached, use strict;use warnings;use cache::memcached::fast; my @server_list= qw/; my $fast; # # for holding objects SubNew{my $self=Bless {}, shift; if( !$fast) {$fast= cache::memcached::fast->New({servers =@server_list}); } $self->{_fast} =$fast; return$self;} SubGet{my $self=shift; $self->{_fast}->Get(@_);}

In the above example, the Cache::memcached::fast object is saved to the class variable $fast.

Processing and rehash of public data

Data such as cached data, setup information, and so on that are shared by all users, such as news on Mixi's home page, can occupy many pages and have a very large number of accesses. Under these conditions, access can easily be centralized to a memcached server. The access set itself is not a problem, but once the server in the access set fails to cause the memcached to connect, it can cause huge problems.

As mentioned in the 4th installment of the series, cache::memcached has the rehash function, that is, when the server that holds the data cannot be connected, the hash value is calculated again, and the other servers are connected.

However, Cache::memcached::fast does not have this feature. However, it can no longer connect to the server in a short period of time when the connection server fails.

My $fast = cache::memcached::fast->new({    max_failures     3,    failure_ Timeout  1});

Max_failures is no longer connected to the memcached server in failure_timeout seconds if the last connection failed. Our setting is 1 seconds and more than 3 times.

In addition, Mixi also sets a naming convention for the key names of cached data that is shared by all users, and data that conforms to the naming convention is automatically saved to multiple memcached servers, and only one server is selected from it when taken. Once you have created the library, you can make memcached server failures no longer having other effects.

memcached Application Experience

This concludes with an introduction to the memcached internal constructs and function libraries, followed by some other application experiences.

Start with Daemontools

Normally memcached runs fairly stably, but Mixi now uses the latest version of 1.2.5 that has happened several times memcached process has died. The architecture guarantees that the service will not be affected even if there are several memcached failures, but for servers memcached the process to die, as long as the memcached is restarted, the method of monitoring the memcached process and starting automatically is used. So the daemontools was used.

Daemontools is a set of UNIX service management tools developed by QMail's author DJB, where programs called supervise are used for service startup, stopped service restarts, and so on.

· Daemontools

The installation of Daemontools is not described here. Mixi uses the following run script to start the memcached.

#!/bin/if [-f/etc/sysconfig/memcached];then       . /etc/sysconfig/2>&1/usr/bin/memcached-p $PORT-u $USER-  M $ Cachesize-c $MAXCONN $OPTIONS


Mixi uses an open-source monitoring software called "Nagios" to monitor memcached.

· Nagios:home

The plugin can be easily developed in Nagios and can be used to monitor memcached's get, add, and so on in detail. However, Mixi only uses the stats command to confirm the operation status of the memcached.

define command {command_name check_memcachedcommand_line $USER 1$112115'  Statsrnquitrn'uptime' -M Crit}

In addition, Mixi transforms the results of the stats catalog into graphs through RRDtool, performs performance monitoring, and makes daily memory usage reports, which are shared with developers via email.

Performance of memcached

As has been described in the series, memcached performance is excellent. Let's take a look at the actual case of Mixi. The chart described here is the most centralized memcached server used by the service.

Figure 2 Number of requests

Figure 3 Flow

Figure 4 Number of TCP connections

The number of requests, traffic, and TCP connections from top to bottom. The maximum number of requests is 15000QPS, the traffic reaches 400Mbps, at this time the number of connections has exceeded 10,000. The server does not have special hardware, which is the normal memcached server that is introduced at the beginning. The CPU utilization at this time is:

Figure 5 CPU utilization

Visible, there is still the idle part. As a result, memcached performance is very high and can be a place for Web application developers to safely save temporary or cached data.

Compatible applications

Memcached implementations and protocols are very simple, so there are many implementations that are compatible with memcached. Some powerful extensions can write memcached memory data to disk, enabling data persistence and redundancy. For the 3rd time, the memcached storage layer will become extensible (pluggable) and gradually support these features.

Here are a few applications that are compatible with memcached.

Repcached: Patch that provides replication (replication) functionality for memcached.
Flared: Store to QDBM. The functions of asynchronous replication and fail over are also realized.
Memcachedb: Store to Berkleydb. The message queue is also implemented.
Tokyo Tyrant: Store data in Tokyo Cabinet. It is not only compatible with the Memcached protocol, but also accessed via HTTP.

Tokyo Tyrant Case

Mixi uses the Tokyo Tyrant in the above compatible applications. Tokyo Tyrant is a network interface for the Tokyo Cabinet dbm developed by Ping Lam. It has its own protocol, but it also has a memcached compatible protocol, and it can also exchange data over HTTP. Tokyo Cabinet Although it is an implementation that writes data to disk, it is very fast.

Mixi does not use the Tokyo Tyrant as a cache server, but instead uses it as a DBMS to hold key-value pairs together. Used primarily as a database to store the user's last access time. It is related to almost all Mixi services, which update data every time a user accesses a page, so the load is quite high. The processing of MySQL is cumbersome, the use of memcached alone to save data and the possibility of loss of data, so the introduction of the Tokyo Tyrant. But there is no need to re-develop the client, just use the cache::memcached::fast intact, which is one of the advantages.
For more information about Tokyo Tyrant, please refer to:

· Mixi Engineers ' Blog–tokyo tyrantによる resistant to high load dbの structure

· Mixi Engineers ' Blog–tokyo (cabinet| Tyrant) New machine can


By this time, the "Memcached Comprehensive Analysis" series is over. We introduce the basic, internal structure, dispersion algorithm and application of memcached. After reading, if you can be interested in memcached, it is our pleasure. For information on the system and application of Mixi, please refer to here. Thank you for reading.


Memcached Complete Anatomy Series Tutorial "Turn" memcached Complete Anatomy Series Tutorial –4.memcached distributed algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.