Link: http://gihyo.jp/dev/feature/01/memcached/0005
The previous articles are listed here:
- 1st times: http://tech.idv2.com/2008/07/10/memcached-001/
- 2nd times: http://tech.idv2.com/2008/07/11/memcached-002/
- 3rd Times: http://tech.idv2.com/2008/07/16/memcached-003/
- 4th times: http://tech.idv2.com/2008/07/24/memcached-004/
I am Nagano of Mixi. The memcached serialization is coming to an end.
Until the last time,
We introduced the topics directly related to memcached. This article introduces some Mixi cases and
This article introduces some memcached-compatible programs.
- Mixi Case Study
- Server Configuration and quantity
- Memcached Process
- Memcached usage and client
- Maintain the connection through cache: memcached: fast
- Public Data Processing and rehash
- Memcached Application Experience
- Start with daemontools
- Monitoring
- Memcached Performance
- Compatible with Applications
- Summary
Mixi Case Study
Mixi uses memcached in the early stage of service provision.
With the sharp increase in website traffic, simply adding a Server Load balancer instance to the database cannot meet the requirements. Therefore, memcached is introduced.
In addition, we have also verified the increased scalability to prove that the speed and stability of memcached can meet the needs.
Now, memcached has become a very important component of the Mixi service.
Figure 1 Current System Components
Server Configuration and quantity
Mixi uses many servers, such as database servers, application servers, image servers,
Reverse proxy server. Nearly 200 memcached servers are running.
The typical configuration of the memcached server is as follows:
- CPU: Intel Pentium 4 2.8 GHz
- Memory: 4 GB
- Hard Disk: 146 GB SCSI
- Operating System: Linux (x86_64)
These servers were previously used for database servers. As CPU performance increases and memory prices fall,
We are actively switching database servers and application servers to servers with more powerful performance and more memory.
In this way, the number of servers used by Mixi can be greatly increased, reducing management costs.
Because the memcached server occupies almost no CPU, the replaced server is used as the memcached server.
Memcached Process
Each memcached server starts only one memcached process. The memory allocated to memcached is 3 GB,
The startup parameters are as follows:
/usr/bin/memcached -p 11211 -u nobody -m 3000 -c 30720
Because the x86_64 operating system is used, 2 GB or more memory can be allocated. In a 32-bit operating system,
Each process can only use 2 GB of memory. You have also considered starting multiple processes that allocate less than 2 GB of memory,
However, the number of TCP connections on a server will multiply and the management becomes more complex,
Therefore, Mixi uses a 64-bit operating system.
In addition, although the server memory is 4 GB, only 3 GB is allocated because the memory allocation volume exceeds this value,
This may cause memory switching (SWAp ). 2nd times of serialization
Previously, I explained the memcached memory storage "Slab allocator". At that time, when memcached was started
The specified memory allocation volume is the amount of memcached used to save data, excluding the memory occupied by "Slab allocator,
And the management space set to save the data. Therefore, the actual memory allocation of the memcached process is
Note that the specified capacity is large.
Most of the data stored by Mixi in memcached is relatively small. In this way, the process size is equal
The specified capacity is much larger. Therefore, we repeatedly change the memory allocation for verification,
It is confirmed that the size of 3 GB does not cause swap, which is the value of the current application.
Memcached usage and client
Currently, Mixi uses about 200 memcached servers as a pool.
The capacity of each server is 3 GB, so there will be a huge memory database of nearly GB.
The client library uses the cache: memcached: fast,
Interacts with the server. Of course, the cache distributed algorithm uses
4th times
Consistent hashing algorithm.
- Cache: memcached: Fast-search.cpan.org
The usage of memcached on the application layer is determined and implemented by the engineers who develop the application.
However, in order to prevent wheel re-engineering and prevent the re-occurrence of lessons on Cache: memcached: fast,
We provide and use the cache: memcached: Fast wrap module.
Maintain the connection through cache: memcached: fast
Cache: memcached, the connection (file handle) with memcached is stored in the class variable in the cache: memcached package.
In environments such as mod_perl and FastCGI, the variables in the package will not be restarted at any time as CGI does,
It is maintained throughout the process. The result is that the connection to memcached will not be disconnected,
Reduces the overhead for TCP connection establishment and prevents repeated TCP connections and disconnections in a short period of time.
The TCP port resource is exhausted.
However, cache: memcached: fast does not have this function, so it must be out of the module.
Keep the cache: memcached: Fast object in the class variable to ensure persistent connection.
Package gihyo: memcached;
Use strict;
Use warnings;
Use cache: memcached: fast;
My @ server_list = QW/192.168.1.1: 11211 192.168.1.1: 11211 /;
My $ fast; # used to keep objects
Sub new {
My $ self = bless {}, shift;
If (! $ Fast ){
$ Fast = cache: memcached: Fast-> New ({servers =>\@ server_list });
}
$ Self-> {_ fast} = $ fast;
Return $ self;
}
Sub get {
My $ self = shift;
$ Self-> {_ fast}-> get (@_);
}
In the preceding example, the cache: memcached: Fast object is saved to the class variable $ fast.
Public Data Processing and rehash
Cache data shared by all users, such as the news on the Mixi homepage, and set information,
It occupies many pages and has a large number of accesses. Under such conditions, access is easily concentrated on a memcached server.
The access set itself is not a problem. However, if the server in the access set fails, memcached cannot be connected,
This will cause huge problems.
4th serialization times
As mentioned in, cache: memcached has the rehash function, that is, when the server that saves data cannot be connected,
The hash value is calculated again to connect to other servers.
However, cache: memcached: fast does not have this function. However, when the connection to the server fails,
The function of no longer connecting to the server in a short time.
my $fast = Cache::Memcached::Fast->new({
max_failures => 3,
failure_timeout => 1
});
If the max_failures connection fails within the second of failure_timeout, The memcached server is no longer connected.
Our setting is more than 3 times in 1 second.
In addition, Mixi also sets naming rules for the key names of cache data shared by all users,
Data that meets the naming rules is automatically saved to multiple memcached servers,
Select only one server. After this function library is created, the memcached server can be faulty.
There is no other impact.
Memcached Application Experience
So far, we have introduced the internal structure and function library of memcached. Next we will introduce some other application experience.
Start with daemontools
Generally, memcached runs quite stably, but Mixi uses the latest version 1.2.5.
The memcached process has been killed several times. The architecture ensures that even if several memcached instances fail
The service will not be affected either. However, for servers that die from the memcached process, you only need to restart memcached,
Therefore, the memcached process is monitored and started automatically.
Therefore, daemontools is used.
Daemontools is a set of UNIX service management tools developed by Qmail author djb,
A program named supervise can be used to start or stop a service.
The installation of daemontools is not described here. Mixi uses the following run script to start memcached.
#!/bin/sh
if [ -f /etc/sysconfig/memcached ];then
. /etc/sysconfig/memcached
fi
exec 2>&1
exec /usr/bin/memcached -p $PORT -u $USER -m $CACHESIZE -c $MAXCONN $OPTIONS
Monitoring
Mixi uses an open-source monitoring software named "Nagios" to monitor memcached.
You can develop plug-ins in Nagios to monitor get and add actions of memcached in detail.
However, Mixi only uses the stats command to check the running status of memcached.
define command {
command_name check_memcached
command_line $USER1$/check_tcp -H $HOSTADDRESS$ -p 11211 -t 5 -E -s 'stats\r\nquit\r\n' -e 'uptime' -M crit
}
In addition, Mixi converts the results of the stats Directory into graphs through rrdtool for performance monitoring,
Report the daily memory usage and share it with developers by email.
Memcached Performance
We have already introduced the performance of memcached. Let's take a look at the actual cases of Mixi.
The charts described here are the most concentrated memcached servers used by the Service.
Figure 2 Request count
Figure 3 Traffic
Figure 4 Number of TCP connections
Requests, traffic, and TCP connections are listed from top to bottom. The maximum number of requests is 15000qps,
When the traffic reaches 400 Mbps, the number of connections has exceeded 10000.
This server has no special hardware, that is, the general memcached server introduced at the beginning.
The CPU usage is as follows:
Figure 5 CPU utilization
It can be seen that there are still idle parts. Therefore, memcached has very high performance,
It can be used as a place for Web application developers to safely save temporary data or cache data.
Compatible with Applications
The implementation and protocols of memcached are very simple, so there are many implementations compatible with memcached.
Some powerful extensions can write memcached memory data to disks for data persistence and redundancy.
3rd serialization times
We have introduced that the storage layer of memcached will become extensible (Pluggable) in the future, and these functions will be gradually supported.
Here we will introduce several memcached-compatible applications.
-
Repcached
-
Provides replication patches for memcached.
-
Flared
-
Stored in qdbm. Asynchronous replication and fail over are also implemented.
-
Memcachedb
-
Stored in berkleydb. Message Queue is also implemented.
-
Tokyo tyrant
-
Store Data to Tokyo cabinet. Not only is it compatible with the memcached protocol, but it can also be accessed through HTTP.
Tokyo tyrant Case Study
Mixi uses Tokyo tyrant, which is compatible with the preceding applications. Tokyo tyrant was developed by pinglin.
Network Interface of Tokyo cabinet dBm. It has its own protocol, but also has the memcached compatibility protocol,
You can also exchange data over HTTP. Although Tokyo cabinet is an implementation of writing data to a disk, the speed is quite fast.
Mixi does not use Tokyo tyrant as the cache server, but uses it as the DBMS that saves the combination of key-value pairs.
It is mainly used as a database that stores the user's last access time. It is related to almost all Mixi services,
Every time a user accesses a page, data must be updated, so the load is quite high. MySQL processing is very cumbersome,
Using memcached alone to save data may lead to data loss, so Tokyo tyrant is introduced.
However, you do not need to re-develop the client. You only need to use cache: memcached: fast in an intact manner,
This is also one of the advantages. For more information about Tokyo tyrant, see the company's development blog.
- Mixi engineers' blog-Tokyo tyrant zookeeper high-performance dual-load database architecture
- Mixi engineers' blog-Tokyo (Cabinet | tyrant)
Summary
So far, the "memcached comprehensive analysis" series has ended. We have introduced the basic, internal structure,
Distribute algorithms and applications. After reading this article, we will be honored if you are interested in memcached.
For more information about Mixi systems and applications, see the company's development blog.
Thank you for reading this article.