Link: http://gihyo.jp/dev/feature/01/memcached/0005
Here is the link to this series of articles:
- 1st times: http://www.phpchina.com/html/29/n-35329.html
- 2nd times: http://www.phpchina.com/html/30/n-35330.html
- 3rd Times: http://www.phpchina.com/html/31/n-35331.html
- 4th times: http://www.phpchina.com/html/32/n-35332.html
- 5th times: http://www.phpchina.com/html/32/n-35333.html
I am Nagano of Mixi. The memcached serialization is coming to an end. So far, we have introduced the topics directly related to memcached. This article introduces some mixi cases and practical application topics, and introduces some memcached-compatible programs.
Mixi Case Study
- Server Configuration and quantity
- Memcached Process
- Memcached usage and client
- Maintain the connection through Cache: Memcached: Fast
- Public Data Processing and rehash
Memcached Application Experience
- Start with daemontools
- Monitoring
- Memcached Performance
Compatible with Applications
Summary Mixi Case Study
Mixi uses memcached in the early stage of service provision. With the sharp increase in website traffic, simply adding a Server Load balancer instance to the database cannot meet the requirements. Therefore, memcached is introduced. In addition, we have also verified the increased scalability to prove that the speed and stability of memcached can meet the needs. Now, memcached has become a very important component of the mixi service.
Figure 1 Current System Components
Server Configuration and quantity
Mixi uses many servers, such as database servers, application servers, image servers, and reverse proxy servers. Nearly 200 memcached servers are running. The typical configuration of the memcached server is as follows:
CPU: Intel Pentium 4 2.8 GHz
Memory: 4 GB
Hard Disk: 146 GB SCSI
Operating System: Linux (x86_64)
These servers were previously used for database servers. As CPU performance increases and memory prices fall, we are actively switching from database servers and application servers to servers with more powerful performance and more memory. In this way, the number of servers used by mixi can be greatly increased, reducing management costs. Because the memcached server occupies almost no CPU, the replaced server is used as the memcached server.
Memcached Process
Each memcached server starts only one memcached process. The memory allocated to memcached is 3 GB. the startup parameters are as follows:
/usr/bin/memcached -p 11211 -u nobody -m 3000 -c 30720
Because the x86_64 operating system is used, 2 GB or more memory can be allocated. In a 32-bit operating system, each process can only use 2 GB of memory. I have also considered starting multiple processes that allocate less than 2 GB of memory, but the number of TCP connections on one server will multiply and the management will become more complex, therefore, mixi uses a 64-bit operating system.
In addition, although the server memory is 4 GB, only 3 GB is allocated because the memory allocation exceeds this value, which may lead to memory switching (swap ). During the 2nd session, I explained the memcached memory storage "slab allocator". At that time, I said that the memory allocation specified during memcached startup is the amount of memcached used to store data, the memory occupied by "slab allocator" and the management space set to save data are not included. Therefore, the actual memory allocation of the memcached process is larger than the specified capacity.
Most of the data stored by mixi in memcached is relatively small. In this way, the process size is much larger than the specified capacity. Therefore, we repeatedly change the memory allocation volume for verification, and confirm that the size of 3 GB will not cause swap, which is the value of the current application.
Memcached usage and client
Currently, mixi uses about 200 memcached servers as a pool. The capacity of each server is 3 GB, so there will be a huge memory database of nearly GB. The client library uses the Cache: Memcached: Fast that is repeatedly mentioned in this serialization to interact with the server. Of course, the cached distributed algorithm uses the Consistent Hashing algorithm introduced 4th times.
- Cache: Memcached: Fast-search.cpan.org
The usage of memcached on the application layer is determined and implemented by the engineers who develop the application. However, to prevent wheel re-engineering and prevent the re-occurrence of lessons on Cache: Memcached: Fast, we provide the Cache: Memcached: Fast wrap module for use.
Maintain the connection through Cache: Memcached: Fast
Cache: Memcached, the connection (file handle) with memcached is stored in the class variable in the Cache: Memcached package. In environments such as mod_perl and FastCGI, the variables in the package will not be restarted at any time like CGI, but will be maintained throughout the process. The result is that the connection to memcached will not be closed, which reduces the overhead of TCP connection establishment, and also prevents TCP port resource depletion caused by repeated TCP connections and disconnections in a short period of time.
However, Cache: Memcached: Fast does not have this function, so you need to keep the Cache: Memcached: Fast object out of the module in the class variable to ensure persistent connections.
Package Gihyo: Memcached; use strict; use warnings; use Cache: Memcached: Fast; my @ server_list = qw/192.168.1.1: 11211 192.168.1.1: 11211/; my $ fast; # used to keep the object sub new {my $ self = bless {}, shift; if (! $ Fast) {$ fast = Cache: Memcached: Fast-> new ({servers =>\@ server_list });} $ self-> {_ fast }=$ fast; return $ self;} sub get {my $ self = shift; $ self-> {_ fast}-> get (@_);}
In the preceding example, the Cache: Memcached: Fast object is saved to the class variable $ fast.
Public Data Processing and rehash
Data such as cache data and setting information shared by all users, such as the news on the mixi homepage, occupies many pages and has a large number of accesses. Under such conditions, access is easily concentrated on a memcached server. Access centralization is not a problem, but once the server in the access concentration fails, memcached cannot be connected, a huge problem will occur.
Cache: Memcached has the rehash function, that is, when the server that saves data cannot be connected, it will calculate the hash value again to connect to other servers.
However, Cache: Memcached: Fast does not have this function. However, it can not connect to the server for a short time when the connection fails.
my $fast = Cache::Memcached::Fast->new({ max_failures => 3, failure_timeout => 1 });
If the max_failures connection fails within the second of failure_timeout, The memcached server is no longer connected. Our setting is more than 3 times in 1 second.
In addition, mixi also sets a naming rule for the key names of cache data shared by all users. Data that meets the naming rules is automatically saved to multiple memcached servers, and only one server is selected for retrieval. After this function library is created, the memcached Server failure will not cause any other impact.
Memcached Application Experience
So far, we have introduced the internal structure and function library of memcached. Next we will introduce some other application experience.
Start with daemontools
Generally, memcached runs quite stably, but the latest version 1.2.5 used by mixi has been killed several times. The architecture ensures that services are not affected even if several memcached faults exist. However, if the memcached process crashes on the server, it can run normally after memcached is restarted, therefore, the memcached process is monitored and automatically started. Therefore, daemontools is used.
Daemontools is a set of UNIX service management tools developed by DJB, the author of qmail. The program named supervise can be used for service Start and Stop Service restart.
The installation of daemontools is not described here. Mixi uses the following run script to start memcached.
#!/bin/sh if [ -f /etc/sysconfig/memcached ];then . /etc/sysconfig/memcached fi exec 2>&1 exec /usr/bin/memcached -p $PORT -u $USER -m $CACHESIZE -c $MAXCONN $OPTIONS
Monitoring
Mixi uses an open-source monitoring software named "nagios" to monitor memcached.
You can develop plug-ins in nagios to monitor get and add actions of memcached in detail. However, mixi only uses the stats command to check the running status of memcached.
define command { command_name check_memcached command_line $USER1$/check_tcp -H $HOSTADDRESS$ -p 11211 -t 5 -E -s 'stats\r\nquit\r\n' -e 'uptime' -M crit }
In addition, mixi converts the results of the stats Directory into graphs through rrdtool for performance monitoring, reports the daily memory usage, and shares the results with developers through emails.
Memcached Performance
We have already introduced the performance of memcached. Let's take a look at the actual cases of mixi. The charts described here are the most concentrated memcached servers used by the Service.
Figure 2 Request count
Figure 3 Traffic
Figure 4 Number of TCP connections
Requests, traffic, and TCP connections are listed from top to bottom. The maximum number of requests is 400 QPS, and the traffic reaches 10000 Mbps. At this time, the number of connections has exceeded. This server has no special hardware, that is, the general memcached server introduced at the beginning. The CPU usage is as follows:
Figure 5 CPU utilization
It can be seen that there are still idle parts. Therefore, memcached has high performance and can be used as a place for Web application developers to safely save temporary data or cache data.
Compatible with Applications
The implementation and protocols of memcached are very simple, so there are many implementations compatible with memcached. Some powerful extensions can write memcached memory data to disks for data persistence and redundancy. The storage layer of memcached will become extensible (pluggable) and support these features in the future.
Here we will introduce several memcached-compatible applications.
-
Repcached
-
Provides replication patches for memcached.
-
Flared
-
Stored in QDBM. Asynchronous replication and fail over are also implemented.
-
Memcachedb
-
Stored in BerkleyDB. Message queue is also implemented.
-
Tokyo Tyrant
-
Store Data to Tokyo Cabinet. Not only is it compatible with the memcached protocol, but it can also be accessed through HTTP.
Tokyo Tyrant Case Study
Mixi uses Tokyo Tyrant, which is compatible with the preceding applications. Tokyo Tyrant is the network interface of Tokyo Cabinet DBM developed by pinglin. It has its own protocol, but also has memcached compatibility protocol, and can also exchange data through HTTP. Although Tokyo Cabinet is an implementation of writing data to a disk, the speed is quite fast.
Mixi does not use Tokyo Tyrant as the cache server, but uses it as the DBMS that saves the combination of key-value pairs. It is mainly used as a database that stores the user's last access time. It is related to almost all mixi services, and data needs to be updated every time a user accesses the page, so the load is quite high. The processing of MySQL is very cumbersome. Using memcached alone to save data may lead to data loss. Therefore, Tokyo Tyrant is introduced. However, you do not need to re-develop the client. You only need to use Cache: Memcached: Fast in an intact manner, which is also one of the advantages. For more information about Tokyo Tyrant, see the company's development blog.
- Mixi engineers' Blog-Tokyo Tyrant zookeeper high-performance dual-load database architecture
- Mixi engineers' Blog-Tokyo (Cabinet | Tyrant)
Summary
So far, the "memcached comprehensive analysis" series has ended. We introduced the basics, internal structure, distributed algorithms, and applications of memcached. After reading this article, we will be honored if you are interested in memcached. For more information about mixi systems and applications, see the company's development blog. Thank you for reading this article.
Copyright Notice: AnyReprintedHowever, the original author charlee, original link, and this statement must be indicated during reprinting.