Memcached, although called a "distributed" cache server, has no "distributed" functionality on the server side. Each server is a fully independent and isolated service. The memcached distributed, is fully implemented by the client library. This distribution is the biggest feature of memcached. Distributed principle
The word "distributed" has been used many times here, but it has not been explained in detail. Now let's start with a brief introduction to the principle that the implementations of each client are basically the same.
The following assumes that the memcached server has Node1 ~ node3 Three, the application will save the key named "Tokyo" "Kanagawa" "Chiba" "Saitama" "Gunma" data.
Figure 1 Distributed Introduction: Preparing
First add "Tokyo" to the memcached. When "Tokyo" is passed to the client library, the client-implemented algorithm determines the memcached server that holds the data based on the "key". When the server is selected, it commands it to save "Tokyo" and its values.
Figure 2 Distributed Introduction: When adding
Similarly, "Kanagawa" "Chiba" "Saitama" "Gunma" is the first to select the server and then save.
Next, you get the saved data. The key "Tokyo" To get is also passed to the library. The function library selects the server according to the "key" by the same algorithm as when the data is saved. Using the same algorithm, you can select the same server as you saved, and then send a GET command. As long as the data is not deleted for some reason, the saved value can be obtained.
Figure 3 Distributed Introduction: When getting
This allows the memcached to be distributed by saving different keys to different servers. memcached server, the key will be scattered, even if a memcached server failure can not connect, nor affect the other cache, the system can continue to run. Distributed algorithms
In the cache system, the remainder is calculated dispersed and the consistency HASH computation is dispersed. Dispersion principle of remainder calculation
In simple terms, the remainder calculation dispersion method is "scattered according to the remainder of the number of servers."
1. Obtain an integer hash of the incoming key (int hashcode).
2. Use the computed hashcode divided by the number of servers (n) to take the remainder (c=hashcode% n)
3. Select the server with the ordinal C in the N server. Characteristics
The remainder calculation method is simple, the dispersion of the data is very good, but also has its shortcomings. That is, when the server is added or removed, the cost of the cache reorganization is significant. After the server is added, the remainder can be transformed so that the same server as the save is not available, affecting the cache hit ratio. Consistent Hashing
algorithm
Consistent HASH algorithm my understanding, in a nutshell, is to construct a virtual ring, a ring at the first (0) tail (integer.maxvalue), in a large data range, and then add a hash algorithm Add a virtual node (1 entity nodes can be virtual n virtual phase, such as 160, 200, 1000, etc.) to make the nodes more evenly on the ring. KEY request, also through the same hash algorithm to calculate the hash value, and then locate the closest virtual node to the ring, and finally through the virtual node and entity node corresponding relationship to find the entity node of the service.
A lot of online introduction, there are many pictures, do not want to intercept. Then give it a connection:
http://blog.csdn.net/sparkliang/article/details/5279393
In addition, the company's existing projects also use consistent Hashing for table positioning, cache positioning and so on. The implementation of the first-off algorithm is also available in the project. Characteristics
1. Algorithm implementation is cumbersome, need to build virtual ring.
2. Solve the problem that the remainder algorithm increases the number of nodes hit, in theory, insert an entity node, the average will affect: virtual nodes/2 of the node data hit
Reference: HTTP://TECH.IDV2.COM/2008/07/10/MEMCACHED-001/
Memcached Study notes-part fifth-memcached distributed implementation