First we know that memcached is a distributed cache system, but memcached is not like MongoDB, allowing multiple nodes to be configured, and the nodes are automatically assigned data
That is to say memcached node, is non-phase communication, therefore, memcached distributed, rely on the user to design the algorithm, the data distributed in multiple memcached nodes.
Let's take a look at the commonly used distributed algorithms:
1. Modulo algorithm:
The easiest to think of is the modulus algorithm, that is, n nodes from 0-"n-1 number, key to n modulo, I, then key falls on the first server.
Is the hash value of the server and the total number of servers to be redundant, that is, hash%n, the disadvantage of this method is that when the server is increased or decreased, there will be more cache needs to be reassigned and will cause the allocation of the cache is uneven (it is possible that one server allocates a lot, others are very few).
But let's look at the effect of this algorithm on cache hit ratios:
We assume that there are 8 servers, running suddenly down one, then the base of the remainder becomes 7
Let's take a look at the consequences:
In general, we summarize mathematically:
There are n servers, which become N-1 servers,
Each n (N-1) number, only (n-1) units,%n,% (n-1) Get the same result
So the hit rate drops sharply down to 1/(N-1) in the short term of the server down
So the more servers, the worse the consequences of the down machine!
Let's discuss the algorithm for consistent hash:
Popular understanding of consistent hashing:
Place each server node at every moment of the clock, we will also map the key to the clock at some point, the key along the clock clockwise, hit the first node smaller than it, then this key fell to this server.
1 Question 1: The pointer on the clock is up to 11 points, what if I have hundreds of memcached nodes? 2 A: The clock is just for the sake of understanding the analogy, in the actual application, we can distribute on the ring [0,2^1] of the number,3 in this way, servers all over the world can be installed. 4 5 Question 2: How do I Turn "node name", "Key Name" into an integer? 6 A: You can use the current function, such as CRC32 (). 7 You can also design the conversion rules yourself, but pay attention to the low impact rate after conversion. 8 is a different node name, and the probability of converting to the same integer is lower.
Well, let's think about it. What is the impact after a node is down?
When a node is down, it affects only 1 nodes after the node is clockwise, and the other nodes are unaffected. Therefore, the consistent hashing minimizes the redistribution of the keys
We see that after the 6th node down, all the pressure is transferred to node 7th, causing the pressure of the 7th node server is particularly large, then we consider whether the pressure of the 6th node can be noted on the rest of the node?
So we introduced the concept of virtual nodes:
The virtual node is----n real nodes, maps each real node into m virtual nodes, and then m*n the virtual nodes,
Hash on the ring. The virtual nodes corresponding to each real node are interleaved with each other
Thus, when a real node is down, the impact is evenly shared across all other nodes
Well, the above is the theoretical point of consistency hash, and then we will consider how to achieve it?
Here is the code implemented in PHP:
1<?PHP2 3 4 classConsistent {5 protected $_nodes=Array();6 7 //generate a numeric value8 Public function_hash ($str){9 return sprintf("%u",CRC32($str));Ten } One A Public functionFind$key){ - $point=$this->_hash ($key); - $pos= Current($this-_nodes); the - foreach($this->_nodes as $k=$v){ - if($point<=$k){ - $pos=$v; + Break; - } + } A return $pos; at } - - Public functionAddserver ($server){ - for($i= 1;$i<=32;$i++){ - $pos=$this->_hash ($server." -".$i); - $this->_nodes[$pos] =$server; in } - to Ksort($this->_nodes,sort_regular); + } - the Public functionprintnodes () { * Print_r($this-_nodes); $ }Panax Notoginseng - } the + $cons=Newconsistent (); A the $cons->addserver (' a '); + $cons->addserver (' B '); - $cons->addserver (' C '); $ $ Echo $cons->_hash (' name '). " <br/> "; - Echo' Should fall in '.$cons->find (' name '). " <br/> "; - the $cons-printnodes (); -?>
Implementation of MC's distributed algorithm and some summary