Overview
- Our Memcache client (here I look at the source of the Spymemcache), using a consistent hash algorithm Ketama the choice of data storage nodes. Unlike the conventional hash algorithm, we only hash the key we want to store the data, and allocate it to different node storage. The consistent hash algorithm computes the hash of the server where we want to store the data, and then confirms where each key is stored.
- The application of conventional hash algorithm and its drawbacks
The most common way is the way of the hash modulo. For example, the number of machines available in the cluster is n, then the data request with a key value of K is simple enough to be routed to a hash (k) mod n corresponding machine. Indeed, this structure is simple and practical. But in some of the fast-growing web systems, such solutions still have some drawbacks. With the increase of system access pressure, the cache system has to increase the corresponding speed and data carrying capacity of the cluster by increasing the machine node. Increase the machine means in accordance with the way of the hash, in the time of increasing the machine node, a large number of cache, cache data need to be re-established, or even the overall cache data migration, the moment will bring a very high system load on the DB, set the DB server downtime.
- When designing a distributed cache system, what are the consistent hash algorithms that can help us solve these problems?
The core point of distributed cache design: When designing a distributed caching system, we need to equalize the distribution of the key, and after adding the cache server, the cache will be migrated to a minimum.
The consistency hash algorithm mentioned here Ketama the practice is: Select the specific machine node is not only rely on the key to cache the hash itself, but the machine node itself is also a hash operation.
theory
http://blog.csdn.net/cywosp/article/details/23397179
http://blog.csdn.net/kongqz/article/details/6695417
http://blog.csdn.net/caigen1988/article/details/7708806
Code (PHP)
class consistenthash { Public $nodes=Array();//Actual node protected $v _nodes=Array();///virtual node correspondence between more actual nodes protected $v _mul= +;//A node corresponds to a $v_mul virtual node //Based on the binary search principle, the nearest $key value for distance $key is searched. protected function binary_search($key){ $arr _key= Array_keys ($this->v_nodes);if($arr _key[0]>=$key){return $arr _key[0]; }$arr _num= Count ($arr _key);if($arr _key[$arr _num-1] <=$key){return $arr _key[$arr _num-1]; }//binary Find $low=0;$height=$arr _num-1; while($low<=$height){$mid= (int) (($low+$height)/2);if($arr _key[$mid] <$key){if($arr _key[$mid+1] >=$key){return $arr _key[$mid+1]; }$low=$mid+1; }Else if($arr _key[$mid] >$key){$height=$mid-1; }Else{return $arr _key[$mid]; } }return $arr _key[0]; }//Initialize Public function __construct($nodes=array()){ foreach($nodes as $v){$this->addnode ($v); } }//hash function Public function hash($str){ returnsprintf'%u ', CRC32 ($str)); }//Get node location based on string (corresponding node name/Key name) Public function getPosition($str){ $hash=$this->hash ($str);$node _hash=$this->binary_search ($hash);return $this->v_nodes[$node _hash]; }//Add node Public function addNode($node){ if(In_array ($node,$this->nodes)) {return; }$this->nodes[$node]=NULL; for($i=0;$i<$this->v_mul;$i++){$hash=$this->hash ("{$node}_{$i}");$this->v_nodes[$hash]=$node; } ksort ($this->v_nodes); }//Delete node Public function deletenode($node){ if(isset($this->nodes[$node])){return; }unset($this->nodes[$node]);foreach($this->v_nodes as $k=$v){if($v==$node){unset($this->v_nodes[$k]); } } }}
Consistent hashing algorithm