Implementation of MC's distributed algorithm and some summary

Source: Internet
Author: User
Tags crc32 memcached node server

First we know that memcached is a distributed cache system, but memcached is not like MongoDB, allowing multiple nodes to be configured, and the nodes are automatically assigned data

That is to say memcached node, is non-phase communication, therefore, memcached distributed, rely on the user to design the algorithm, the data distributed in multiple memcached nodes.

Let's take a look at the commonly used distributed algorithms:

1. Modulo algorithm:

The easiest to think of is the modulus algorithm, that is, n nodes from 0-"n-1 number, key to n modulo, I, then key falls on the first server.

Is the hash value of the server and the total number of servers to be redundant, that is, hash%n, the disadvantage of this method is that when the server is increased or decreased, there will be more cache needs to be reassigned and will cause the allocation of the cache is uneven (it is possible that one server allocates a lot, others are very few).

But let's look at the effect of this algorithm on cache hit ratios:

We assume that there are 8 servers, running suddenly down one, then the base of the remainder becomes 7

Let's take a look at the consequences:

In general, we summarize mathematically:

There are n servers, which become N-1 servers,

Each n (N-1) number, only (n-1) units,%n,% (n-1) Get the same result

So the hit rate drops sharply down to 1/(N-1) in the short term of the server down

So the more servers, the worse the consequences of the down machine!

Let's discuss the algorithm for consistent hash:

Popular understanding of consistent hashing:

Place each server node at every moment of the clock, we will also map the key to the clock at some point, the key along the clock clockwise, hit the first node smaller than it, then this key fell to this server.

1 Question 1: The pointer on the clock is up to 11 points, what if I have hundreds of memcached nodes? 2 A: The clock is just for the sake of understanding the analogy, in the actual application, we can distribute on the ring [0,2^1] of the number,3 in this way, servers all over the world can be installed. 4 5 Question 2: How do I Turn "node name", "Key Name" into an integer? 6 A: You can use the current function, such as CRC32 (). 7 You can also design the conversion rules yourself, but pay attention to the low impact rate after conversion. 8 is a different node name, and the probability of converting to the same integer is lower.

Well, let's think about it. What is the impact after a node is down?

When a node is down, it affects only 1 nodes after the node is clockwise, and the other nodes are unaffected. Therefore, the consistent hashing minimizes the redistribution of the keys

We see that after the 6th node down, all the pressure is transferred to node 7th, causing the pressure of the 7th node server is particularly large, then we consider whether the pressure of the 6th node can be noted on the rest of the node?

So we introduced the concept of virtual nodes:

The virtual node is----n real nodes, maps each real node into m virtual nodes, and then m*n the virtual nodes,

Hash on the ring. The virtual nodes corresponding to each real node are interleaved with each other

Thus, when a real node is down, the impact is evenly shared across all other nodes

Well, the above is the theoretical point of consistency hash, and then we will consider how to achieve it?

Here is the code implemented in PHP:

1<?PHP2 3 4 classConsistent {5         protected $_nodes=Array();6 7         //generate a numeric value8          Public function_hash ($str){9                 return sprintf("%u",CRC32($str));Ten         } One  A          Public functionFind$key){ -                 $point=$this->_hash ($key); -                 $pos= Current($this-_nodes); the  -                 foreach($this->_nodes as $k=$v){ -                         if($point<=$k){ -                                 $pos=$v; +                                  Break; -                         } +                 } A                 return $pos; at         } -  -          Public functionAddserver ($server){ -                  for($i= 1;$i<=32;$i++){ -                         $pos=$this->_hash ($server." -".$i); -                         $this->_nodes[$pos] =$server; in                 } -  to                 Ksort($this->_nodes,sort_regular); +         } -  the          Public functionprintnodes () { *                 Print_r($this-_nodes); $         }Panax Notoginseng  - } the  + $cons=Newconsistent (); A  the $cons->addserver (' a '); + $cons->addserver (' B '); - $cons->addserver (' C '); $  $ Echo $cons->_hash (' name '). " <br/> "; - Echo' Should fall in '.$cons->find (' name '). " <br/> "; -  the $cons-printnodes (); -?>

Implementation of MC's distributed algorithm and some summary

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.