Understanding of distributed cache consistency hash algorithm

Source: Internet
Author: User

Today I read a large Network technology architecture This section of the Soviet distributed cache consistency hash algorithm, for large-scale distributed systems, the cache in the system is necessary, distributed cluster environment, the need to add cache nodes, so that the cache server needs to protect the cache hit rate, There are a lot of demands:

Using the common method, the key value is hashed and the number of distributed cache machines is taken as an example of cluster 3 distributed cache:

For the data to take hash value and then to 3 of its residual, the remainder of 0 into node 0, the remaining bit 1 into the Node1, the remainder of 2 into the node2.

If you add a node to the 4, the portion of Node0, part of Node1, part of Node2, is split into Node3, and a hit rate of 75% appears.

If 2 nodes are added, then 5 is taken, and only 3/5 of the machines are hit.

The design of a common method will result in a higher number of nodes added to your node, resulting in a lower hit rate resulting in greater operational stress on the database.

  

Using a consistent hash algorithm:

Construct an integer ring of 0~2^32, and then place the name of the node, such as Node0, in a hash value to distribute it on the store,

Then compare the key value to the hash value:

Example: Node0 hash value is 432323232;node1 hash value is 879798098, if the hash value of Key1 is 559798098, then it is greater than node0 Hashi value, then clockwise rotation, When Node1 is found, it is stored in the cache in Node1.

After expansion, turn three into 4

Inserting a node3 in Node2 and node0 results in two halves of the data originally stored in Node1 in Node2 to Node1, node2-node3 in Node3, Node3 and Node1 stored in Node1, You can see that this paragraph in Node0-node2 and Node0-node1 has not changed. It's also 75% but the problem is that the Node2 and node0 load is a node2, so there's still a solution.

Reference virtual Way: A physical distributed cache server Layered n virtual machines, distributed around this ring, due to the irregularity of hash hash, he will be distributed in different areas, see, if you insert a new server again, he will be distributed in the virtual machine on the distribution of irregular on the various points, will be more evenly distributed in the various rings, so that the impact of the above problem can be solved.

  

According to the book, in practice, it is appropriate for a physical server to be virtual into 150 virtual server nodes.

Understanding of distributed cache consistency hash algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.