Consistent hashing algorithm

Source: Internet
Author: User

In a distributed system, if a business can be processed by multiple identical nodes, it is easy to think of hashing the business request to these nodes, and if there are N nodes, the calculation method is: HASH (ID)% N.

This is a simple and effective scenario if it is simply a calculation that does not involve the user state. If the calculation of the node involves the user state, such as maintenance shopping cart, memcache cache service, and so on, as if there is no problem, as long as the same data to do ID, the results of the above hash will remain unchanged. But if the number of nodes changes, such as the increase in traffic due to increased nodes or due to machine downtime and reduce the node, the result of the above hash is different. If you add 2 nodes, an ID original processing node is hash (ID)% N, the new processing node becomes a hash (ID)% (n + 2), the processing node of a large number of IDs may be scrambled to redistribute, you will find that the previous node saved the user data is not used, and the new processing node does not have the data at all. During this time, the status of these users is compromised, if it is a shopping cart, nothing in the car, if it is the cache service, the previous cache has disappeared, not the effect of caching. Users may need to log back in and may need to update the cache from the database, possibly introducing a new issue.

The consistency hash alleviates this problem to some extent by:

1. Organize the entire hash value space into a virtual ring, assuming that the value space of a hash function h is 0-(2^32-1), which is a 32-bit unsigned integer

2. Hash each node with the H function to hash the server's IP or hostname as a keyword so that each node can determine its location on the Hashi

3. Map the ID to a value in the hash space with an H function, and along that value, the first node encountered as the processing node

, if the hash value of an ID falls in the middle of the respective hash value of Node1 and Node2, the business request corresponding to this ID is handled by Node2.

When a service node is added, it affects only one of the nodes adjacent to it, and the other nodes are unaffected. If you add a node5 between Node2 and Node4, only the partial ID processed by NODE4 (the hash value falls after Node2, the part of the ID before NODE5) becomes handled by NODE5, and the other node handles the same ID. There is a great improvement over the simple hash method described at the beginning.

If the number of nodes is not large, after mapping these nodes to the value space, the distribution may be very uneven, which will inevitably result in the number of individual nodes processing more than the other nodes, which does not have the effect of load balancing. This can be solved by introducing a virtual node, that is, to compute multiple hash values for each node, and try to ensure that the hash values are evenly distributed in the value space. When a node is found based on its ID, it finds the virtual node and then finds the corresponding real node based on the virtual node. The process of finding one more time. This algorithm is already supported by the Memcached client library libmemcached. The above is organized from the Internet, make a note. Reference: 1.http://www.cnblogs.com/liunx/archive/2010/03/24/1693925.html2.http://blog.csdn.net/crazy_dreamer/article/ details/1475413

Consistent hashing algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.