Consistent hashing algorithm

Source: Internet
Author: User

The origin and introduction of the consistent hashing algorithm it's not going to be said, this is typically used for distributed caching, for handling cached data and for the correspondence between multiple cache servers.
The personal understanding is that if you can get to the data from the cache server is the cache hit, directly from the cache server to obtain data, if you die from the underlying server to obtain data (this relatively poor efficiency), the use of cache server can improve the back end of the return speed, but also greatly reduce the pressure of the data server.

    • General algorithm

When it comes to consistent hashing algorithms that always start with a generic hashing algorithm, this is also the case here.

According to most of the examples on the web, suppose there are 3 cache servers A, B, C. Each data to be cached is hashed and then the number of servers is redundant, and the resulting number is the location of the cache server to which the data is to be stored.
Calculation formula:

hash(object) % N

Here n is 3, the result of the calculation is certainly 0, 1, 2 one, 0 corresponds to the a,1 corresponding to b,2 c.
In normal circumstances this approach is feasible. However, if the business volume increases to increase the cache server, this time the cache database on the majority of cache data will be invalidated, reduce the server also this situation, and if the cache server cache data too many failures may cause the data server traffic suddenly increased, resulting in the data Server service paralysis.

    • Consistent hashing algorithm

Aware of the inadequacy of the above methods, now look at the consistency hashing algorithm.
Consistent hash algorithm in some places it is the object of the hash value of 2 of the 32 square to take the model, there are some places that the object's hash value of 2 of the 32-square mapping.

The consistent hashing algorithm involves a very important concept-Hashi, which has 2^32 points on the ring, which form a closed loop, as follows:

0 is the starting point, and the first point on the left of 0 is 2^32-1.

First find out a, B, c three Server hash value, is generally the server's IP or server name hash value.

Suppose a, B, c three servers correspond to the following locations:

Now take 4 data as an example to store: Object1, Object2, Object3, Object4.
Calculate hash Value:

hash(object1) = key1hash(object2) = key2hash(object3) = key3hash(object4) = key4

These four values on the hash ring correspond to starting from 0, according to the clockwise, the hash value of the nearest node as the storage server.

The corresponding storage is as follows:

There is no node after Object4, will start looking for cache server from 0, so save to a.

Now if you remove Node B, you can see that the OBJECT2 cache found on C, this time is not found, but Object1, Object3 and Object4 is still valid.

If the new node is a similar situation, the impact on the original server is relatively small.

However, in reality, the distribution of a, B, and C nodes on the Hashi will be the following situation in many cases:

At this point the C node is useless and the load is concentrated on a. Virtual nodes are introduced for this scenario.

    • Virtual node

The so-called virtual node is a host in the Hashi on the allocation of multiple nodes, the general Hashi on the more the more evenly distributed, in order to reflect the advantages of a consistent hashing algorithm.

Here we map two virtual nodes for a, B, and C, and the mappings of virtual nodes can be generated in a similar manner as follows:

hash(ipA#A1) = key1hash(ipA#A2) = key2hash(ipB#B1) = key3hash(ipB#B2) = key4hash(ipC#C1) = key5hash(ipC#C2) = key6

This time the cache host allocation is relatively uniform, the same data cache when the corresponding host will be relatively uniform.

Consistent hashing algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.