Consistent hash algorithm

Source: Internet
Author: User

A distributed hash (DHT) implementation algorithm, proposed by MIT in 1997, was designed to address hot spot problems in the Internet, with a similar intent to carp. The consistent hash corrects the problem caused by the simple hashing algorithm used by carp, so that distributed hashing (DHT) can be really applied in the peer-to-peer environment.

The consistency hash algorithm proposes four definitions for determining the good or bad hash algorithm in a dynamically changing cache environment:

1, Balance (Balance): The balance is that the result of the hash can be distributed as far as possible in all buffers, so that all buffer space can be exploited. Many hashing algorithms can satisfy this condition.

2. Monotonicity (monotonicity): monotonicity means that if some content has been allocated to the corresponding buffer by hashing, a new buffer is added to the system. The result of the hash should be to ensure that the original allocated content can be mapped to an existing or new buffer without being mapped to another buffer in the old buffer collection.

3, Dispersion (Spread): In a distributed environment, the terminal may not see all the buffers, but only to see part of it. The end result is that the same content is mapped to different buffers by different endpoints when the terminal wants the content to be mapped to buffering through a hashing process, because the buffer range seen by different terminals may be different, resulting in inconsistent results for the hash. This is obviously something that should be avoided because it causes the same content to be stored in different buffers, reducing the efficiency of the system's storage. The definition of dispersion is the severity of the above-mentioned situation. A good hashing algorithm should be able to avoid inconsistencies as far as possible, that is, to minimize dispersion.

4. Load: The load problem is actually looking at the dispersion problem from another perspective. Since different terminals may map the same content to different buffers, it is possible for a particular buffer to be mapped to different content by different users. As with dispersion, this situation should also be avoided, so a good hashing algorithm should be able to minimize the buffering load.

The basic implementation of the consistent hashing algorithm is to map the machine node and the key value to a 0~2^32 ring based on the same hash algorithm. When a write cache request arrives, computes the hash value of the key value K (k), if the value corresponds to the hash value of the previous machine node, writes to the machine node directly, if there is no corresponding machine node, the next node is found clockwise, write, if more than 2^ 32 The corresponding node has not been found, then the lookup starts at 0 (because it is a ring structure). 1 is shown

Consistent hash algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.