Design Concept of consistent hash

Source: Internet
Author: User

First, let's look at a simple hash function.

# Define hash_range 5 <br/> int simple_hash (INT key) <br/>{< br/> return key % hash_range <br/>}

Call simple_hash to hash the values 5, 8, and 9. The result is as follows:

Simple_hash (5) = 0 <br/> simple_hash (8) = 3 <br/> simple_hash (9) = 4

In practical application, hash_range represents a certain practical significance, such as the number of machines. This article takes the number of machines as an example. When hash_range = 5, the machine node number is 0 ~ 4. simple_hash (5) = 0 indicates that data with the key of 5 will be placed in machine 0 node. When querying, you only need to hash key 5 to get the node number 0, then you can find the node and read the data.

 

In this way, the system has been running for a period of time, and the data is coming and going.

 

One day, a new machine is added to the system. At this time, hash_range = 6. We will find that the hash values of 5, 8, and 9 have changed! What does this mean? In the view of the new function, the original values 5, 8, and 9 are allocated to the wrong machine. If key = 5 is used for query, the returned node is 5 instead of 0. Obviously, this query will fail. Because the data is still on node 0!

 

// Hash_factor = 6 <br/> simple_hash (5) = 5 <br/> simple_hash (8) = 2 <br/> simple_hash (9) = 3

 

In textbooks, hash_range is called a hash factor.

 

How can we ensure that the correct data is still read when the number of machines increases? The simplest way is to use the number of machines as the hash factor.
Instead, some constants are selected as the hash factor. Consistency hash is produced under this guiding ideology.

 

First, let's look at the basic idea of consistent hash:

 

The number of machines (node) is not used as a hash factor, but is hashed to the range of [0, 2 ^ 32) through a common hash function, all keys are also hashed to this range, and then the node is located "Forward" along the number axis ring. The machine where the key is to be stored is first found.

 

Consistency hash is not perfect. When a new machine is added to the system, some keys need to be re-distributed, for example. However, the general hash function still uses the original one. In terms of effect, the effect of consistency hash is much smaller than that of normal hash.

 

 

 

For more consistent hash content, you can read it on Wikipedia.
.

 

In common hash functions, hash and storage are tightly coupled. In consistency hash, hash and storage are separated. Hash is used to determine the approximate range, find the nearest node for storage by forward addressing.

 

The essential problems of hash functions are not solved. Consistency hash only weakens the role of Hash Functions in algorithms and introduces some new ideas to solve the problems.

 

 

Remarks:

1. image captured in this article: Comprehensive Analysis of memcached
P28

2. To prevent data from being hashed to a very close location, consistency hash also uses the concept of virtual nodes. That is, a physical node may be virtualized into 200 virtual nodes and evenly distributed on the ring. This idea is also clever and worth learning.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.