Distributed hashing algorithm

Source: Internet
Author: User

One, the normal way of hashing

Before introducing the distributed hashing algorithm, we first understand how the ordinary hash is implemented. The Java.util.HashMap class in the JDK implements a hash table, which features: ① create hash table (HASHMAP) needs to specify the size , that is, by default create a hash table that can store how many elements, its default size is 16.

② when adding elements to the HashMap, the hashmap becomes more and more full, and when the added element reaches the load factor multiplied by the table length, it needs to be expanded. When expanding, elements that have already been mapped to a location (bucket) in the hash table need to be re-hashed before the original data is copied to the new hash table.

For a common hash table, the cost of the expansion is significant. Because the normal hash calculation address is as follows: hash (Key)%M, for demonstration convenience, to give an extreme example is as follows:

Assuming that the hash function is hash (x) =x, the hash table has a length of 5 (with 5 buckets)

Key=6, hash (6)%5 = 1, which is a key 6 element stored in the first bucket

Key=7, Hash (7)%5 = 2, which is a key 7 element stored in the second bucket

Key=13, the hash (13)%5=3, that is, the key 13 element is stored in the third bucket ....

Assuming that the hash table length is now expanded to 8, then the data for key 6,7,13 needs to be re-hashed. Because

Key=6, hash (6)%8 = 6, which is a key 6 element stored in the sixth bucket

Key=7, hash (2)%8 = 7, which is a key 7 element stored in the seventh bucket

Key=13, the hash (13)%8=5, that is, the key 13 element is stored in the fifth bucket ....

As can be seen from the above: After the expansion, the position of the element has changed completely. For example, the element with key 6 is originally stored in the first bucket, and will need to be stored in the 6th bucket after the expansion.

Therefore, this is a disadvantage of the ordinary hash: the expansion may affect the movement of all elements. This is why: in order to reduce the movement of elements in the expansion, the hash table is always expanded to twice times the original size of the reason . Because, there is a mathematical proof, the expansion into twice times the size, so that the number of re-hash elements is the least.

Two, consistent hash mode

In distributed systems, it is common for nodes to be down, a node to join or move out of a cluster. For distributed storage, assuming that there are 10 machines in the storage cluster, if the data is fragmented by hashing (the data is mapped to a machine based on a hash function), the hash function should look like this: hash (file)% 10. According to the above introduction, the expansion is very unfavorable, if you want to add a machine, many of the files that have been stored on the machine need to be reassigned, if the file is large, this movement will cause the network load.

As a result, a consistent hash is present. For a consistent hash, refer to: Consistency hashing algorithm learning and Java Code Implementation analysis

A consistent hash, in fact, is the space that the hash function can map (equivalent to the number of ordinary hash barrels is fixed) fixed , such as fixed to: 2n-1, and organized into a ring shape

Each machine corresponds to an n-bit ID and is mapped to the ring. Each query key is also an n-bit ID, and the ID of the node and the query key correspond to the same mapping space.

Each node choose a n-bit ID  Intention was that they being random  Though probably a hash of some fixed info  IDs is Arranged in a Ringeach lookup key is also a n-bit ID   i.e., the hash of the real lookup key   Node IDs and Keys OCCU Py the same space!

For example, there are four machines mapped to a fixed size of 2n-1 hash ring. Four machines divided the entire ring into four parts: (A, B), (B,c), (C,d), (D,a)

Machine A is responsible for storing data falling within the (D,a) range, and machine B is responsible for storing data that falls within the range of (A, b) ....

In other words, when the data is hashed, the address of the data falls on a point on the ring, and the data is stored on that machine in the clockwise direction of the point.

the advantage of a consistent hash compared to a normal hash is that when a new machine is added or deleted, it does not affect the storage of all the data, but only the data stored on the machine (the data that falls on the ring that the machine is responsible for).

For example, when the B machine is removed, the data that falls within the range (a, a) needs to be stored by the C machine, and it affects only the data that falls within the range (a, b).

At the same time, the expansion is also very convenient, for example, in the (c,d) This ring to add another machine E, only a portion of D on the data to copy to the machine E.

That consistency Hashi no shortcomings? Of course there is. In summary, it is not possible to achieve good load balancing without considering the heterogeneous nature of each machine.

For example, Machine C has a high configuration, a good performance, and a low configuration of machine D. However, it is possible that most of the data in reality is hashed to (c,d) on this ring, which directly leads to a large storage pressure on machine D.

In addition, there is a "hot spot" problem with a consistent hash (hotspot).

For example, because a large amount of data is stored on machine D, in order to relieve the pressure on the machine D, adding another machine E to the Ring (C,D) segment requires copying part of the data on machine D to machine E. And there is only one machine: the machine D participates in the copy , which causes a sudden increase in network load between machine D and machine E. Machine D, that's what we call a hotspot.

Three, the consistent hashing method of introducing virtual node

In order to solve some shortcomings of the consistency hash, the concept of virtual node is introduced.

The introduction of virtual node can effectively prevent the physical node (machine) from mapping to the hash ring in the case of uneven. For example, machines A, B, and C are all mapped to the right half of the ring.

In general, virtual nodes will be much more than physical nodes, and can be distributed evenly on the ring, so as to improve the ability of load balancing. Such as:

① If a virtual machine is mapped well to a physical machine, the data on it can be shared by several other physical machines after the outage of a physical machine.

② If a new machine is added, it can correspond to the virtual nodes on multiple nonadjacent ring segments, so that the hash data is stored more scattered.

Four, a distributed hash of the query process

When a machine accepts a query request, the data to be looked up is local, and if it is not local, the query request is forwarded to the next node in the clockwise direction.

For example, the machine N10, accept the client's query "who has Key80?" The corresponding data? "Because the Key80 corresponding data is stored on the machine N90, it needs to be forwarded clockwise: N10-->n32-->n60-->n80

In the worst case of this query, the time complexity of the query is O (N)

To improve query efficiency, you need to maintain some routing information on each machine: which keys are stored on which nodes. For example, if the machine N10 on the "Key80 data on the Machine N90" (<key80, n90>) Such routing information, the machine N10 can immediately forward the query request to the machine N90, so do not need to forward clockwise.

How is the routing information in the actual system maintained?

In chord, a mechanism called "Finger table" was used: Entry I in the Finger Table of node n was the first node that succeeds or equals n + 2i

The ID of machine 1 is 1, the ID of Machine 2 is 2, and the ID of machine 6 is 6. SUCC represents the address where the next machine is located. For example, the routing table on Machine 1 is as follows:

I id+2i succ

0 1+2^0=2 2

1 1+2^1=3 6

2 1+2^2=5 6

It points out that the data corresponding to the items for 2,3,5 are in Machine 2, machine 6 and machine 6 respectively.

Here is an explanation of how it is queried as follows:

The machine 1 stores the data of items 1 locally, the Machine 2 stores the data of items 2 locally, and the machine 6 stores the data of items as 3,4,5,6; Machine 0 stores the data of the items as 7,0 locally.

At the same time, machine 1 also stores the address of the machine where items are 2,3,5 data. For example, machine 1 knows that items 5 of the data is stored on machine 6.

For example, when machine 1 receives a query for the data on items 7 on which machine, it finds that items=7 is larger than the largest ITEMS5 in its routing table, then forwards the query request to the machine 6 corresponding to items=5,

The routing table on machine 6 indicates that the ITEMS=7 data is on machine 0, so the items=7 data is found on machine 0.

By saving the routing information on each machine, the query time complexity of O (LOGN) can be achieved in the above way. For other implementations of time complexity, refer to Wikipedia

In addition, as the Amazon Dynamo paper says: Dynamo can do an O (1) time query by saving enough routing information on each machine.

Dynamo can characterized as a zero-hop dht,where each node Maintainsenough routing information locally to route a Reque St to the Appropriatenode directly

Five, references:

Distributed Hash consistency algorithm

Https://en.wikipedia.org/wiki/Distributed_hash_table

Distributed Hash table.pdf 15441 Spring 2004, Jeff Pang

Consistency hashing algorithm Learning and Java Code Implementation analysis

Original: http://www.cnblogs.com/hapjin/p/5760463.html

Distributed hashing algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.