Deep consistency hash (consistent Hashing) algorithm principle with 100 lines of code implementation

Source: Internet
Author: User
Tags getmessage md5

In this paper, we share the principle of--consistent hashing algorithm and Java implementation, as well as the effect test, in order to realize some key techniques used in distributed task scheduling system.

Background introduction

Consistency hashing is often used in distributed systems to minimize the data migration overhead associated with node changes. The consistent hashing algorithm was proposed in 1997 in the paper consistenthashing and random trees .

The first simple understanding of the hash is to solve what problem. Suppose a distributed task scheduling system, the task of the node has n machines, the existing m job on the N machine running, the M job needs to map to one of N nodes, this time can choose a simple hash algorithm to let m job can be evenly distributed to n nodes, such as hash ( JOB)%n, looks perfect, but consider the following two scenarios:

    1. One of n nodes is down, when the number of nodes is changed to n-1, the mapping formula becomes hash (JOB)% (n-1)
    2. The mapping formula becomes hash (job)% (n+1) due to the increase in job count and the need to add a new machine

1, 22 scenarios can be seen, basically all the jobs will be reassigned to the node before the change of the node, meaning that you need to migrate almost all of the running job, think of how this will bring the system the complexity and performance loss.

There is also a situation, assuming that the hardware processing performance of the node is not exactly the same, want to let high performance nodes are allocated some jobs, the above simple hash mapping algorithm is difficult to do.

How to solve the problem of large amount of data migration and uneven data distribution caused by this node change? The consistent hashing algorithm is a clever solution to these problems.

Consistent hashing is a hashing algorithm, the typical feature is: when the node is reduced or added, as far as possible to ensure that the key mapping relationship is not changed, as far as possible to reduce the migration of key.

Consistent hashing algorithm principle, how to deal with Job->node mapping process
    1. Determining the hashing value space

The given value space 2^32,[0,2^32] is the value space for all hash values and is depicted graphically as the following ring:

2. Node-to-value space mapping

node to this value space mapping, take the hash value of node, choose a can be fixed to identify a node attribute value for hashing, assuming that the input as a string, the algorithm is as follows:

You can take the MD5 value of the node ID and then intercept the 32 bits as the mapping value. The MD5 values are as follows:

Private byte[] MD5 (String value) {messagedigest MD5; Try{MD5= Messagedigest.getinstance ("MD5"); } Catch(nosuchalgorithmexception e) {Throw Newillegalstateexception (E.getmessage (), E);        } md5.reset (); byte[] bytes; Try{bytes= Value.getbytes ("UTF-8"); } Catch(unsupportedencodingexception e) {Throw Newillegalstateexception (E.getmessage (), E);        } md5.update (bytes); returnmd5.digest (); }

Because the mapping value requires only 32 bits, the final value can be computed using the following method (number 0 is available):

Private A long hash (byteint number )        {return ((long) (Digest[3 + Number * 4] & 0xFF) <<)                | ((long) (digest[2 + number * 4] & 0xFF) <<)                | ((long) (digest[1 + number * 4] & 0xFF) << 8)                | (digest[0 + number * 4] & 0xFF))                 & 0xFFFFFFFFL;}

The n node is given the hash value through the above method, and the mapping to the ring value space is as follows:

In the algorithm, the physical node information corresponding to the hash value of each node is cached in memory in the form of an ordered map. Cached in this memory variable: Private final Treemap<long, string> virtualnodes.

3. Data-to-value space mapping

The data job hashes the same way as node nodes, and you can also map all job hashes to this ring using the Md5->hash method described above.

4. Data and Node mappings

When the node and the data are mapped to this ring, you can set a rule to put which data hash value on which node hash value, the rule is, along the clockwise direction, the data hash value to find the first node The hash value is the data that corresponds to the hash value of the data mapped to that node. At this point, the mapping relationship from data to node is determined.

The next node hash value algorithm is searched clockwise as follows:

 PublicString Select (Trigger Trigger) {string key=trigger.tostring (); byte[] Digest =MD5 (key); String node= Sekectforkey (hash (digest, 0)); returnnode; }    PrivateString Sekectforkey (Longhash)        {String node; Long Key=Hash; if(!Virtualnodes.containskey (Key)) {SortedMap<long, string> tailmap =Virtualnodes.tailmap (key); if(Tailmap.isempty ()) {Key=Virtualnodes.firstkey (); } Else{Key=Tailmap.firstkey (); }} node=Virtualnodes.get (key); returnnode; }

Trigger is an abstraction of the job trigger task, which can be ignored, overriding the ToString method to return a unique flag that marks a job, calculates the hash value, and searches for the rules from the hash value of the node. Virtual node follow-up introduction.

Algorithm performance

Then you can see how the consistency hash is based on how this data structure plays the advantages mentioned earlier.

1. When nodes are reduced, look at the nodes that need to be migrated

Assuming that the node_1 is down, only the job_1 of the data object in the figure will be remapped to Node_k, while the other job_x throw the original mapping relationship intact.

2. When a node is added

Assuming the new node_i, only the job_k of the data objects in the figure will be remapped to Node_i, and the other job_x also keep the original mappings intact.

Algorithm optimization-Virtual node

The above algorithm process, will think of two problems, first, the data objects will not be distributed unevenly, especially the new nodes or reduce the node; second, if you want to make some nodes more mapped to some data objects, how to deal with them. Virtual node This is the solution to this problem.

A physical node is virtualized to a certain number of virtual nodes, dispersed to this value space, need to be scattered as randomly as possible.

Suppose there are 4 physical node nodes, each color block on the ring represents a hash value area covered by a virtual node, each of which represents a physical node. When there are fewer physical nodes, the number of virtual nodes needs to be higher to ensure better consistency performance. After testing, in the physical node is a single digit, the virtual node can be set to 160, at this time can bring good performance (after the test results will be given, 160*n total number of nodes, if a node changes, the mapping relationship change rate is basically 1/n, to achieve the expected).

When the implementation of the algorithm, the known physical node, the number of virtual nodes set to 160, the 160*n node can be calculated hash value, the hash value is key, the physical node is identified as value, in the form of an ordered map in memory cache, The query data that is the physical node that corresponds to the subsequent calculation of the data object. As the code below, the physical node information for all virtual node hash values is cached in Virtualnodes.

 PublicConsistenthash (list<string>nodes) {         This. Virtualnodes =NewTreemap<>();  This. Identityhashcode =Identityhashcode (nodes);  This. Replicanumber = 160;  for(String node:nodes) { for(inti = 0; i < REPLICANUMBER/4; i++) {                byte[] Digest = MD5 (node.tostring () +i);  for(inth = 0; H < 4; h++) {                    Longm =Hash (Digest, h);                Virtualnodes.put (m, node); }            }        }    }
Algorithm testing

The above detailed description of the consistency hash (consistent Hashing) algorithm principle and implementation process, next give a test result:

With 10 physical nodes, 160 virtual nodes, 1000 data objects tested, and 10 physical nodes, the results of these 1000 data object mappings are as follows:

Number of path_7 node data objects before a node is reduced: 113
Number of PATH_0 node data objects before a node is reduced: 84
Number of Path_6 node data objects before a node is reduced: 97
Number of Path_8 node data objects before a node is reduced: 122
Number of Path_3 node data objects before a node is reduced: 102
Number of path_2 node data objects before a node is reduced: 99
Number of Path_4 node data objects before a node is reduced: 98
Number of Path_9 node data objects before a node is reduced: 102
Number of path_1 node data objects before a node is reduced: 99
Number of Path_5 node data objects before a node is reduced: 84

Reduce a physical node path_9, at this time 9 physical nodes, the original 1000 data object mapping situation is as follows:

Number of path_7 node data objects after reducing one node: 132
Number of Path_6 node data objects after reducing one node: 107
Number of PATH_0 node data objects after reducing one node: 117
Number of Path_8 node data objects after reducing one node: 134
Number of Path_3 node data objects after reducing one node: 104
Number of Path_4 node data objects after reducing one node: 104
Number of path_2 node data objects after reducing one node: 115
Number of Path_5 node data objects after reducing one node: 89
Number of path_1 node data objects after reducing one node: 98

The number of data objects on each physical node is changed first from the quantity:

When one node is reduced, the number of path_7 node data objects changes from 113 to 132
When one node is reduced, the number of Path_6 node data objects changes from 97 to 107
When one node is reduced, the number of PATH_0 node data objects changes from 84 to 117
When one node is reduced, the number of Path_8 node data Objects changes from 122 to 134
When one node is reduced, the number of Path_3 node data objects changes from 102 to 104
When one node is reduced, the number of Path_4 node data objects changes from 98 to 104
When one node is reduced, the number of path_2 node data objects changes from 99 to 115
When one node is reduced, the number of Path_5 node data objects changes from 84 to 89
When one node is reduced, the number of path_1 node data objects changes from 99 to 98

You can see that the basic is a uniform change, and now compare each data object mapped to the physical node, the change of the data objects accounted for, statistics are as follows:

Data Object Migration ratio: 0.9%

This result basically shows the best performance that the consistent hash can bring, and reduce the data migration brought by the node change as much as possible.

Java Complete code attached

Finally, we enclose the complete algorithm code for your reference. The data objects in the code are trigger abstracted and can be adjusted to a specific scene to run tests.

Package Com.cronx.core.common;import Com.cronx.core.entity.trigger;import java.io.UnsupportedEncodingException; Import Java.security.messagedigest;import Java.security.nosuchalgorithmexception;import java.util.Collections; Import Java.util.list;import java.util.sortedmap;import java.util.treemap;/** * Created by Echov on 2018/1/9.    */public class Consistenthash {Private Final treemap<long, string> virtualnodes;    private final int replicanumber;    private final int identityhashcode;    private static Consistenthash Consistenthash;        Public Consistenthash (list<string> nodes) {this.virtualnodes = new treemap<> ();        This.identityhashcode = Identityhashcode (nodes);        This.replicanumber = 160; for (String node:nodes) {for (int i = 0; i < REPLICANUMBER/4; i++) {byte[] digest = MD5                (node.tostring () + i);                  for (int h = 0; h < 4; h++) {Long m = hash (digest, h);  Virtualnodes.put (m, node); }}}} private static int identityhashcode (list<string> nodes) {Collections.sort (nod        ES);        StringBuilder sb = new StringBuilder ();        for (String s:nodes) {sb.append (s);    } return Sb.tostring (). Hashcode (); public static String Select (Trigger Trigger, list<string> nodes) {int _identityhashcode = IDENTITYHASHC        Ode (nodes); if (Consistenthash = = NULL | | Consistenthash.identityhashcode! = _identityhashcode) {synchronized (consistenth                    Ash.class) {if (Consistenthash = = NULL | | Consistenthash.identityhashcode! = _identityhashcode) {                Consistenthash = new Consistenthash (nodes);    }}} return Consistenthash.select (trigger);        Public String Select (Trigger Trigger) {String key = Trigger.tostring ();        Byte[] Digest = MD5 (key); String NOde = Sekectforkey (hash (digest, 0));    return node;        } private string Sekectforkey (long hash) {String node;        Long key = hash;            if (!virtualnodes.containskey (key)) {Sortedmap<long, string> tailmap = Virtualnodes.tailmap (key);            if (Tailmap.isempty ()) {key = Virtualnodes.firstkey ();            } else {key = Tailmap.firstkey ();        }} node = Virtualnodes.get (key);    return node; } Private long hash (byte[] digest, int number) {return ((long) (digest[3 + number * 4] & 0xFF) << 24 )                | ((long) (digest[2 + number * 4] & 0xFF) << 16) | ((long) (digest[1 + number * 4] & 0xFF) << 8) |                (digest[0 + number * 4] & 0xFF))    & 0xFFFFFFFFL;        } private byte[] MD5 (String value) {messagedigest MD5;        try {MD5 = messagedigest.getinstance ("MD5"); } CAtch (NoSuchAlgorithmException e) {throw new IllegalStateException (E.getmessage (), E);        } md5.reset ();        byte[] bytes;        try {bytes = value.getbytes ("UTF-8");        } catch (Unsupportedencodingexception e) {throw new IllegalStateException (E.getmessage (), E);        } md5.update (bytes);    return Md5.digest (); }}

Transferred from: https://my.oschina.net/yaohonv/blog/1610096

Deep consistency hash (consistent Hashing) algorithm principle with 100 lines of code implementation

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.