The consistent hashing algorithm (consistent Hashing algorithms) is a frequently used algorithm in a distributed system.
The traditional hash algorithm, when slot (slot) increases or decreases, faces the problem of all data being deployed again. The consistent hashing algorithm ensures that only the k/n data is moved (k is the total amount of data, n is the number of slots) and affects only one of the existing slots.
This allows the distributed system to face new or deleted machines. Change requests can be processed at a higher speed.
This article will use Java to implement a simple version number of a consistent hash algorithm, just to illustrate the core idea of a consistent hashing algorithm.
Introduction to consistent hashing algorithms
The consistent hashing algorithm is very much introduced. such as wikis, and very many blogs. Just briefly describe its concept here. Please refer to the relevant papers for specific introduction.
The first concept is node, which is equivalent to a single machine in a distributed system. All of the nodes logically surround themselves to form a ring. The second concept is data. Each piece of data has a key value. Data always needs to be stored on a node. What is the correlation between data and nodes? Associated with the concept of a zone. Each node is responsible for an area on the ring, which is stored on that node, and the area is usually closed on the left side. Open on the right side of the form. such as [2500,5000].
The following is a consistent hashing algorithm with 4 nodes:
The total range is set at 10000, which also limits the number of total slots. Be able to make the right size according to the needs of the project.
- The starting position of the Node1 is 0. Responsible for storing data between [0, 2500]
- The starting position of the NODE2 is 2500, which is responsible for storing data between [2500, 5000]
- The starting position of the NODE3 is 5000. Responsible for storing data between [5000, 7500]
- The starting position of the NODE4 is 7500, which is responsible for storing data between [7500, 10000]
The most important feature of a consistent hashing algorithm is the processing of new or deleted nodes.
Assuming that a new node with a starting position of 1250 is NODE5, the only node affected is the Node1,node1 storage range changed from [0, 2500] [0, 1250]. The storage range for NODE5 is [1250, 2500), so data falling in the range of [1250, 2500) must be moved to NODE5. It is important that the others do not need to be changed. Equivalent to NODE5 share part of the work of Node1. Assuming that the Node3 deleted, then need to Node3 above the data moved to Node2 above, the scope of Node2 expanded to [2500,7500], Node2 undertook Node3 work.
A detailed implementation of the consistent hashing algorithm Java
Java is an object-oriented language, which requires abstract objects first. Node. Represents a node, with a name. The starting position, as well as the data list three properties, because of the match between node and data. The range is used, so for simplicity, node adds an End property. There should have been a concept of data and datakey. But for the sake of simplicity, data in the demo sample is a string, and key is itself.
The entire ring has a length defined as scope, which defaults to 10000.
The algorithm for the new node is. Find the largest empty block. Put the new node in the middle. Of course, can also be changed to find the pressure (data volume) the largest node, put the new node after the node. Removing nodes is a little tricky. If you delete a node with a start position of 0, set the start position of the next node to 0, which is different from the normal backspace.
This ensures that only nodes are needed. There must be a node that starts at 0. This simplifies our algorithms and processing logic.
AddItem method is to put data into the system, and finally to show the distribution of data, the Desc method is provided. Print out the distribution of data. Very interesting.
The overall code such as the following:
public class Consistenthash {private int scope = 10000;private list<node> nodes;public consistenthash () {nodes = new Arraylist<node> ();} public int Getscope () {return scope;} public void Setscope (int scope) {this.scope = scope;} public void AddNode (String nodeName) {if (NodeName = = NULL | | Nodename.trim (). Equals ("")) {throw new illegalargumentexcept Ion ("name can ' t be null or empty");} if (Containnodename (nodeName)) {throw new IllegalArgumentException ("duplicate name");} Node node = new node (nodeName), if (nodes.size () = = 0) {node.setstart (0); node.setend (scope); Nodes.Add (node);} else {node M Axnode = Getmaxsectionnode (); int middle = Maxnode.start + (maxnode.end-maxnode.start)/2;node.start = Middle;node.end = Maxnode.end;int maxposition = Nodes.indexof (Maxnode); Nodes.Add (maxposition + 1, node); Maxnode.setend (middle);//move Dataiterator<string> iter = MaxNode.datas.iterator (); while (Iter.hasnext ()) {String data = Iter.next (); int value = Math.Abs (Data.hashcode ())% scope;if (Value >= middle) {Iter.remove (); Node.datas.add (data);}} for (String data:maxNode.datas) {int value = Math.Abs (Data.hashcode ())% scope;if (value >= middle) {maxNode.datas.re Move (data); Node.datas.add (data);}}} public void RemoveNode (String nodeName) {if (!containnodename (nodeName)) {throw new IllegalArgumentException ("Unknown Name ");} if (nodes.size () = = 1 && nodes.get (0). Datas.size () > 0) {throw new IllegalArgumentException ("Last node, and STI ll have data ");} Node node = findNode (nodeName), int position = nodes.indexof (node), if (position = = 0) {if (Nodes.size () > 1) {node Newfi Rstnode = Nodes.get (1); for (String Data:node.datas) {newFirstNode.datas.add (data);} Newfirstnode.setstart (0);}} else {Node lastnode = Nodes.get (position-1); for (String Data:node.datas) {lastNode.datas.add (data);} Lastnode.setend (node.end);} Nodes.remove (position);} public void AddItem (String item) {if (item = NULL | | Item.trim (). Equals (")") {throw new IllegalArgumentException ("Item CA N ' t be Null or Empty ");} int value = Math.Abs (Item.hashcode ())% scope; Node node = findNode (value); Node.datas.add (item);} public void Desc () {System.out.println ("Status:"), for (Node node:nodes) {System.out.println (node.name + ":(" + Node.star T + "," + node.end+ "):" + liststring (Node.datas));}} Private String liststring (linkedlist<string> datas) {StringBuffer buffer = new StringBuffer (); Buffer.append ("{") ;iterator<string> iter = Datas.iterator (); if (Iter.hasnext ()) {Buffer.append (Iter.next ());} while (Iter.hasnext ()) {Buffer.append ("," + Iter.next ());} Buffer.append ("}"); return buffer.tostring ();} Private Boolean Containnodename (String nodeName) {if (Nodes.isempty ()) {return false;} Iterator<node> iter = Nodes.iterator (); while (Iter.hasnext ()) {node node = Iter.next (); if (Node.name.equals ( NodeName) {return true;}} return false;} Private Node findNode (int value) {iterator<node> iter = Nodes.iterator (); while (Iter.hasnext ()) {node node = iter.ne XT (); if (Value >= Node.staRT && Value < node.end) {return node;}} return null;} Private Node FindNode (String nodeName) {iterator<node> iter = Nodes.iterator (); while (Iter.hasnext ()) {Node node = i Ter.next (); if (Node.name.equals (NodeName)) {return node;}} return null;} Private Node Getmaxsectionnode () {if (nodes.size () = = 1) {return nodes.get (0);} Iterator<node> iter = Nodes.iterator (); int maxsection = 0; Node Maxnode = Null;while (Iter.hasnext ()) {node node = iter.next (); int section = node.end-node.start;if (Sections > m axsection) {maxnode = node;maxsection = section;}} return maxnode;} Static class Node {private String name;private int start;private int end;private linkedlist<string> datas;public Nod E (String name) {this.name = Name;datas = new linkedlist<string> ();} Public String GetName () {return name;} public void SetName (String name) {this.name = name;} public int Getstart () {return start;} public void Setstart (int start) {This.start = start;} public int getend () {return end;} Public void SetEnd (int end) {this.end = end;} Public linkedlist<string> Getdatas () {return datas;} public void Setdatas (linkedlist<string> datas) {this.datas = Datas;}} public static void Main (string[] args) {Consistenthash hash = new Consistenthash (); Hash.addnode ("Machine-1"); Hash.addnode ("Machine-2"); Hash.addnode ("Machine-3"); Hash.addnode ("Machine-4"); Hash.additem ("Hello"); Hash.additem ("hash"), Hash.additem ("main"), Hash.additem ("args"), Hash.additem ("LinkedList"); Hash.additem ("End"); Hash.desc (); Hash.removenode ("Machine-1"); Hash.desc (); Hash.addnode ("Machine-5"); Hash.desc (); Hash.additem (" Scheduling "), Hash.additem (" queue "), Hash.additem (" thumb "), Hash.additem (" quantum "), Hash.additem (" approaches "); Hash.additem ("Migration"), Hash.additem ("null"), Hash.additem ("Feedback"), Hash.additem ("ageing"); Hash.additem (" Bursts "); Hash.additem (" shorter "); Hash.desc (); Hash.addnode (" Machine-6 "); Hash.addnode (" Machine-7 "); Hash.addnode ( "Machine-8"); Hash.desc (); Hash.addnode ("Machine-9"); Hash.addnode ("Machine-10");); Hash.addnode ("Machine-11"); Hash.desc (); Hash.addnode ("Machine-12"); Hash.addnode ("Machine-13"); Hash.addnode (" Machine-14 "), Hash.addnode (" Machine-15 "), Hash.addnode (" Machine-16 "); Hash.addnode (" Machine-17 "); Hash.desc ();}}
Where it needs to be further intact.
Different nodes to backup each other, improve the reliability of the system. Dynamic adjustment of the node scope. Sometimes the distribution may not be balanced.
Java implementation of consistent hashing algorithm