The consistent hashing algorithm is a common algorithm in distributed systems. For example, a distributed storage system, to store data on a specific node, if the use of ordinary hash method, the data mapped to a specific node, such as Key%n,key is the data key,n is the number of machine nodes, if a machine joins or exits the cluster, then all the data map is invalid , if you are persisting the storage to do the data migration, if it is distributed cache, then the other cache will be invalidated.
Therefore, a consistent hashing algorithm is introduced:
The data is mapped into a large space using a hash function (such as MD5). When the data is stored, a hash value is obtained, corresponding to each position in the ring, such as the K1 corresponds to the position shown in the figure, then a machine node B is found clockwise, and the K1 is stored in the Node B.
If the b node goes down, the data on B falls to the C node, as shown in:
In this way, only the C node is affected and the data of other nodes a,d is not affected. However, this will create an "avalanche" situation, the C node due to bear the B-node data, so the C node load will be high, C node is easy to go down, so in turn, so that the entire cluster is hung.
To this end, the concept of "virtual node" is introduced: that is, there are many "virtual nodes" in this ring, the storage of data is to find a virtual node in the clockwise direction of the ring, each virtual node will be associated to a real node, as used:
The figure of A1, A2, B1, B2, C1, C2, D1, D2 are virtual nodes, machine a load storage A1, A2 data, machine B load Storage B1, B2 data, machine C load Storage C1, C2 data. Because these virtual nodes are large in number and evenly distributed, they do not cause "avalanche" phenomena.
Java implementations:
[Java]View PlainCopyPrint?
- Public class Shard<s> { //S class encapsulates information about machine nodes, such as name, password, IP, port, etc.
- private Treemap<long, s> nodes; //Virtual node
- private list<s> shards; //Real machine node
- private Final int node_num = 100; //number of virtual nodes associated with each machine node
- Public Shard (list<s> shards) {
- super ();
- this.shards = shards;
- Init ();
- }
- private void init () { //Initialize consistency hash ring
- nodes = new Treemap<long, s> ();
- For (int i = 0; I! = Shards.size (); ++i) { //each real machine node requires an associated virtual node
- final S shardinfo = Shards.get (i);
- For (int n = 0; n < node_num; n++)
- //A Real Machine node association node_num virtual Nodes
- Nodes.put (Hash ("shard-" + i + "-node-" + N), shardinfo);
- }
- }
- Public S Getshardinfo (String key) {
- Sortedmap<long, s> tail = nodes.tailmap (hash (key)); //Find a virtual node clockwise along the ring
- if (tail.size () = = 0) {
- return Nodes.get (Nodes.firstkey ());
- }
- return Tail.get (Tail.firstkey ()); //Returns the information of the Real machine node corresponding to the virtual node
- }
- /**
- * MurmurHash algorithm, non-cryptographic hash algorithm, high performance,
- * Compared to traditional crc32,md5,sha-1 (both algorithms are cryptographic hash algorithms, the complexity itself is very high, resulting in the performance of the damage is inevitable)
- * Equal hash algorithm is much faster, and it is said that the collision rate of this algorithm is very low.
- * http://murmurhash.googlepages.com/
- */
- Private Long hash (String key) {
- Bytebuffer buf = Bytebuffer.wrap (Key.getbytes ());
- int seed = 0X1234ABCD;
- Byteorder Byteorder = Buf.order ();
- Buf.order (Byteorder.little_endian);
- long m = 0xc6a4a7935bd1e995l;
- int r = 47;
- long h = seed ^ (buf.remaining () * m);
- long K;
- While (buf.remaining () >= 8) {
- K = Buf.getlong ();
- K *= m;
- K ^= k >>> R;
- K *= m;
- H ^= K;
- H *= m;
- }
- if (buf.remaining () > 0) {
- Bytebuffer finish = bytebuffer.allocate (8). Order (
- Byteorder.little_endian);
- //For Big-endian version, does this first:
- //Finish.position (8-buf.remaining ());
- Finish.put (BUF). Rewind ();
- H ^= Finish.getlong ();
- H *= m;
- }
- H ^= h >>> R;
- H *= m;
- H ^= h >>> R;
- Buf.order (Byteorder);
- return h;
- }
- }
Consistent hashing algorithm and Java implementation