Conformance Hash-java Implementation version TreeMap

Source: Internet
Author: User

Store the data of different numbers on different machines to disperse the pressure. If we have 1 million QQ number, 10 machines, how to divide it?

The simplest and most brutal method is to use the QQ number directly to 10, the result is 0-9 respectively corresponding to the above 10 machines. For example, the QQ number of 23900 users in the number 0 machine 23901 users in the number 1 machine, and so on. So the problem comes, now QQ users rose sharply from 1 million to 5 million, apparently 10 machines have been powerless

, so we expanded to 25 units. This time we found that the previous data were all messed up. Finished! Just run away ...

A measure of the Hash algorithm is monotonicity (monotonicity), which is defined as follows:

monotonicity refers to the addition of a new buffer to the system if some content has been allocated to the corresponding buffer by hashing. The result of the hash should be to ensure that the original allocated content can be mapped to a new buffer without being mapped to another buffer in the old buffer collection.

Easy to see, above the simple hash algorithm hash (object)%N difficult to meet the monotonicity requirements.

So in the case of reasonable dispersion, we are still able to expand. This is the consistency hash, the consistent hash algorithm is to map value to a 32-bit key value, that is, the numerical space of the 0~2^32-1; we can think of this space as a ring with a first (0) tail (2^32-1), and when there is data come in clockwise to find the most Near a point, this point, is the node machine I want. Such as:

Hash ("192.168.128.670")---->a//Generate nodes based on server IP hash

Hash ("192.168.148.670")---->c//Generate nodes based on server IP hash


Hash ("81288812")----> K1//hash out according to the QQ number generated value-----> clockwise to find the machine


Hash ("8121243812")----> K4//hash out according to the QQ number generated value-----> clockwise to find the machine


So when new machines are added, the old machines are removed, and the impact is a fraction of the data. This seems perfect, but if one of the node B data surges and hangs, all the data will fall to c--->c can not carry----> All data will fall to D ... And so on, finally all hung up! The whole world is quiet!!!

Obviously, this way the service hangs because the data is not average. So our consistency hash also needs to be balanced.

balance means that the result of the hash can be distributed to all buffers as much as possible, thus allowing all buffer space to be exploited.
To solve the balance, the consistency hash introduces the concept of virtual node. Virtual node is the actual node in the hash space of the replica (replica), a real node corresponding to a number of "virtual node", the corresponding number has become "Replication Number", "Virtual node" in the hash space in the hash value. So if we have 25 servers, each virtual 10, there are 250 virtual nodes. This ensures that the load of each node is not too large, the pressure is equally shared, something to carry!!!

Hash ("192.168.128.670#36kr01")---->a//Generate nodes based on server IP hash

Hash ("192.168.128.670#36kr02")---->b//Generate nodes based on server IP hash

Hash ("192.168.128.670#36kr03")---->b//Generate nodes based on server IP hash

......

Final Virtual node +murmurhash is our solution:

Class Shard<s> {//S classes encapsulate machine node information such as name, password, IP, port, etc. private treemap<long, s> nodes;//Virtual node privat e list<s> shards; Real machine node private final int node_num = 100;        Number of virtual nodes associated with each machine node public Shard (list<s> shards) {super ();        This.shards = shards;    Init ();        } private void Init () {//Initialize consistency hash ring nodes = new Treemap<long, s> ();            for (int i = 0; I! = Shards.size (); ++i) {//each real machine node requires an associated virtual node final S shardinfo = Shards.get (i);  for (int n = 0; n < node_num; n++)//A real machine node associated node_num virtual node Nodes.put (hash ("shard-" + I        + "-node-" + N), shardinfo); }} public S Getshardinfo (String key) {sortedmap<long, s> tail = nodes.tailmap (hash (key)), and//found along the ring clockwise        A virtual node if (tail.size () = = 0) {return nodes.get (Nodes.firstkey ()); } return Tail.get (Tail.firstkey ()); Returns the information of the Real machine node corresponding to the virtual node}/** * MurmurHash algorithm, non-cryptographic hash algorithm, high performance, * than the traditional crc32,md5,sha-1 (these two algorithms are cryptographic hash algorithm, the complexity itself is very high, resulting in the performance of the damage is inevitable) * and other hash algorithm is much faster, and it is said that the algorithm collision rate is very low. * http://murmurhash.googlepages.com/*/private Long hash (String key) {Bytebuffer buf = Bytebuffer.wra        P (Key.getbytes ());        int seed = 0X1234ABCD;        Byteorder Byteorder = Buf.order ();        Buf.order (Byteorder.little_endian);        Long m = 0xc6a4a7935bd1e995l;        int r = 47;        Long h = seed ^ (buf.remaining () * m);        Long K;            while (Buf.remaining () >= 8) {k = Buf.getlong ();            K *= m;            K ^= k >>> R;            K *= m;            H ^= K;        H *= m; } if (buf.remaining () > 0) {bytebuffer finish = bytebuffer.allocate (8). Order (Byt            Eorder.little_endian);            For Big-endian version, does this first://Finish.position (8-buf.remaining ());            Finish.put (BUF). Rewind (); H ^= Finish.getlong ();            H *= m;        } h ^= h >>> R;        H *= m;        H ^= h >>> R;        Buf.order (Byteorder);    return h; }}

Conformance Hash-java Implementation version TreeMap

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.