The consistent hash algorithm is a distributed HASHDHT implementation algorithm proposed by MIT 1097, which is a very stubborn hotspot problem in the Internet.
Balanced hash results are distributed to all caches as much as possible, with the highest buffer space utilization
Monotonicity keeps existing caches mapped to the corresponding locations, new added caches can be added to new locations that do not map to old locations
Dispersion minimizes dispersion of cache inconsistencies
Load load is reduced by fan load
Chord algorithm of consistent hash implementation algorithm
Basic Features: Node ID (NID) resource ID (KID) constant hash function (Sha-1) chord ring (Ring of 2^m)
Because the range of hash (k) is in the int range, we will 0~2^32 as a ring. The steps are: 1, the hash (server IP) value of each server is calculated, and it is configured on a 0~2^n ring (n usually takes 32). 2, the same method is used to find the primary key hash value of the object to be stored, also configure it to the ring, and then from the data map to the position of the clockwise lookup, the data is distributed to the first server node found.
This is the principle of a consistent hash algorithm, adding a redis node: Redis node5, which affects only the part of the yellow mark, does not cause global changes.
Avalanche effect
When there is a node outage, when the B node is down, the K1,K2 that was originally stored on the B node will be migrated to Node C, which can cause significant problems. If b is stored on the hotspot data, the data is migrated to the C node, then C needs to withstand the b+c data, also can not bear, also hung .... Then continue the CD all hung up. This creates an avalanche effect.
We can virtual a number of virtual nodes for each actual node, to map to the position on the circle, to store the corresponding data.
A node corresponds to the A1,A2,BCD node in the same vein. At this point, if the a node is hung, the data migration of the A node is: A1 data is migrated to C2,A2 data migration to D1. This is equivalent to A's data is shared by C and D, which avoids the avalanche effect of the transmission, and the virtual node we can customize the settings to make it suitable for our application.
Hash consistency algorithm