Consistent hash is common: how to distribute many key values (such as MD5 value range space) to multiple server nodes.
The direct method is a normal Hash (for example, modulo), but if the server node may change dynamically, each node change will lead to the failure of the vast majority of mappings.
The consistent hash method is to regard the value field of the key as a ring, and each server maps to multiple vertices on the ring (virtual node ), the point set of all servers divides the ring into multiple segments. The key-> server ing process is: Find the corresponding point based on the key, and then clockwise (clockwise, find the nearest virtual node. The server corresponding to the vnode will be used to process the key.
It is worth noting that a server needs to be mapped to multiple virtual nodes (such as 100-), and the virtual nodes should be evenly distributed on the ring (if all the nodes are tied, ). In an extreme case, each server is mapped to a vnode, so when a server fails, its load will be borne by the next server, and the load balancing effect will be lost.
In practice, you can dynamically adjust the number of vnodes mapped to a server based on the server capacity and current load to better balance the load.
Assume that the space of the ring is the MD5 value range, that is, 0-2 ^ 128-1. Two hash values must be considered during implementation.
1. ing between server-> vnode_list. You can append several numbers (or random numbers) to the server IP address and obtain the corresponding vnode list after MD5.
2. resource_key-> ring_key ing. For example, if the Resource Key is a URL, you can directly use MD5.
We also need to consider a partition relationship, that is, the segments that each vnode is responsible for. This uses a binary tree or an array. When searching, we can split the array into two parts.
The last step is the slave relationship between the vnode and the server, that is, the first ing between the two hash entries above.
Consistent hashing is a standard tool in many Internet platform implementations, such as Amazon Dynamo (see Dynamo: Amazon's highly available key-Value Store ).
For memcache, consistent hashing is done on the client side, you can refer to the last. FM implementation: http://cn.last.fm/user/RJ/journal/2007/04/10/rz_libketama_-_a_consistent_hashing_algo_for_memcache_clients
For more information, see the Wiki entry http://en.wikipedia.org/wiki/Consistent_hashing