the evolution of the NoSQL distributed architecture:
Phase one: Considering that a single server cannot be loaded, a distributed architecture is used, and the initial algorithm is hash () mod n,hash () usually takes the user id,n as the number of nodes, this method is easy to implement and can meet operational requirements. The disadvantage is that the system cannot recover automatically when a single point of failure occurs.
Second stage: In order to solve a single point of failure, use hash () mod (N/2), so that any user has 2 server alternatives, can be randomly selected by the client. Because users between different servers need to interact with each other, all servers need to know exactly where the user is. Therefore, the user location is saved to memcached.
When a machine fails, the client can automatically switch to the corresponding backup, because there are 1 other sessions without the user before the switch, so the design of this phase has the following problems:
Load imbalance, especially after a single machine failure, the remaining one will be too much pressure.
Cannot be dynamically truncated (machine)
Client needs to be re-logged in the event of a node failure.
Phase three: To remove the hard-coded hash () mod n algorithm, use the consistent hash distribution algorithm instead.
We divide each server into V virtual nodes, and then randomly assign all virtual nodes (N*V) to a consistent hash ring, so that all users from their position on the ring clockwise down to the first Vnode is their own node. When there is a failure in this node, a clockwise one is removed as an alternative node.
Advantage: A single point of failure when the load will be evenly dispersed to all other nodes, program implementation is also more elegant.
NoSQL Consistent hashing algorithm