Consistent hashing and open-source implementation

Source: Internet
Author: User
Tags value store

Consistent hash is common: how to distribute many key values (such as MD5 value range space) to multiple server nodes.

The direct method is a normal Hash (for example, modulo), but if the server node may change dynamically, each node change will lead to the failure of the vast majority of mappings.

The consistent hash method is to regard the value field of the key as a ring, and each server maps to multiple vertices on the ring (virtual node ), the point set of all servers divides the ring into multiple segments. The key-> server ing process is: Find the corresponding point based on the key, and then clockwise (clockwise, find the nearest virtual node. The server corresponding to the vnode will be used to process the key.

It is worth noting that a server needs to be mapped to multiple virtual nodes (such as 100-), and the virtual nodes should be evenly distributed on the ring (if all the nodes are tied, ). In an extreme case, each server is mapped to a vnode, so when a server fails, its load will be borne by the next server, and the load balancing effect will be lost.

In practice, you can dynamically adjust the number of vnodes mapped to a server based on the server capacity and current load to better balance the load.

Assume that the space of the ring is the MD5 value range, that is, 0-2 ^ 128-1. Two hash values must be considered during implementation.

1. ing between server-> vnode_list. You can append several numbers (or random numbers) to the server IP address and obtain the corresponding vnode list after MD5.

2. resource_key-> ring_key ing. For example, if the Resource Key is a URL, you can directly use MD5.

We also need to consider a partition relationship, that is, the segments that each vnode is responsible for. This uses a binary tree or an array. When searching, we can split the array into two parts.

The last step is the slave relationship between the vnode and the server, that is, the first ing between the two hash entries above.

Consistent hashing is a standard tool in many Internet platform implementations, such as Amazon Dynamo (see Dynamo: Amazon's highly available key-Value Store ).

For memcache, consistent hashing is done on the client side, you can refer to the last. FM implementation: http://cn.last.fm/user/RJ/journal/2007/04/10/rz_libketama_-_a_consistent_hashing_algo_for_memcache_clients

For more information, see the Wiki entry http://en.wikipedia.org/wiki/Consistent_hashing


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.