C ++ implementation of consistent hash
Consistent hash is an algorithm widely used in distributed computing. It is used in many distributed systems, including Amazon Dynamo, memcached, and Riak.
The principle of consistent hash is relatively simple. There are many good articles on the Internet and some related code, but they are not very satisfactory. Therefore, we have implemented one by ourselves. The code is very simple and put on github.
Consistent_hash_map
The consistent hash function is encapsulated in the template class consistent_hash_map:
- template <typename T,
- typename Hash,
- typename Alloc = std::allocator<std::pair<const typename Hash::result_type,T > > >
- class consistent_hash_map
Consistent_hash_map uses the stl map interface. In fact, std: map is used internally to manage and maintain all nodes.
Consistent_hash_map only provides the most basic consistent hash function and does not directly support the concept of virtual nodes. However, the concept of virtual nodes can be easily achieved through the customized T and Hash types. The benefit of this design is that it makes the design and implementation of consitent_hash_map very simple, and leaves users with great flexibility and customization.
The following example describes how to implement virtual nodes.
Template parameters
Member type
- Size_type Hash: Type of the return value of the reslut_type hash Function
- Value_type std: pair <const size_type, T> first is the hash value of the node, and second is the node
- Iterator a bidirectional iterator to value_type bidirectional iterator
- Reverse_iterator <iterator> reverse iterator
Member function
- Std: size_t size () const;
- Returns the number of nodes in consistent_hash_map.
-
- Bool empty () const;
- Determine whether consistent_hash_map is null
-
- Std: pair <iterator, bool> insert (const T & node );
- Insert a node. If the bool variable in the return value is true, iterator is the iterator pointing to the inserted node. If bool is false, insertion fails.
- Insertion failed because the node already exists or the hash value of the node conflicts with other nodes.
-
- Void erase (iterator it );
- Delete a specified node through the iterator.
-
- Std: size_t erase (const T & node );
- Deletes a specified node by using the node value.
-
- Iterator find (size_type hash );
- Find the iterator for the corresponding node in consistent_hash by passing in the hash value.
-
- Iterator begin ();
- Iterator end ();
- Returns the corresponding iterator
-
- Reverse_iterator rbegin ();
- Reverse_iterator rend ();
- Returns the corresponding reverse iterator.
Example of a virtual node
The complete code of the entire example is here.
In this example, we first define the virtual node type and its corresponding hasher.
- # Include <stdint. h>
- # Include <boost/format. hpp>
- # Include <boost/crc. hpp>
-
- # Include "consistent_hash_map.hpp"
-
- // List of all hosts
- Const char * nodes [] = {
- "192.168.1.100 ",
- "192.168.1.101 ",
- "192.168.1.102 ",
- "192.168.1.103 ",
- "192.168.1.104"
- };
-
- // Virtual node
- Struct vnode_t {
- Vnode_t (){}
- Vnode_t (std: size_t n, std: size_t v): node_id (n), vnode_id (v ){}
-
- Std: string to_str () const {
- Return boost: str (boost: format ("% 1%-% 2%") % nodes [node_id] % vnode_id );
- }
-
- Std: size_t node_id; // host ID, index of the host in the host list
- Std: size_t vnode_id; // virtual node ID
-
- };
-
- // Hasher, using CRC32 as the hash algorithm. Note that you need to define result_type.
- Struct crc32_hasher {
- Uint32_t operator () (const vnode_t & node ){
- Boost: crc_32_type ret;
- Std: string vnode = node. to_str ();
- Ret. process_bytes (vnode. c_str (), vnode. size ());
- Return ret. checksum ();
- }
- Typedef uint32_t result_type;
- };
Generate 100 virtual nodes for each host and add them to consistent_hash_map.
- typedef consistent_hash_map<vnode_t,crc32_hasher> consistent_hash_t;
- consistent_hash_t consistent_hash_;
-
- for(std::size_t i=0;i<5;++i) {
- for(std::size_t j=0;j<100;j++) {
- consistent_hash_.insert(vnode_t(i,j));
- }
- }
Find the vnode and host corresponding to a hash value:
- Consistent_hash_t: iterator it;
- It = consistent_hash _. find (290235110 );
- // It-> first is the hash value of the node, and it-> second is the virtual node.
- Std: cout <boost: format ("node: % 1%, vnode: % 2%, hash: % 3% ")
- % Nodes [it-> second. node_id] % it-> second. vnode_id % it-> first <std: endl;
Traverse all vnodes in consistent_hash and count the number of keys of each virtual node and the number of keys contained in each host:
- Std: size_t sums [] = {0, 0, 0 };
- Consistent_hash_t: iterator I = consistent_hash _. begin (); // The first node
- Consistent_hash_t: reverse_iterator j = consistent_hash _. rbegin (); // The Last Node
- Std: size_t n = I-> first + UINT32_MAX-j-> first; // calculate the number of keys contained in the first Node
- Std: cout <boost: format ("vnode: % 1%, hash: % 2%, contains: % 3% ")
- % I-> second. to_str () % I-> first % n <std: endl;
- Sums [I-> second. node_id] + = n; // update the number of keys contained in the host.
-
- // Calculate the number of keys contained in all remaining nodes and update the number of keys included in the host.
- Uint32_t priv = I-> first;
- Uint32_t cur;
- Consistent_hash_t: iterator end = consistent_hash _. end ();
- While (++ I! = End ){
- Cur = I-> first;
- N = cur-priv;
- Std: cout <boost: format ("vnode: % 1%, hash: % 2%, contains: % 3% ")
- % I-> second. to_str () % cur % n <std: endl;
- Sums [I-> second. node_id] + = n;
- Priv = cur;
- }
-
- For (std: size_t I = 0; I <5; ++ I ){
- Std: cout <boost: format ("node: % 1% contains: % 2%") % nodes [I] % sums [I] <std: endl;