Consistent hashing algorithm and its implementation (consistent Hashing)

Source: Internet
Author: User
Tags hash md5 memcached redis redis cluster
Consistent hashing algorithm and its implementation (consistent Hashing)

One, the principle of a consistent hashing algorithm

1, the background of the emergence of a consistent hashing algorithm
Technology and business are mutually reinforcing and advancing together. The generation of consistent hashing algorithms also stems from business requirements. As the business grows, a single machine
has been unable to meet the needs of the business, distributed architecture emerged. In a distributed environment, multiple machines need to work together, if the data is guaranteed to be distributed
The consistency in the environment has become an urgent problem to be solved. Consistent hashing algorithm, is to solve multiple machines, in the case of dynamic additions and deletions, can
To maximize the consistency of information.
The consistent hashing algorithm is a distributed hashing algorithm designed to solve hot spot problems in the Internet. Consistent hashing algorithm
The original design was very similar to the carp. CARP, the Composition/aggregation Principle, the combination/aggregation principle. One of the goals of carp is to
The availability of improved services. In multiple server environment, failover, improve the availability of the system. The consistent hash corrects the simple use of the carp
The problem of hashing algorithm, so that distributed hash (DHT) can be really applied in the peer-to-peer environment.

2, consistent hash algorithm implementation reference standard
To design a consistent hash algorithm, it is generally necessary to follow the following criteria:
2.1, Balance (Balance): Balance not only refers to the average distribution, can be understood as a weighted average, according to the ability of each server, the task
Allocate and make full use of the resources of each machine.
2.2, Monotonicity (monotonicity): How to understand monotonicity. is a difficult point. The online statement is basically "take doctrine", and does not put the monotony of the
The concept is clearly stated. It's hard to tell if it's just conceptually monotonous. We can change the angle to read "Monotonicity", from monotonicity to solve
The problem begins with the interpretation of "monotonicity", or from the function of monotony, to understand "monotonicity". In a dynamically changing distributed environment, increase the server node
and removing server nodes is the most common operation. If we use a simple hashing algorithm, such as using the node's IP hash value hash% node number n, as the HA
The hash value, which is mapped to the node. Then, once the number of nodes has changed, the hash value will be invalidated. Monotonicity is to solve this problem.
2.3, Dispersion (Spread): In a distributed environment, node A may not see all other N-1 nodes, only a subset of the nodes. When the section
When point a maps data to other nodes, the cluster extents seen by different nodes may be different, resulting in inconsistent hash results, and eventually the same
The data is mapped to different memory by different nodes. This situation is obviously supposed to be avoided because it causes the same data to be stored to different nodes, reducing
The efficiency of the system storage. The dispersion is not very understanding at the moment, why the same data will be mapped to different nodes. Save it for later study.
2.4, Load: The concept of the load of the online view, is not very clear, to keep the study.


Second, the innovative point of the consistent hashing algorithm

The implementation of the consistent hashing algorithm is actually the process of solving the above problems. We do not follow the online way to do the boring explanation, but
Through the comparison, to conduct a targeted explanation. The consistent hashing algorithm has different implementation modes in different system environments. However, the approximate process of implementation
is still the same.

1, static mapping--dynamic mapping
Common hashing algorithms, such as those mentioned above (hash% N), because data and nodes are statically bound. In other words, after hashing, the data
And the relationship between the nodes is determined. Once the number of nodes has changed, all hashes are invalidated. Consistent hashing algorithm, how to solve this problem.
The consistency hashing algorithm introduces the concept of the ring, and the most important innovation point is: the allocation of nodes and the allocation of data, split into 2 independent processes. Data
The association with the node is not directly established by the hashing algorithm. So that the data and nodes are relatively independent, node a changes, and does not affect the entire distribution
System, because you do not need to hash all the data at this time.
The progress of the consistent hashing algorithm is that the association of data and nodes is changed from "Static" to "dynamic".

2. Locate the node clockwise
How does a consistent hash algorithm correlate data with nodes? After the nodes and data are hashed to the ring, the data is looked up in a clockwise direction.
method, which is associated with the node. The data takes the first node found in the clockwise position as its storage location, so that the data and nodes are perfectly correlated.
The

Three, the problem of the consistency hashing algorithm face

The consistent hashing algorithm solves the problem that the ordinary hashing algorithm cannot solve, but the consistency hashing algorithm also has some defects. In the case where Node A is dead,
The data that is mapped to node A is affected. Because the data that was previously mapped to the a node is now being searched clockwise, it is mapped to the next node of Node A.
Similarly, when a node is added, it also affects a subset of the data.
Another drawback of the consistent hashing algorithm is that when the number of nodes in a cluster is small, data skew is caused. Data skew problem, can be through virtual node
The way to solve. Add a second mapping between the virtual node and the actual node.


In a word, compared with the ordinary hashing algorithm, the consistent hashing algorithm has some fault tolerance and expansibility for the dynamic deletion of nodes.



/**
     * MurMurHash algorithm is a non-encrypted HASH algorithm with high performance,
     * Compared with the traditional CRC32, MD5, SHA-1 (both algorithms are encrypted HASH algorithm, the complexity itself is very high, the performance damage caused is also inevitable)
     * Wait for HASH algorithm to be much faster, and it is said that the collision rate of this algorithm is very low
     * http://murmurhash.googlepages.com/
     */
    private Long hash(String key) {
          
        ByteBuffer buf = ByteBuffer.wrap(key.getBytes());
        int seed = 0x1234ABCD;
          
        ByteOrder byteOrder = buf.order();
        buf.order(ByteOrder.LITTLE_ENDIAN);
  
        long m = 0xc6a4a7935bd1e995L;
        int r = 47;
  
        long h = seed ^ (buf.remaining() * m);
  
        long k;
        while (buf.remaining() >= 8) {
            k = buf.getLong();
  
            k *= m;
            k ^= k >>> r;
            k *= m;
  
            h ^= k;
            h *= m;
        }
  
        if (buf.remaining()> 0) {
            ByteBuffer finish = ByteBuffer.allocate(8).order(
                    ByteOrder.LITTLE_ENDIAN);
            // for big-endian version, do this first:
            // finish.position(8-buf.remaining());
            finish.put(buf).rewind();
            h ^= finish.getLong();
            h *= m;
        }
  
        h ^= h >>> r;
        h *= m;
        h ^= h >>> r;
  
        buf.order(byteOrder);
        return h;
    }







Four, Java implementation of consistent hashing algorithm







package redis.cn;
import java.nio.charset.Charset;
import java.util.List;
import java.util.SortedMap;
import java.util.TreeMap;
import com.google.common.hash.HashFunction;
import com.google.common.hash.Hashing;
public class ConsistentHash {
// ------------------ java implementation of consistent hashing algorithm ------------------
    private SortedMap<Long,String> ketamaNodes = new TreeMap<Long,String>();
    private int numberOfReplicas = 1024;
    // Google's jar package is used here - guava-18.0.jar
    private HashFunction hashFunction = Hashing.md5();
    private List<String> nodes;
    private volatile boolean init = false; //Whether the flag is initialized
    // parameterized constructor
    public ConsistentHash(int numberOfReplicas,List<String> nodes){
        this.numberOfReplicas = numberOfReplicas;
        this.nodes = nodes;
        init();
    }
    // According to the hash value of the key, find the nearest node (server)
    public String getNodeByKey(String key){
        if(!init)
        throw new RuntimeException("init uncomplete...");
        // Note that here is the NIO package java.nio.charset.Charset
        byte[] digest = hashFunction.hashString(key, Charset.forName("UTF-8")).asBytes();
        long hash = hash(digest,0);
        //If this node is found, take the node directly and return
        if(!ketamaNodes.containsKey(hash)){
            //Get the sub-Map larger than the current key, and then take the first key from it, which is the key larger than and closest to it
            SortedMap<Long,String> tailMap = ketamaNodes.tailMap(hash);
            if(tailMap.isEmpty()){
                hash = ketamaNodes.firstKey();
            }else{
                hash = tailMap.firstKey();
            }

        }
        return ketamaNodes.get(hash);
    }
    // Add node
    public synchronized void addNode(String node){
        init = false;
        nodes.add(node);
        init();
    }

    private void init(){
        //For all nodes, generate numberOfReplicas virtual nodes
        for(String node:nodes){
            //Every four virtual nodes are 1 group
            for(int i=0;i<numberOfReplicas/4;i++){
                //Get a unique name for this group of virtual nodes
                byte[] digest = hashFunction.hashString(node+i, Charset.forName("UTF-8")).asBytes();
                //Md5 is a 16-byte array, and every 16 bytes of the 16-byte array corresponds to a virtual node, which is why the four virtual nodes are divided into a group
                for(int h=0;h<4;h++){
                    Long k = hash(digest,h);
                    ketamaNodes.put(k,node);
                }
            }
        }
        init = true;
    }

    public void printNodes(){
        for(Long key:ketamaNodes.keySet()){
            System.out.println(ketamaNodes.get(key));
        }
    }
    // Hash algorithm
    public static long hash(byte[] digest, int nTime)
    {
        long rv = ((long)(digest[3 + nTime * 4] & 0xFF) << 24)
                | ((long)(digest[2 + nTime * 4] & 0xFF) << 16)
                | ((long)(digest[1 + nTime * 4] & 0xFF) << 8)
                | ((long)digest[0 + nTime * 4] & 0xFF);
        return rv;
    }
}

Five, the application of consistent hashing algorithm in Redis

Redis itself does not support clustering, so you need to use APIs or other third-party products to implement cluster deployment. Of course, you can also use a consistent hashing algorithm to
Implement a Redis cluster. Memcached should be no stranger to everyone, by mapping the key to the memcached server for fast reading. We can dynamically pair its nodes
Increased, does not affect the relationship between the key previously mapped to memory and the memcached server, because a consistent hashing algorithm is used. Memcached's
Hash policies are implemented on the client side, so different client implementations differ, with Spymemcache, Xmemcache as an example, using Ketama as their implementation.
To implement a Redis distributed cluster, here are a few ideas to consider:
* Using Jedis
* Implement a consistent hashing algorithm by itself;

1,jedis
Jedis is the Redis client API. There is no Sharding method on the Redis-server side, but we can use Jedis to implement the distribution. The Jedis uses a
Called Sharding's mind.
What is sharding? In simple terms, it is the database "Shard". The core idea of sharding is to "scatter" the data according to a certain strategy in the cluster
In different physical machines, fundamentally, the "big Data" distributed storage is realized, which embodies the concept of "cluster". For example, 100 million data, we can according to the data
Hashcode, the data hash is stored on 5 physical machines.
The implementation of Sharding is also based on a consistent hashing algorithm. Let's take a look at the key source code implemented by sharding.
1.1 Hashcode Value: source code from redis.clients.util.Hashing. The default hash algorithm in Jedis is MD5, the fifth generation Information digest that we are familiar with.
Algorithm: Message Digest algorithm 5.




//Small amount of optimized performance
    public ThreadLocal<MessageDigest> md5Holder = new ThreadLocal<MessageDigest>();
    public static final Hashing MD5 = new Hashing() {
public long hash(String key) {
return hash(SafeEncoder.encode(key));
}
// The hash algorithm used by sharding is MD5
public long hash(byte[] key) {
try {
if (md5Holder.get() == null) {
md5Holder.set(MessageDigest.getInstance("MD5"));
}
}
catch (NoSuchAlgorithmException e) {
throw new IllegalStateException("++++ no md5 algorythm found");
}
MessageDigest md5 = md5Holder.get();
md5.reset();
md5.update(key);
//Get MD5 byte sequence
byte[] bKey = md5.digest();
//The first four bytes are used as calculation parameters, and finally a 32-bit int value is obtained.
//This calculation method can ensure that the hash value of the key is more "random"/"discrete"
//If the hash value is too dense, it is not conducive to the realization of consistent hashing (especially when there is a "virtual node" design)
long res = ((long) (bKey[3] & 0xFF) << 24)
| ((long) (bKey[2] & 0xFF) << 16)
| ((long) (bKey[1] & 0xFF) << 8)
| (long) (bKey[0] & 0xFF);
return res;
}
    };

1.2 Node build process (redis.clients.util.Sharded):




//The shards list provides all redis-server configuration information for the client, including: ip, port, weight, name
    //Where weight is the weight, it will directly determine the "proportion" (density) of the "virtual node", the higher the weight, the higher the probability of being hit by the hash in storage
    //--- the more data stored on it.
    //Where name is "node name", jedis uses name as a calculation parameter of "node hash value".
    //---
    // Consistent hash algorithm requires that each "virtual node" must have a "hash value", and each actual server can have multiple "virtual nodes" (API level)
    //The number of virtual nodes = "logical interval length" * weight, the "virtual nodes" of each server will be distributed in the global area in a "hash" way
    //The total length of the global zone is 2^32. Each "virtual node" is mapped in the global zone by means of a hash value.
    // Ring: 0-->vnode1(:1230)-->vnode2(:2800)-->vnode3(400000)---2^32-->0
    //All "virtual nodes" will be arranged in the order of their "node hash" (both positive and reverse order), so there must be a difference in hash value between two adjacent "virtual nodes",
    //Then this difference is the data hash value range loaded by the previous (or the latter, depending on the implementation) "virtual node".
    //For example, data whose hash value is "2000" will be accepted by vnode1.
    private void initialize(List<S> shards){
        //Virtual node, adopt TreeMap storage: sort, binary tree
        nodes = new TreeMap<Long, S>();
        for (int i = 0; i != shards.size(); ++i) {
            final S shardInfo = shards.get(i);
            if (shardInfo.getName() == null)
                    //When "name" is not set, use "SHARD-NODE" as the parameter of the "virtual node" hash value calculation
                    //"Logical interval step" is 160, why?
                    //Finally, the "virtual nodes" of multiple servers will be staggered, not necessarily very uniform.
                for (int n = 0; n <160 * shardInfo.getWeight(); n++) {
                    nodes.put(this.algo.hash("SHARD-" + i + "-NODE-" + n), shardInfo);
                }
            else
                for (int n = 0; n <160 * shardInfo.getWeight(); n++) {
                    nodes.put(this.algo.hash(shardInfo.getName() + "*" + shardInfo.getWeight() + n), shardInfo);
                }
            resources.put(shardInfo, shardInfo.createResource());
        }
    }

1.3,node Selection Method:




    Public R Getshard (String key) {  
        return Resources.get (Getshardinfo (key));  
    }    
    Public S Getshardinfo (byte[] key) {  
        //Get the list of "virtual nodes" for >=key  
        sortedmap<long, s> tail = Nodes.tailmap ( Algo.hash (key));  
        If the virtual node does not exist, the first node is returned.  
        if (tail.size () = = 0) {  
            return Nodes.get (Nodes.firstkey ());  
        }  
        If present, returns the first node of the "virtual node" that meets the (>=key) condition  
        return Tail.get (Tail.firstkey ());  
    }  

Jedis sharding mode, if a server fails, the client does not remove this sharding, so if access to this sharding will throw an exception.
This is to maintain consistency of all client data views. You might want a dynamic consistent hash topology (that is, if a shard fails, the sharding structure
The data on the failed sharding is hashed to the other sharding), but unfortunately, the Sharedjedis client cannot support it if it is not
Support, large code adjustments are required, and additional topology Autodiscover mechanisms need to be introduced. (See: Redis cluster architecture, which has provided the end of this issue
Solutions). However, in the case of persistent storage, we can use the "strong hash" Shard, then we need to rewrite its hash algorithm. Under the strong hash algorithm, if a virtual
The physical server failure that the proposed node is in will cause the data to be inaccessible (read/stored), i.e. the failed server will not be removed from the list of virtual nodes.
For Jedis If you rewrite the consistent hashing algorithm, you need to consider the following:
1) The virtual node hash is relatively uniform
2) Whether the hash value of the data is evenly distributed
3) Whether the virtual nodes are hashed evenly in the "global".
If the design is bad, it is likely to result in uneven distribution of data on the server, and lose the meaning of sharding itself.

Demo of using Jedis in 2,java




package redis.cn;
import java.util.ArrayList;
import java.util.List;
import org.apache.commons.pool2.impl.GenericObjectPoolConfig;
import redis.clients.jedis.JedisShardInfo;
import redis.clients.jedis.ShardedJedis;
import redis.clients.jedis.ShardedJedisPool;
/**
 * @author yangcq
 * @category jedis is also an implementation of the consistent hash algorithm. To build a redis distributed cluster, you can use jedis.
 */
public class ShardedRedis {

// In addition to the toolkit that comes with jdk, you need to import the following 2 jar packages
// commons-pool2-2.0.jar
    // jedis-2.4.2.jar
The
    public static void main(String[] args){
    // jedis configuration parameters
        GenericObjectPoolConfig genericObjectPoolConfig = new GenericObjectPoolConfig();
        genericObjectPoolConfig.setMaxTotal(1000);
        genericObjectPoolConfig.setMaxIdle(500);

        List<JedisShardInfo> jedisShardInfoList = new ArrayList<JedisShardInfo>();
        JedisShardInfo jedisShardInfo1 = new JedisShardInfo("127.0.0.1",1234);
        JedisShardInfo jedisShardInfo2 = new JedisShardInfo("127.0.0.1",1235);
        JedisShardInfo jedisShardInfo3 = new JedisShardInfo("127.0.0.1",1236);
        jedisShardInfoList.add(jedisShardInfo1);
        jedisShardInfoList.add(jedisShardInfo2);
        jedisShardInfoList.add(jedisShardInfo3);

        ShardedJedisPool shardedJedisPool = new ShardedJedisPool(genericObjectPoolConfig,jedisShardInfoList);

        set("key1","value1",shardedJedisPool);
        set("key2","value2",shardedJedisPool);
        set("key3","value3",shardedJedisPool);
        set("key4","value4",shardedJedisPool);
        set("key5","value5",shardedJedisPool);
        
        // jedis hides the details of implementing a consistent hash algorithm, but only provides us with a simple interface call to implement the construction of a redis distributed cluster
        // So how does jedis implement a consistent hash algorithm?
    }

    public static void set(String key, String value, ShardedJedisPool pool){
    // Get the redis instance from the shared resource pool
        ShardedJedis shardedJedis = pool.getResource();
        // Assignment
        shardedJedis.set(key,value);
        pool.returnResource(shardedJedis);
    }
}


Reference Source:

Jedis is the data that is written to the Redis cluster through Shardedjedis, the key method in Shardedjedis:




public Sharded(List<S> shards, Hashing algo, Pattern tagPattern) {
    this.algo = algo;
    this.tagPattern = tagPattern;
    initialize(shards);
}

//Initialize the hash ring
private void initialize(List<S> shards) {
    nodes = new TreeMap<Long, S>();

    for (int i = 0; i != shards.size(); ++i) {
        final S shardInfo = shards.get(i);
        if (shardInfo.getName() == null)
        for (int n = 0; n <160 * shardInfo.getWeight(); n++) {
            nodes.put(this.algo.hash("SHARD-" + i + "-NODE-" + n),
                shardInfo);
        }
        else
        for (int n = 0; n <160 * shardInfo.getWeight(); n++) {
            nodes.put(
                this.algo.hash(shardInfo.getName() + "*"
                    + shardInfo.getWeight() + n), shardInfo);
        }
        resources.put(shardInfo, shardInfo.createResource());
    }
}

//Store key and value to the corresponding shard
 public String set(String key, String value) {
    Jedis j = getShard(key);
    return j.set(key, value);
 }

public R getShard(String key) {
    return resources.get(getShardInfo(key));
}

//Acquire shard according to key
public S getShardInfo(byte[] key) {
    SortedMap<Long, S> tail = nodes.tailMap(algo.hash(key));
    if (tail.isEmpty()) {
        return nodes.get(nodes.firstKey());
    }
    return tail.get(tail.firstKey());
}


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.