(To) idealized redis Cluster

Last Update:2014-08-18 Source: Internet

Author: User

Tags redis cluster redis server

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The key to being open-minded is correct and optimistic about the failed system. You don't need to worry too much. You need to talk about the capabilities. Therefore, architecture design is so important. Many excellent systems do not have the ability to grow further. What we should do is to share work with other systems.

Redis is one of the systems that attracts me, a persistent, key-value high-performance platform for memory operations. He is an excellent key-Value Pair database. I am using it. Even if AWS recently announced its support for lower-level elasticcache caching. However, a master-less redis cluster still plays an important role. We need multiple systems to complete the work. At the same time, can we combine multiple components to work together in a fault-tolerant and master-less cluster? In this article, I will introduce the fantastic redis.

Consistent hash

The key to building a data storage cluster is an effective data storage and replication mechanism. I hope to build a data cluster in an effective way. In this process, you can add or remove a redis node at will, and ensure that your data still exists without disappearing. This method is called consistent hash.

Since it is not a very obvious concept, I will explain it in a bit of time. To understand the consistent hash, you can imagine that there is a function f (x) that always returns a 1 to 60 for a given X (why is it 60? You will know, but please wait). For a unique X, f (x) always returns the same result. The values from 1 to 60 are arranged in a circle clockwise.

Each node in the cluster needs a unique name, right? So if you pass this name to F ('<redis_node_name>'), it returns a number between 1 and 60 (including 1 and 60 ), this number is the position of the node on the ring. Of course, it is only the logical (recorded) location of the node. In this way, you get a node, pass it to the hash function, get the result, and put it on the ring. Is it easy? In this way, each node has its own position on the ring. Suppose there are five redis nodes named 'A', 'B', 'C', 'D', and 'E' respectively '. Each node is passed to the hash function f (x) and placed on the ring. Here F ('A') = 21, F ('B') = 49, F ('C') = 11, F ('D') = 40, F ('E') = 57. Remember that the location here is the logical location.

So why should we place the node on a ring? The purpose of placing a node on the ring is to determine which hash spaces it has. The hash space owned by node 'D' in the figure is the part between F ('A') and F ('D') (whose value is 40, including F ('D'), that is, (21, 40]. That is to say, the node 'D' will have the key X if f (x) belongs to the range (21, 40]. For example, if the key is 'apple' and its value is F ('apple') = 35, then the key 'apple' will exist on the 'D' node. Similarly, each key stored on the cluster is properly stored to the nearest node clockwise on the ring using a hash function.

Although consistent hash is finished, we should know that in most cases, this type of system is built with high availability. To meet the high availability of data, we need to replicate data based on factors called replication factors. Assuming that the replication factor of our cluster is 2, the data belonging to the 'D' node will be copied to the 'B' and 'E' nodes closest to them clockwise. This ensures that if data fails to be obtained from the 'D' node, the data can be obtained from the 'B' or 'e' node.

Not only does the key use consistent hash for storage, it is easy to overwrite failed nodes, and the replication factor is still effective. For example, if the 'D' node fails, the 'B' node will obtain the ownership of the 'D' node hash space, at the same time, the hash space of the 'D' node can be easily copied to the 'C' node.

Bad and good

The bad thing is that all the concepts discussed at present, such as replication (redundancy), failure processing, and cluster size, are not feasible outside redis. Consistent hash only describes the ing of nodes on the hash ring and the ownership of those hash data. Even so, it is still an excellent start for building an elastic and scalable system.

The good thing is that some other discrete tools implement consistent hashing on the redis cluster, which can remind the nodes to become invalid and add new nodes. Although this feature is not part of a tool, we will see how to use multiple systems to run an idealized redis cluster.

Twemproxy aka Nutcracker

Twemproxy is an open-source tool that is a fast and lightweight proxy Based on memcached and redis protocols. In essence, if you have some redis servers running and want to use these servers to build clusters, you only need to deploy twemproxy on the front ends of these servers, and let all redis traffic pass through it.

In addition to proxy redis traffic, twemproxy can also perform consistent hashing when storing data on the redis server. This ensures that data is distributed across multiple redis nodes based on consistent hash.

However, twemproxy does not provide high availability support for redis clusters. The simplest way is to create a slave (redundant) server for each node in the cluster. When the master server fails, the slave (redundant) server is upgraded to the master server. It is very easy to configure a slave server for redis.

The disadvantage of this model is obvious. It needs to run two servers simultaneously for each node in the redis cluster. However, node failure is obvious and more dangerous, so how do we know and solve these problems.

Gossip on serf

Gossip is a standard mechanism, through which nodes on the cluster can clearly understand the latest status of members. In this way, each node in the cluster is aware of the node changes in the cluster, such as the addition of nodes and the deletion of programs.

Serf provides such help by implementing the gossip mechanism. Serf is a proxy-based mechanism that implements the gossip protocol for node member information exchange. Serf is constantly running. In addition, it can also generate custom events.

Take our node cluster as an example. If each node also has a serf proxy running, details can be exchanged between the node and the node. therefore, each node in the cluster can clearly understand the existence of other nodes and their status.

This is not enough. For high reliability, we also need to let twemproxy know when a node has expired, so that it can modify its configuration accordingly. As mentioned above, serf can achieve this by using custom actions based on some gossip-triggered events. Therefore, as long as one redis node in the cluster goes down for some reason, another node can send messages with unexpected disconnections to any given endpoint, in our case, this endpoint is also the twemproxy server.

This is not all

Now we have a redis cluster. Based on the consistent hash ring, we use twemproxy to store data (Consistent hash) and serf. It uses the gossip protocol to detect redis cluster member failures, and send invalid messages to twemproxy. However, we have not yet established an idealized redis cluster.

Message listener

Although serf can send offline or online messages to any endpoint. However, twemproxy does not have a mechanism to listen for such events. Therefore, we need to customize a listener, like the redis-twenproxy proxy, which needs to do the following.

Listening for serf messages
Update nutcraker. yml to reflect the new topology
Restart twemproxy

The message listener can be a small HTTP server. When receiving a batch of post data, it performs the above actions for twemproxy. It should be noted that such a message should be an atomic operation, because when a node fails (or accidentally goes offline, all the active nodes that can send messages will send the invalid event message to the listener, but the listener should only respond once.

Data Replication

In the above "consistent hash", I mentioned the replication factors in the redis cluster. Similarly, it is not an inherent feature of twemproxy. twemproxy only cares about storing a copy using consistent hash. Therefore, in our pursuit of idealized redis cluster, we also need to create such replication capabilities for twemproxy or redis.

To create a replication capability for twenproxy, you need to use the replication factor as a configuration item and save the data to the adjacent redis nodes in the cluster (based on the replication factor ). Since twemproxy knows the node location, this adds a great feature to twemproxy.

Since twemproxy is only a proxy server, its simple function is its power. Creating a copy management function for it will make it bloated.

Redis Master/Slave Ring

When thinking about the working mechanism, I suddenly thought, why not set each node as a copy of another node, or a slave node, and thus form a master-slave ring?

In this case, if a node fails, the data of the failed node can still be obtained from the adjacent node on the ring. The node with the data copy will serve as the slave node of the node and provide the service for saving the data copy. This is a ring that is both a master node and a slave node. As usual, serf serves as a proxy for spreading node invalid messages. However, the client on twenproxy, that is, the listener, will not only update the invalidation information on twenproxy, but also adjust the redis server cluster to adapt to this change.

There is an obvious flaw in this ring, which is also a technical defect. This obvious defect is that the ring will break down, because the slave node of the slave node cannot identify which is the data of its master node and which is the data of its master node. In this way, all data will be transmitted cyclically.

The technical problem is that, once redis synchronizes data from the master node to the slave node, it will erase the data from the slave node; in this way, all data previously written to the slave node is deleted. This type of master-slave ring obviously cannot be applied in practice unless we modify the replication mechanism of the master-slave ring to meet our needs. In this way, each slave node will not synchronize the data of its master node to its slave node. To achieve this, you must ensure that each node can partition its own key space and the key space of its master node; in this way, it will not transmit data from the master node to its slave node.

In this way, when a node fails, four actions are required. 1. Use the slave node of the failed node as the owner of its key space. 2. Distribute these keys to the slave node of the invalid node for replication. 3. Use the slave node of the failed node as the slave node of the master node of the failed node. Finally, reset twemproxy on the new topology.

How to idealize?

In fact, there is no such redis cluster, it can have consistent hash, high reliability and partition fault tolerance. Therefore, the last image depicts an idealized redis cluster, but this is not impossible. Next, we will list the conditions required to make it a real product.

TransparentTwenproxy

It is necessary to deploy a twenproxy so that the positions of redis nodes in the hash Hash hash are transparent. Each redis node can know itself and the location of its adjacent nodes. This information is necessary for master-slave replication of nodes and repair of failed nodes. Since twenproxy is open-source, node location information can be modified and expanded.

Redis data ownership

This is a relatively difficult part. Each redis node should record its own data and what is the data of the master node. Currently, such isolation is impossible. This also requires modifying the basic redis code so that the node can know when to synchronize with the slave node, and when not.

To sum up, our idealized redis cluster has become a reality that we need to modify these two components. For a long time, they are all very large industrial levels and used in production environments. This is already worth implementing (this cluster) by anyone ).

This article from: Open Source China Community [http://www.oschina.net]
Title: Idealized redis Cluster
Address: http://www.oschina.net/translate/utopian-redis-cluster
Translation: Qingfeng Xiaoyue, super0555, wobuzhidao _, longears, Java grassroots

Utopian redis Cluster

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More