Building data Services based on Redis

Last Update:2016-12-13 Source: Internet

Author: User

Tags failover message queue redis cluster

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

today, Let's talk about how to extend the data service based on the Redis database, how to implement Sharding (sharding), and high availability (higheravailability ).

There is no perfect design for distributed systems, and trade off is reflected everywhere.

therefore, before we begin the text, we need to determine the following principles of discussion, still in the design of distributed systems for example, the CAP principle. Because the protagonist is a redis, the performance is definitely the highest design goal, and then all the decisions in the discussion process will be given priority to the AP in the CAP Nature.

Two points in order to see the shards First.

What is a shard? To put it simply, a single redis is scaled horizontally.

of course, The students who do the game may want to ask, one to a Redis, why do you need to expand horizontally? We have discussed this topic in a few previous articles, you can see here, or here, the novel June no longer repeat.

if you want to implement Service-level reuse, the data service is often positioned as a global service. redis, which is so single-instance-only, is difficult to cope with -- after all , redis is Single-threaded.

from MySQL all the way to use the students at this time will be accustomed to split horizontally,Redis is similar to the principle, the overall data is sliced, each part is a shard (Shard ), different shards maintain different key Sets.

the essence of the Shard problem is how to design a globally unified data service based on multiple Redis Instances. At the same time, there is a constraint that we cannot guarantee strong consistency.

in other words, The data service does the sharding expansion on the premise that it does not provide security for Cross-shard Transactions. Redis cluster also does not provide similar support because distributed transactions are inherently conflicting with Redis 's Positioning.

therefore, Our shard scheme has two limitations:

· data in different shards must be strictly segregated, such as data from different groups of services, or completely irrelevant data. To achieve Cross-shard data interactions, you must rely on a higher-level coordination mechanism to ensure that the data service layer does not make any commitments. In this way, if you want to provide a coordination mechanism for the application layer, it is simple and straightforward to deploy the single-instance simple lock mechanism introduced in the previous article on each shard .

· Our sharding scheme is unable to make a data redundancy mechanism like distributed storage systems between shards, in other words, a data cross exists in multiple shards.

How do I implement sharding?

first, we want to determine what problems the Shard solution needs to Solve.

fragmented redis clusters, in effect, form a stateful service (statefulservices). To design a stateful service, we usually consider it from two points:

Cluster Membership, the relationship between each node of the system, or each Shard.

Work Distribution, what the external request should be, which node to handle, or the user (hereinafter referred to as dbclient) should be to find which shard to read or Write.

For the first problem, the solution usually has three:

presharding, also known as sharding static Configuration.

Gossip protocol, in fact, is the scheme used by Redis Cluster. Simply put, each node in the cluster has a different cluster global view due to network differentiation, node jitter, and so On. Node information is shared between nodes through gossip protocol . This is the Industry's more popular way to the central Solution.

the consensus system, which,contrary to the previous one, relies on an externally distributed conformance facility, whose quorum determines the identities of the nodes in the Cluster.

· Demand decision solution, the novel June believes that for the game server and most of the application backend scenarios, the cost of the latter two is too high, will add a lot of uncertain complexity, so both scenarios are not the right choice. moreover, Most services are usually able to determine the capacity limit for each shard during the design phase, and do not require much complex mechanism support.

however , the shortcomings of presharding are also obvious, do not have dynamic capacity reduction, and can not be highly available. But in fact, as long as a little transformation, enough to meet the Demand.

however, before we talk about concrete transformation measures, we will first look at the second problem--work distribution, which we have proposed before to solve the fragmentation scheme .

The problem is actually looking at sharding from another dimension, there are a lot of solutions, but if you look at the impact of the architecture, there are probably two types:

· one is proxy-based , based on the additional forwarding Agent. Examples are twemproxy/codis.

· one is Client sharding , namely dbclient (each service that requires data services) is maintained sharding rules, Self-Service choose which to go to Redis Instance. the redis cluster is essentially this, andthedblient side caches some sharding Information.

The disadvantage of the first scenario is obvious - adding an additional layer of indirection across the architecture, adding a trip to the round-trip. If It's okay to support high availability like twemproxy or Codis , but GitHub On any one can find a special number of high-availability proxy-based solution, No more than a single point, it is completely confused sharding The meaning of The.

the second kind of solution, the novel June can think of is the cluster state changes when there is no immediate notification to Dbclient.

in the first scenario, we can actually pass it out directly. Because this scenario is more suitable for private cloud scenarios, The department that develops the data service is likely to be far from the business unit and therefore requires a unified forwarding agent Service. But for some simple application development scenarios, The data Service logic service is written by a group of people, and there is no need to add extra layers.

then, It seems that you can only choose the second Option.

by combining presharding with client sharding , we now have the following Results: data services are global,redis Multiple instances can be opened and irrelevant data needs to be accessed on different shards,dbclient Mastering this mapping relationship.

however, the current solution can only be considered to meet the basic needs of the application of data services.

in the game industry, most of the teams that use Redis typically end up choosing this as their own data service. The subsequent extensions are not an impossible thing for them to do, but there may be maintenance complexities and uncertainties.

But as a well-ethical programmer, the novel chooses to continue to expand.

There are two problems with this solution:

· first, While we do not support the need for online data migration, offline Data migration is a must, after all presharding do not get foolproof. In this scenario, if you use a simple hashing algorithm, adding a shard will cause the original key to shard The correspondence becomes very messy, raising the cost of data Migration.

· second, Fragmentation scenarios can spread the risk of crashing the entire data service across different Shard , for example, compared to non-fragmented data services, a machine hangs up, affecting only part Client . however, we should be able to extend the data service more deeply and make it more usable.

for the first problem, the processing mode is not much different from the proxy-based adopted, because the current data service scheme is relatively simple, the consistent hash can be used. Or a relatively simple two-segment mapping, the first paragraph is a static fixed hash, and the second is a dynamic configurable map. The former through the algorithm, the latter through the map Configuration Maintenance mode, can minimize the impact of the key set.

For the second problem, the solution is to achieve high Availability.

How can I make Data Services highly available? Before we discuss this issue, let's start by looking at how Redis achieves "usability".

What is the nature of usability for redis? In fact , the redis instance can have a backup node on top after it is Hung.

Redis supports this by two Mechanisms.

The first mechanism is replication. The usual replication scheme is divided into two main types.

· one is active-passive , namely Active the node modifies its state first, then writes the unified persistence Log , and then Passive node Read Log Follow up Status.

· the other is active-active , write request unified write to persist Log , and then each Active Automatic Node Synchronization Log Progress.

the replication Scheme of Redis uses a active-passive scheme with weak consistency . that is, Master maintains the logitself, and logs the log to other slave sync,master hangs up may cause some logs to be lost,client finished Master can receive a successful return, which is an asynchronous replication.

This mechanism can only solve the problem of node data redundancy, Redis to have usability is also resolved redis The instance hangs up to let the spare tire automatic overhead question, after all the human flesh to monitor the master State again human flesh to switch is not realistic. A second mechanism is therefore needed.

the second mechanism is Redis's own Redis Sentinelthat automates fail-over . Reds Sentinel is actually a special redis instance, which is itself a highly available service - - can be opened more, can automatically service discovery (based on redis built -in pub-sub support,Sentinel is not disabled pub-sub command map), can be leader election(based on raft algorithm is implemented as A module of sentinel, and then when the master is found dead leader launched the fail-overand dropped the offline master down to new Master of the slave .

based on these two mechanisms, Redis has been able to achieve a certain degree of Usability.

next, Let's look at how data services are highly available.

What is the nature of the availability of data services? In addition to the requirements for Redis availability --redis instance Data redundancy, automatic failover, You also need to notify each dbclient of a switched message .

In other words, the first diagram is changed to look like this:

Each shard is changed to master and slave Mode.

if Redis Sentinel is responsible for Master-slave switching, the most natural idea is to get dbclient to Sentinel Requests the current node Master-slave connection Information. But Redis Sentinel itself is also a redis instance, and the number is dynamic, andRedis Sentinel connection information is not only a problem in the configuration, but also a variety of problems when updating dynamically.

also,Redis Sentinel is essentially a static parts(service to dbclient ) for the entire server , but relies on Redis 's start-up is not particularly elegant. On the other hand,dbclient to ask Redis Sentinel to connect to the current connection information, only rely on its built -in pub-sub Mechanism. Redis 's pub-sub is just a simple message distribution, no message persistence, so a Polling-type request connection Information model is Required.

can we tailor a service at a lower cost to replace Redis Sentinel andsolve the problem?

recall the previous article we solve the resharding problem of the Idea:

1. A consistent Hash.

2. using a relatively simple two-segment mapping, the first paragraph is a static fixed hash, and the second is a dynamic configurable map. The former through the algorithm, the latter through the map Configuration Maintenance mode, can minimize the impact of the key set.

Dynamic resharding can be implemented in two scenarios , anddbclient can be dynamically updated:

· If two-stage mapping is used, then we can dynamically release the second segment of the configuration Data.

· If a consistent hash is used, then we can dynamically distribute the Shard's connection Information.

again, The services we are going to implement (hereinafter referred to as watcher), at least to achieve these requirements:

· to be able to monitor Redis State of Survival. This is easy to implement, and you can PING Redis instances on a regular basis . The information needed and the basis for judging the objective downline and the subjective downline can be directly copied Sentinel Implementation.

· to achieve self-service discovery, including other Watcher the discovery and monitoring of the Master-slave discovery of new nodes in the Group. On the implementation, the former can be based on the PUB-SUB function of the message queue , the latter as long as the Redis instance periodically INFO Get the information You Can.

· to be found in Master The objective is to choose when the Downline leader perform the subsequent failover process. This is the most complex part of the implementation, which is discussed in the next Section.

· elected leader then one of the most suitable slave promoted to Master , and then wait for the old Master and put it on the line again and downgrade it to a new Master of the slave .

to address these issues,watcher is both scalable and customizable, while also providing some online migration mechanisms for fragmented data services. In this way, our data services are more robust and more usable.

In this way , although the master-slave group for each shard of Redis is guaranteed to be available, new uncertainties are introduced as we introduce new services -- If the service is introduced with the availability of the data service, then we have to make sure that the service itself is Available.

that might be a bit of a detour, in other words, service . A high availability is achieved with service b , and service b itself needs to be highly available.

Let's start with a brief introduction to how Redis Sentinel is highly available. While monitoring the same set of Master-slave Sentinel can have multiple,whenmaster hangs off, these Sentinel will be based on Redis own realization of a Raft the algorithm elects leader , the algorithmic flow is not particularly complex, at least Paxos It's much Simpler. All Sentinel are follower, judging the master objective offline Sentinel will be upgraded into candidate at the same time to other Follower canvassing, All Follower the same Epoch can only vote for the first one to canvass Himself. candidate . In the concrete manifestation, usually one or two epoch can guarantee form majority, elect leader. With leader, it's much easier to do slaveof on redis back .

if you want to replace Sentinelwith Watcher, the most complex implementation details may be this part of the LOGIC.

This part of the logic is to maintain a consistent state in the distributed system, for example, the concept of "who is leader " as a state amount, by the identity of the distributed system of equal number of nodes together to maintain, since it is possible to modify this variable, And whose amendment actually works?

fortunately, for this common problem scenario, we have a ready-made infrastructure abstraction to Solve.

This infrastructure is the coordinator component (coordinator) of the distributed system, and the veteran has zookeeper(based on Paxos Improved Zab protocol, hereinafter referred to as ZK ), The new point has etcd(this is clear to all, based on raft protocol). This kind of component usually does not need to repeat development, like Paxos This algorithm understands to have to be long time, the detail order of realization is more difficult to imagine. So many open source projects rely on both for high availability, such as the ZK that Codis started with .

What problem did ZK solve?

for General Application Service requirements,ZK can be used to select leader, and can also be used to maintain dbclient configuration data -- Dbclient go directly to the ZK to get the data on the LINE.

ZK 's specific principles of the novel June is no longer introduced, there is time to study the next Paxos, see lamport Paper , no time, no energy. ZK the implementation of the principle of the blog on the Line.

How to realize leader electionbased on ZK is briefly introduced. ZK provides a directory structure similar to the os file system, where each node on the directory structure has the concept of a type and can store some data at the same time. ZK also provides a one-time triggered watch mechanism.

application layer To do leader election can be based on the concept of Implementation.

Suppose there is a directory node " /election ",watcher1 When the node is started to create a child node, the node type is a temporary sequential node, that is, the node will be hung off with the creator , the order means that a number suffix is appended to the name of the node, which uniquely identifies the node 's IDin the child node of " /election ".

· a simple solution is to make each Watcher All Watch " /election "all the child nodes, and then look at their ID Whether it is the smallest, if it is to explain himself to be leader and then tell the application layer that you are leader , let the application layer do the Following. however, This can have a surprise group effect, because a child node is deleted, each watcher will be notified, but at most one watcher will be from Follower becomes leader.

· optimizing some of the scenarios is that each node is concerned about a single row of nodes that are smaller than themselves. This way, if the node with the smallest ID is hung,the node with the smaller ID will be notified and then learn that it has become a leader , avoiding the surprise group Effect.

the novel June found in practice, there is still a point to note that the temporary nodes of the provisional manifest in a session instead of a connection Termination.

For example watcher1 Each application node is called watcher1, the first time it applies for a successful node full name is assumed to be watcher10002(followed by ZK automatically add the serial number), and then offline,watcher10002 node will also exist for some time, if this period of time Watcher1 go online again and try to create Watcher1 will fail, and then the previous node is a bit too soon because Session and then destroyed, which is the equivalent of this Watcher1 Disappeared.

There are two solutions, you can explicitly delete the nodes before you create them , or you can ensure that the names of each node are created differently, such as guids, by other mechanisms .

As for the configuration issued, it is much simpler. When configuring changes, the node data is updated directly, and with ZK notification to the dbclient of concern , this event notification mechanism is compared to the poll request Sentinel the mechanism to configure data is more Elegant.

Take a look at the final architecture diagram:

Source: Public Account

Building data Services based on Redis

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More