Redis replication and Scalable Cluster creation)

Last Update:2018-12-07 Source: Internet

Author: User

Tags redis cluster

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Document directory

Redis Replication Process Overview
Redis replication mechanism Defects
Cache or storage
Build a scalable redis Cluster
Ideas for improving redis Replication
Integration of redis and MySQL

This article will discuss the redis replication function, the advantages and disadvantages of the redis replication mechanism, and the cluster construction issues.

Redis Replication Process Overview

Redis's replication function is fully based on the memory snapshot-based persistence policy we have discussed earlier. That is to say, no matter what your persistence policy is, as long as the redis replication function is used, there will be a memory snapshot. First of all, pay attention to your system memory capacity planning. For the reason, refer to the redis disk Io problem mentioned in my previous article.

The redis replication process flows through a set of state machines on the slave and master sides. The state information involved is:

Slave end:

REDIS_REPL_NONEREDIS_REPL_CONNECTREDIS_REPL_CONNECTED

MASTER:

REDIS_REPL_WAIT_BGSAVE_STARTREDIS_REPL_WAIT_BGSAVE_ENDREDIS_REPL_SEND_BULKREDIS_REPL_ONLINE

The entire state machine process is as follows:

1. The slave end adds the slave of command to the configuration file, so the slave reads the configuration file at startup and the initial status is redis_repl_connect.
2. the slave end connects to the master in the scheduled task servercron (redis Internal timer trigger event) and sends the sync command, then block and wait for the master to send back its memory snapshot file (the latest version of redis does not need to block the slave ).
3. when the master side receives the sync command, it simply determines whether there is a memory snapshot sub-process in progress. If no, it starts the memory snapshot immediately. If yes, It waits until it ends, after the snapshot is complete, the file will be sent to the slave end.
4. the server Load balancer receives the memory snapshot file sent from the master and stores it locally. After receiving the snapshot file, it clears the memory table, reads the memory snapshot file sent from the master again, and recreates the data structure of the entire memory table, the final state is set to redis_repl_connected, And the slave state machine flow is complete.
5. when the master sends a snapshot file, any commands that change the dataset will be saved to the sending cache Queue (List Data Structure) connected to the slave network for the moment. After the snapshot is complete, it is sent to slave in sequence, and the commands received are processed in the same way, and the status is set to redis_repl_online.

The entire replication process is completed, as shown in the following figure:

Redis replication mechanism Defects

From the above process, we can see that when the slave database is connected to the master database of the master database, the master will take a memory snapshot and then send the entire snapshot file to the slave, that is, there is no replication location concept like MySQL, that is, there is no incremental replication, which will bring a lot of problems to the entire cluster.

For example, if a master database running online is configured with a slave database for simple read/write splitting, and slave is disconnected from the master database due to network or other reasons, when slave reconnects, you need to re-obtain the memory snapshot of the entire master. All the data in the slave will be cleared, and then the entire memory table will be re-created. On the one hand, the slave recovery time will be very slow, and on the other hand, it will bring pressure to the master database.

Therefore, if your redis cluster Requires master-slave replication, it is best to configure all slave databases in advance to avoid adding slave databases in the middle.

Cache or storage

After analyzing the replication and persistence functions of redis, we can draw a conclusion that the current version of redis is still a single-host version, and the main problems are as follows, the persistence method is not mature enough, and the replication mechanism has large defects. At this time, we began to rethink redis's positioning: cache or storage?

As a cache, it seems that in addition to some very special business scenarios, we must use a certain data structure of redis. memcached may be more suitable for us, after all, memcached, regardless of the client package and the server itself, has been tested for a longer time.

If it is used as storage, the biggest problem we face is that persistence or replication cannot solve the single point of failure in redis, that is, a redis instance is suspended, there is no good way to quickly recover. Usually tens of GB of persistent data, it takes several hours for redis to restart and load, and replication is flawed. How can this problem be solved?

Redis Scalable Cluster construction 1. Active replication avoids redis replication defects.

Since the replication function of redis is defective, we may wish to discard the replication function provided by redis itself. We can build our cluster environment through active replication.

Active Replication refers to the dual-write or multi-write operations on the data stored in redis by the business end or through the Proxy Middleware, and achieves the same purpose as Replication through multiple copies of the data, active replication is not limited to redis clusters. Currently, many companies use the active replication technology to solve the problem of replication latency between master and slave MySQL databases, for example, Twitter also specifically developed the middleware gizzard (https://github.com/twitter/gizzard) for replication and partitioning ).

Although active replication solves the latency problem of passive replication, it also brings about a new problem, that is, data consistency. Data is written twice or multiple times, how can we ensure consistency of multiple copies of data? If your application does not have high requirements on data consistency and supports eventual consistency, a simple solution is usually to use timestamps or vector clock, allow the client to obtain and verify multiple copies of data at the same time. If your application has high requirements on data consistency, You need to introduce some complex consistency algorithms such as paxos to ensure data consistency, however, the write performance also decreases a lot.

Through active replication, we no longer worry about redis spof when storing multiple data copies. If a group of redis clusters fails, we can quickly switch the business to another group of redis, reduces business risks.

2. Online redis resizing through presharding.

Through active replication, we have solved the single point of failure (spof) Problem in redis, so there is another important problem to be solved: Capacity Planning and online resizing.

We have previously analyzed that the applicable scenario of redis is that all data is stored in the memory, but the memory capacity is limited. First, we need to make a preliminary capacity plan based on the business data volume, for example, if your business data requires GB of storage space and the server memory is 48 gb, we need about 3 ~ according to the redis disk Io problem we discussed in the previous article ~ 4 servers for storage. This is actually a capacity plan for the current business situation. If the business grows rapidly, it will soon find that the current capacity is insufficient, the data stored in redis will soon exceed the physical memory size. How can we resize redis online?

The author of redis proposed a solution called presharding to solve the problem of dynamic resizing and Data Partitioning. It is actually the method of deploying multiple redis instances on the same machine, when the capacity is insufficient, multiple instances are split into different machines, which achieves the expansion effect.

The splitting process is as follows:

1. Start the redis instance of the corresponding port on the new machine.
2. Configure the new port as the slave database of the port to be migrated.
3. After the replication is complete and the master database is synchronized, switch all clients to the new slave database port.
4. Configure the slave database as the new master database.
5. Remove the old port instance.
6. Repeat the above process to migrate all ports to the specified server.

The above splitting process is a smooth migration process proposed by the author of redis. However, this splitting method still relies heavily on the replication function of redis itself. If the snapshot data file of the master database is too large, this replication process will take a long time and put pressure on the master database. Therefore, we recommend that you perform this splitting during off-peak hours for business access.

Ideas for improving redis Replication

Our online system uses our own stand-alone version of redis, which mainly solves the defect that redis does not have incremental replication and can perform incremental replication from the location of the database Request log like MySQL BINLOG.

Our persistence solution is to first write the aof file of redis, and automatically split and scroll the aof file by file size, and disable the rewrite command of redis, memory snapshot storage will be performed during off-peak hours, and the current aof file location will be written into the snapshot file together, so that we can make the snapshot file and aof file location consistent, in this way, we get the memory snapshot of the system at a certain time point, and we can also know the location of the aof file corresponding to this time point. When a synchronization command is sent from the slave database, first, we will send the snapshot file to the slave database. Then, the slave database will take out the location of the aof file stored in the snapshot file and send the location to the master database, the master database then sends all commands after this location, and subsequent replication will be the incremental information after this location.

Integration of redis and MySQL

Currently, most Internet companies use MySQL as the primary persistent storage of data. How can we make redis and MySQL well integrated? We mainly use a solution based on MySQL as the master database and redis as the heterogeneous read/write splitting of the high-speed data query slave database.

Therefore, we have developed our own MySQL replication tool to conveniently synchronize Mysql Data to redis in real time.

(MySQL-redis heterogeneous read/write splitting)

Summary:

1. The redis replication function does not support incremental replication. The entire memory snapshot of the master database is sent to the slave database for each reconnection. Therefore, you need to avoid adding slave databases to the master database that is under heavy pressure on online services.
2. because redis replication uses the snapshot persistence method, if you choose the log append method (AOF) for redis persistence ), then, the system may simultaneously fl the aof log files to the disk and write snapshots to the disk at the same time. At this time, redis's response capability will be affected. Therefore, if aof persistence is used, you need to be more cautious when adding slave databases.
3. You can use the active replication and presharding methods to build and scale up redis clusters online.

In addition to the previous two articles, this article analyzes and discusses the most common functions, application scenarios, and optimization of apsaradb for redis. There are many other auxiliary functions in apsaradb for redis, redis author is also constantly trying new ideas, here not to list one by one, interested friends can study, also welcome to discuss together, my weibo (http://weibo.com/bachmozart) @ swing Bach.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More