The Redis cluster automatically slices between multiple nodes when it is started. Also provides availability between shards: The cluster can continue to work when a subset of REDIS nodes fail or the network is interrupted. However, when a large area of node failure or network outage (for example, most of the primary node is not available), the cluster cannot be used.
So, from a practical point of view, the Redis cluster provides the following features:
- Automatically slice data into multiple REDIS nodes
- Cluster can continue to work when some nodes are hung or unreachable
TCP ports for Redis clusters
Each node in the Redis cluster needs to establish 2 TCP connections and listen to these 2 ports: One port is called the "client Port", which is used to accept client instructions, interact with the client, such as 6379, and the other port is called "Cluster bus port", which adds 10000 to the client port number. For example, 16379, used for communication between nodes via binary protocol. Each node detects the outage node, updates the configuration, and the failover verification through the cluster bus. Clients can use only client ports and cannot use cluster bus ports. Make sure that your firewall allows you to open both ports, otherwise the Redis cluster will not work. The difference between the client port and the cluster bus port is fixed, and the cluster bus port is 10000 higher than the client port.
Note that there are 2 ports on the cluster:
- The client port (typically 6379) needs to be open to all clients and cluster nodes, because the cluster node needs to transfer data through that port.
- The cluster bus port (typically 16379) is only available for all nodes in the cluster
These 2 ports must be open or the cluster will not work properly.
The interaction of data between cluster nodes via the cluster bus port, using a protocol different from the client's protocol, is a binary protocol, which reduces bandwidth and processing time.
sharding of Redis cluster data
Redis clusters do not use a consistent hash, but instead use hash slots. The entire Redis cluster has 16,384 hash slots, and the algorithm that determines which one key should be allocated to that slot is: computes the CRC16 result of the key and then modulo 16834.
Each node in the cluster is responsible for a portion of the hash slot, such as 3 nodes in a cluster:
- The hash slot range for Node A storage is: 0–5500
- The hash slot range stored by Node B is: 5501–11000
- The hash slot range stored by node C is: 11001–16384
This method of distribution facilitates the addition and deletion of nodes. For example, a new node D is needed to move some of the hash slot data in a, B, and C to the D node. Similarly, if you want to delete the a node in the cluster, you only need to move the data of the hash slot of the A node to the B and C nodes, and when the data of the a node is all removed, the a node can be completely removed from the cluster.
Because moving a hash slot from one node to another node does not require downtime, adding or removing nodes, or changing the hash slots on the nodes, is not required.
If multiple keys belong to a hash slot, the cluster supports the simultaneous manipulation of these keys through a single command (or transaction, or LUA script). The concept of "hash label" allows the user to assign multiple keys to the same hash slot. Hash tags are described in the cluster detail document, here is a brief introduction: If the key contains curly braces "{}", then only the strings in curly braces will participate in the hash, such as "This{foo}" and "Another{foo}" the 2 keys will be assigned to the same hash slot, So you can manipulate them in one command at the same time.
Master-slave mode for Redis clusters
In order to ensure that the cluster still works properly when a partial node fails or the network fails, the cluster uses a master-slave model, each with one (master node) to N replicas (N-1 nodes). In our earlier cluster example, there are a,b,c three nodes, if the B-node failure cluster will not work properly, because the hash slot data in the B node is not operational. However, if we add a slave node to each node, it becomes: a,b,c three nodes are the primary node, A1, B1, C1 are their slave nodes, and when the B node goes down, our cluster will work. The B1 node is a copy of the B-node, and if the B-node fails, the cluster will elevate the B1 to the primary node, allowing the cluster to continue working properly. However, if B and B1 fail at the same time, the cluster cannot continue to work.
consistency assurance for Redis clusters
Redis clusters do not guarantee strong consistency. Some operations that have confirmed write success to the client are lost in some uncertain circumstances.
The first reason for the loss of write operations is because the master-slave nodes use asynchronous methods to synchronize the data.
A write operation is such a process:
- 1) The client initiates a write operation to master Node B
- 2) Master Node B responds to client write operation success
- 3) Master Node B synchronizes the write operation to its slave node b1,b2,b3
As can be seen from the above process, master Node B does not wait for the result of the client's operation after it has been written from the node b1,b2,b3. Therefore, if master Node B fails after notifying the client that the write operation succeeds, but before synchronizing to the slave node, the master Node B is faulted, and one of the slave nodes that did not receive the write operation is promoted to the master node, and the write operation is lost forever.
Just like a traditional database, it writes back to disk every second without involving a distributed scenario. To improve consistency, you can reply to the client after the write is complete, but you will lose performance. This is equal to how the Redis cluster uses synchronous replication.
Basically, between performance and consistency, a tradeoff is needed.
If you really need it, the Redis cluster supports synchronous replication by using the wait command, which reduces the likelihood of losing write operations. However, even with synchronous replication, Redis clusters are still not strongly consistent, and in some complex situations, such as when a node is selected as the primary node after it loses connectivity to the master node, inconsistencies can occur.
This inconsistency occurs when the client is connected to a small number of nodes (at least one master node), but they are not connected to most other network nodes. For example, 6 nodes, A,b,c is the main node, A1,B1,C1 is their slave node, a client called Z1.
When the network problems, they were divided into 2 groups of networks, the group network connectivity, but the network between the 2 groups are not connected, assuming that a,c,a1,b1,c1 between each other is Unicom, the other side, B and Z1 network is unicom. Z1 can continue to write to B, and B also accepts Z1 write operations. When the network recovers, if the interval is short enough, the cluster can still continue to function properly. If the time is longer, so that B1 on most of the side is selected as the main node, then just Z1 to B write operations will be lost.
Note that Z1 send write to B has a limit, if the length of time to reach the majority of nodes can choose a new master node, a few of the main side of all the primary node will not accept write operations.
The configuration of this time, called node timeout, is very important to the cluster, and when it reaches the time that the node expires, the primary node is considered to be down and can be replaced with one of its slave nodes. Similarly, when a node times out, if the primary node still cannot contact the other primary node, it goes into an error state and no longer accepts write operations.
redis cluster parameter configuration
-
- cluster-enabled <yes/no>: If you configure "yes" to turn on cluster functionality, this Redis instance is a node of a cluster, otherwise it is a common single redis instance.
-
- cluster-config-file <filename>: Note: Although this configuration is named "Cluster Profile", this profile cannot be edited manually, it is a file that is automatically maintained by the cluster node and is used primarily to record which nodes are in the cluster, Their status, as well as some persistence parameters, make it easy to restore these states on reboot. This file is usually updated after the request is received.
-
- cluster-node-timeout <milliseconds>: This is the maximum time that a node in a cluster can be lost, and beyond that time, the node is considered to be faulty. If the primary node exceeds this time or is unreachable, the slave node will initiate a failover and upgrade to the master node. Note that this node will stop receiving any requests if the node is still not connected to most of the primary nodes within this time.
-
- cluster-slave-validity-factor <factor>: If set to 0, the slave node will attempt to upgrade to the master node regardless of how long the slave node is missing from the primary node. If set to a positive number, then cluster-node-timeout times the time that is obtained by Cluster-slave-validity-factor, which is the maximum time to be valid from the node data after it is lost from the node to the primary node, and will not initiate a failover from the node. Assuming cluster-node-timeout=5,cluster-slave-validity-factor=10, this slave node cannot become the primary node if the slave node is not associated with the primary node for more than 50 seconds. Note that if this parameter is configured to be non-0, it is possible that the cluster will not function properly because a primary node is missing, but not from the top of the node, and in this case, only when the original primary node is re-returned to the cluster.
-
- cluster-migration-barrier <count>: The minimum number of slave nodes required by the primary node, which is migrated from the node only if the primary node fails. A more detailed introduction can be seen later in this tutorial about replica migration to the section.
-
- cluster-require-full-coverage <yes/no>: If this parameter is set to "Yes" (the default) when the node where the key is located is not available, the entire cluster will stop accepting the operation; If this parameter is set to " No, the cluster still provides read operations for keys on the nodes that can be reached.
Create and use a Redis cluster
To create a cluster, you first need an empty Redis instance running in cluster mode. It is also said that Redis launched in normal mode is not a node of the cluster, Redis instances that need to be started in cluster mode can have the characteristics of the cluster nodes, support the command of the cluster, and become the nodes of the cluster.
The following is the minimum configuration file for the Redis cluster:
port 7000
cluster-enabled yes
cluster-config-file nodes.conf
cluster-node-timeout 5000
appendonly yes
To enable cluster mode, just enable the cluster-enabled configuration item. Each redis instance contains a configuration file, the default is nodes.conf, used to store some configuration information of this node. This configuration file is created and updated by the nodes of the redis cluster, and cannot be manually modified by humans.
A minimum cluster requires a minimum of 3 master nodes. For the first test, it is strongly recommended that you configure 6 nodes: 3 master nodes and 3 slave nodes.
To start the test, the steps are as follows: first enter the new directory, use the port of the redis instance as the directory name, create a directory, and we will run our instance in these directories.
Something like this:
mkdir cluster-test
cd cluster-test
mkdir 7000 7001 7002 7003 7004 7005
Create a configuration file redis.conf in each directory of 7000-7005, and use the simplest configuration above as a template. Note that the port number should be changed to a port consistent with the directory.
Copy your redis server (compiled with the latest code from the unstable branch in GitHub) to the cluster-test directory, then open 6 terminal pages to test.
Start a redis instance in each terminal, the instructions are like this:
cd 7000
../redis-server ./redis.conf
In the log we can see that since no nodes.conf file does not exist, each node gives itself a new ID.
[82462] 26 Nov 11: 56: 55.329 * No cluster configuration found, I ’m 97a3a64667477371c4479320d683e4c8db5858b1
This ID will always be used by this node as the unique identifier of this node in the entire cluster. Nodes are distinguished from other nodes by this ID, not IP or port. The IP can be changed and the port can be changed, but this ID cannot be changed until this node leaves the cluster. This ID is called the Node ID.
Create a cluster
Now that the six instances are running, we need to write some meaningful configurations to the nodes to create the cluster. The redis cluster command tool redis-trib can make it very easy for us to create a cluster. Redis-trib is a script written in ruby, which is used to send instructions to each node to create a cluster, check the status of the cluster, or reshard the cluster. Redis-trib is in the src directory of the Redis source code. You need gem redis to run redis-trib.
gem install redis
To create a cluster, just enter the command:
./redis-trib.rb create --replicas 1 127.0.0.1:7000 127.0.0.1:7001 127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005
The command used here is create because we need to create a new cluster. The option "–replicas 1" means that each master needs one slave. The other parameter is the address of the redis instance that needs to be added to this cluster.
The cluster we created has 3 master nodes and 3 slave nodes.
redis-trib will give you some configuration suggestions, enter yes to accept. The clusters will be configured and connected to each other, which means that each node instance is directed to talk to each other and eventually form a cluster. Finally, if all goes well, you will see a message similar to the following:
[OK] All 16384 slots covered
This means that 16384 hash slots are normally served by the master node.
Create redis cluster using create-cluster script
If you don't want to manually configure the nodes separately to create a cluster like the above, there is a simpler system (of course you can't understand some details of the cluster operation).
In the utils / create-cluster directory, there is a bash script named create-cluster. If you need to start a cluster with 3 master nodes and 3 slave nodes, just enter the following instructions
1. create-cluster start
2. create-cluster create
In step 2, when redis-trib wants you to accept the layout of the cluster, enter "yes".
Now you can interact with the cluster. The starting port of the first node is 30001 by default. When you are finished, stop the cluster with the following command:
1. create-cluster stop.
Please check the README in the directory for a detailed description of how to use this script.
Try the cluster
You can use the client provided above or the redis-cli command to test the cluster.
The following uses redis-cli as an example to test:
$ redis-cli -c -p 7000
redis 127.0.0.1:7000> set foo bar
-> Redirected to slot [12182] located at 127.0.0.1:7002
OK
redis 127.0.0.1:7002> set hello world
-> Redirected to slot [866] located at 127.0.0.1:7000
OK
redis 127.0.0.1:7000> get foo
-> Redirected to slot [12182] located at 127.0.0.1:7002
"bar"
redis 127.0.0.1:7000> get hello
-> Redirected to slot [866] located at 127.0.0.1:7000
"world"
Note: If you use a script to create a cluster, your redis may listen on a different port. The default is 30001.
Redis-cli uses the characteristics of any node in the cluster to inform the client of the correct node characteristics, and achieves the most basic functions of the cluster client. A more rigorous client can cache the hash slot-to-node mapping relationship, allowing the client to connect directly to the correct node, and only refresh the cache when the node configuration of the cluster is updated, such as a failure migration, or an administrator increase Or reduced nodes and so on.
Resharding
Resharding simply means moving hash slots from some nodes to other nodes. Resharding can be done just like creating a cluster using redis-trib.
To start resharding, enter the following command:
./redis-trib.rb reshard 127.0.0.1:7000
You only need to specify one node in the cluster, and redis-trib will automatically find the other nodes in the cluster.
Currently, redis-trib only supports administrator operations and cannot say: move 50% of the hash slots from this node to that node. It starts by asking questions. The first question is how many hash slots you need to re-shard:
How many slots do you want to move (from 1 to 16384)?
Since our previous script has been running and is not called with sleep, at this time, more keys should have been inserted. We can try to re-shard 1000 hash slots.
Then, redis-trib needs to know to which node we want to move these 1000 hash slots, that is, the nodes that accept these 1000 hash slots. I want to use the node 127.0.0.1:7000. You need to use the node ID to tell which node redis-trib is. redis-trib has listed all nodes and their IDs on the screen. The ID of the specified node can also be found with the following command:
$ redis-cli -p 7000 cluster nodes | grep myself
97a3a64667477371c4479320d683e4c8db5858b1: 0 myself, master-0 0 0 connected 0-5460
OK, my target node is 97a3a64667477371c4479320d683e4c8db5858b1.
Now redis-trib will ask: From which nodes do you want to remove these hash slots? When I type all, the hash slots are removed from other master nodes.
After inputting the final confirmation, redis-trib will output on the screen which node each hash slot will be transferred to which node. A dot is printed every time the key screen is actually moved.
During the re-sharding process, you can see that the script you just ran is not affected. You can even re-run the example script repeatedly during the re-sharding process.
After the resharding is complete, you can check whether the current status of the cluster is normal. Run the following command:
./redis-trib.rb check 127.0.0.1:7000
All hash slots exist. At this time, the master node at 127.0.0.1:7000 has a little more hash slots, with about 6,461.
Scripted resharding
Re-sharding does not need to be performed interactively, it can be performed automatically using the following instructions:
./redis-trib.rb reshard --from <node-id> --to <node-id> --slots <number of slots> --yes <host>: <port>
If you want to re-shard frequently, you can use the above instructions to auto-shard, but currently the redis-trib script does not perform load balancing and intelligently migrate hash slots based on the key distribution on the nodes. This feature will be added in the future.
Test failover
Note: During the test, please keep the above consistency test application running.
To trigger failover, the easiest way is to bring one process down. In our use case, one of the master node processes is down.
We can distinguish cluster nodes with the following instructions:
$ redis-cli -p 7000 cluster nodes | grep master
3e3a6cb0d9a9a87168e266b0a0b24026c0aae3f0 127.0.0.1:7001 master-0 1385482984082 0 connected 5960-10921
2938205e12de373867bf38f1ca29d31d0ddb3e46 127.0.0.1:7002 master-0 1385482983582 0 connected 11423-16383
97a3a64667477371c4479320d683e4c8db5858b1: 0 myself, master-0 0 0 connected 0-5959 10922-11422
Therefore, 7000, 7001, and 7002 are master nodes. We want to make 7002 go down and use the DEBUG SEGFAULT instruction:
$ redis-cli -p 7002 debug segfault
Error: Server closed the connection
Now, let's see what the previous example of the consistency detector outputs:
18849 R (0 err) | 18849 W (0 err) |
23151 R (0 err) | 23151 W (0 err) |
27302 R (0 err) | 27302 W (0 err) |
... many error warnings here ...
29659 R (578 err) | 29660 W (577 err) |
33749 R (578 err) | 33750 W (577 err) |
37918 R (578 err) | 37919 W (577 err) |
42077 R (578 err) | 42078 W (577 err) |
We see that the example shows 578 read failures and 577 write failures, but no inconsistencies have occurred. As we mentioned in the previous chapter, the redis cluster is not strongly consistent. Because it asynchronously replicates data to the slave node, it may cause data loss if the master node fails. ? Because the master node immediately synchronizes data to the slave node after responding to the client, this is almost simultaneous. The time difference here is very small. Only when the master node fails in this very small time difference will inconsistency occur. Although the possibility of occurrence is very small, it does not mean that it cannot happen, the redis cluster is still not strongly consistent.
Now let's see what the cluster does when the node fails (note that I have restarted the failed node, it has reconnected to the cluster and became a slave node):
$ redis-cli -p 7000 cluster nodes
3fc783611028b1707fd65345e763befb36454d73 127.0.0.1: 7004 slave 3e3a6cb0d9a9a87168e266b0a0b24026c0aae3f0 0 1385503418521 0 connected
a211e242fc6b22a9427fed61285e85892fa04e08 127.0.0.1:7003 slave 97a3a64667477371c4479320d683e4c8db5858b1 0 1385503419023 0 connected
97a3a64667477371c4479320d683e4c8db5858b1: 0 myself, master-0 0 0 connected 0-5959 10922-11422
3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e 127.0.0.1:7005 master-0 1385503419023 3 connected 11423-16383
3e3a6cb0d9a9a87168e266b0a0b24026c0aae3f0 127.0.0.1:7001 master-0 1385503417005 0 connected 5960-10921
2938205e12de373867bf38f1ca29d31d0ddb3e46 127.0.0.1:7002 slave 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e 0 1385503418016 3 connected
Now, the port of the master node becomes: 7000, 7001, 7005 (previously the master node was 7002, and now it becomes 7005).
The output of the "cluster nodes" command looks quite scary, but it is very simple. The meaning of each column is as follows:
* Node ID
* IP: Port
* Mark bit: master node, slave node, itself, fail. . .
* If it is a slave node, then the ID of its master node is next
* The time that the last ping is still waiting for a response
* The last time you received pong
* Last updated this configuration
* Node connection status
* Stored hash slots
Manual failover
Sometimes, manual failover is very useful, it will not cause any problems to the master node. For example, if you need to upgrade the redis process of a master node, you can first make it a slave node by manual failover, so that the upgrade can be used on the cluster The impact of availability is minimal.
The redis cluster supports failover through the instruction "CLUSTER FAILOVER", but this command needs to be executed on a slave node of the failed master node.
Compared with the real master node being down, manual failover is safer, it can avoid data loss, and after the new master node has copied all the data, it will redirect the client from the original master node to the new master node. .
Here are some logs I saw after executing the cluster failover instruction on one of the slave nodes:
# Manual failover user request accepted.
# Received replication offset for paused master manual failover: 347540
# All master replication stream processed, manual failover can start.
# Start of election delayed for 0 milliseconds (rank # 0, offset 347540).
# Starting a failover election for epoch 7545.
# Failover election won: I ‘m the new master.
To put it simply: the client stops connecting to the original master node that has been failed over; at the same time, the original master node synchronizes the replica set that has not been synchronized to the slave node; when the slave node receives all the replica sets, the failover begins and the original master node is notified The configuration is updated and the master node is changed; the client is redirected to the new master node.
Add new node
Adding a new node adds an empty node to the cluster. There are two cases: if the new node is the master node, some data is transferred to it from other nodes in the cluster; if the new node is the slave node, it is told to synchronize the replication set from a known node.
We try both cases. The first is to add a new master node to the cluster.
In both cases, you need to add an empty node to the cluster first.
Given that we have started 6 nodes earlier, and the port numbers 7000-7005 have already been used, the port number of the new node is 7006. Add a new empty node, the same steps as above to start the first 6 nodes (remember to change the port number of the configuration file):
* Open a new page in the terminal
* Go to the cluster-test directory
* Create a directory named "7006"
* Create the redis.conf file in this directory, the content is consistent with the content of other nodes, but the port number is changed to 7006.
* Finally start it: ../redis-server ./redis.conf
The node should now be up and running.
Now, we use redis-trib to add a new node to the cluster:
./redis-trib.rb add-node 127.0.0.1:7006 127.0.0.1:7000
Use the add-node command to add a node. The first address is the address of the node to be added, and the second address is the address of any node in the cluster.
The redis-trib script just sends the CLUSTER MEET message to the node, which can also be sent manually through the client, but redis-trib will check the status of the cluster before sending, so it is better to use the redis-trib script to operate the cluster .
Now we can connect to the new node and see if it has joined the cluster:
redis 127.0.0.1:7006> cluster nodes
3e3a6cb0d9a9a87168e266b0a0b24026c0aae3f0 127.0.0.1:7001 master-0 1385543178575 0 connected 5960-10921
3fc783611028b1707fd65345e763befb36454d73 127.0.0.1:7004 slave 3e3a6cb0d9a9a87168e266b0a0b24026c0aae3f0 0 1385543179583 0 connected
f093c80dde814da99c5cf72a7dd01590792b783b: 0 myself, master-0 0 0 connected
2938205e12de373867bf38f1ca29d31d0ddb3e46 127.0.0.1:7002 slave 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e 0 1385543178072 3 connected
a211e242fc6b22a9427fed61285e85892fa04e08 127.0.0.1:7003 slave 97a3a64667477371c4479320d683e4c8db5858b1 0 1385543178575 0 connected
97a3a64667477371c4479320d683e4c8db5858b1 127.0.0.1:7000 master-0 1385543179080 0 connected 0-5959 10922-11422
3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e 127.0.0.1:7005 master-0 1385543177568 3 connected 11423-16383
Although this new node is now connected to the cluster and can redirect clients to the correct cluster node, it is different from other master nodes of the cluster:
* It has no data because no hash slot is assigned to it
* Because it is a master node without a hash slot, when a slave node needs to be elected as a new master node, it does not have the right to participate
You can add a hash slot to the new node by using the redis-trib resharding instruction. Since we have already described how to re-shard, we will not go into details here.
Adding a slave node
There are two ways to add slave nodes. The first is to use the redis-trib script above and add the --slave option, similar to this:
./redis-trib.rb add-node --slave 127.0.0.1:7006 127.0.0.1:7000
Note that the above command line is similar to the command line that we add the master node, so there is no specification of the master node of the newly added slave node. At this time, redis-trib will randomly select one of the master nodes with the least slave nodes as the master node. Add the master node of the node.
Of course, you can also specify the master node of the new slave node by using the following command:
./redis-trib.rb add-node --slave --master-id 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e 127.0.0.1:7006 127.0.0.1:7000
Using the above instruction, we can specify that the new slave node is the replica set of that master node.
Another method is to add the new node to the cluster as a master node, and then use the "CLUSTER REPLICATE" instruction to make it a slave node. This method is also applicable to replacing the master node with the slave node.
For example, there is already a master node 127.0.0.1:7005, which has a hash slot range of 11423-16383 and a node ID of 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e. We want to add a slave node to it. First use the previous method to add an empty master node, then connect to the new node and send the following command:
redis 127.0.0.1:7006> cluster replicate 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e
In this way, the new slave node is added successfully, and all other nodes in the cluster already know the new node (it may take some time to update the configuration). We can verify this with:
$ redis-cli -p 7000 cluster nodes | grep slave | grep 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e
f093c80dde814da99c5cf72a7dd01590792b783b 127.0.0.1:7006 slave 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e 0 1385543617702 3 connected
2938205e12de373867bf38f1ca29d31d0ddb3e46 127.0.0.1:7002 slave 3c3a0c74aae0b56170ccb03a76b60cfe7dc1912e 0 1385543617198 3 connected
Now node 3c3a0c ... has 2 slave nodes, which are the original node running on port 7002 and the newly added node on port 7006.
Delete node
Use the redis-trib command "del-node" to delete a node:
./redis-trib del-node 127.0.0.1:7000 `node-id`
The first parameter is any node of the cluster, and the second parameter is the ID of the node to be deleted.
The same method can delete the master node, but before deleting, you need to remove the data by re-sharding.
Another way to delete the master node is to manually fail over and have one of its slave nodes upgraded to the master node before deleting this node. However, this does not reduce the number of master nodes in the cluster. If the number of master nodes needs to be reduced, re-sharding is inevitable.
Copy set migration
In a redis cluster, you can use the following instructions to replace the master node at any time with the slave node:
CLUSTER REPLICATE <master-node-id>
There is a special scenario: the system automatically changes the master node of the replication set, instead of the administrator to manually handle it. This scenario of automatic reconfiguration of slave nodes is called replicas migration, which can increase the robustness of the redis cluster.
Note: You can learn more details in the "Redis Cluster Specification", here is only a brief introduction to it and its usefulness.
If a cluster with only one slave node per master node, the cluster cannot continue to work if both the master and slave nodes fail at the same time, because the hash slot data stored in the failed node can no longer be read or written. Although network disconnection is likely to cause a large number of nodes to be isolated at the same time, but There are many other conditions that can cause node failures, such as hardware or software failures that cause a node to go down. It is also a very important cause of node failure. In this case, generally all nodes do not fail at the same time. For example, each master node in the cluster has a slave node, and one slave node is killed at 4 o'clock, and the master node is killed at 6 o'clock. This still causes the cluster to fail to work.
In order to enhance the availability of the system, you can add another slave node to each master node, but this is more expensive. Replication set migration allows us to add more slave nodes to only some master nodes. For example, there are 10 master nodes, each master node has 1 slave node, a total of 20 nodes, and then you can add some slave nodes (such as 3 slave nodes) to some master nodes, so that there are slave nodes of some master nodes The number exceeds 1.
When a master node has no slave nodes, if there is a master node in the cluster with multiple slave nodes, the replication set migration mechanism finds a node among these multiple slave nodes and makes a replication set to the master node without the slave nodes. So when a slave node goes down at 4 o'clock, another slave node will replace it as the slave of the master node; then when the master node goes down at 5 o'clock, there is another slave node that can be upgraded to a master node, so that the cluster Can continue to run.
So, to simply say that replica set migration is:
* The cluster will find the master node that has the most slave nodes, pick one of its slave nodes, and perform the replication set migration.
* For replication set migration to take effect, you only need to add a few more slave nodes to the cluster, and you can add any master node to it.
* Regarding the replication set migration, there is a configuration parameter called "cluster-migration-barrier", which is described in detail in the sample configuration file of the cluster.
Upgrade nodes in a cluster
Upgrading a slave node is very simple, just stop it and restart the updated version. If the client is connected to a slave node, when the node is unavailable, the client needs to reconnect to another available slave node.
Upgrading the main points is relatively complicated. Here is the recommended process:
1. Use the CLUSTER FAILOVER instruction to trigger a manual failover so that the master node becomes a slave node
2. Wait until the master node becomes a slave node
3. Upgrade the slave node
4. If you want the upgraded node to become the master again, trigger a manual failover again to make it the new master.
With this procedure, all nodes are upgraded one by one.
Migrating to a redis cluster
The user needs to migrate the redis data to the redis cluster. The original data may be only one master node, or it may have been fragmented in the existing way, and the keys are stored in N nodes.
The situations in the above 2 are easy to migrate. The most important details are whether multiple keys are used and how to use multiple keys. Here are 3 different situations:
1. No operation of multiple keys (including instructions, transactions, lua scripts that operate multiple keys). All keys are operated independently.
2. Operate multiple keys (including instructions, transactions, lua scripts that operate multiple keys), but these keys have the same hash tag, such as these keys that are operated simultaneously: SUNION {user: 1000} .foo { user: 1000} .bar
3. Operated multiple keys (including instructions, transactions, and Lua scripts that operate multiple keys). These keys are not specially processed and do not have the same label.
The third case cannot be handled by the redis cluster. You need to modify the application. Do not use multiple keys or add the same hash tag to these keys.
The first and second cases can be handled, and they are handled the same way.
Suppose that your existing data is divided into N master nodes for storage (when N = 1, there is no sharding). To migrate data to the redis cluster, you need to perform the following steps:
1. Stop your client. There is currently no method for automatically migrating online to a redis cluster. You can plan for yourself how to make your application support online migration.
2. Use the BGREWRITEAOF instruction to make all master nodes generate AOF files and wait for these files to be created.
3. Save these AOF files and name them aof-1, aof-2, ..aof-N. If necessary, you can stop the original redis instance (for non-virtualized deployment, you need to reuse this computer, It is helpful to stop the old process).
4. Create a redis cluster with N master nodes and 0 slave nodes. Add slave nodes later. Make sure that all nodes have the appendonly configuration enabled.
5. Stop all nodes in the cluster, and replace the AOF file of each node with the AOF file just saved, aof-1 for the first node, aof-2 for the second node, and so on.
6. Restart all nodes. These nodes may prompt that some keys should not be stored on this node according to the configuration.
7. Use the redis-trib fix command to let the cluster automatically migrate data based on hash slots
8. Use redis-trib check command to ensure your cluster is normal
9. Have your client use the redis cluster client library and restart it.
There is also a method to import data from an existing redis instance into a redis cluster, using the redis-trib import command. This instruction deletes all data in the source instance and writes the data to the previously deployed cluster. It should be noted that if your source instance uses redis version 2.8, this import process may take a long time, because version 2.8 does not implement a connection cache for data migration, so it is best to upgrade the redis version of the source instance to 3 first. .x version.
Redis cluster knowledge analysis