Redis master-slave switchover Drill
Redis master-slave switchover Drill
Environment
IP address |
Port |
Attribute |
192.168.31.208 |
6379 |
Master |
192.168.31.208
|
6378
|
Slave |
192.168.31.209
|
6379
|
Master
|
192.168.31.209
|
6378
|
Slave |
192.168.31.210
|
6379
|
Master
|
192.168.31.210
|
6378
|
Slave
|
Environment Variable
PATH = $ PATH: $ HOME/cpprelease/redis-3.0.2/src/: $ HOME/bin
BASE_PATH =/home/beidou_soa/cpprelease/
Export PATH
Alias redisstart1 = 'CD ~ /Redis/& redis-server $ BASE_PATH/cfg/redis/redis1.conf & cd -'
Alias redisstart2 = 'CD ~ /Redis/& redis-server $ BASE_PATH/cfg/redis/redis2.conf & cd -'
Alias redisstop1 = 'CD ~ /Redis/& redis-cli-p 6379 shutdown & cd -'
Alias redisstop2 = 'CD ~ /Redis/& redis-cli-p 6378 shutdown & cd -'
1. Check the environment [beidou_soa @ localhost ~] $ Redis-trib.rbcheck192.168.31.208: 6379Connectingtonode192. 168.31.208: 6379: large: 6378: large: 6378: OKConnectingtonode192.168.31.210: 6379: large: 6379: large: 6378: OK >>> large mingclustercheck (usingnode192.168.31.208: 6379) M: Large. 168.31.208: 6379 slots: 10923-16383 (5461 slots) master1additionalreplica (s) S: 08f61dcd66389dae5c39e375d4f52e1defa77ec1192. 168.31.210: 6378 slots :( 0 slots) rows: Clerk: 6378 slots :( 0 slots) Clerk: 40cecda23f32cb3b8ff60752c00514f2d7d9c3d0192. 168.31.210: 6379 slots: 0-5460 (5461 slots) master1additionalreplica (s) M: 4f11d4175178d72e0ccf7edf0ddabf835e9c56df192. 168.31.209: 6379 slots: 5461-10922 (5462 slots) master1additionalreplica (s) S: 4b6e2b13b1be1a081db2153dc4beaf1_b489605192. 168.31.208: 6378 slots :( 0 slots) slavereplicates273aa3c0416e7d1795ce678d56bd2db148613f7e [OK] Allnodesagreeaboutslotsconfiguration.> checkforopenslots... >>> checkslotscoverage... [OK] All16384slotscovered.
[Beidou_soa @ localhost ~] $ Redis-cli-c-p6379127.0.0.1: 6379> 127.0.0.1: 6379> 127.0.0.1: 6379> gettestkey001 (nil) 127.0.0.1: 6379> 127.0.0.1: 6379> 127.0.0.1: 6379> snapshot-> Redirectedtoslot [401] partition: 6379OK192. 168.31.210: 6379> 192.168.31.210: 6379> gettestkey002 "testvalue002" 192.168.31.210: 6379> settestkey003testvalue003OK
2. Prepare to shut down the 208 master [beidou_soa @ localhost ~] $ Redisstop1
3. check cluster Status view slave logs [beidou_soa @ localhostredis] $ vimredis-6378.log6874: S30Dec15: 28: 04.755 # response: Connectionrefused6874: S30Dec15: 28: 05.758 * response: 63796874: S30Dec15: 28: 05.758 * MASTER <-> slave: S30Dec15: 28: 05.759 # slave: Connectionrefused6874: S30Dec15: 28: 06.647 * slave: S30Dec15: 28: 06.647 # Clusterstatechanged: fail6874: S30Dec15: 28: 06.662 # Startofelectiondelayedfor842milliseconds (rank #0, offset105547 ). 6874: S30Dec15: 28: 06.762 * ConnectingtoMASTER192.168.31.208: 63796874: S30Dec15: 28: 06.762 * MASTER <-> priority: S30Dec15: 28: 06.763 # priority: Connectionrefused6874: S30Dec15: 28: 07.565 # Startingafailoverelectionforepoch4.6874: S30Dec15: 28: 07.567 # Failoverelectionwon: I 'mthenewmaster. 6874: S30Dec15: 28: 07.567 # authorization: M30Dec15: 28: 07.567 * discardingpreviuslycachedmasterstate.6874: M30Dec15: 28: 07.567 # Clusterstatechanged: OK
View the cluster status [beidou_soa @ localhost ~] $ Redis-trib.rbcheck192.168.31.208: 6379connectingtonode192. 168.31.208: 6378: large: 6378: large: 6379: OKConnectingtonode192.168.31.210: 6378: large: 6379: OK >>> large mingclustercheck (usingnode192.168.31.208: 6378) M: Large. 168.31.208: 6378 slots: 10923-16383 (5461 slots) master0additionalreplica (s) S: Clerk: 6378 slots :( 0 slots) Clerk: Clerk. 168.31.209: 6379 slots: 5461-10922 (5462 slots) master1additionalreplica (s) S: 08f61dcd66389dae5c39e375d4f52e1defa77ec1192. 168.31.210: 6378 slots :( 0 slots) bandwidth: 40cecda23f32cb3b8ff60752c00514f2d7d9c3d0192. 168.31.210: 6379 slots: 0-5460 (5461 slots) master1additionalreplica (s) [OK] Allnodesagreeaboutslotsconfiguration. >>> checkforopenslots... >>> checkslotscoverage... [OK] All16384slotscovered.
4. Disable slave [beidou_soa @ localhost ~] $ Redisstop1
5. Check the cluster status [beidou_soa @ localhost ~] $ Redis-trib.rbcheck192.168.31.209: 6379Connectingtonode192. 168.31.209: 6379: large: 6379: large: 6378: OKConnectingtonode192.168.31.209: 6378: large: 6378: OK >>> large mingclustercheck (usingnode192.168.31.209: 6379) M: Large. 168.31.209: 6379 slots: 5461-10922 (5462 slots) master1additionalreplica (s) M: 40cecda23f32cb3b8ff60752c00514f2d7d9c3d0192. 168.31.210: 6379 slots: 0-5460 (5461 slots) master1additionalreplica (s) S: 08f61dcd66389dae5c39e375d4f52e1defa77ec1192. 168.31.210: 6378 slots :( 0 slots) small: 6378 slots :( 0 slots) small: 4b6e2b13b1be1a081db2153dc4beaf1_b489605192. 168.31.208: 6378 slots: 10923-16383 (5461 slots) master0additionalreplica (s) [OK] Allnodesagreeaboutslotsconfiguration. >>> checkforopenslots... >>> checkslotscoverage... [OK] All16384slotscovered.
6. Disable three slave [beidou_soa @ localhost ~] $ Redisstop2
[Beidou_soa @ localhost ~] $ Redisstop2
[Beidou_soa @ localhost ~] $ Redisstop2
7. Check the cluster status [beidou_soa @ localhost ~] $ Redis-trib.rbcheck192.168.31.209: 6379Connectingtonode192. 168.31.209: 6379: OKConnectingtonode192.168.31.210: 6379: OKConnectingtonode192.168.31.208: 6378: OK >>> checking mingclustercheck (usingnode192.168.31.209: 6379) M: Large. 168.31.209: 6379 slots: 5461-10922 (5462 slots) master0additionalreplica (s) M: 40cecda23f32cb3b8ff60752c00514f2d7d9c3d0192. 168.31.210: 6379 slots: 0-5460 (5461 slots) master0additionalreplica (s) M: 4b6e2b13b1be1a081db2153dc4beaf1_b489605192. 168.31.208: 6378 slots: 10923-16383 (5461 slots) master0additionalreplica (s) [OK] Allnodesagreeaboutslotsconfiguration. >>> checkforopenslots... >>> checkslotscoverage... [OK] All16384slotscovered.
8. disable a master when slave is disabled [beidou_soa @ localhost ~] $ Redisstop1
9. Check the cluster status [beidou_soa @ localhost ~] $ Redis-trib.rbcheck192.168.31.209: 6379Connectingtonode192. 168.31.209: 6379: OKConnectingtonode192.168.31.210: 6379: OK >>> checking mingclustercheck (usingnode192.168.31.209: 6379) M: Large. 168.31.209: 6379 slots: 5461-10922 (5462 slots) master0additionalreplica (s) M: 40cecda23f32cb3b8ff60752c00514f2d7d9c3d0192. 168.31.210: 6379 slots: 0-5460 (5461 slots) master0additionalreplica (s) [OK] Allnodesagreeaboutslotsconfiguration. >>> checkforopenslots... >>> checkslotscoverage... [ERR] Notall16384slotsarecoveredbynodes.
The cluster enters the fail status and is unavailable.
10. Enable a Server Load balancer [beidou_soa @ localhost ~] in the cluster down state. $ Redisstar2
11. Check the cluster status [beidou_soa @ localhost ~] $ Redis-trib.rbcheck192.168.31.209: 6379Connectingtonode192. 168.31.209: 6379: OKConnectingtonode192.168.31.210: 6379: OKConnectingtonode192.168.31.208: 6379: OK *** WARNING: 192.168.31.208: Large. >>> performingClusterCheck (usingnode192.168.31.209: 6379) M: 4f11d20175178d72e0ccf7edf0ddabf835e9c56df192. 168.31.209: 6379 slots: 5461-10922 (5462 slots) master0additionalreplica (s) M: 40cecda23f32cb3b8ff60752c00514f2d7d9c3d0192. 168.31.210: 6379 slots: 0-5460 (5461 slots) master0additionalreplica (s) S: 273aa3c0416e7d1795ce678d56bd2db148613f7e192. 168.31.208: 6379 slots :( 0 slots) slavereplicates4b6e2b13b1be1a081db2153dc4beaf%b489605 [OK] Allnodesagreeaboutslotsconfiguration.> checkforopenslots... >>> checkslotscoverage... [ERR] Notall16384slotsarecoveredbynodes
The cluster is in the fail status. The master cannot be automatically elected and the cluster is unavailable.
12. Enable master and check the cluster status [beidou_soa @ localhost ~] $ Redis-trib.rbcheck192.168.31.209: 6379Connectingtonode192. 168.31.209: 6379: large: 6379: OKConnectingtonode192.168.31.208: 6379: OKConnectingtonode192.168.31.208: 6378: OK >>> checking mingclustercheck (usingnode192.168.31.209: 6379) M: Large. 168.31.209: 6379 slots: 5461-10922 (5462 slots) master0additionalreplica (s) M: 40cecda23f32cb3b8ff60752c00514f2d7d9c3d0192. 168.31.210: 6379 slots: 0-5460 (5461 slots) master0additionalreplica (s) S: 273aa3c0416e7d1795ce678d56bd2db148613f7e192. 168.31.208: 6379 slots :( 0 slots) rows: 4b6e2b13b1be1a081db2153dc4beaf1_b489605192. 168.31.208: 6378 slots: 10923-16383 (5461 slots) master1additionalreplica (s) [OK] Allnodesagreeaboutslotsconfiguration. >>> checkforopenslots... >>> checkslotscoverage... [OK] All16384slotscovered.
13. Shut down a master [beidou_soa @ localhost ~] $ Redisstop1
14. Check the cluster status [beidou_soa @ localhost ~] $ Redis-trib.rbcheck192.168.31.209: 6379Connectingtonode192. 168.31.209: 6379: OKConnectingtonode192.168.31.210: 6379: ^ [[AOKConnectingtonode192.168.31.208: 6379: OK *** WARNING: 192.168.31.208: Large.> performingClusterCheck (usingnode192.168.31.209: 6379) M: 4f11d20175178d72e0ccf7edf0ddabf835e9c56df192. 168.31.209: 6379 slots: 5461-10922 (5462 slots) master0additionalreplica (s) M: 40cecda23f32cb3b8ff60752c00514f2d7d9c3d0192. 168.31.210: 6379 slots: 0-5460 (5461 slots) master0additionalreplica (s) S: 273aa3c0416e7d1795ce678d56bd2db148613f7e192. 168.31.208: 6379 slots :( 0 slots) slavereplicates4b6e2b13b1be1a081db2153dc4beaf%b489605 [OK] Allnodesagreeaboutslotsconfiguration.> checkforopenslots... >>> checkslotscoverage... [ERR] Notall16384slotsarecoveredbynodes.
1 minute later
[Beidou_soa @ localhost ~] $ Redis-trib.rbcheck192.168.31.209: 6379Connectingtonode192. 168.31.209: 6379: OKConnectingtonode192.168.31.210: 6379: OKConnectingtonode192.168.31.208: 6379: OK >>> checking mingclustercheck (usingnode192.168.31.209: 6379) M: Large. 168.31.209: 6379 slots: 5461-10922 (5462 slots) master0additionalreplica (s) M: 40cecda23f32cb3b8ff60752c00514f2d7d9c3d0192. 168.31.210: 6379 slots: 0-5460 (5461 slots) master0additionalreplica (s) M: 273aa3c0416e7d1795ce678d56bd2db148613f7e192. 168.31.208: 6379 slots: 10923-16383 (5461 slots) master0additionalreplica (s) [OK] Allnodesagreeaboutslotsconfiguration. >>> checkforopenslots... >>> checkslotscoverage... [OK] All16384slotscovered.
The election is complete. Redis cluster recovery.
Redis Architecture
Architecture details:
(1) All redis nodes are interconnected (PING-PONG mechanism), and the binary protocol is used internally to optimize the transmission speed and bandwidth.
(2) fail takes effect only when more than half of nodes in the cluster fail to be detected.
(3) The client is directly connected to the redis node, without the intermediate proxy layer. The client does not need to connect to all nodes in the cluster, just connect to any available node in the cluster.
(4) redis-cluster maps all physical nodes to the [0-16383] slot, and the cluster maintains the node <-> slot <-> value
2) redis-cluster election: Fault Tolerance
(1) The election process involves the participation of all the master nodes in the cluster. If more than half of the master nodes communicate with the master node (cluster-node-timeout), the current master node is considered to have crashed.
(2) When the entire cluster is unavailable (cluster_state: fail). When the cluster is unavailable, all operations on the cluster are unavailable and receive (error) CLUSTERDOWN The cluster is down) Error
A: If any master node of the cluster fails and the current master node does not have a Server Load balancer instance. The cluster enters the fail status, it can also be understood as the fail status when the slot ing to the cluster [0-16383] is not completed.
B: if more than half of the master nodes in the cluster are down, whether or not the slave cluster is in the fail status.