The construction and Jedis test of Sentinel cluster in Redis [III]

Source: Internet
Author: User
Tags failover redis cluster redis server



Setup and Jedis of Sentinel clusters in the top two redis tutorials [i] and the setup and Jedis of Sentinel clusters in Redis [II] The construction of Sentinel cluster in Redis and the Jedis test of Java code are briefly described.
  
This article mainly to analyze briefly the principle of Redis-sentinel cluster, according to track Sentinel information to complete the Redis-sentinel cluster testing in the detailed principle analysis. Including the analysis of Sentinel information in Master-slave, the failover process, the leader election process after master outage, and so on, a deep detailed description of its various principles.


One, Redis common ha scheme


HA (High Available, highly available cluster) machine cluster system short, is to ensure business continuity of the effective solution, generally have two or more than two nodes, and divided into active nodes and backup nodes. It is common practice to
The line business is called an active node, and a backup as an active node is called an alternate node. When there is a problem with the active node, which causes the running Business (task) to not function properly, the standby node is detected at this point and immediately executes the business with the active node. This allows for uninterrupted or short outages of the business. From Baidu Encyclopedia.
Redis is typically deployed in a master/slave manner (the application discussed here is primarily for backup from the instance, and the primary instance provides read and write). There are several scenarios for implementing HA in this way:
1) keepalived: Through the keepalived virtual IP, provides the master-slave unified access, in the main problem, through the keepalived run script will be promoted from the main, after the primary recovery after the first synchronization automatically, the benefits of this scheme is master-slave switching, the application does not need to know ( Because the virtual IP access is the same, the downside is that introducing keepalived increases deployment complexity and in some cases leads to data loss;
2) Zookeeper: Through the zookeeper to monitor master-slave instances, maintain the latest effective IP, the application through the Zookeeper acquisition of IP, access to Redis, the program needs to write a large number of monitoring code;
3) Sentinel: Sentinel monitoring Master and slave instances, automatic failback, the scheme has a flaw: because the master-slave instance address (Ip&port) is different, when the fault occurs after the master-slave switch, the application can not know the new address, so in Jedis2.2.2 new Increased support for Sentinel, the application of Jedis instances obtained through Redis.clients.jedis.JedisSentinelPool.getResource () is updated to the new primary instance address in a timely manner.
The following is the deployment logic diagram for this scenario.


Second, redis-sentinel Configuration and deployment


See the setup and Jedis of Sentinel cluster in Redis for details [I.]
Since Sentinel servers are already in the Redis Server environment (Redis-sentinel), we use them directly (in production environments, Sentinel servers and redis-server servers are generally separate, Deployed on different pc-server, the number of Sentinel and the number of redis-server are not linked, the following for the convenience of learning to use the 3 Sentinel services are placed on a PC).
  Note: The Sentinel profile is only related to the default master, and it is not related to slave. Our above example is the use of 3 Redis-server and 3 Redis-sentinel, in fact, the number of Redis-sentinel not necessarily and redis-sever corresponding, 1~n can.
First, it is clear that Sentinel is a process independent of redis and does not provide key/value services. In the Redis installation directory, the name is Redis-sentinel. It is mainly used to monitor the redis-server process for master/slave management, and if your redis is not running in Master/slave mode, you do not need to set Sentinel.


Iii. Redis-sentinel Start-up and detection


Start Redis-server and Redis-sentinel respectively.
  
1) Sentinel console for redis-0 when starting redis-0
  
  
2) Sentinel console for Reids-1 when starting redis-1
  
  
Sentinel Console for reids-0 when starting redis-1
  
  
3) Sentinel console for Reids-2 when starting Redis-2
  
  
Sentinel Console for reids-0 when starting Redis-2
  
  
Sentinel Console for Reids-1 when starting Redis-2



Through the control output information can be seen, in order to start the service, to use the console output +sentinel, +slave,-sdown and other information, indicating that each sentinel communication, but also monitor the situation of redis-server. +sentinel indicates that a new Sentinel instance has been added to the monitor. Tip: When you first build your sentinel environment, you must first start the Master machine.
  
4) View related information
Use # NETSTAT-NTLP | The grep redis command can see the current redis run condition.
  
  
Viewing the status of Redis-server through REDIS-CLI


/usr/local/webserver/redis/redis-cli-h 127.0.0.1-p 6379-a abcd123457 Info Replication


(The-A is used to enter the password)



Description: Info command
This command will print the complete service information, including the cluster, we only need to pay attention to the "Replication" section (in the above command, we added the Replication after info, if not added, this will also output Server, clients, Memory, Information such as persistence, Stats, CPU, and Keyspace), this information will tell us the "role of the current server" and all the slave information that points to it. You can obtain the master information pointed to by the current slave by using the "INFO" command on any slave. The following is the Replication information for slave1.
   
  
At the same time, this instruction can not only help us get the situation of the cluster, but the Sentinel component also uses "INFO" to do the same thing. Below is the Sentinel information for Slave2.
   
  
Through the above information, we can clearly see the Redis service status and the master-slave relationship.

5) Failover Test
When the above deployment environment is stable, we directly shut down redis-0, after waiting for "down-after-milliseconds" seconds (30 seconds), Redis-0/redis-1/redis-2 Sentinel window will immediately print "+sdown", "+odown", "+failover", "+selected-slave", "+promoted-slave", "+slave-reconf" and a series of instructions indicating that when master fails, the Sentinel component The process of failover.
Simulates a mater outage. At this point, each Sentinel console outputs the following information.
  
Sentinel information on the redis-0.



Sentinel Information on the redis-1
  
  
Sentinel Information on the Redis-2



As you can see from the three windows above, three Sentinel (Sentinel) has failed over master after the master outage.


{From the sentinel console of redis-1, you can see that the following operations were performed.
a. + sdown mater mymaster 127.0.0.1 6379 (subjectively believes that mater has failed);
b. + odown mater mymaster 127.0.0.1 6379 #quorum 2/2 (There are already two sentries who think that master is subjectively invalid, and mater is considered to be objectively invalid);
c. + new-epoch 1 (ready to select a new mater);
d. + try-failover master mymaster 127.0.0.1 6379 (try hot backup switch, master gives up position);
e. + vote-for-leader 1eb0f03b7a7815c3c5506b0fa041ad8d6ca9db90 1 (voting election leader);
f.127.0.0.1: 16379 voted for 1eb0f03b7a7815c3c5506b0fa041ad8d6ca9db90 1 (vote election leader);
g.127.0.0.1: 36379 voted for 1eb0f03b7a7815c3c5506b0fa041ad8d6ca9db90 1 (vote election leader);
h. + elected-leader master mymaster 127.0.0.1 6379 (previously elected master);
i. + failover-state-select-slave master mymaster 127.0.0.1 6379 (Leader starts to find the appropriate slave);
j. + selected-slave slave 127.0.0.1:63792 127.0.0.1 63792 @ mymaster 127.0.0.1 6379 (leader has found a suitable slave);
k. + failover-state-send-slaveof-noone slave 127.0.0.1:63792 127.0.0.1 63792 @ mymaster 127.0.0.1 6379 (Leader sends a "slaveof no one" instruction to the slave. At this time, the slave has completed its role transition. This slave Ie master);
l. + failover-state-wait-promotion slave 127.0.0.1:63792 127.0.0.1 63792 @ mymaster 127.0.0.1 6379 (wait for other sentinel to confirm the slave);
m. + promoted-slave slave 127.0.0.1:63792 127.0.0.1 63792 @ mymaster 127.0.0.1 6379 (confirmed successfully);
n. + failover-state-reconf-slaves master mymaster 127.0.0.1 6379 (begin reconfiguring slaves);
o. + slave-reconf-sent slave 127.0.0.1:63791 127.0.0.1 63791 @ mymaster 127.0.0.1 6379 (send a "slaveof" command to the specified slave to tell this slave to follow the new master);
p. + slave-reconf-inprog slave 127.0.0.1:63791 127.0.0.1 63791 @ mymaster 127.0.0.1 6379 (This slave is executing the slaveof + SYNC process. If the slave receives "+ slave-reconf-sent", Execute slaveof operation, loop n);
q. + slave-reconf-done slave 127.0.0.1:63791 127.0.0.1 63791 @ mymaster 127.0.0.1 6379 (this slave is completed synchronously, after that, the leader can continue the reconfig operation of the next slave);
r. + failover-end master mymaster 127.0.0.1 6379 (end of failover);
s. + switch-master mymaster 127.0.0.1 6379 127.0.0.1 63792 (after failover is successful, each sentinel instance starts to monitor the new master);
t. + slave slave 127.0.0.1:63791 127.0.0.1 63791 @ mymaster 127.0.0.1 63792 (the following steps are adding slaves to the new master);
u. + slave slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 63792;
v. + sdown slave 127.0.0.1:6379 127.0.0.1 6379 @ mymaster 127.0.0.1 63792. 


When the environment was stable, we found that Redis-2 was promoted ("promoted") as Master, and Redis-1 followed redis-2 through the "slave-reconf" process.
If the redis-0 server returns to normal at this point, Sentinel will automatically add redis-0 as slave to redis-2.
The following is information about the Redis-0 Sentinel console output.
  
  
View Redis-2 again (current Master's info)



  Tip : Sentinel instances need to be in a full boot state, and if you start the server without starting the appropriate sentinel, you still cannot ensure that the server is properly monitored and managed.


Four, Redis-sentinel principle analysis


1) Sdown and Odown conversion process
"noun explanation"
Sdown : Subjectively down, the direct translation of the "subjective" failure, that is, the current The Sentinel instance considers a Redis service to be "not available" status.
Odown : Objectively down, translated directly to "objective" failure, that is, multiple Sentinel instances are considered master in the "Sdown" state, then master will be in Odown, Odown can simply be understood as master has been identified as "unavailable" by the cluster and will open failover. The
(Sdown is appropriate for both master and slave, but Odown is only used for master, and when the slave failure exceeds "Down-after-milliseconds", all Sentinel instances will be marked as "Sdow N ". )



Conversion procedure
A. After each Sentinel instance is started, TCP connections are established with known Slaves/master and other Sentinels, and pings are periodically sent (default is 1 seconds);
B. In interaction, if re Dis-server cannot respond to or respond to an error message within the "Down-after-milliseconds" time, it is considered to be in the Sdown state,
C. If the Sdown server in B is mast Er, the Sentinel instance will then send the "is-master-down-by-addr" instruction to other Sentinel intermittently (one second) and get the response information, if enough sentinel instances detect that Master is in Sdown, this time The current Sentinel instance tag Master is Odown ... The other sentinel instances do the same. Configuration item "Sentinel Monitor", if the number of slave that Master is in the Sdown state is detected, then this Sentinel instance will assume that Master is in Odown at this point; d. Each Sentinel instance will intermittently (10 seconds) Send an "info" instruction to master and slaves, and if master fails and no new master is selected, the "info" is sent every 1 seconds, and the main purpose of "info" is to obtain and confirm when The survival of slaves and master in the pre-cluster environment;
After the above procedure, all Sentinel has agreed to master failure and start failover.



2) Sentinel and slaves "Autodiscover" mechanism
In Sentinel's configuration file (local-sentinel.conf), port is specified, which is the port where the Sentinel instance listens for links to other Sentinel instances. After the cluster is stable, each Sentinel instance will end up A TCP link is created that sends "PING" and similar to the "is-master-down-by-addr" instruction set, which can be used to detect the validity of other sentinel instances and the interaction of the information in the "Odown" and "failover" processes.
Before establishing a connection between Sentinel, Sentinel will try to establish a connection with the master specified in the configuration file. The communication between Sentinel and Master is primarily based on Pub/sub to publish and receive information, including the listening port of the current Sentinel instance.



3) Leader election
In fact, in Sentinels failover, a "Leader" is still required to dispatch the entire process: Master election and slave reconfiguration and synchronization. When there are multiple Sentinel instances in a cluster, how do you elect one Sentinel as leader?
  
REDIS2.8.7 's election has two conditions, first of all to filter out some nodes from the following conditions


    • Use the following criteria to filter alternate node:
      1, except for slave node state in s_down,o_down,disconnected
      2, the most recent ping response time does not exceed 5 times times the ping interval (if the ping interval is 1 seconds, the last response delay should not exceed 5 seconds, Redis Sentinel is 1 seconds by default)
      3, Info_refresh answer not more than 3 times times Info_refresh interval (principle with 2,redis Sentinel default is 10 seconds)
      4. The time that the slave node loses contact with the master node cannot exceed ((Now-master->s_down_since_time) + (Master->down_after_period * 10)). The overall meaning is that the slave node is too late to synchronize with master (such as a newly-launched node) and should not participate in the election.
      5, Slave priority is not equal to 0 (this is specified in the configuration file, the default configuration is 100).
    • From alternative node, select the new master in the following order
      1, the lower slave_priority (this is specified in the configuration file, the default configuration is 100)
      2, the larger replication offset (each slave in sync with master after the automatic increase of offset)
      3, the smaller Runid (each Redis instance, will have a runid, usually a 40-bit random string, set at Redis startup, very small repetition probability)
      4, if the above conditions are not enough to distinguish the unique node, you will see which slave node processing before master sent more command, select WHO.


We expect to have enough sentinel instances to ensure that when leader fails, it is possible to elect a Sentinel to leader for failover. If leader cannot be generated, such as fewer sentinels instances are valid, then the failover process cannot continue.



4) Failover Process
Before Leader triggers failover, wait a few seconds (then 0~5) for the other sentinel instances to prepare and adjust, and if everything works, then Leader needs to start promoting a salve to master, this slave must be a state Good (cannot be in Sdown/odown state) and the weight value is lowest (redis.conf), when master identity is confirmed, start failover.


V. Summary of Redis-sentinel Learning


1) Level expansion of Redis. The previous implementation of Redis's master-slave HA cluster (from the server to do backup), imagine when the cache to a level a server has not been satisfied, think of the Redis distributed, put the cache to allocate to more than one server. Redis has also officially provided Redis cluster for distribution, but there is no official release (Redis 3.0 seems to provide support, and it hasn't been studied yet). Java can be partitioned by Jedis's Shardedjedis.
2) Monitoring of Redis. Whether a thing is running normally, stably, or in a performance situation, this involves monitoring it. The current Redis monitoring tools are: RedMon, redis-live and so on. This article temporarily does not monitor, the reader may refer to other materials to learn to use.
3) Read/write separation at cluster time. The Lord used to write, from which to read. In a Redis HA cluster, the master-slave server is changed, which results in the program, which is not easy to get the current service is the main, which several services are from. We can get the Jedis instance from the server by our own code, see the setup of Sentinel cluster in Redis and the Jedis test drawing tutorial [ii] to achieve read and write separation.
4) "Cache data Synchronization" is also a must-think problem for all caching tools.



The construction and Jedis test of Sentinel cluster in Redis [III]


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.