Redis sentinel cluster construction and Jedis test graphic tutorial [3], redisjedis

Source: Internet
Author: User
Tags redis cluster

Redis sentinel cluster construction and Jedis test graphic tutorial [3], redisjedis

In the first two articles, the sentinel cluster construction and Jedis test graphic tutorials [1] and the sentinel cluster construction in Redis and the Jedis test graphic tutorial [2] briefly describe the sentinel cluster in Redis. and Java code Jedis test.
  
This article mainly analyzes the principle of the Redis-sentinel cluster and completes the detailed Principle Analysis in the Redis-sentinel cluster test based on the tracing sentinel information. Including the analysis of sentinel information in master-slave, failover process, leader election process after master crash, and other aspects.

I. Common HA solutions for Redis

HA (High availability cluster) is an effective solution to ensure business continuity. Generally, there are two or more nodes, which are divided into active nodes and standby nodes. Generally
A row-based business is called an active node, and a backup of an active node is called a backup node. When a problem occurs on the active node and the running business (task) cannot run properly, the Standby node will detect the problem and immediately continue the active node to execute the business. In this way, the business is not interrupted or temporarily interrupted. From Baidu encyclopedia.
Redis is generally deployed in the Master/Slave Mode (the application mentioned here is mainly used for backup from the instance, and the master instance provides read/write). There are several main methods to implement HA in this mode:
1) keepalived: Provides master-slave unified access through the virtual IP address of keepalived. When the master node encounters a problem, the script running through keepalived will be upgraded from the master node to the master node, after the master node is restored, it is synchronized and then automatically changed to the master node. The advantage of this solution is that after the master node is switched over, the application does not need to know (because the accessed virtual IP address remains unchanged ), the disadvantage is that the introduction of keepalived increases deployment complexity and may cause data loss in some cases;
2) zookeeper: monitors master-slave instances through zookeeper, maintains the latest and valid IP addresses, and applications obtain IP addresses through zookeeper to access Redis. This solution requires a lot of monitoring code;
3) sentinel: monitors Master/Slave instances through Sentinel for automatic fault recovery. This solution has a defect: Because the Master/Slave instance addresses (IP & PORT) are different, when the master-slave switchover fails, the application cannot know the new address. Therefore, support for Sentinel is added in Jedis2.2.2, and redis is used. clients. jedis. jedisSentinelPool. the Jedis Instance obtained by getResource () is updated to the new master instance address in time.
The following figure shows the deployment logic of the solution.

Ii. Redis-sentinel configuration and deployment

For details, see the sentinel cluster construction and Jedis test graphic tutorial in Redis [1]
Because the sentinel server is already in the redis server environment (redis-sentinel), we will use them directly here (in the production environment, the sentinel server and the redis-server are generally separated, deployed on different pc-servers. At the same time, the number of sentinel and the number of redis-server are not linked. The following three sentinel services are deployed on one pc for ease of use ).
  Note:The sentinel configuration file is only related to the default master, and has nothing to do with slave. The above example uses three redis-server and three redis-sentinel. In fact, the number of redis-sentinel does not have to match the redis-sever ~ N.
First, it should be clear that sentinel is a process independent of redis and does not provide the key/value service externally. In the redis installation directory, the name is redis-sentinel. It is mainly used to monitor redis-server processes and perform master/slave management. If your redis is not running in master/slave Mode, you do not need to set sentinel.

Iii. Redis-sentinel startup and Detection

Start redis-server and redis-sentinel respectively.
  
1) redis-0 sentinel console when starting the redis-0
  
  
2) redis-1 sentinel console when starting the reids-1
  
  
Redis-1 sentinel console when starting reids-0
  
  
3) redis-2 sentinel console when starting the reids-2
  
  
Redis-2 sentinel console when starting reids-0
  
  
Redis-2 sentinel console when starting reids-1
  

The control output information on the console shows that when the service is started in sequence, the console outputs information such as + sentinel, + slave, and-sdown, indicating that sentinel communication is performed, it also monitors the situation of redis-server. + Sentinel indicates that a new sentinel instance is added to monitoring. Tip: when building the sentinel environment for the first time, you must first start the master machine.
  
4) view related information
Run the # netstat-ntlp | grep redis command to view the current redis running status.
  
  
View the status of redis-server through redis-cli

/Usr/local/webserver/redis-cli-h 127.0.0.1-p 6379-a abcd123457 info Replication

(The above-a is used to enter the password)
  

Note: info commands
This command prints the complete service information, including the cluster. We only need to pay attention to the "Replication" section (in the preceding command, we added the Replication restriction after info, if this parameter is not added, information such as Server, Clients, Memory, Persistence, Stats, CPU, and Keyspace will be output ), this part of information will tell us "current server role" and all slave information pointing to it. You can use the "INFO" command on any slave to obtain the master information that the current slave points. The following is the Replication information of slave1.
   
  
At the same time, this command not only helps us get the cluster information, but also the sentinel component uses "INFO" to do the same thing. The following is the sentinel information of slave2.
   
  
The above information clearly shows the redis service status and master-slave relationship.
 
5) failover Test
When the above deployment environment is stable, we close the redis-0 directly, waiting for "down-after-milliseconds" seconds (30 seconds ), redis-0/redis-1/redis-2 sentinel window will print "+ sdown", "+ odown", "+ failover", "+ selected-slave", "+ promoted-slave ", "+ slave-reconf" and other commands, these commands indicate that when the master node fails, the sentinel component performs the failover process.
Simulate the downtime of mater. In this case, the sentinel console outputs the following information.
  
Sentinel information on redis-0.
   

Sentinel information on redis-1
  
  
Sentinel information on redis-2
  

From the information in the above three windows, we can see that when the master is down, the three sentinel (sentinel) will perform failover on the master.

{As you can see from the sentinel console of the redis-1, the following operations are performed. A. + sdown mater mymaster 127.0.0.1 6379 (the mater is regarded as invalid); B. + odown mater mymaster 127.0.0.1 6379 # quorum 2/2. + new-epoch 1 (to select a new mater); d. + try-failover master mymaster 127.0.0.1 6379 (master gets out of the hot backup); e. + vote-for-leader Example 1 (Voting Leader); f.127.0.0.1: 16379 voted for 1eb0f03b7a7815c3c5506b0fa041ad8d6ca9db90 1 (Voting Leader) G.127.0.0.1: 36379 voted for 1eb0f03b7a7815c3c5506b0fa041ad8d6ca9db90 1 (Voting Leader); h. + elected-leader master mymaster 127.0.0.1 6379 (previously elected master); I. + failover-state-select-slave master mymaster 127.0.0.1 6379 (the Leader starts to find the appropriate slave); j. + selected-slave 127.0.0.1: 63792 127.0.0.1 63792 @ mymaster 127.0.0.1 6379 (the leader has found a suitable slave); k. + failover-state-send-slaveof-noone slave 127.0.0.1: 63792 127. 0.0.1 63792 @ mymaster 127.0.0.1 6379 (the Leader sends the "slaveof no one" command to slave. At this time, slave has completed role conversion, and this slave is the master); l. + failover-state-wait-promotion slave 127.0.0.1: 63792 127.0.0.1 63792 @ mymaster 127.0.0.1 6379 (wait for other sentinel to confirm the slave); m. + promoted-slave 127.0.0.1: 63792 127.0.0.1 63792 @ mymaster 127.0.0.1 6379 (confirmed); n. + failover-state-reconf-slaves master mymaster 127.0.0.1 6379 (start to recon slaves) Fig operation); o. + slave-reconf-sent slave 127.0.0.1: 63791 127.0.0.1 63791 @ mymaster 127.0.0.1 6379 (send the "slaveof" command to the specified slave to notify this slave to follow the new master); p. + slave-reconf-inprog slave 127.0.0.1: 63791 127.0.0.1 63791 @ mymaster 127.0.0.1 6379 (this slave is executing the slaveof + SYNC process, if slave receives "+ slave-reconf-sent", it will execute the slaveof operation, loop n); q. + slave-reconf-done slave 127.0.0.1: 63791 127.0.0.1 63791 @ mymaster 127.0.0.1 6379 (This slave synchronization is completed, and then the leader can continue the reconfig operation of the next slave); r. + failover-end master mymaster 127.0.0.1 6379 (failover ended); s. + switch-master mymaster 127.0.0.1 6379 127.0.0.1 63792 (after the Failover is successful, each sentinel instance starts to monitor the new master); t. + slave 127.0.0.1: 63791 127.0.0.1 63791 @ mymaster 127.0.0.1 63792 (add slave to the new master in the following steps); u. + slave 127.0.0.1: 6379 127.0.0.1 6379 @ mymaster 127.0.0.1 63792; v. + sdown slave 127.0.0.1: 63 79 127.0.0.1 6379 @ mymaster 127.0.0.1 63792.

When the environment was stable, we found that the redis-2 was promoted ("promoted") as the master and followed by the redis-1 through the "slave-reconf" process.
If the redis-0 server returns to normal at this time, sentinel automatically adds the redis-0 as a slave to the redis-2.
Below is the information output from the redis-0's sentinel console.
  
  
View the redis-2 again (info for the current master)
  

  Prompt: The sentinel instance must be fully started. If you only start the server without starting the corresponding sentinel, the server cannot be properly monitored and managed.

Iv. Analysis of Redis-sentinel principles

1) SDOWN and ODOWN conversion processes
[Glossary]
  SDOWN: Subjectively down, directly translated as "subjective" invalid, that is, the current sentinel Instance considers a redis service as "unavailable.
  ODOWN: Objectively down, directly translated as "objective" invalid, that is, multiple sentinel instances think that the master is in the "SDOWN" State, then the master will be in the ODOWN, ODOWN can be simply understood as the master has been determined as "unavailable" by the cluster, and failover will be enabled.
(SDOWN is suitable for master and slave, but ODOWN will only be used for master; When slave fails to exceed "down-after-milliseconds, all sentinel instances will mark it as "SDOWN ".)

[Conversion process]
A. After each sentinel instance is started, it establishes a TCP connection with known slaves/master and other sentinels and periodically sends PING (1 S by default );
B. During interaction, if the redis-server cannot respond or respond to error messages within the "down-after-milliseconds" period, the redis-server is considered to be in the SDOWN state;
C. if B. the sentinel instance sends the "is-master-down-by-addr" command to other sentinel intermittently (one second) and obtains the response information, if enough sentinel instances detect that the master is in SDOWN state, the current sentinel instance will mark the master as ODOWN... Other sentinel instances perform the same interactive operation. The configuration item "sentinel monitor" is used. If the number of slave instances in the SDOWN state of the master is detected, the sentinel Instance considers the master instance to be In the ODOWN state;
D. each sentinel instance sends the "INFO" command intermittently (10 seconds) to the master and slaves. If the master fails and no new master is selected, the "INFO" command is sent every one second "; the main purpose of "INFO" is to obtain and confirm the survival status of slaves and master in the current cluster environment;
After the above process, all sentinel reach an agreement on master failure and then start failover.

2) Sentinel and slaves "automatic discovery" Mechanism
In the sentinel configuration file (local-sentinel.conf), port is specified, which is the port on which the sentinel instance listens to other sentinel instances to establish a connection. after the cluster is stable, a tcp link is established between each sentinel instance, which sends a "PING" and an instruction set similar to "is-master-down-by-addr, it can be used to check the validity of other sentinel instances and the interaction of information during "ODOWN" and "failover.
Before establishing a connection between sentinel, sentinel tries its best to establish a connection with the master specified in the configuration file. The communication between sentinel and master is mainly based on pub/sub to publish and receive information. The published information includes the listening port of the current sentinel instance.

3) Leader election
In fact, in sentinels failover, a "Leader" is still needed to schedule the entire process: master election and Server Load balancer reconfiguration and synchronization. When there are multiple sentinel instances in the cluster, how do I elect one sentinel as the leader?
  
The redis2.8.7 election has two conditions: first, filter out some nodes using the following conditions.

  • Use the following conditions to filter alternative nodes:
    1. Except for slave nodes in S_DOWN, O_DOWN, and DISCONNECTED statuses
    2. the last ping response time should not exceed 5 times the ping interval (if the ping interval is 1 second, the last response delay should not exceed 5 seconds, and redis sentinel is 1 second by default)
    3. The interval between info_refresh and info_refresh is no more than three times. (the principle is the same as 2. The default value of redis sentinel is 10 seconds)
    4. The time for the slave node to lose contact with the master node cannot exceed (now-master-> s_down_since_time) + (master-> down_after_period * 10 )). In general, the Server Load balancer node and the master node are not synchronized in a timely manner (such as newly started nodes) and should not be elected.
    5. Slave priority is not equal to 0 (this is specified in the configuration file, and the default configuration is 100 ).
  • Select the new master from the alternative node in the following order
    1. Lower slave_priority (this is specified in the configuration file and is configured as 100 by default)
    2. Large replication offset (the offset is automatically increased after each slave is synchronized with the master)
    3. A small runid (each redis instance has a runid, usually a 40-bit random string, which is set at redis startup with a very low repetition probability)
    4. If none of the above conditions is sufficient to distinguish a unique node, the Server Load balancer node will check which Server Load balancer node has sent more commands before processing.

We expect to have enough sentinel instances to ensure that when the leader fails, a sentinel can be elected as the leader for failover. If the leader cannot be generated, for example, a small number of sentinels instances are valid, the failover process cannot continue.

4) failover Process
Before Leader triggers a failover, wait takes several seconds (0 ~ 5) so that other sentinel instances can be prepared and adjusted. If everything works properly, the leader needs to begin to upgrade a salve to a master, this slave must be in good state (not in SDOWN/ODOWN state) and have the lowest weight (redis. in conf). When the master identity is confirmed, failover starts.

V. Redis-sentinel learning Summary

1) horizontal scaling of redis. In the previous article, we implemented the redis master-slave HA cluster (which exists after backup from the server). Imagine that when a server is cached at a certain level cannot meet the requirements, we thought of the redis distributed architecture, place the cache to multiple servers. Redis also officially provides redis cluster for distributed implementation, but the official version is not yet released (redis 3.0 seems to have provided support, and it is not time to study ). Java can use ShardedJedis of jedis for sharding.
2) redis monitoring. Whether something runs normally, stably, and performance involves monitoring it. Currently, redis monitoring tools include redmon and redis-live. This article does not provide monitoring for the time being. Readers can refer to other materials to learn and use it.
3) read/write splitting in the cluster. Master is used for writing and slave is used for reading. In the HA cluster of apsaradb for redis, the master and slave servers change, which makes it difficult to obtain the master and slave services in the program. We can use the code to obtain the jedis instance of the slave server. For details, see the sentinel cluster construction in Redis and the Jedis test graphic tutorial [2] to achieve read/write splitting.
4) "cache data synchronization" is also a question that must be considered by all cache tools.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.