Keepalived+redis High-availability Redis master-Slave solution

Source: Internet
Author: User
Tags failover redis version haproxy redis server

Background introduction:

Currently, Redis does not have an official HA scenario similar to MySQL proxy or Oracle RAC.
The master-slave switching scheme named Sentinel has been formally provided #Redis version 2.8 (attached, not tested)

Therefore, how to automatically transfer in the event of a failure is a problem that needs to be solved.

Through the search for some information on the Internet, it is recommended to use Haproxy or keepalived to achieve, in fact, if it is to do failover rather than load balancing, keepalived efficiency is certainly more than Haproxy, So I decided to adopt the keepalived scheme.

Environment Introduction:
master:192.168.0.100
Slave:192.168.0.101
Virtural IP Address (VIP): 192.168.0.200

Design ideas:

When Master and Slave are working normally, Master is responsible for service, Slave is responsible for standby;
When Master hangs up, Slave is normal, Slave takes over the service and shuts down the master-slave copy function;
When Master returns to normal, the data is synchronized from slave, the master-slave replication function is turned off after the data is synchronized, and the master identity is restored, while slave waits for the master synchronization data to complete before resuming slave identity.
Then loop in turn.

It is important to note that this requires a localization strategy on both master and slave, otherwise, in the automatic switching process, the other party's data will be emptied by the non-open side, resulting in a complete loss of data.

Below, is the specific implementation steps:

Installing keepalived on Master and slave

$ yum Install keepalived

The default installation is complete keepalived has a default profile, so we override it to overwrite it:

First, create the following configuration file on Master
$ vim/etc/keepalived/keepalived.conf

! Configuration File for Keepalived
Global_defs {
router_id redis100
}
Vrrp_script Chk_redis
{
Script "/etc/keepalived/scripts/redis_check.sh 127.0.0.1 6379"
Interval 2
Timeout 2
Fall 3
}
Vrrp_instance Redis {
State MASTER # Master set to SLAVE Also
Interface eth0
VIRTUAL_ROUTER_ID 50
Priority 150
Nopreempt # no Seize,must add
Advert_int 1
Authentication {#all node must same
Auth_type PASS
Auth_pass 1111
}
virtual_ipaddress {
192.168.0.200/24
}
Track_script {
Chk_redis
}
Notify_master "/etc/keepalived/scripts/redis_master.sh 127.0.0.1 192.168.0.101 6379"
Notify_backup "/etc/keepalived/scripts/redis_backup.sh 127.0.0.1 192.168.0.101 6379"
notify_fault/etc/keepalived/scripts/redis_fault.sh
notify_stop/etc/keepalived/scripts/redis_stop.sh
}

then, create the following configuration file on slave:

! Configuration File for Keepalived

Global_defs {
router_id redis101
}
Vrrp_script Chk_redis
{
Script "/etc/keepalived/scripts/redis_check.sh 127.0.0.1 6379"
Interval 2
Timeout 2
Fall 3
}
Vrrp_instance Redis {
State BACKUP
Interface eth0
VIRTUAL_ROUTER_ID 50
Priority 100
Advert_int 1
Authentication {#all node must same
Auth_type PASS
Auth_pass 1111
}
virtual_ipaddress {
192.168.0.200/24
}
Track_script {
Chk_redis
}
Notify_master "/etc/keepalived/scripts/redis_master.sh 127.0.0.1 192.168.0.100 6379"
Notify_backup "/etc/keepalived/scripts/redis_backup.sh 127.0.0.1 192.168.0.100 6379"
notify_fault/etc/keepalived/scripts/redis_fault.sh
notify_stop/etc/keepalived/scripts/redis_stop.sh
}

creating a script to monitor Redis on master and slave
$ mkdir/etc/keepalived/scripts
$ vim/etc/keepalived/scripts/redis_check.sh

#!/bin/bash
Alive= '/usr/redis/redis-cli-h $1-p $ PING '
Logfile= "/var/log/keepalived-redis-check.log"
echo "[CHECK]" >> $LOGFILE
Date >> $LOGFILE
if [$ALIVE = = "PONG"]; Then:
echo "Success:redis-cli-h $1-p $ PING $ALIVE" >> $LOGFILE 2>&1
Exit 0
Else
echo "Failed:redis-cli-h $1-p $ PING $ALIVE" >> $LOGFILE 2>&1
Exit 1
Fi

write the following key scripts that are responsible for the operation:
notify_master/etc/keepalived/scripts/redis_master.sh
notify_backup/etc/keepalived/scripts/redis_backup.sh
notify_fault/etc/keepalived/scripts/redis_fault.sh
notify_stop/etc/keepalived/scripts/redis_stop.sh

Because the keepalived is called according to the state when the state is converted:
Notify_master is called when the master State is entered
Notify_backup is called when the backup status is entered
Enter fault status call Notify_fault when abnormal conditions are found
Call Notify_stop when the keepalived program terminates

First, create the Notity_master and Notify_backup scripts on the Redis master:
$ vim/etc/keepalived/scripts/redis_master.sh

#!/bin/bash
rediscli= "/usr/redis/redis-cli-h $1-p"
Logfile= "/var/log/keepalived-redis-state.log"
echo "[Master]" >> $LOGFILE
Date >> $LOGFILE
echo "Being master ..." >> $LOGFILE 2>&1
echo "Run MASTER cmd ..." >> $LOGFILE 2>&1
$REDISCLI slaveof >> $LOGFILE
Sleep #delay s wait data async cancel Sync
echo "Run slaveof NO one cmd ..." >> $LOGFILE
$REDISCLI slaveof NO one >> $LOGFILE 2>&1

$ sudo vim/etc/keepalived/scripts/redis_backup.sh

#!/bin/bash
Rediscli= "/usr/redis/redis-cli"
Logfile= "/var/log/keepalived-redis-state.log"
echo "[Backup]" >> $LOGFILE
Date >> $LOGFILE
echo "Run slaveof cmd ..." >> $LOGFILE
$REDISCLI slaveof >> $LOGFILE 2>&1
# echo "Being slave ..." >> $LOGFILE 2>&1
Sleep #delay The wait Data Sync Exchange role


Next, create the Notity_master and Notify_backup scripts on the Redis slave:

$ vim/etc/keepalived/scripts/redis_master.sh

#!/bin/bash
rediscli= "/usr/redis/redis-cli-h $1-p"
Logfile= "/var/log/keepalived-redis-state.log"
echo "[Master]" >> $LOGFILE
Date >> $LOGFILE
echo "Being master ..." >> $LOGFILE 2>&1
echo "Run slaveof cmd ... ">> $LOGFILE
$REDISCLI slaveof >> $LOGFILE 2>&1
#echo "slaveof-$-cmd can ' t excute ... ">> $LOGFILE
Sleep # #delay s wait Data sync Exchange role
echo "Run slaveof NO one cmd ..." >> $LOGFILE
$REDISCLI slaveof NO one >> $LOGFILE 2>&1


$ vim/etc/keepalived/scripts/redis_backup.sh

#!/bin/bash
Rediscli= "/usr/redis/redis-cli"
Logfile= "/var/log/keepalived-redis-state.log"
echo "[BACKUP]" >> $LOGFILE
Date >> $LOGFILE
echo "Being slave ..." >> $LOGFILE 2>&1
echo "Run slaveof cmd ..." >> $LOGFILE 2>&1
$REDISCLI slaveof >> $LOGFILE
Sleep #delay s wait data async cancel Sync
Exit (0)


then create the same script as the following in master and slave:
$ vim/etc/keepalived/scripts/redis_fault.sh

#!/bin/bash
Logfile=/var/log/keepalived-redis-state.log
echo "[Fault]" >> $LOGFILE
Date >> $LOGFILE

$ sudo vim/etc/keepalived/scripts/redis_stop.sh

#!/bin/bash
Logfile=/var/log/keepalived-redis-state.log
echo "[Stop]" >> $LOGFILE
Date >> $LOGFILE

Add executable permissions to the script:

(This is important, at first because this did not do, after the operation has been the error "vrrp_instance (Redis) Now in the FAULT State")

$ sudo chmod +x/etc/keepalived/scripts/*.sh

after the script is created, we begin to test the process as follows
1. Start Redis on Master
$/etc/init.d/redis Start

2. Start Redis on the slave
$/etc/init.d/redis Start

3. Start the keepalived on master
$/etc/init.d/keepalived Start

4. Start the keepalived on the slave
$/etc/init.d/keepalived Start


5. Try to connect Redis via VIP:
$ redis-cli-h 10.6.1.200 INFO

The connection was successful and slave was connected.
Role:master
Slave0:10.6.1.144,6379,online

6. Try inserting some data:
$ redis-cli-h 10.6.1.200 SET Hello Redis
Ok

Read data from VIP
$ redis-cli-h 10.6.1.200 GET Hello
"Redis"

reading data from Master
$ redis-cli-h 10.6.1.143 GET Hello
"Redis"

reading data from slave
$ redis-cli-h 10.6.1.144 GET Hello
"Redis"


Below, a simulated failure occurs:
The Redis on Master is stopped.
$ Service redis_6379 Stop

View the keepalived log on Master
$ tailf/var/log/keepalived-redis-state.log
[Fault]
Thu Sep 08:29:01 CST 2012

At the same time the log on slave shows:
$ tailf/var/log/keepalived-redis-state.log
[Master]
Fri Sep 14:14:09 CST 2012
Being master ....
Run slaveof cmd ...
Ok
Run slaveof NO One cmd ...
Ok

Then we can see that slave has taken over the service and assumed the role of master.
$ redis-cli-h 192.168.0.200 INFO

Role:master

Then we restore the master Redis process
$ service redis_6379 Start

View the keepalived log on Master
$ tailf/var/log/keepalived-redis-state.log
[Master]
Thu Sep 08:31:33 CST 2012
Being master ....
Run slaveof cmd ...
Ok
Run slaveof NO One cmd ...
Ok

At the same time the log on slave shows:
$ tailf/var/log/keepalived-redis-state.log
[Backup]
Fri Sep 14:16:37 CST 2012
Being slave ....
Run slaveof cmd ...
Ok

You can see that the current master has resumed the master role again, and failover and automatic recovery have been successful.

master-Slave scripts and keepalived.conf can be downloaded from thishttp://download.csdn.net/detail/huwei2003/8252221

Note : The master-slave Redis has to turn on local backup

Report:

Redis Sentinel's master-slave switching scheme

Redis version 2.8 begins with a master-slave switching scheme called Sentinel, which is used to manage multiple Redis server instances and is primarily responsible for three tasks:

1. Monitoring (Monitoring): Sentinel will constantly check whether your primary server and slave server are functioning properly.
2. Reminder (Notification): When a problem occurs with a Redis server being monitored, Sentinel can send notifications to administrators or other applications through the API.
3. automatic failover (Automatic failover): When a primary server fails, Sentinel starts an automatic failover operation that upgrades one of the failed primary servers from the server to the new primary server. And let the other server from the failed master to replicate the new primary server, when the client tries to connect to the failed primary server, the cluster will also return the address of the new primary server to the client, so that the cluster can use the new primary server instead of the failed server.

Redis Sentinel is a distributed system that allows you to run multiple Sentinel processes (progress) in one architecture that uses the gossip protocol (gossip protocols) to receive information about whether the primary server is offline, The Voting Protocol (agreement protocols) is used to determine whether automatic failover is performed and which slave server is selected as the new primary server.

Launch Sentinel

Starting with the--sentinel parameter and specifying a corresponding profile, the system uses the configuration file to save the current state of Sentinel and to load the configuration file for state restore when Sentinel restarts.

Redis-server/path/to/sentinel.conf--sentinel

With TCP port 26379, you can use REDIS-CLI or any other client to communicate with it.

If the appropriate configuration file is not specified when you start Sentinel, or if the specified profile is not writable (not writable), Sentinel refuses to start.

Configure Sentinel

The following is an example of a configuration file:

Sentinel Monitor MyMaster 127.0.0.1 6379 2
Sentinel Down-after-milliseconds MyMaster 60000
Sentinel Failover-timeout MyMaster 180000
Sentinel Parallel-syncs MyMaster 1

Sentinel Monitor Resque 192.168.1.3 6380 4
Sentinel Down-after-milliseconds Resque 10000
Sentinel Failover-timeout Resque 180000
Sentinel Parallel-syncs Resque 5

The first line of configuration instructs Sentinel to monitor a primary server named MyMaster, which has an IP address of 127.0.0.1 and a port number of 6379, and it requires at least 2 Sentinel consent for the primary server to fail (as long as the Se The number of Ntinel is not met, and automatic failover will not be performed).
However, it is important to note that no matter how many Sentinel consents you set up to determine a server failure, a Sentinel needs to obtain support from most (majority) sentinel in the system to initiate an automatic failover and reserve a given configuration era (config epoch, a configuration era is the version number of a new master server configuration). In other words, if only a few (minority) Sentinel processes are functioning properly, automatic failover cannot be performed.

The down-after-milliseconds option specifies the number of milliseconds that Sentinel considers the server to be disconnected (judged to be a subjective downline sdown).
The PARALLEL-SYNCS option specifies the maximum number of times a failover can be synchronized from the server to the new primary server at the same time, and the smaller the number, the longer it takes to complete a failover, but the larger it means that the more from the server is unavailable because of replication. You can ensure that only one from the server is in a state that cannot process a command request at a time by setting this value.

Subjective downline and objective downline

1. subjective downline (subjectively down, abbreviated as Sdown) refers to a single Sentinel instance that makes a referral to the server.
2. Objective downline (objectively down, abbreviated as Odown) refers to multiple Sentinel instances making sdown judgments on the same server and using Sentinel IS-MASTER-DOWN-BY-ADDR commands to mutually After the exchange, the resulting server offline judgment.

Objective downline conditions apply only to the primary server: for any other type of Redis instance, Sentinel does not need to negotiate before judging them as a downline, so the objective downline condition will never be reached from the server or other Sentinel.
As soon as a Sentinel discovers that a primary server has entered an objective downline, this sentinel may be selected by other Sentinel and perform an automatic failover of the failed primary server.

Timed tasks performed by each Sentinel instance

1. Each Sentinel sends a PING command to its known primary server, slave server, and other sentinel instances at a frequency of once per second.
2. If an instance (instance) is longer than the value specified by the Down-after-milliseconds option for the last time the PING command is valid, the instance is flagged as a subjective downline by Sentinel. A valid reply can be: +pong,-loading, or-masterdown.
3. If a primary server is marked as a subjective downline, then all Sentinel monitoring of this primary server will confirm that the primary server has actually entered a subjective downline state at a frequency of once per second.
4. If a primary server is marked as a subjective downline and there is a sufficient number of Sentinel (at least the number specified in the configuration file) to agree to this judgment within a specified timeframe, then the primary server is marked as an objective offline.
5. In general, each Sentinel sends an INFO command to all of its known primary and slave servers at a frequency of every 10 seconds. When a primary server is marked as objective by Sentinel, Sentinel sends all the INFO commands from the server to the offline primary server from 10 seconds to once per second.
6. When there is not enough Sentinel to agree that the primary server is offline, the objective offline status of the primary server is removed. When the primary server re-returns a valid reply to Sentinel's PING command, the primary server's supervisor downline status is removed.

Sentinel API

There are two ways to communicate with Sentinel: Directives, publications, and subscriptions.

Sentinel command

PING: Returns PONG.
SENTINEL Masters: Lists all the monitored primary servers, and the current status of these primary servers;
SENTINEL Slaves <master name>: Lists all the slave servers for a given primary server, and the current state of those from the server;
SENTINEL get-master-addr-by-name <master name>: Returns the IP address and port number of the primary server for the given name. This command returns the IP address and port number of the new primary server if the primary server is performing a failover operation or if a failover operation has been completed for the primary server;
SENTINEL Reset <pattern>: Resets all names to the primary server matching the given pattern pattern. The pattern parameter is a Glob style mode. The reset operation is clear about all the current state of the primary server, including the failover in progress, and removes all the slave servers and Sentinel from the primary server that are currently discovered and associated;
SENTINEL failover <master name>: When the primary server fails, force an automatic failover to start without asking other Sentinel comments.

The client can obtain the current home server IP address and port number through Sentinel Get-master-addr-by-name <master name>, as well as Sentinel slaves <master name> Get all the slaves information

Publish and subscribe information

The client can treat Sentinel as a Redis server that only provides subscription functionality: You can not use the PUBLISH command to send information to this server, but you can use the SUBSCRIBE command or the Psubscribe command to subscribe to a given channel to get the corresponding Event reminders.
A channel can receive events with the same name as this channel. For example, a channel named +sdown can receive events that all instances enter a subjective downline (Sdown) state.
All event information can be received by executing the psubscribe * command.

+switch-master <master name> <oldip> <oldport> <newip> <newport>: Configuration change, IP for master server and address has changed. This is a message that most external users are concerned about.

As you can see, we use Sentinel commands and publish subscriptions with two mechanisms to implement and integrate with the client well:
The current master and slaves addresses and information can be obtained using the get-master-addr-by-name and slaves directives, and when a failover occurs, master switches, which can be subscribed by the + The Switch-master event gets the latest master information.

*ps: See the official documentation for more Sentinel events.

The Notification-script in sentinel.conf

Multiple Sentinel Notification-script <master name> <shell Script-path> can be configured in sentinel.conf, such as Sentinel Notification-script MyMaster./check.sh

This is when the cluster failover triggers execution of the specified script. The execution result of the script is 1, that is, retry later (maximum number of retries is 10), and if 2, the execution ends. And the maximum execution time for the script is 60 seconds, and the timeout is aborted.

PS: There is currently a problem with the script being executed several times, and finding the data is explained by:
The script is divided into two levels, Sentinel_leader and Sentinel_observer, which are executed only by the lead Sentinel (a Sentinel), which is executed by all SENTINEL monitors of the same master (multiple Sentine L).

Keepalived+redis High-availability Redis master-Slave solution

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.