MHA high-availability failover Tool

Last Update:2016-07-26 Source: Internet

Author: User

Tags install perl

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

MHA high-availability failover Tool

MHA Introduction
MHA is a MySQL failover solution written by MySQL daiu in Japan using Perl to ensure high availability of the database system. It completes failover within the downtime (usually 10-30 seconds, deploying HA can avoid master inconsistencies and save the cost for purchasing servers. It is easy to install without changing the existing deployment.

MHA Solution

In the MySQL master-slave replication architecture, when the master node fails, some (or all) slave may fail to obtain the latest binglog, as a result, the slave database and master data are inconsistent, and even the data of each salve is deviated.
The master node can eliminate the data differences between different slave instances, maximize data consistency, and achieve high availability in the true sense. 1) binary log events (binlog events) are saved from a crashed master node; 2) the latest slave is identified; 3) relay logs with different applications to other slave; 4) upgrade an slave to a new master; 5) connect other slave to a new master for replication.

MHA Architecture

MHA Manager can be deployed on an independent machine to manage multiple master-slave clusters, or on a server Load balancer node. MHA Node runs on each MySQL server. MHA Manager periodically detects master nodes in the cluster. When the master Node fails, it can automatically upgrade the slave of the latest data to the new master, then, point all other slave instances to the new master. The entire failover process is completely transparent to applications. During the automatic failover process of mha, MHA tries to save binary logs from the master server to maximize data loss, but this is not always feasible. For example, if the master server hardware fails or cannot be accessed through ssh, MHA cannot save binary logs and only performs failover to lose the latest data. Using semi-synchronous replication of MySQL 5.5 can greatly reduce the risk of data loss. MHA can be combined with semi-synchronous replication. If only one slave has received the latest binary log, MHA can apply the latest binary log to all other slave servers, so that data consistency of all nodes can be ensured.

MHA features automatic completion from master monitoring to failover. You can also manually execute the fault

Failover within seconds

Any salve can be upgraded to the master

Install and uninstall MHA without stopping the currently running mysql process

MHA itself does not increase the server load, does not reduce performance, and does not need to append servers.

Independent of Storage Engine

Does not rely on the binglog file format (statement or row)

Ability to call external scripts on multiple points, which can be transferred when the power supply is OFF or the IP address fails.

MHA deployment environment: CentOS 6.7 _ x64 MySQL5.5 multi-instance (master-slave replication has been implemented)

Master database/data/3306/172 .16.2.10 port 3306 # master slave database/data/3307/172 .16.2.10 port 3307 # candicate master alternative master database slave database and management/data/3308/172 .16.2.10 port 3308 # slave

1) Although I use a local multi-instance Environment, I still need to configure an SSH remote distribution key so that the MHA management host can communicate with other nodes from the slave database without entering the password, if it is a different host, remember to distribute the public key to each server, and the Management node should send it to itself once.

[root@db02 ~]# ssh-keygen -t rsa[root@db02 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@172.16.2.10

2) install mha node packages on all nodes

[Root @ db02 ~] # Yum install perl-DBD-MySQL-y # install the dependency package [root @ db02 ~] # Rpm-ivh tools/mha4mysql-node-0.54-0.el6.noarch.rpm # mha node package (: https://downloads.mariadb.com/files/MHA/mha4mysql-node-0.54-0.el6.noarch.rpm)

3) Management node installation manager and related dependency packages

[Root @ db02 ~] # Yum install perl-DBD-MySQL-y [root @ db02 ~] # Yum install perl-Config-Tiny-y [root @ db02 ~] # Yum install perl-Log-Dispatch-y [root @ db02 ~] # Yum install perl-Parallel-ForkManager-y [root @ db02 ~] # Yum localinstall tools/mha4mysql-manager-0.55-0.el6.noarch.rpm # localinstall solve the circular dependency problem

4) in the slave database server configuration file, the relay_log_purge = 0 parameter must be added to my. cnf.
By default, MySQL database master-slave replication automatically deletes the slave database's relay logs after the SQL thread is executed. However, for MHA scenarios, for recovery of some lagging slave databases dependent on other slave database relay logs, You need to disable the automatic deletion function:

mysql> set global relay_log_purge = 0;

Edit configuration file

my.cnf [mysqld] relay_log_purge = 0

5) Add a management account on all node servers

mysql> grant all privileges on *.* to mha@'172.16.2.%' identified by 'mha';Query OK, 0 rows affected (0.00 sec)mysql> flush privileges;Query OK, 0 rows affected (0.00 sec)

6) Configure/etc/mha/app1.cnf on the Management end.

[Root @ db02 ~] # Mkdir/etc/mha # create a MHA configuration folder [root @ db02 ~] # Mkdir-p/var/log/mha/app1 # MHA log management folder [root @ db02 ~] # Vim/etc/mha/app1.cnf # management end configuration file [server default] manager_log =/var/log/mha/app1/manager. log # manager log manager_workdir =/var/log/mha/app1.log # manager's working directory master_binlog_dir =/data/3306/# the location where the master saves the binlog, so that MHA can find the master log user = mha # Set monitoring user rootpassword = mha # Set monitoring user password ping_interval = 2 # Set monitoring master database, send ping packet interval, the default value is 3 seconds, railoverrepl_password = 123456 # Set the password of the master-slave replication user repl_user = rep # Set the master-slave replication user ssh_user = root # SSH remote connection user name [server1] hostname = 172.16.2.10port = 3306 [server2] candidate_master = 1 # set as a candidate master, if this parameter is set, the slave database will be upgraded to the master database after a master-slave switchover, even if the master database is not the latest slavecheck_repl_delay = 0 # by default, if a slave lags behind the master's relay logs of M, MHA will not select the slave as a new master, because it takes a long time to recover this slave, by setting check_repl_delay = 0, MHA will ignore the replication delay when selecting a new master, this parameter is very useful for hosts with candidate_master = 1, because this candidate master must be the new masterhostname = 172.16.2.10port = 3307 [server3] hostname = 172.16.2.10port = 3308

Check whether mha manage is configured successfully
1) Check SSH Login

[root@db02 ~]# masterha_check_ssh --conf=/etc/mha/appl.cnf...Sun Jul 10 23:43:10 2016 - [info] All SSH connection tests passed successfully.

2) check whether the master-slave replication of mysql replication is successful

[Root @ db02 ~] # Ln-s/application/mysql/bin/mysqlbinlog/usr/bin/mysqlbinlog # All nodes run [root @ db02 ~] # Ln-s/application/mysql/bin/mysql/usr/bin/mysql # Run [root @ db02 ~] on all nodes # Masterha_check_repl -- conf =/etc/mha/appl. cnf... mon Jul 11 00:11:14 2016-[info] Checking replication health on 172.16.2.10 .. mon Jul 11 00:11:14 2016-[info] OK. mon Jul 11 00:11:14 2016-[info] Checking replication health on 172.16.2.10 .. mon Jul 11 00:11:14 2016-[info] OK. mon Jul 11 00:06:56 2016-[warning] shutdown_script is not defined. mon Jul 11 00:06:56 2016-[info] Got exit code 0 (Not master dead ). mySQL Replication Health is OK.

3) Management Terminal start monitoring and Test start monitoring

[Root @ db02 ~] # Nohup masterha_manager -- conf =/etc/mha/app1.cnf -- remove_dead_master_conf -- ignore_last_failover </dev/null>/var/log/mha/app1/manager. log 2> & 1 & [1] 100414 # -- remove_dead_master_conf this parameter indicates that when a master-slave switchover occurs, the ip address of the old master database will be removed from the configuration file # -- ignore_last_failover by default, if MHA detects continuous downtime and the interval between two downtime is less than 8 hours, no Failover will be performed. The reason for this restriction is to avoid the ping-pong effect. This parameter ignores the files generated when the previous MHA trigger switchover. By default, the MHA will generate app1.failover in the log directory, that is, the/data set above. the complete file. If the file exists in the directory during the next switchover, the switchover is not allowed unless the file is deleted after the first switchover. For convenience, set it to -- ignore_last_failover [root @ db02 ~]. # Masterha_check_status -- conf =/etc/mha/app1.cnf # view the master database and node status app1 (pid: 100414) is running (0: PING_ OK), master: 172.16.2.10

Here, even if the MHA configuration is over, we will try to stop the master database master and simulate a fault.
At this time, the 3308 slave database and Management Machine

Mysql> show slave status \ G; ***************************** 1. row ************************** Slave_IO_State: Waiting for master to send event Master_Host: 172.16.2.10 Master_User: rep Master_Port: 3306 # Show master is 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000023 Read_Master_Log_Pos: 3887 Relay_Log_File: relay-bin.000031 Relay_Log_Pos: 701 runtime: mysql-bin.000023 Slave_IO_Running

We stopped 3306

[root@db02 ~]# mysqladmin -uroot -pli123456 -S /data/3306/mysql.sock shutdown

Let's take a look at the synchronization information.

Mysql> show slave status \ G; ***************************** 1. row ************************** Slave_IO_State: Waiting for master to send event Master_Host: 172.16.2.10 Master_User: rep Master_Port: 3307 # show that the master database has been switched to 3307, and the speed is very fast Connect_Retry: 60 Master_Log_File: mysql-bin.000002 Read_Master_Log_Pos: 3529 Relay_Log_File: relay-bin.000002

The slave database information is no longer visible at the time above 3307, indicating that the slave database has been automatically changed to the master database by MHA ~

mysql> show slave status\G;Empty set (0.00 sec)

Note: In actual work, when the master database is down and the master database needs to be switched in seconds, this MHA high-availability solution involves changing the IP address of the DB in the cluster architecture, it will affect many backend web applications. We recommend that you use the host name resolution method when connecting to the DNS database. Once the master database goes down, MHA selects the master database again, we can resolve the hosts to all cluster nodes with one click, reducing the trouble of changing the configuration and Improving the Efficiency!

This article permanently updates the link address:

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More