Two MHA switching exceptions (masterha_master_switch line 53)
In the process of testing manual failover and online failover, MHA encountered two strange problems, and the test was not successful when calling the IP address, detected dead master xxx does not match with specified dead master and xxx is not alive appear. The following are the descriptions and solutions for these two errors.
1. MHA configuration file
[Root @ vdbsrv4 ~] # More/etc/masterha/app1.cnf
[Server default]
Manager_workdir =/var/log/masterha/app1
Manager_log =/var/log/masterha/app1/manager. log
User = mha
Password = xxx
Ssh_user = root
Repl_user = repl
Repl_password = repl
Ping_interval = 1
Shutdown_script = ""
Master_ip_online_change_script = ""
Report_script = ""
# Master_ip_failover_script =/usr/bin/master_ip_failover
Master_ip_failover_script =/tmp/master_ip_failover
[Server1]
Hostname = vdbsrv1
Master_binlog_dir =/data/mysqldata
[Server2]
Hostname = vdbsrv2
Master_binlog_dir =/data/mysqldata
[Server3]
Hostname = vdbsrv3
Master_binlog_dir =/data/mysqldata/
# Candidate_master = 1
2. error message during manual failover
[Root @ vdbsrv4 ~] # Masterha_master_switch -- master_state = dead -- conf =/etc/masterha/app1.cnf -- dead_master_host = 192.168.1.6 \
> -- Dead_master_port = 3306 -- new_master_host = 192.168.1.8 -- new_master_port = 3306 -- ignore_last_failover
-- Dead_master_ip = <dead_master_ip> is not set. Using 192.168.1.6.
Wed Apr 21 09:08:30 2015-[warning] Global configuration file/etc/masterha_default.cnf not found. Skipping.
Wed Apr 21 09:08:30 2015-[info] Reading application default configuration from/etc/masterha/app1.cnf ..
Wed Apr 21 09:08:30 2015-[info] Reading server configuration from/etc/masterha/app1.cnf ..
Wed Apr 21 09:08:30 2015-[info] MHA: MasterFailover version 0.56.
Wed Apr 21 09:08:30 2015-[info] Starting master failover.
Wed Apr 21 09:08:30 2015-[info]
Wed Apr 21 09:08:30 2015-[info] * Phase 1: Configuration Check Phase ..
Wed Apr 21 09:08:30 2015-[info]
Wed Apr 21 09:08:31 2015-[info] GTID failover mode = 0
Wed Apr 21 09:08:31 2015-[error] [/usr/lib/perl5/site_perl/5.8.8/MHA/MasterFailover. pm, ln2083] Detected dead master vdbsrv1 (192.168.1.6: 3306)
Does not match with specified dead master 192.168.1.6 (192.168.1.6: 3306 )!
Wed Apr 21 09:08:31 2015-[error] [/usr/lib/perl5/site_perl/5.8.8/MHA/MasterFailover. pm, ln2151]
Got ERROR: at/usr/bin/masterha_master_switch line 53
3. error message during online Switching
[Root @ vdbsrv4 ~] # Masterha_master_switch -- conf =/etc/masterha/app1.cnf -- master_state = alive -- new_master_host = 192.168.1.8 \
> -- Orig_master_is_new_slave -- running_updates_limit = 10000
Tue Apr 21 11:50:14 2015-[info] MHA: MasterRotate version 0.56.
Tue Apr 21 11:50:14 2015-[info] Starting online master switch ..
Tue Apr 21 11:50:14 2015-[info]
Tue Apr 21 11:50:14 2015-[info] * Phase 1: Configuration Check Phase ..
Tue Apr 21 11:50:14 2015-[info]
Tue Apr 21 11:50:14 2015-[warning] Global configuration file/etc/masterha_default.cnf not found. Skipping.
Tue Apr 21 11:50:14 2015-[info] Reading application default configuration from/etc/masterha/app1.cnf ..
Tue Apr 21 11:50:14 2015-[info] Reading server configuration from/etc/masterha/app1.cnf ..
Tue Apr 21 11:50:14 2015-[info] GTID failover mode = 0
Tue Apr 21 11:50:14 2015-[info] Current Alive Master: vdbsrv1 (192.168.1.6: 3306)
Tue Apr 21 11:50:14 2015-[info] Alive Slaves:
Tue Apr 21 11:50:14 2015-[info] vdbsrv2 (192.168.1.7: 3306) Version = 5.6.22-log (oldest major version between slaves) log-bin: enabled
Tue Apr 21 11:50:14 2015-[info] Replicating from 192.168.1.6 (192.168.1.6: 3306)
Tue Apr 21 11:50:14 2015-[info] vdbsrv3 (192.168.1.8: 3306) Version = 5.6.22-log (oldest major version between slaves) log-bin: enabled
Tue Apr 21 11:50:14 2015-[info] Replicating from 192.168.1.6 (192.168.1.6: 3306)
It is better to execute FLUSH NO_WRITE_TO_BINLOG TABLES on the master before switching. Is it OK to execute on vdbsrv1 (192.168.1.6: 3306 )? (YES/no): yes
Tue Apr 21 11:50:41 2015-[info] Executing FLUSH NO_WRITE_TO_BINLOG TABLES. This may take long time ..
Tue Apr 21 11:50:41 2015-[info] OK.
Tue Apr 21 11:50:41 2015-[info] Checking MHA is not monitoring or doing failover ..
Tue Apr 21 11:50:41 2015-[info] Checking replication health on vdbsrv2 ..
Tue Apr 21 11:50:41 2015-[info] OK.
Tue Apr 21 11:50:41 2015-[info] Checking replication health on vdbsrv3 ..
Tue Apr 21 11:50:41 2015-[info] OK.
Tue Apr 21 11:50:41 2015-[error] [/usr/lib/perl5/site_perl/5.8.8/MHA/MasterRotate. pm, ln228] 192.168.1.8 is not alive!
Tue Apr 21 11:50:41 2015-[error] [/usr/lib/perl5/site_perl/5.8.8/MHA/MasterRotate. pm, ln613] Failed to get new master!
Tue Apr 21 11:50:41 2015-[error] [/usr/lib/perl5/site_perl/5.8.8/MHA/MasterRotate. pm, ln652] Got ERROR: at/usr/bin/masterha_master_switch line 53
4. Solutions
After you replace the IP address with the host name, the problem is solved.
As described in the official document, the parameter -- dead_master_host = (hostname) is not an IP address.
If these parameters are not set, -- dead_master_ip will be the result of gethostbyname (dead_master_host), and -- dead_master_port will be 3306.
This article permanently updates the link address: