Environment: centos5.8
Mysql5.5.17
Experiment: Set up the MHA high-availability architecture (non-root user SSH equivalent configuration) SSH equivalent user configuration: concert port: 1314
MHA configuration file
[[Email protected] MHA] $ more/etc/masterha_default.cnf
[Server default]
User = root
Password = mysql_admin
Ssh_user = concert
Ssh_port = 1314
Repl_user = repl
Repl_password = repl_pwd
Ping_interval = 3
Ping_type = select
[[Email protected] MHA] $ more/etc/Appl. CNF
[Server default]
Manager_workdir =/MHA/appl
Manager_log =/MHA/appl/manager. Log
Remote_workdir =/MHA/appl
[Server1]
Hostname = 192.168.66.88
Master_binlog_dir =/data/lib/MySQL
Candidate_master = 1
[Server2]
Hostname = 192.168.66.89
Master_binlog_dir =/data/lib/MySQL
Candidate_master = 1
[Server3]
Hostname = 192.168.66.120
No_master = 1
Port = 3307
Problem: After the non-root user SSH equivalence is configured, The masterha_check_ssh check is successful.
[[Email protected] ~] $/Usr/bin/masterha_check_ssh -- conf =/etc/Appl. CNF
Tue Sep 2 15:06:01 2014-[info] Reading default extends atoins from/etc/masterha_default.cnf ..
Tue Sep 2 15:06:01 2014-[info] Reading application default deployments from/etc/Appl. CNF ..
Tue Sep 2 15:06:01 2014-[info] Reading server deployments from/etc/Appl. CNF ..
Tue Sep 2 15:06:01 2014-[info] Starting SSH connection tests ..
Tue Sep 2 15:06:01 2014-[debug]
Tue Sep 2 15:06:01 2014-[debug] connecting via SSH from [email protected] (192.168.66.88: 1314) to [email protected] (192.168.66.89: 1314 )..
Tue Sep 2 15:06:01 2014-[debug] OK.
Tue Sep 2 15:06:01 2014-[debug] connecting via SSH from [email protected] (192.168.66.88: 1314) to [email protected] (192.168.66.120: 1314 )..
Tue Sep 2 15:06:01 2014-[debug] OK.
Tue Sep 2 15:06:02 2014-[debug]
Tue Sep 2 15:06:01 2014-[debug] connecting via SSH from [email protected] (192.168.66.89: 1314) to [email protected] (192.168.66.88: 1314 )..
Tue Sep 2 15:06:01 2014-[debug] OK.
Tue Sep 2 15:06:01 2014-[debug] connecting via SSH from [email protected] (192.168.66.89: 1314) to [email protected] (192.168.66.120: 1314 )..
Tue Sep 2 15:06:02 2014-[debug] OK.
Tue Sep 2 15:06:02 2014-[debug]
Tue Sep 2 15:06:02 2014-[debug] connecting via SSH from [email protected] (192.168.66.120: 1314) to [email protected] (192.168.66.88: 1314 )..
Tue Sep 2 15:06:02 2014-[debug] OK.
Tue Sep 2 15:06:02 2014-[debug] connecting via SSH from [email protected] (192.168.66.120: 1314) to [email protected] (192.168.66.89: 1314 )..
Tue Sep 2 15:06:02 2014-[debug] OK.
Tue Sep 2 15:06:02 2014-[info] All SSH connection Tests passed successfully.
However, the masterha_check_repl check fails.
[[Email protected] ~] $/Usr/bin/masterha_check_repl -- conf =/etc/Appl. CNF
Tue Sep 2 17:10:08 2014-[info] Reading default extends atoins from/etc/masterha_default.cnf ..
Tue Sep 2 17:10:08 2014-[info] Reading application default deployments from/etc/Appl. CNF ..
Tue Sep 2 17:10:08 2014-[info] Reading server deployments from/etc/Appl. CNF ..
Tue Sep 2 17:10:08 2014-[info] MHA: mastermonitor version 0.55.
Tue Sep 2 17:10:08 2014-[info] Dead servers:
Tue Sep 2 17:10:08 2014-[info] alive servers:
Tue Sep 2 17:10:08 2014-[info] 192.168.66.88 (192.168.66.88: 3306)
Tue Sep 2 17:10:08 2014-[info] 192.168.66.89 (192.168.66.89: 3306)
Tue Sep 2 17:10:08 2014-[info] 192.168.66.120 (192.168.66.120: 3307)
Tue Sep 2 17:10:08 2014-[info] alive slaves:
Tue Sep 2 17:10:08 2014-[info] 192.168.66.89 (192.168.66.89: 3306) version = 5.5.17-log (oldest major version between slaves) log-bin: Enabled
Tue Sep 2 17:10:08 2014-[info] replicating from 192.168.66.88 (192.168.66.88: 3306)
Tue Sep 2 17:10:08 2014-[info] primary candidate for the new master (candidate_master is set)
Tue Sep 2 17:10:08 2014-[info] 192.168.66.120 (192.168.66.120: 3307) version = 5.5.17-log (oldest major version between slaves) log-bin: Enabled
Tue Sep 2 17:10:08 2014-[info] replicating from 192.168.66.88 (192.168.66.88: 3306)
Tue Sep 2 17:10:08 2014-[info] Not candidate for the new master (no_master is set)
Tue Sep 2 17:10:08 2014-[info] current alive master: 192.168.66.88 (192.168.66.88: 3306)
Tue Sep 2 17:10:08 2014-[info] Checking slave invocations ..
Tue Sep 2 17:10:08 2014-[info] Checking replication filtering settings ..
Tue Sep 2 17:10:08 2014-[info] binlog_do_db =, binlog_ignore_db =
Tue Sep 2 17:10:08 2014-[info] replication filtering check OK.
Tue Sep 2 17:10:08 2014-[info] Starting SSH connection tests ..
Tue Sep 2 17:10:10 2014-[Error] [/usr/lib/perl5/vendor_perl/MHA/mastermonitor. PM, ln386] Error happend on checking events. Ssh configuration check failed!
At/usr/lib/perl5/vendor_perl/MHA/mastermonitor. PM line 341
Tue Sep 2 17:10:10 2014-[Error] [/usr/lib/perl5/vendor_perl/MHA/mastermonitor. PM, ln482] error happened on monitoring servers.
Tue Sep 2 17:10:10 2014-[info] Got exit code 1 (not master dead ).
MySQL replication health is not OK!
Solution:
1. added the working directory permission of remote_workdir (the server running the MySQL instance). A log file is generated, and the directory owner is set to concert.
[[Email protected] ~] # Chown-r concert: Concert/MHA/
2. Add the concert as a MySQL user group so that it has the permission to read the MySQL binary/relay log file and relay_log.info file, and write the log directory.
[[Email protected] ~] # Usermod-G MySQL concert
Check again
[[Email protected] MHA] $/usr/bin/masterha_check_repl -- conf =/etc/Appl. CNF
Wed Sep 3 22:27:41 2014-[info] Reading default export atoins from/etc/masterha_default.cnf ..
Wed Sep 3 22:27:41 2014-[info] Reading application default deployments from/etc/Appl. CNF ..
Wed Sep 3 22:27:41 2014-[info] Reading server deployments from/etc/Appl. CNF ..
Wed Sep 3 22:27:41 2014-[info] MHA: mastermonitor version 0.55.
Wed Sep 3 22:27:41 2014-[info] Dead servers:
Wed Sep 3 22:27:41 2014-[info] alive servers:
Wed Sep 3 22:27:41 2014-[info] 192.168.66.88 (192.168.66.88: 3306)
Wed Sep 3 22:27:41 2014-[info] 192.168.66.89 (192.168.66.89: 3306)
Wed Sep 3 22:27:41 2014-[info] 192.168.66.120 (192.168.66.120: 3307)
Wed Sep 3 22:27:41 2014-[info] alive slaves:
Wed Sep 3 22:27:41 2014-[info] 192.168.66.89 (192.168.66.89: 3306) version = 5.5.17-log (oldest major version between slaves) log-bin: Enabled
Wed Sep 3 22:27:41 2014-[info] replicating from 192.168.66.88 (192.168.66.88: 3306)
Wed Sep 3 22:27:41 2014-[info] primary candidate for the new master (candidate_master is set)
Wed Sep 3 22:27:41 2014-[info] 192.168.66.120 (192.168.66.120: 3307) version = 5.5.17-log (oldest major version between slaves) log-bin: Enabled
Wed Sep 3 22:27:41 2014-[info] replicating from 192.168.66.88 (192.168.66.88: 3306)
Wed Sep 3 22:27:41 2014-[info] Not candidate for the new master (no_master is set)
Wed Sep 3 22:27:41 2014-[info] current alive master: 192.168.66.88 (192.168.66.88: 3306)
Wed Sep 3 22:27:41 2014-[info] Checking slave invocations ..
Wed Sep 3 22:27:41 2014-[info] Checking replication filtering settings ..
Wed Sep 3 22:27:41 2014-[info] binlog_do_db =, binlog_ignore_db =
Wed Sep 3 22:27:41 2014-[info] replication filtering check OK.
Wed Sep 3 22:27:41 2014-[info] Starting SSH connection tests ..
Wed Sep 3 22:27:42 2014-[info] All SSH connection Tests passed successfully.
Wed Sep 3 22:27:42 2014-[info] Checking MHA node version ..
Wed Sep 3 22:27:43 2014-[info] version check OK.
Wed Sep 3 22:27:43 2014-[info] Checking SSH publickey Authentication Settings on the current master ..
Wed Sep 3 22:27:43 2014-[info] healthcheck: SSH to 192.168.66.88 is reachable.
Wed Sep 3 22:27:43 2014-[info] Master MHA node version is 0.54.
Wed Sep 3 22:27:43 2014-[info] Checking recovery script deployments on the current master ..
Wed Sep 3 22:27:43 2014-[info] Executing command: save_binary_logs -- command = test -- start_pos = 4 -- binlog_dir =/data/lib/MySQL -- output_file =/MHA/appl/save_binary_logs_test -- manager_version = 0.55 -- start_file = mysql-bin.000004
Wed Sep 3 22:27:43 2014-[info] connecting to [email protected] (192.168.66.88 )..
Creating/MHA/appl if not exists... OK.
Checking output directory is accessible or not ..
OK.
BINLOG found at/data/lib/MySQL, up to mysql-bin.000004
Wed Sep 3 22:27:43 2014-[info] Master setting check done.
Wed Sep 3 22:27:43 2014-[info] Checking SSH publickey authentication and checking recovery script deployments on all alive slave servers ..
Wed Sep 3 22:27:43 2014-[info] Executing command: export -- command = test -- slave_user = 'root' -- slave_host = 192.168.66.89 -- slave_ip = 192.168.66.89 -- slave_port = 3306 -- workdir =/MHA/appl -- target_version = 5.5.17-log -- manager_version = 0.55 -- relay_log_info =/data/lib/MySQL/relay-log.info -- relay_dir =/data/lib/MySQL/-- slave_pass = xxx
Wed Sep 3 22:27:43 2014-[info] connecting to [email protected] (192.168.66.89: 1314 )..
Checking slave recovery environment settings ..
Opening/data/lib/MySQL/relay-log.info... OK.
Relay log found at/data/lib/MySQL, up to mysql-relay-bin.000006
Temporary relay log file is/data/lib/MySQL/mysql-relay-bin.000006
Testing MySQL connection and privileges... done.
Testing mysqlbinlog output... done.
Cleaning up test file (s)... done.
Wed Sep 3 22:27:43 2014-[info] Executing command: export -- command = test -- slave_user = 'root' -- slave_host = 192.168.66.120 -- slave_ip = 192.168.66.120 -- slave_port = 3307 -- workdir =/MHA/appl -- target_version = 5.5.17-log -- manager_version = 0.55 -- relay_log_info =/data/lib/mysqlb/relay-log.info -- relay_dir =/data/lib/mysqlb/-- slave_pass = xxx
Wed Sep 3 22:27:43 2014-[info] connecting to [email protected] (192.168.66.120: 1314 )..
Checking slave recovery environment settings ..
Opening/data/lib/mysqlb/relay-log.info... OK.
Relay log found at/data/lib/mysqlb, up to mysql-relay-bin.000005
Temporary relay log file is/data/lib/mysqlb/mysql-relay-bin.000005
Testing MySQL connection and privileges... done.
Testing mysqlbinlog output... done.
Cleaning up test file (s)... done.
Wed Sep 3 22:27:44 2014-[info] slaves Settings check done.
Wed Sep 3 22:27:44 2014-[info]
192.168.66.88 (current master)
+ -- 192.168.66.89
+ -- 192.168.66.120
Wed Sep 3 22:27:44 2014-[info] Checking replication health on 192.168.66.89 ..
Wed Sep 3 22:27:44 2014-[info] OK.
Wed Sep 3 22:27:44 2014-[info] Checking replication health on 192.168.66.120 ..
Wed Sep 3 22:27:44 2014-[info] OK.
Wed Sep 3 22:27:44 2014-[Warning] master_ip_failover_script is not defined.
Wed Sep 3 22:27:44 2014-[Warning] shutdown_script is not defined.
Wed Sep 3 22:27:44 2014-[info] Got exit code 0 (not master dead ).
MySQL replication health is OK.
OK! Solve the problem.
MHA + non-root user SSH equivalent Configuration