MHA high availability deployment and Testing
MHA high availability deployment and testing, MySQL high availability solution
Directory
I. MHA features
Ii. MHA Mechanism
Iii. Master-slave architecture applicable to MHA
Iv. Construction of MHA high-availability environment
4.1 experiment environment
4.2 experiment steps
4.3 script description
4.4 MHA deployment process
4.5 VIP configuration
5. Common MHA commands
Vi. Precautions
7. Problems Encountered during deployment
I. MHA features
1. Automatic Monitoring and Failover of the master server
MHA monitors the master server in the replication architecture. Once a master server fault is detected, failover is automatically performed. Even if some slave servers do not receive the latest relay log, MHA automatically identifies the differential relay log from the latest slave server and applies the log to other slave servers, therefore, all slave servers are consistent. MHA usually completes failover within several seconds. It can detect faults on the primary server in 9-12 seconds, and shut down the faulty primary server in 7-10 seconds to avoid split-brain, the application differential relay log in a few seconds is sent to the new master server. The entire process can be completed within 10-30 s. You can also set a priority to specify a Server Load balancer instance as the master candidate. Because MHA fixes consistency between slaves, any slave can be changed to a new master without consistency issues, resulting in replication failure.
2. Interactive master server failover
Zookeeper can only use MHA failover instead of monitoring the master server. When the master server fails, MHA is manually called for failover.
3. non-interactive master failover
Zookeeper does not monitor the master server, but automatically implements failover. This feature applies to the use of other software to monitor the status of the master server, such as heartbeat to detect master server faults and virtual IP address takeover. MHA can be used to achieve failover and upgrade the slave server to the master server.
4. Switch the master server online
Zookeeper needs to migrate existing primary servers to another server in many cases. For example, if the hardware of the primary server is faulty, the RAID controller must be rebuilt to move the primary server to a server with better performance. Maintenance of the master server causes performance degradation, resulting in downtime at least Data Writing. In addition, blocking or killing the currently running session will cause data inconsistency between the master and the master. MHA provides fast switching and elegant blocking of writing. This switching process only takes 0.5-2 seconds, during which data cannot be written. In many cases, 0.5-2 seconds of blocking write is acceptable. Therefore, you do not need to schedule the maintenance time window for switching the master server (haha, you do not need to complete the task of switching the master server all night long when the night is dark ).
Ii. MHA Mechanism
MHA automatic Failover process Parsing
Http://www.mysqlsystems.com/2012/03/figure-out-process-of-autofailover-on-mha.html
Https://code.google.com/p/mysql-master-ha/wiki/Sequences_of_MHA
Iii. Master-slave architecture applicable to MHA
Https://code.google.com/p/mysql-master-ha/wiki/UseCases
Iv. Construction of MHA high-availability environment
4.1 experiment environment
• Node1: 192.168.10.216 (master)
• Node2: 192.168.10.217 (slave and slave for Master failover)
• Node3: 192.168.10.218 (slave, and MHA management node)
• VIP: 192.168.10.219
• Mysql: Percona-Server-5.6.16-rel64.2-569
• The above node systems are CentOS6.5 x64
4.2 experiment steps
A. Configure the epel yum source at three nodes and install the relevant dependency packages.
B. Establish a master-slave replication relationship
C. ssh-keygen for mutual key-free login between the three machines
D. Three-node installation mha4mysql-node-0.56, mha4mysql-manager-0.56 installed on node3
E. Manage the MHA configuration file on node3
F. masterha_check_ssh verify that the ssh trusted logon is successful, and masterha_check_repl verify that mysql replication is successful.
G. Start MHA manager and monitor log files
H. Test whether automatic switch is normal after mysql of master (Node1) is down
I. Configure the VIP. After the switch, the master service is automatically taken over and transparent to the client.
4.3 script description
MHA node has three scripts, depending on the perl Module
Save_binary_logs: stores and copies the binary logs of the Active Server.
Apply_diff_relay_logs: identifies differential relay log events and applies them to all slave server nodes.
Purge_relay_logs: clears the relay log file.
4.4 MHA deployment process
A. Configure the epel yum source at three nodes and install the relevant dependency packages.
Rpm-Uvh http://dl.Fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
Rpm -- import/etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-6
Yum-y install perl-DBD-MySQL ncftp
B. Establish a master-slave replication relationship
On node1:
Mysql> grant replication slave on *. * to 'rep '@ '192. 192. %' identified by 'geekwolf ';
Mysql> grant all on *. * to 'root' @ '192. 192.% 'identified by 'geekwolf ';
Mysql> show master status;
Copy the data directory of node1 and synchronize it to node2, node3.
On node2 node3:
Mysql> change master to master_host = '2017. 168.10.216 ', master_user = 'rep', master_password = 'geekwolf ', master_port = 3306, master_log_file = 'mysql-in.000006', master_log_pos = 120, master_connect_retry = 1;
Mysql> start slave;
Each node has a soft link for mysql commands.
Ln-s/usr/local/mysql/bin/*/usr/local/bin/
C. ssh-keygen for mutual key-free login between the three machines
Run
Ssh-keygen-t rsa
Ssh-copy-id-I/root/. ssh/id_rsa.pub root @ node1
Ssh-copy-id-I/root/. ssh/id_rsa.pub root @ node2
Ssh-copy-id-I/root/. ssh/id_rsa.pub root @ node3
D. Three-node installation mha4mysql-node-0.56, mha4mysql-manager-0.56 installed on node3
Install mha4mysql-node in node1 node2 node3
Wget https://googledrive.com/host/0B1lu97m8-haWeHdGWXp0YVVUSlk/mha4mysql-node-0.56.tar.gz
Tar xf mha4mysql-node-0.56.tar.gz
Cd mha4mysql-node
Perl Makefile. PL
Make & make install
Install mha4mysql-manager on node3
Wget https://googledrive.com/host/0B1lu97m8-haWeHdGWXp0YVVUSlk/mha4mysql-manager-0.56.tar.gz
Tar xf mha4mysql-manager-0.56.tar.gz
Cd mha4mysql-manager-0.56
Yum install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Config-IniFiles perl-Time-HiRes
E. Manage the MHA configuration file on node3
Mkdir-p/etc/mha/{app1, scripts}
Cp mha4mysql-manager-0.56/samples/conf/etc/mha/
Cp mha4mysql-manager-0.56/samples/scripts/etc/mha/scripts/
Mv/etc/mha/app1.cnf/etc/mha/app1/
Mv/etc/mha/masterha_default.cnf/etc/masterha_default.cnf
Set global Configuration:
Vim/etc/mha/masterha_default.cnf
[Server default]
User = root
Password = geekwolf
Ssh_user = root
Repl_user = rep
Repl_password = geekwolf
Ping_interval = 1
# Shutdown_script = ""
Secondary_check_script = masterha_secondary_check-s node1-s node2-s node3 -- user = root -- master_host = node 1 -- master_ip = 192.168.10.216 -- master_port = 3306
# Master_ip_failover_script = "/etc/mha/scripts/master_ip_failover"
# Master_ip_online_change_script = "/etc/mha/scripts/master_ip_online_change"
# Shutdown_script =/script/masterha/power_manager
# Report_script = ""
Vim/etc/mha/app1/app1.cnf
[Server default]
Manager_workdir =/var/log/mha/app1
Manager_log =/var/log/mha/app1/manager. log
[Server1]
Hostname = node1
Master_binlog_dir = "/usr/local/mysql/logs"
Candidate_master = 1
[Server2]
Hostname = node2
Master_binlog_dir = "/usr/local/mysql/logs"
Candidate_master = 1
[Server3]
Hostname = node3
Master_binlog_dir = "/usr/local/mysql/logs"
No_master = 1
Note:
Required bytes candidate_master = 1 indicates that the host can be preferentially elected as the new master. When multiple [serverX] and other parameters are set, the priority is determined by the order in which [serverX] is configured.
Secondary_check_script mha strongly recommends that you have two or more network lines to check the availability of the MySQL master server. By default, there is only one route for MHA Manager checks: from Manager to Master, but this is not desirable. MHA can actually have two or more check routes by calling an external script to define the secondary check Script Parameters
Master_ip_failover_script calls this script when MySQL is upgraded from the server to the new master server. Therefore, the vip information can be written to this configuration file.
Master_ip_online_change_script this script is called when you use the masterha_master_switch command to manually switch to the MySQL master server. The parameters are similar to master_ip_failover_script, and scripts can be used with each other.
Shutdown_script this script (the script in samples by default) uses the remote control IDRAC of the server, and forces the shutdown using ipmitool to prevent the fence device from restarting the master server, resulting in Brain column Phenomena
When the new master server switchover is complete, send the mail report through this script, refer to using http://caspian.dotconf.net/menu/Software/SendEmail/sendEmail-v1.56.tar.gz
The scripts involved above can be modified from the mha4mysql-manager-0.56/samples/scripts/* Copy Using
Other manager detailed configuration parameter https://code.google.com/p/mysql-master-ha/wiki/Parameters
F. masterha_check_ssh verify that the ssh trusted logon is successful, and masterha_check_repl verify that mysql replication is successful.
Verify ssh trust: masterha_check_ssh-conf =/etc/mha/app1/app1.cnf
[Root @ localhost ~] # Masterha_check_ssh -- conf =/etc/mha/app1/app1.cnf
Tue May 13 07:53:15 2014-[warning] Global configuration file/etc/masterha_default.cnf not found. Skipping.
Tue May 13 07:53:15 2014-[info] Reading application default configuration from/etc/mha/app1/app1.cnf ..
Tue May 13 07:53:15 2014-[info] Reading server configuration from/etc/mha/app1/app1.cnf ..
Tue May 13 07:53:15 2014-[info] Starting SSH connection tests ..
Tue May 13 07:53:16 2014-[debug]
Tue May 13 07:53:15 2014-[debug] Connecting via SSH from root @ node1 (192.168.10.216: 22) to root @ node2 (192.168.10.217: 22 )..
Tue May 13 07:53:15 2014-[debug] OK.
Tue May 13 07:53:15 2014-[debug] Connecting via SSH from root @ node1 (192.168.10.216: 22) to root @ node3 (192.168.10.218: 22 )..
Tue May 13 07:53:16 2014-[debug] OK.
Tue May 13 07:53:16 2014-[debug]
Tue May 13 07:53:16 2014-[debug] Connecting via SSH from root @ node2 (192.168.10.217: 22) to root @ node1 (192.168.10.216: 22 )..
Tue May 13 07:53:16 2014-[debug] OK.
Tue May 13 07:53:16 2014-[debug] Connecting via SSH from root @ node2 (192.168.10.217: 22) to root @ node3 (192.168.10.218: 22 )..
Tue May 13 07:53:16 2014-[debug] OK.
Tue May 13 07:53:17 2014-[debug]
Tue May 13 07:53:16 2014-[debug] Connecting via SSH from root @ node3 (192.168.10.218: 22) to root @ node1 (192.168.10.216: 22 )..
Tue May 13 07:53:16 2014-[debug] OK.
Tue May 13 07:53:16 2014-[debug] Connecting via SSH from root @ node3 (192.168.10.218: 22) to root @ node2 (192.168.10.217: 22 )..
Tue May 13 07:53:17 2014-[debug] OK.
Tue May 13 07:53:17 2014-[info] All SSH connection tests passed successfully.
Verify master-slave replication: masterha_check_repl-conf =/etc/mha/app1/app1.cnf
[Root @ localhost mha] # masterha_check_repl -- conf =/etc/mha/app1/app1.cnf
Tue May 13 08:10:54 2014-[info] Reading default configuration from/etc/masterha_default.cnf ..
Tue May 13 08:10:54 2014-[info] Reading application default configuration from/etc/mha/app1/app1.cnf ..
Tue May 13 08:10:54 2014-[info] Reading server configuration from/etc/mha/app1/app1.cnf ..
Tue May 13 08:10:54 2014-[info] MHA: MasterMonitor version 0.56.
Tue May 13 08:10:54 2014-[info] GTID failover mode = 0
Tue May 13 08:10:54 2014-[info] Dead Servers:
Tue May 13 08:10:54 2014-[info] Alive Servers:
Tue May 13 08:10:54 2014-[info] node1 (192.168.10.216: 3306)
Tue May 13 08:10:54 2014-[info] node2 (192.168.10.217: 3306)
Tue May 13 08:10:54 2014-[info] node3 (192.168.10.218: 3306)
Tue May 13 08:10:54 2014-[info] Alive Slaves:
Tue May 13 08:10:54 2014-[info] node2 (192.168.10.217: 3306) Version = 5.6.16-64.2-rel64.2-log (oldest major version between slaves) log-bin: enabled
Tue May 13 08:10:54 2014-[info] Replicating from 192.168.10.216 (192.168.10.216: 3306)
Tue May 13 08:10:54 2014-[info] Primary candidate for the new Master (candidate_master is set)
Tue May 13 08:10:54 2014-[info] node3 (192.168.10.218: 3306) Version = 5.6.16-64.2-rel64.2-log (oldest major version between slaves) log-bin: enabled
Tue May 13 08:10:54 2014-[info] Replicating from 192.168.10.216 (192.168.10.216: 3306)
Tue May 13 08:10:54 2014-[info] Not candidate for the new Master (no_master is set)
Tue May 13 08:10:54 2014-[info] Current Alive Master: node1 (192.168.10.216: 3306)
Tue May 13 08:10:54 2014-[info] Checking slave invocations ..
Tue May 13 08:10:54 2014-[info] read_only = 1 is not set on slave node2 (192.168.10.217: 3306 ).
Tue May 13 08:10:54 2014-[warning] relay_log_purge = 0 is not set on slave node2 (192.168.10.217: 3306 ).
Tue May 13 08:10:54 2014-[info] read_only = 1 is not set on slave node3 (192.168.10.218: 3306 ).
Tue May 13 08:10:54 2014-[warning] relay_log_purge = 0 is not set on slave node3 (192.168.10.218: 3306 ).
Tue May 13 08:10:54 2014-[info] Checking replication filtering settings ..
Tue May 13 08:10:54 2014-[info] binlog_do_db =, binlog_ignore_db =
Tue May 13 08:10:54 2014-[info] Replication filtering check OK.
Tue May 13 08:10:54 2014-[info] GTID (with auto-pos) is not supported
Tue May 13 08:10:54 2014-[info] Starting SSH connection tests ..
Tue May 13 08:10:55 2014-[info] All SSH connection tests passed successfully.
Tue May 13 08:10:55 2014-[info] Checking MHA Node version ..
Tue May 13 08:10:55 2014-[info] Version check OK.
Tue May 13 08:10:55 2014-[info] Checking SSH publickey authentication settings on the current master ..
Tue May 13 08:10:56 2014-[info] HealthCheck: SSH to node1 is reachable.
Tue May 13 08:10:56 2014-[info] Master MHA Node version is 0.56.
Tue May 13 08:10:56 2014-[info] Checking recovery script deployments on node1 (192.168.10.216: 3306 )..
Tue May 13 08:10:56 2014-[info] Executing command: save_binary_logs -- command = test -- start_pos = 4 -- binlog_dir =/usr/local/mysql/logs -- output_file =/var/tmp/save_binary_logs_test -- manager_version = 0.56 -- start_file = mysql-bin.000009
Tue May 13 08:10:56 2014-[info] Connecting to root@192.168.10.216 (node1: 22 )..
Creating/var/tmp if not exists... OK.
Checking output directory is accessible or not ..
OK.
Binlog found at/usr/local/mysql/logs, up to mysql-bin.000009
Tue May 13 08:10:56 2014-[info] Binlog setting check done.
Tue May 13 08:10:56 2014-[info] Checking SSH publickey authentication and checking recovery script deployments on all alive slave servers ..
Tue May 13 08:10:56 2014-[info] Executing command: export -- command = test -- slave_user = 'root' -- slave_host = node2 -- slave_ip = 192.168.10.217 -- slave_port = 3306 -- workdir =/var/tmp -- target_version = 5.6.16-64.2-rel64.2-log -- manager_version = 0.56 -- relay_log_info =/usr/local/mysql/data/relay-log.info -- relay_dir =/usr/local/mysql/data/-- slave_pass = xxx
Tue May 13 08:10:56 2014-[info] Connecting to root@192.168.10.217 (node2: 22 )..
Checking slave recovery environment settings ..
Opening/usr/local/mysql/data/relay-log.info... OK.
Relay log found at/usr/local/mysql/logs, up to relay-bin.000006
Temporary relay log file is/usr/local/mysql/logs/relay-bin.000006
Testing mysql connection and privileges .. Warning: Using a password on the command line interface can be insecure.
Done.
Testing mysqlbinlog output... done.
Cleaning up test file (s)... done.
Tue May 13 08:10:57 2014-[info] Executing command: export -- command = test -- slave_user = 'root' -- slave_host = node3 -- slave_ip = 192.168.10.218 -- slave_port = 3306 -- workdir =/var/tmp -- target_version = 5.6.16-64.2-rel64.2-log -- manager_version = 0.56 -- relay_log_info =/usr/local/mysql/data/relay-log.info -- relay_dir =/usr/local/mysql/data/-- slave_pass = xxx
Tue May 13 08:10:57 2014-[info] Connecting to root@192.168.10.218 (node3: 22 )..
Checking slave recovery environment settings ..
Opening/usr/local/mysql/data/relay-log.info... OK.
Relay log found at/usr/local/mysql/logs, up to relay-bin.000006
Temporary relay log file is/usr/local/mysql/logs/relay-bin.000006
Testing mysql connection and privileges .. Warning: Using a password on the command line interface can be insecure.
Done.
Testing mysqlbinlog output... done.
Cleaning up test file (s)... done.
Tue May 13 08:10:57 2014-[info] Slaves settings check done.
Tue May 13 08:10:57 2014-[info]
Node1 (192.168.10.216: 3306) (current master)
+ -- Node2 (192.168.10.217: 3306)
+ -- Node3 (192.168.10.218: 3306)
Tue May 13 08:10:57 2014-[info] Checking replication health on node2 ..
Tue May 13 08:10:57 2014-[info] OK.
Tue May 13 08:10:57 2014-[info] Checking replication health on node3 ..
Tue May 13 08:10:57 2014-[info] OK.
Tue May 13 08:10:57 2014-[warning] master_ip_failover_script is not defined.
Tue May 13 08:10:57 2014-[warning] shutdown_script is not defined.
Tue May 13 08:10:57 2014-[info] Got exit code 0 (Not master dead ).
MySQL Replication Health is OK.
G. Start MHA manager and monitor log files
Killall mysqld on node1 and start the manager service on node3 at the same time
[Root @ localhost mha] # masterha_manager -- conf =/etc/mha/app1/app1.cnf
Tue May 13 08:19:01 2014-[info] Reading default configuration from/etc/masterha_default.cnf ..
Tue May 13 08:19:01 2014-[info] Reading application default configuration from/etc/mha/app1/app1.cnf ..
Tue May 13 08:19:01 2014-[info] Reading server configuration from/etc/mha/app1/app1.cnf ..
Creating/var/tmp if not exists... OK.
Checking output directory is accessible or not ..
OK.
Binlog found at/usr/local/mysql/logs, up to mysql-bin.000009
Tue May 13 08:19:18 2014-[info] Reading default configuration from/etc/masterha_default.cnf ..
Tue May 13 08:19:18 2014-[info] Reading application default configuration from/etc/mha/app1/app1.cnf ..
Tue May 13 08:19:18 2014-[info] Reading server configuration from/etc/mha/app1/app1.cnf ..
Then observe/var/log/mha/app1/manager on node3. log shows the status of node1 dead. The master node is automatically switched to node2, And the Master/Slave configuration on node3 points to node2, after a switchover,/var/log/mha/app1/app1.failover is generated. complete file;
For more details, please continue to read the highlights on the next page: