Mha (Master High Availability) is currently a mature solution for MySQL multi-server (more than two) High Availability.
I. What is mha and what are its features?
1. Automatic monitoring and failover of the master server
MHA monitors the master server in the replication architecture. Once the master server is detected to be faulty, failover is automatically performed. Even if some slave servers do not receive the latest relay log, MHA automatically identifies the differential relay log from the latest slave server and applies the log to other slave servers, therefore, all slave servers are consistent. MHA usually completes failover within several seconds. It can detect faults on the primary server in 9-12 seconds, and shut down the faulty primary server in 7-10 seconds to avoid split-brain, the application differential relay log in a few seconds is sent to the new master server. The entire process can be completed within 10-30 s. You can also set a priority to specify a server load balancer instance as the master candidate. Because MHA fixes consistency between slaves, any slave can be changed to a new master without consistency issues, resulting in replication failure.
2. Interactive master server failover
You can only use MHA failover instead of monitoring the master server. When the master server fails, MHA is manually called for failover.
3. Non-interactive master failover
The master server is not monitored, but failover is automatically implemented. This feature applies to the use of other software to monitor the status of the master server, such as heartbeat to detect master server faults and virtual IP address takeover. MHA can be used to achieve failover and upgrade the slave server to the master server.
4. Switch master/slave servers online
In many cases, you need to migrate the existing master server to another server. For example, if the hardware of the primary server is faulty, the RAID controller must be rebuilt to move the primary server to a server with better performance. Maintenance of the master server causes performance degradation, resulting in downtime at least data writing. In addition, blocking or killing the currently running session will cause data inconsistency between the master and the master. MHA provides fast switching and elegant blocking of writing. This switching process only takes 0.5-2 seconds, during which data cannot be written. In many cases, 0.5-2 seconds of blocking write is acceptable. Therefore, you do not need to schedule the maintenance time window for switching the master server (haha, you do not need to complete the task of switching the master server all night long when the night is dark ).
5. MHA consists of two parts: MHA Manager (management Node) and MHA Node (data Node)
To build MHA, a replication cluster must have at least three database servers, one master and two slave, that is, one master, one slave, and the other slave, the management node can be on the same machine as the master node. So if you only have two machines, heartbeat and keepalive are all good choices.
6. MHA is flexible. You can write scripts for failover or master-slave switchover.
7. After mha fails, the configuration file will be modified, which makes me very funny. If The failover requires re-modifying the configuration file and restarting the masterha_manager service.
II. Server description
192.168.10.103 masters // master
192.168.10.209 slave1 // from
192.168.10.219 slave2 // slave (master/slave)
192.168.10.220 manage // management node
One master, two slave, and one management node write the content above to each/etc/hosts.
3. Password-less ssh login between servers
# Ssh-keygen-t rsa
# Ssh-copy-id-I/root/. ssh/id_rsa.pub root@192.168.10.103
# Ssh-copy-id-I/root/. ssh/id_rsa.pub root@192.168.10.209
# Ssh-copy-id-I/root/. ssh/id_rsa.pub root@192.168.10.219
# Ssh-copy-id-I/root/. ssh/id_rsa.pub root@192.168.10.220
There are five commands on it. If the command is run on machine 103, ssh-copy-id is not required for machine 103. After the copy operation is complete, run the ssh command to test whether the password is required for inter-machine switch.
4. Install mha
1. Download mha
Https://code.google.com/p/mysql-master-ha/downloads/list
2. All nodes must be installed.
# Yum install-y perl-DBD-MySQL
# Rpm-ivh mha4mysql-node-0.54-0.el6.noarch.rpm
3. Manage nodes
# Yum install-y perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager
# Rpm-ivh mha4mysql-manager-0.55-0.el6.noarch.rpm
Note: the manager and node versions can be different.
5. Configure mysql replication
See mysql replication master-slave synchronization.
To comply with mha configurations, the root article is a bit different.
1. Master/slave configurations must be available
Binlog-do-db = test
Replicate-do-db = test
Generally, the master server must contain binlog-do-db = test, and the slave server must contain replicate-do-db = test, so that the master and slave servers can be synchronized. However, the following error will be reported if this configuration is used only.
All log-bin enabled servers must have same binlog filtering rules (same binlog-do-db and binlog-ignore-db). Check show master status output and set my. cnf correctly.
It took a lot of time to explore this configuration. I always thought that the above English means that the databases for master-slave synchronization should be the same, but not in the configuration file, the configuration of the database must be the same.
2. Add relay_log_purge = 0 to the slave server. If this parameter is not added, the system reports the warning, relay_log_purge = 0 is not set on slave.
VI. corosync pacemaker mysql replication configuration
See corosync pacemaker mysql replication for high availability.
The goal of configuring corosync pacemaker is to get a virtual IP address that connects the master node to the master node and the Slave node. I can connect to the master node through a virtual IP address. The advantage of this is that if the master node is down, I can use a virtual IP address to connect to the master and slave nodes. If the master node is modified, the virtual IP address can be connected to the master node without modifying the code.
7. Configure mha manage
1. Add a management account. Each machine performs the following operations:
Grant all privileges on *. * TO mha @ '2017. 192.% 'identified by 'test ';
Flush privileges;
2. Configure/etc/mha/app1.cnf only on the management end, manage this machine
# Mkdir/etc/mha
# Mkdir-p/var/log/mha/app1
[Root @ manage mysql] # cat/etc/mha/app1.cnf
[Server default]
Manager_log =/var/log/mha/app1/manager. log
Manager_workdir =/var/log/mha/app1.log
Master_binlog_dir =/var/lib/mysql
User = mha
Password = test
Ping_interval = 2
Repl_password = test
Repl_user = test
Ssh_user = root
[Server1]
Hostname = 192.168.10.103
Port = 3306
[Server2]
Candidate_master = 1
Check_repl_delay = 0
Hostname = 192.168.10.219
Port = 3306
[Server3]
Hostname = 192.168.10.209
Port = 3306
The configuration in server default is a common configuration of three machines, which can also be customized in a specific server.
8. Check whether mha manage is configured successfully.
1. Check ssh logon
# Masterha_check_ssh -- conf =/etc/mha/app1.cnf
If All SSH connection tests passed successfully is displayed, the ssh configuration is successful.
2. Check whether mysql replication is configured successfully.
# Masterha_check_repl -- conf =/etc/mha/app1.cnf
If the following content appears, the configuration is successful.
Mha checks mysql replication
Mha checks mysql replication
3. Common management commands
Masterha_check_ssh check the SSH configuration of MHA
Masterha_check_repl check MySQL replication status
Masterha_manger start MHA
Masterha_check_status checks the Current MHA running status
Masterha_master_monitor
Masterha_master_switch controls failover (automatic or manual)
Masterha_conf_host: add or delete the server information of the configuration
9. Start monitoring on the management end
[Root @ manage mha] # nohup masterha_manager -- conf =/etc/mha/app1.cnf -- remove_dead_master_conf -- ignore_last_failover </dev/null>/var/log/mha/app1/manager. log 2> & 1 &
[1] 13675
[Root @ manage mha] # masterha_check_status -- conf =/etc/mha/app1.cnf // check the status
App1 (pid: 13675) is running (0: PING_ OK), master: 192.168.10.103
[Root @ manage mha] # masterha_stop -- conf =/etc/mha/app1.cnf // disable monitoring
Here, we have configured mha.
10. Let's talk about my testing process.
1. mysql-u test-p-h 192.168.10.130, log on through the virtual IP address
2. Insert data and check whether the data of master 103 is synchronized with the two slave servers.
3. On master 103, executing crm node standby will bring several results.
On Machine 220,/etc/mha/app1.cnf
[Server1]
Hostname = 192.168.10.103
Port = 3306
The configuration disappears.
On machine 219, show master status; data exists and becomes the host.
On the machine 209, Master_Host: 192.168.10.219 in show slave status \ G; changes to 219.
4. Run # crm node online on 103. At this time, 103 is neither the master nor slave. After standby, the mysqld process is disabled, so start mysqld here, then add 103 to 219.
Mysql> change master to MASTER_HOST = '192. 168.10.219 ',
MASTER_USER = 'test', MASTER_PASSWORD = 'test ',
MASTER_LOG_FILE = 'MySQL-bin.000048 ',
MASTER_LOG_POS = 107;
5. Online master/slave switching
[Root @ manage mysql] # masterha_master_switch -- conf =/etc/mha/app1.cnf -- master_state = alive -- new_master_host = 192.168.10.103 -- new_master_port = 3306 -- timeout -- running_updates_limit = 10000
Wed Apr 29 04:14:55 2015-[info] MHA: MasterRotate version 0.55.
Wed Apr 29 04:14:55 2015-[info] Starting online master switch ..
Wed Apr 29 04:14:55 2015-[info]
Wed Apr 29 04:14:55 2015-[info] * Phase 1: Configuration Check Phase ..
Wed Apr 29 04:14:55 2015-[info]
Wed Apr 29 04:14:55 2015-[warning] Global configuration file/etc/masterha_default.cnf not found. Skipping.
Wed Apr 29 04:14:55 2015-[info] Reading application default tolerations from/etc/mha/app1.cnf ..
Wed Apr 29 04:14:55 2015-[info] Reading server deployments from/etc/mha/app1.cnf ..
Wed Apr 29 04:14:55 2015-[info] Current Alive Master: 192.168.10.219 (192.168.10.219: 3306)
Wed Apr 29 04:14:55 2015-[info] Alive Slaves:
Wed Apr 29 04:14:55 2015-[info] 192.168.10.209 (192.168.10.209: 3306) Version = 5.1.73-log (oldest major version between slaves) log-bin: enabled
Wed Apr 29 04:14:55 2015-[info] Replicating from 192.168.10.219 (192.168.10.219: 3306)
It is better to execute FLUSH NO_WRITE_TO_BINLOG TABLES on the master before switching. Is it OK to execute on 192.168.10.219 (192.168.10.219: 3306 )? (YES/no): yes
Wed Apr 29 04:15:10 2015-[info] Executing FLUSH NO_WRITE_TO_BINLOG TABLES. This may take long time ..
Wed Apr 29 04:15:10 2015-[info] OK.
Wed Apr 29 04:15:10 2015-[info] Checking MHA is not monitoring or doing failover
............. Omitted ...............
In this way, the original status is changed.
MySQL MHA testing
Basically the same as MMM.
Before the test, I had two questions:
1) after the master-slave switchover, how does the application access the new master?
2) Can I monitor multiple groups of MySQL master/slave architectures on an MHA Manager server?
It is not difficult to deploy and install it. You can check the document and it is not listed here.
I have created two groups of MySQL Master/Slave architectures, with three machines in each group: Master-Premary, Master-Standby, and Slave.
Another MHA Manger.
The test was basically smooth, and I also learned about the MHA working mechanism.
Summary:
1) MHA is only responsible for master-slave Failover. It does not have a fixed interface provided to the application end. Therefore, in the real environment, tools such as Keepalived are still required to provide VIP;
2) if multiple MySQL master/slave groups cannot be monitored on a single MHA Manager server, a conflict is prompted;
3) MHA Failover is fast and stable;
4) when MHA fails, the Master-Standby, that is, the New Master, is sent to the reset slave for processing. Therefore, manual recovery is required later.
I basically got all the answers I needed. The answers to the two questions mentioned above are not perfect, so I didn't consider applying them to the company's production environment for the moment.