MySQL High-availability Scenario MHA Introduction

Last Update:2016-06-16 Source: Internet

Author: User

Tags dba failover mysql version

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

MySQL High Availability Scenario MHA Introduction Overview MHA is a Japanese MySQL Daniel written in Perl a set of MySQL failover scheme to ensure the high availability of the database system. In the time of downtime (typically 10-30 seconds), complete failover, deployment MHA, to avoid master-slave consistency issues, Save on the cost of purchasing a new server without compromising server performance, easy installation, and no changes to existing deployments. Also support online switching, from the current running master switch to a new master, only a short period of time (0.5-2 seconds), this time only blocking the write operation, does not affect the read operation, convenient host hardware maintenance. On systems with high availability, data consistency requirements, MHA provides useful functionality to meet maintenance needs almost uninterrupted. Advantage 1 Master automatic monitoring and failover in a master-slave replication environment that currently exists, MHA can monitor master host failures and failover automatically. Even if some slave do not accept the new relay log EVENTS,MHA will automatically recognize the difference relay log events from the latest slave, and apply the difference event to the other slaves. So all the slave are consistent. MHA Second Level failover (9-12 SEC monitoring to host failure, choose 7 seconds to turn off the power host to avoid brain fissures, then apply the difference relay logs, register to the new master, usually takes time 10-30 seconds that is total downtime). In addition, in the configuration file you can configure a slave priority to become master. Because MHA fixes the consistency between slave, DBAs don't have to deal with consistency issues. When the new master is migrated, the other slave are restored in parallel. Even if there are tens of thousands of slave, it will not affect the recovery master time, Slave also quickly completed. Dena Company in the 150+ master-slave environment with MHA. When one of the master crashes, MHA4 seconds complete failover, which is an active/passive cluster solution that cannot be completed. 2 Interactive (manual) master failover MHA can be used to do only failover, without monitoring MASTER,MHA only as a failover interaction. 3 non-interactive failover is also provided (without monitoring master, automatic failover). This feature is useful, especially if you have installed other software monitoring master. For example, use Pacemaker (Heartbeat) to monitor master faults and VIP takeover with MHA failover and slave elevation. 4 Switch master to different host online in many cases, it is necessary to transfer master to other hosts (such as replacing the RAID controller, upgrading the master machine hardware, and so on). This is not MAster crashes, but planned maintenance must be done. Planned maintenance leads to downtime and must be recovered as quickly as possible. Fast Master switching and graceful blocking writes are required, and MHA provides this approach. Graceful master switch, block write operation within 0.5-2 seconds. In many cases, a downtime of 0.5-2 seconds is acceptable, and even if the Maintenance window is not scheduled. This means that when a faster machine needs to be replaced and a higher version is upgraded, the DBA can easily take action. 5 master crash does not cause master-slave data inconsistency when master crash, MHA automatically recognizes the difference between relay logevents between Slave and then applies to different slave, eventually all slave are synchronized. Combined with the use of semi-synchronous, there is virtually no data loss. Other high-availability scenarios 6 MHA deployment does not affect current environment settings MHA One of the most important design concepts is to use it as easily as possible. With 5.0+ and above, other HA scenarios need to change the MySQL deployment settings, MHA will not let DBAs do these deployment configurations, both synchronous and semi-synchronous environments are available. Start/stop/upgrade/downgrade/install/uninstall MHA without changing MySQL master/slave (e.g. start/stop). When you need to upgrade MHA to a new version, you do not need to stop MySQL, just update the HMA version, and then restart Mhamanger. MHA support contains 5.0/5/1/5.5 (should also support 5.6, translate documents when MHA developer is not updated for 5.6 versions). Some HA scenarios require a specific MySQL version (such as MYSQLCLUSTER,MYSQL with global transaction ID, etc.), and you may not want to migrate the app just for Masterha. In many cases, the company has deployed many traditional MySQL applications, development or DBAs don't want to spend too much time migrating to different storage engines or new features (newer bleeding edge distributions don't know if this translates). 7 no increase in server costs MHA includes MHA Manager and MHA node. MHA node runs on each MySQL server, the manager can deploy a single machine, monitoring 100+ above master, the total number of servers will not be greatly increased. It is important to note that the manager can also run on a machine in the slaves. 8 performance without impact when monitoring MASTER,MHA is only a few seconds (default 3 seconds) Send ping packets, do not send large queries. Master-slave replication performance is unaffected 9 applies to any storage engine MySQL is not only applicable to the transaction security of the InnoDB engine, the engine from which the master applies, MHA can be applied. Even with the legacy environment of the Mysiam engine, no migration is possible with MHA.
Compared to other HA scenarios doing everything manuallymysql replication is synchronous or semi-synchronous. When master crashes, it is likely that some slave have not accepted the latest relay log events, which means that each slave is in a different state from each other. People are no longer trivial to fix the problem of consistency. Without consistency issues, master and slave may not start (such as duplicate key error). It is unusual to spend 1 hours restarting master-slave replication. A single master and a slave in one master-slave case, some slave lag with other slave situations will not occur. One of the master crashes, it is easy to transfer the application to a new master, providing external services, the failure to migrate is very simple. Master, one candidate master, and multiple slaves dual master multiple slave architectures are also common. Master master hangs up, and the alternate master will serve as the Master master. Some scenarios are configured as multi-master schemas. M (rw)-----M2 (R) M (RW), promoted from m2| |+----+----+--(master crash)--+-x--+--x-+s (R) S2 (R) S (?) S (?) (Fromwhich position should S restart replication?) However, this is not a master failover scenario. The current master hangs, and the remaining slave does not necessarily accept all relay log events, fixing data consistency or problems. This architecture is widely used, but is not everyone can understand the above problems. Current master hangs, slave becomes not unified or slave cannot copy data from new master. Perhaps dual master, where one master is read-only, each master has at least one slave that might solve the problem. M (RW)--m2 (R) | | S (R) S2 (r) Pacemaker + drbdpecemaker (Heartbeat) +drbd+mysql is a common scenario. But this program also has the following problems 1 cost problem, especially run a lot of master and slave environment. PECEMAKER+DRBD is an active/passive solution, so a passive server is required to provide no application services to the outside. Basic needs four MySQL servers, one active Master,one passive masteR,two slaves. 2 outage time (downtime). PACEMAKER+DRBD is the primary standby cluster, the master master hangs, and the standby master is enabled. This may take a long time, especially without the use of InnoDB plugin. Even with InnoDB plugin, it is unusual to take a few minutes to start accepting connections on the standby master. In addition, because the data/file cache on the standby master is empty, the recovery time, warm-up (populating data into the database buffer pool) takes a noticeable amount of time. In practice, one or more slave are required to provide sufficient reading services. During warm-up time, the empty cache results in a write performance decrease of 3 write problems or consistency issues. In order for the active/passive cluster to really work, the transaction log (binary log and InnoDB log) must be refreshed after each commit (commit), that is, the innodb-flush-log-at-trx-commit=1 must be set. Sync-binlog=1. Setting sync-binlog=1 reduces write performance because the Fsync () function is serialized (Sync-binlog=1,group commit fails). In most cases, sync-binlog=1 is not set. If Sync-binlog=1 is not set, the active master crash, the new master (formerly Passive server) may lose some binary log events that have been sent to slave. If master hangs, slave a accepts to mysqld-bin.000123, position 1500. Binlog data is flushed to the hard disk at 1000, then the new master data can only be mysqld-bin.000123 1000, and then a new binary log mysqld-bin.000124 is created at startup. If this happens, slave a cannot continue copying because the new master does not have mysqld-bin.000123 location 1500.4 complex. For most people, installing/initializing Pacemake and DRBD is not an easy task. In contrast to other cases, it is not easy to initialize DRBD to recreate the system partition. Requires DBAs to have sufficient skills on the DRBD and Linux kernel layers. If the DBA executes an error command (such as executing Drbdadm–overwrite-data-of-peer primary on the passive node), the active data will be corrupted. It is also important that most DBAs are not easy to handle this problem once the hard disk IO layer has problems. MySQL Clustermysql CLuster is a truly high-availability solution, but it has to be done with the NDB storage engine. If you use InnoDB, you will not be able to play MySQL cluster cluster advantage. Semi-synchronous replication semi-synchronous replication greatly reduces the risk that the Binlog event exists only on the crash master. This is useful to avoid data loss. However, the semi-synchronization does not resolve all the consistency issues, only one (not all) slave accepted to the master side of the commit binlog events, the other slave may not have accepted all Binlog events. It is not possible to apply different Binlog events from new slave to other slave, nor to guarantee consistency global Transaction Idglobaltransaction ID to achieve the same purpose as MHA, but it covers more. MHA is only level two replication, but the global transaction ID overrides any level of replication environment, even if level two replication fails, DBAs can override the third level. Check Google ' Sglobal transaction ID project for details.
Source: http://blog.csdn.net/wulantian/article/details/11770159

From for notes (Wiz)

MySQL High-availability Scenario MHA Introduction

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More