Mysql high availability solution MHA introduction _ MySQL

Source: Internet
Author: User
Mysql high availability solution MHA introduction bitsCN.com

Introduction to mysql high availability solution MHA

Overview

MHA is a MySQL Failover Solution written by MySQL daiu in Japan using Perl to ensure high availability of the database system. failover and MHA deployment can be completed within 10-30 seconds of downtime to avoid master-slave consistency and save costs for purchasing new servers without affecting server performance, easy to install without changing the existing deployment.

Online switching is also supported. switching from the current running master to a new master takes only a short time (within 0.5-2 seconds). In this case, only write operations are blocked and read operations are not affected, easy to maintain host hardware.

In systems with high availability and data consistency requirements, MHA provides useful functions to meet maintenance needs almost continuously.

Advantages

1 master automatic monitoring and failover

In an existing master-slave replication environment, MHA can monitor master host faults and automatically failover.

Even if some slave instances do not accept the new relay log events, MHA automatically identifies the different relay log events from the latest slave and applies the different events to other slaves. Therefore, all slave instances are consistent. MHA second-level failover (from 9 to 12 seconds to monitor host failure, choose 7 seconds to shut down the power host to avoid split-brain, then apply the difference relay logs, register to the new master, it usually takes 10-30 seconds, that is, total downtime ). In addition, you can configure an slave in the configuration file to give priority to master. Because MHA fixes the consistency between slave, DBAs do not have to handle the consistency problem.

After the new master is migrated, other slave instances are restored in parallel. Even if there are thousands of slave, it will not affect the master recovery time, and the slave will be completed quickly.

DeNA uses MHA in more than 150 master-slave environments. When one of the master nodes crashes and MHA4 seconds completes failover, this is not possible for the active/passive cluster solution.

2 Interactive (manual) master failover

MHA can be used only for failover without monitoring the master. MHA serves only as failover interaction.

3. non-interactive failover

Non-interactive failover is also available (master is not monitored, automatic failover ). This feature is very useful, especially when you have installed other software to monitor the master. For example, the Pacemaker (Heartbeat) is used to monitor master faults and vip take over, and MHA failover and slave improvement are used.

4 switch the master to different hosts online

In many cases, it is necessary to move the master node to another host (such as replacing the raid controller and upgrading the hardware of the master node ). This is not a master crash, but the planned maintenance must be done. Scheduled maintenance leads to downtime and must be restored as quickly as possible. Quick master switching and elegant blocking write operations are required. MHA provides this method. The elegant master switch blocks write operations within 0.5-2 seconds. In many cases, the downtime of 0.5-2 seconds is acceptable, and even if it is not in the scheduled maintenance window. This means that when you need to change the machine faster and upgrade the version later, the dba can easily take action.

5. master-slave data inconsistency will not be caused by the master-slave crash.

After the master crash, MHA automatically identifies the differences between slave relay logevents, and then applications and different slave, and eventually all slave are synchronized. In combination with semi-sync, there is almost no data loss.

Other high-availability solutions

6 MHA deployment does not affect the current environment settings

One of the most important design concepts of MHA is to make it as simple as possible. When using a master-slave environment of more than 5.0, other HA solutions need to change the mysql deployment settings. MHA does not allow dba to perform these Deployment configurations, and both the synchronization and semi-synchronization environments can be used. Start/Stop/upgrade/downgrade/install/uninstall MHA do not need to change the mysql master/slave (such as start/stop ).

When you need to upgrade MHA to a new version, you don't need to stop mysql, just update the HMA version, and then restart MHAmanger.

MHA supports versions 5.0, 5/1, and 5.5 (5.6 should also be supported. MHA developers did not update the version 5.6 when translating documents ). Some HA solutions require specific mysql versions (such as mysqlcluster, mysql with global transaction id), and you may not want to migrate applications just for MasterHA. In many cases, the company has deployed many traditional mysql applications, developers or DBAs do not want to spend too much time migrating data to different storage engines or new features (newer bleeding edge distributions does not know if this translation should be done ).

7. no server fee is added.

MHA includes MHA Manager and MHA node. MHA node runs on each mysql server. the Manager can deploy one machine separately to monitor more than 100 master nodes. The total number of servers will not increase significantly. It should be noted that the Manager can also run on a machine in slaves.

8. no impact on performance

When the master node is monitored, MHA sends the ping packet only several seconds (3 seconds by default) without sending large queries. Master-slave replication performance is not affected

9 applicable to any storage engine

Mysql is not only applicable to the innodb engine for transaction security, but also for master-slave engines and MHA. That is, the mysiam engine of the legacy environment can be used without migration or MHA.

Comparison with other HA solutions

Doing everything manually

Mysql replication is synchronous or semi-synchronous. When the master crashes, some slave may not accept the latest relay log events, which means that each slave is in a different state. It is no longer ordinary to manually fix consistency issues. There is no consistency problem, and the master/slave node may not start (such as duplicate key error ). It takes more than one hour to restart master-slave replication.

Single master and single slave

In a single master-slave scenario, some slave lags behind and other slave situations will not happen. When one of the master nodes crashes, the application can be easily transferred to a new master to provide external services. failover is simple.

Master, one candidate master, and multiple slaves dual-Master and multi-Slave

The dual-master and multi-slave architecture is also common. If the master fails, the slave master will take over from the master to provide services. In some cases, configure the multi-master architecture.

M (RW) ----- M2 (R) M (RW), promoted from M2

|

+ ---- + -- (Master crash) --> +-x -- + -- x-+

S (R) S2 (R) S (?) S (?)

(Fromwhich position shocould S restart replication ?)

However, this is not a master Failover Solution. When the current master fails, the remaining slave does not necessarily accept all relay log events, and the data consistency is fixed.

This architecture is widely used, but not everyone can fully understand the above issues. If the current master fails, the slave becomes inconsistent or the slave cannot copy data from the new master.

Maybe dual master, one of which is read-only. each master has at least one slave, which may solve the problem.

M (RW) -- M2 (R)

|

S (R) S2 (R)

Pacemaker + DRBD

Pecemaker (Heartbeat) + DRBD + Mysql is a common solution. However, this solution also has the following problems:

1. cost problems, especially running a large number of master-slave environments. Pecemaker + DRBD is an active/passive solution. Therefore, a passive server is required to provide external application services. Basically, four mysql servers, one active master, one passive master, and two slaves are required.

2. downtime ). Pacemaker + DRBD is a master-slave cluster. The master fails and the slave master is enabled. This may take a long time, especially when innodb plugin is not used. That is to say, it takes several minutes to accept connections on the slave master using innodb plugin. In addition, because the data/file cache on the slave master is empty, the restoration time and warm-up (filling data to the data buffer pool) cannot be ignored. In practice, one or more slave instances are required to provide sufficient Read services. During the warm-up period, empty cache causes lower write performance.

3. write issues or consistency issues. To make active/passive clusters really work, after each commit (commit), you must refresh the Transaction log (binary log and innodb log ), that is, innodb-flush-log-at-trx-commit = 1 and sync-binlog = 1 must be set. Setting sync-binlog = 1 reduces write performance because the fsync () function is serialized (sync-binlog = 1, group commit is invalid ). In most cases, sync-binlog = 1 is not set. if sync-binlog = 1 and active master crash are not set, some binary log events that have been sent to slave may be lost in the new master (previously passive server. If the master fails, slave A receives the mysqld-bin.000123, the location is 1500. Binlog data is refreshed to the hard disk at 1000, so the new master data can only be 1000 of the mysqld-bin.000123, and then a new binary log mysqld-bin.000124 is created at startup. If this happens, slave A cannot continue the replication because the new master has no mysqld-bin.000123 location 1500.

4. For most people, it is not easy to install/initialize pacemake and DRBD. Compared with other cases, it is not easy to re-create the system partition to initialize DRBD. DBAs are required to have sufficient skills at the DRBD and Linux kernel layers. If the dba executes an error command (for example, execute drbdadm-overwrite-data-of-peer primary on a passive node), the active data is damaged. It is important that, once the hard disk I/O layer encounters a problem, most DBAs are not easy to handle this problem.

MySQL Cluster

Mysql cluster is a real high-availability solution, but it must use the NDB storage engine. If you use innodb, you cannot take advantage of the mysql cluster.

Semi-Synchronous Replication

Semi-synchronous replication greatly reduces the risk that binlog event only exists on the crashed master. This is very useful to avoid data loss. However, semi-Sync cannot solve all consistency problems. only one (not all) slave can accept the binlog events of the commit on the master end. other slave may not accept all binlog events. Different binlog events cannot be applied from the new slave to other slave, nor can they be consistent with each other.

Global Transaction ID

GlobalTransaction ID achieves the same purpose as MHA, but it overwrites more. MHA only supports two-level replication, but the global transaction id overwrites any level of replication environment. even if the second-level replication fails, dba can also overwrite the third-level replication. Check Google 'sglobal transaction id project for details.

BitsCN.com

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.