MySQL high-availability mha-other high-availability solutions and issues

Source: Internet
Author: User
Tags failover mysql version

Some of the other high-availability solutions for MySQL were mentioned in the previous MHA introduction, just a little bit about the following, which is described in detail here.

MySQL replication is asynchronous or semi-synchronous. When Master fails, some slave may not receive the latest relay log, which means that each slave may be in a different state. It is trivial to handle these consistency issues manually because you cannot start copying without fixing these issues. However, it is not uncommon to fix these problems manually, which takes one hours or more.

One Master one from

If the schema is a master one from, there will be no part of the slave state behind the latest slave problem. When Master fails, the app's traffic can be sent to the new master (the original slave). Failover is easy to solve. But there are the following questions.

First, read traffic cannot be extended. In many cases, it is possible to run some important operations, such as backups, analytic queries, and batch processing. This can lead to slave performance issues. If only one slave is present, master must process all of these traffic when the slave fails.

Second, usability issues. If master fails, only one service is left (the original slave is the primary), which becomes a single point of failure. In order to create a new slave, an online backup is required, then stored on the new slave and started immediately slave. However, these operations typically take hours (even more than one day to complete replication). In some important applications, the database may not tolerate a single point of failure for such a long time. Also, online backup greatly increases the I/O load of master, so it is dangerous to back up at peak times.

Double Master multiple slave

Dual Master multi-Slave is also a common architecture. If the current master fails, the standby master becomes the new master. In much larger scenarios, the alternate master is configured to read-only.

However, this does not always run as a master failover solution. When the current master fails, the remaining slave may not receive the full relay log, so there is still a need for other solutions to resolve the consistency problem between slave.

If you can't tolerate consistency issues and want to start the service immediately. Just take the alternate master as the new master, and discard the remaining slave. Then create a new slave from this new master for online backup. But this approach has the same problem as the one that was mentioned earlier, the main one. The remaining slave cannot be read-extended and redundant.

In addition, it is possible to use a dual master (a read-only) and each master has at least one.

At least one from can be replicated if the current master fails. But in fact, many users do not adopt this architecture, because the biggest drawback is complexity. Three layers of replication are used in this architecture. Managing three-tier replication is not easy. For example, if the standby master fails, the slave of the standby master cannot proceed with replication. In many cases, the alternate master and its slave must be reconfigured. It is important that at least 4 servers be used in this architecture.

Heartbeat +DRBD

Using the Heartbeat (Heartbeat) +drbd+mysql is a very common ha solution. But there are some serious problems with this solution.

The first problem is overhead, especially when you want to run a large number of MySQL replication environments. The heartbeat +DRBD is a primary/standby solution, and therefore requires a passive (standby) master that does not handle any application traffic. Passive servers cannot be used for read extensions. Typically, you need at least 4 MySQL services, one active master, one passive (passive) master, and two slave.

The second problem is downtime. Because Heartbeat +heartbeat is a active/standby cluster, failure recovery occurs on passive server if Active server fails. This may take a long time, especially without the use of the InnoDB plugin. Even with the InnoDB plug-in, it is not uncommon to take only a few minutes to start receiving access connections in passive server. In addition to recovery time, it takes time to warm up (warm-up) (populate the data to the buffer pool) after a failure, because the database/file system cache is empty on passive. In practice, one or more additional slave are required to handle enough read traffic. During warm-up, write performance degrades significantly due to the fact that the coarse is empty.

The third problem is write performance degradation or consistency issues. In order to ensure that the active/passive high availability cluster runs, the transaction log (binary log and InnoDB log) must be flushed to disk at each commit, so you must set the innodb-flush-log-at-trx-cmmit= 1 and Sync-binlog=1. However, Sync-binlog=1 will degrade write performance because the current MySQL version fsync () is sequential (if Sync-binlog is 1, the group commits will break). In most cases, sync-binlog=1 is not set. However, if Sync-binlog=1 is not set, when active master fails, the new master (the previous passive server) may lose the binary log events that have been sent to slave. If, master fails, and slave a receives the 1500 position of the mysql-bin.00123. If the Binlog data is nearly flushed to 1000-bit to disk, the new master is only mysql-bin.00123 to 1000-bit and creates a new binary file mysql-bin.00124. If this occurs, slave a cannot replicate because the new master does not have a mysql-bin.00123 1500-bit position.

The fourth problem is complexity. For many users, it is not easy to install/configure the heartbeat and DRBD. In many deployment environments, configuring DRBD often requires rebuilding the system partition, which in many cases is not easy. In addition, you need to have enough experience with the DRBD and Linux kernel layers. It is very easy to corrupt production data if an incorrect command is executed (for example, Drbd–overwrite-data-of-peer is performed on the passive node). When using DRBD, there is a problem with the disk I/O layer, which is difficult for most DBAs to solve.

MySQL Cluster

MySQL the cluster really implements a highly available solution, but must use the NDB storage engine. In most cases, InnoDB is used, so the benefits of MySQL clustering cannot be used.

Semi-synchronous replication

Semi-synchronous replication greatly reduces the risk of "binlog only on Failure master". This is a great help to avoid the loss of data. However, semi-synchronous replication does not resolve the consistency issue. Semi-synchronous replication guarantees that at least one (not all) slave receive the Binlog event at Master commit. It is still possible that some slave have not received the Binlog event. If you do not apply the relay log on the latest slave to a non-up-to-date slave, slave cannot be in a consistent state.

MHA resolves the consistency issue, so that with the use of semi-synchronous replication and MHA, there is little data loss and slave retention is always possible.

Global transaction ID (GTID)

The purpose of the global transaction ID and MHA want to implement is basically the same, but the global transaction ID includes more. MHA can only support two-tier replication, but the global transaction ID can support many layers of replicated environments, so even if the second-tier replication fails, three-tier replication can still be resumed.

Since MySQL5.6 began to support the Gtid. Oracle's official tool, Mysqlfailover, supports master failover with Gtid. Starting with version 0.56 of MHA, Gtid-based failover is also supported. MHA will automatically detect if the mysqld is running in Gtid, if Gtid is on, MHA will implement a failover with Gtid, and if not enabled, MHA uses relay log-based failover.

As a result of the limitations of personal technology, you are also requested to point out. The author can be found through the following two groups.

Beijing Linux operation and maintenance recruitment group: 153677549

Linux OPS Development Group: 298324302

This article is from the "Chang Jie listen to the Wind People" blog, please make sure to keep this source http://kevinhao.blog.51cto.com/5204735/1773508

MySQL high-availability mha-other high-availability solutions and issues

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.