10 common MySQL high-availability Scheme selection interpretation

Source: Internet
Author: User
Tags failover zookeeper

I. Overview

When we consider the high-availability architecture of the MySQL database, we mainly consider the following aspects:

    1. If the database is down or an unexpected outage, and so on, can restore the availability of the database as soon as possible to reduce downtime, to ensure that the business will not be interrupted due to database failure.

    2. Data for non-primary nodes that are used as backups, read replicas, and so on, should be in real-time or eventually consistent with the master node's data.

    3. When a database switch occurs in a business, the contents of the database before and after the switchover should be consistent, and the business will not be affected by data loss or inconsistent data.

For highly available ratings We do not discuss this in detail, but only discuss the pros and cons of the commonly used high-availability scenarios and the selection.

Second, high-availability programmes

1. master-slave or primary master semi-synchronous replication

Use a two-node database to build one-way or two-way semi-synchronous replication. In the version after 5.7, due to the introduction of some new features such as lossless replication, logical multithreaded replication, MySQL native half-synchronous replication is more reliable.

The common architecture is as follows:

Usually with proxy, keepalived and other third-party software used at the same time, that can be used to monitor the health of the database, but also can execute a series of administrative commands. If the main library fails, you can still continue to use the database after switching to the standby repository.

Advantages:

    • Schema is simple, using native semi-synchronous replication as the basis for data synchronization

    • Two-node, no host downtime after the selection of the main problem, direct switching can

    • Two-node, low-demand resources, simple deployment

Disadvantages:

    • Fully dependent on semi-synchronous replication, data consistency cannot be guaranteed if semi-synchronous replication is degraded to asynchronous replication

    • High availability mechanisms for haproxy and keepalived need extra consideration

2, semi-synchronous replication optimization

The semi-synchronous replication mechanism is reliable. If the semi-synchronous replication is always in effect, the data can be considered consistent. However, due to the network fluctuations and other objective reasons, resulting in the semi-synchronous replication time-out and switch to asynchronous replication, then there is no guarantee of data consistency. Therefore, as far as possible to ensure the semi-synchronous replication, you can improve the consistency of data.

The scheme also uses a two-node architecture, but the function optimization is done on the basis of the original semi-identical replication, which makes the mechanism of semi-synchronous replication more reliable.

You can refer to the following optimization scenarios:

Dual channel replication

Semi-synchronous replication due to a time-out, replication is fractured, when replication is established, two channels are established at the same time, and one of the semi-synchronous replication channels is copied from the current location to ensure that the slave knows the progress of the current host execution. Another asynchronous replication channel begins to retroactive data that is backward from the machine. When the asynchronous replication channel catches up to the start of the semi-synchronous replication, the semi-synchronous replication is resumed.

Binlog File Server

Build two semi-synchronous replication channels, where the semi-synchronous channel connecting the file server is normally not enabled, and when the master-slave semi-synchronous replication network problem degrades, start the semi-synchronous replication channel with the file server. When the master-slave semi-synchronous copy is restored, the semi-synchronous replication channel with the file server is closed.

Advantages:

    • Two-node, low-demand resources, simple deployment

    • Simple architecture, no choice of the main problem, direct switch can be

    • Optimized semi-synchronous replication ensures consistent data compared to native replication

Disadvantages:

    • Need to modify kernel source code or use MySQL communication protocol. Need to have a certain understanding of the source code, and can do a certain degree of development of two times

    • Still relies on semi-synchronous replication, and does not fundamentally address data consistency issues

3. High-Availability architecture optimization

Extend a two-node database to a multi-node database, or to a multi-node database set. According to their own needs can choose a master two from, a master multi-or multi-master multi-slave set.

Because of the semi-synchronous replication, there is a successful response to a slave that is considered to be a successful semi-synchronous replication feature, so the reliability of multi-slave replication is better than the reliability of single-copy from semi-synchronous replication. and the probability of a multi-node simultaneous outage is less than the probability of a single node outage, so a multi-node architecture can be considered to some extent high availability is better than a two-node architecture.

However, due to the large number of databases, database management software is needed to ensure the maintainability of the database. You can choose Mmm, MHA, or various versions of Proxy and so on. Common scenarios include the following:

mha+ Multi-node set

MHA Manager periodically probes the master node in the set, and when master fails, it automatically promotes the slave of the latest data to the new master, and then points all other slave back to the new master. The entire failover process is completely transparent to the application.

MHA node runs on each MySQL server, and the primary role is to process the binary log when switching, ensuring that the switch minimizes data loss.

MHA can also be extended to the following multi-node set:

Advantages:

    • Detection and transfer of faults can be performed

    • Scalability is good, you can expand the number and structure of MySQL nodes as needed

    • Three-node/multi-node MySQL is less likely to occur when compared to two-node MySQL replication

Disadvantages:

    • Requires at least three nodes and requires more resources relative to the two nodes

    • Logic is more complex, after the failure to troubleshoot problems, positioning problems more difficult

    • Data consistency is still guaranteed by native semi-synchronous replication, and there is still a risk of inconsistent data

    • There may be a brain fissure in the network partition.

Zookeeper+proxy

Zookeeper uses the distributed algorithm to guarantee the consistency of the set data, the use of zookeeper can effectively guarantee the high availability of the proxy, which can avoid the generation of the network partition phenomenon well.

Advantages:

    • It ensures high availability of the entire system, including proxy, MySQL

    • Scalability is good, can be extended to large-scale set

Disadvantages:

    • Data consistency still relies on native MySQL semi-synchronous replication

    • By introducing ZK, the logic of the entire system becomes more complex

4. Shared storage

Shared storage realizes the decoupling of database server and storage device, and the data synchronization between different databases no longer relies on MySQL's native replication function, but the data consistency is ensured by means of disk data synchronization.

San shared storage

The concept of SAN is to allow the storage device and processor (server) to establish a direct high-speed network (compared to the LAN) connection, through which the centralized storage of data. Common architectures are as follows:

With shared storage, the MySQL server can mount the file system and operate properly, and if the main library goes down, the repository can mount the same file system, ensuring that the main and standby libraries use the same data.

Advantages:

    • Two-node, simple deployment, simple switching logic

    • Good assurance of strong consistency of data

    • There is no data inconsistency due to MySQL logic error

Disadvantages:

    • Need to consider the high availability of shared storage

    • The price is expensive

DRBD disk Replication

DRBD is a software-based, network-based block replication storage solution that is primarily used to mirror data between servers, such as disks, partitions, logical volumes, and so on, when a user writes data to a local disk, and sends the data to the disk of another host on the network, such that the local host (master node) and the remote host ( Data on the standby node can be guaranteed in real-time synchronization. Common architectures are as follows:

When the local host problems, the remote host also retains a copy of the same data, can continue to use, to ensure the security of the data.

DRBD is a fast-level synchronous replication technology implemented by the Linux kernel module that achieves the same shared storage effect as a SAN.

Advantages:

    • Two-node, simple deployment, simple switching logic

    • Low cost compared to SAN storage networks

    • Ensure strong consistency of data

Disadvantages:

    • Greater impact on IO performance

    • Read operation not available from library

5. Distributed Protocol

The distributed protocol is a good solution to the data consistency problem. The more common scenarios are as follows:

MySQL Cluster

MySQL Cluster is the official set of deployment scenarios that enable database high availability and data consistency by using the NDB storage engine to back up redundant data in real time.

Advantages:

    • Use all official components and do not rely on third-party software

    • Can achieve strong consistency of data

Disadvantages:

    • Domestic use of less

    • Configuration is more complex and requires the use of the NDB storage engine, which differs from the MySQL conventional engine

    • At least three nodes

Galera

MySQL high-availability set based on Galera, is the MySQL set solution for multi-master data synchronization, which is simple to use, no single point of failure, and high availability. The common architecture is as follows:

Advantages:

    • Multi-Master Write, no delay replication, can ensure strong data consistency

    • There are mature communities, with internet companies in large-scale use

    • Automatic failover, automatic addition, culling of nodes

Disadvantages:

    • Need to hit Wsrep patch for native MySQL node

    • Supports only InnoDB storage engine

    • At least three nodes

P Ax OS

The problem solved by the Paxos algorithm is how a distributed system can agree on a value (resolution). This algorithm is considered to be the most effective of the same kind of algorithm. The combination of Paxos and MySQL can achieve strong consistency in distributed MySQL data. The common architecture is as follows:

Advantages:

    • Multi-Master Write, no delay replication, can ensure strong data consistency

    • Have a mature theoretical foundation

    • Automatic failover, automatic addition, culling of nodes

Disadvantages:

    • Supports only InnoDB storage engine

    • At least three nodes

Summarize

With the continuous improvement of data consistency, more and more methods are tried to solve the problem of distributed data consistency, such as the optimization of MySQL itself, the optimization of MySQL set architecture, Paxos, Raft, the introduction of 2PC algorithm, etc.

and using the distributed algorithm to solve the MySQL database data consistency problem method, also more and more accepted by people, a series of mature products such as phxsql, MariaDB galera Cluster, Percona XtraDB More and more cluster are being used on a large scale.

With the official MySQL Group replication ga, the use of distributed protocols to solve data consistency problem has become the mainstream direction. Expect more and more excellent solutions to be proposed, MySQL high-availability problems can be better solved. Electric Forklift

10 common MySQL high-availability Scheme selection interpretation

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.