Original address
Author Introduction
Wang Songlei, currently serving in Ucloud, is engaged in the MySQL database core research and development work. Mainly responsible for Ucloud cloud database UDB kernel Troubleshooting and new features of the database research and development work.
I. Overview
When we consider the high-availability architecture of the MySQL database, we mainly consider the following aspects:
If the database is down or an unexpected outage, and so on, can restore the availability of the database as soon as possible to reduce downtime, to ensure that the business will not be interrupted due to database failure.
Data for non-primary nodes that are used as backups, read replicas, and so on, should be in real-time or eventually consistent with the master node's data.
When a database switch occurs in a business, the contents of the database before and after the switchover should be consistent, and the business will not be affected by data loss or inconsistent data.
For highly available ratings We do not discuss this in detail, but only discuss the pros and cons of the commonly used high-availability scenarios and the selection.
Second, high-availability programmes
1. master-slave or primary master semi-synchronous replication
Use a two-node database to build one-way or two-way semi-synchronous replication. In the version after 5.7, due to the introduction of some new features such as lossless replication, logical multithreaded replication, MySQL native half-synchronous replication is more reliable.
The common architecture is as follows:
Usually with proxy, keepalived and other third-party software used at the same time, that can be used to monitor the health of the database, but also can execute a series of administrative commands. If the main library fails, you can still continue to use the database after switching to the standby repository.
Advantages:
Schema is simple, using native semi-synchronous replication as the basis for data synchronization
Two-node, no host downtime after the selection of the main problem, direct switching can
Two-node, low-demand resources, simple deployment
Disadvantages:
Fully dependent on semi-synchronous replication, data consistency cannot be guaranteed if semi-synchronous replication is degraded to asynchronous replication
High availability mechanisms for haproxy and keepalived need extra consideration
2, semi-synchronous replication optimization
The semi-synchronous replication mechanism is reliable. If the semi-synchronous replication is always in effect, the data can be considered consistent. However, due to the network fluctuations and other objective reasons, resulting in the semi-synchronous replication time-out and switch to asynchronous replication, then there is no guarantee of data consistency. Therefore, as far as possible to ensure the semi-synchronous replication, you can improve the consistency of data.
The scheme also uses a two-node architecture, but the function optimization is done on the basis of the original semi-identical replication, which makes the mechanism of semi-synchronous replication more reliable.
You can refer to the following optimization scenarios:
Dual channel replication
Semi-synchronous replication due to a time-out, replication is fractured, when replication is established, two channels are established at the same time, and one of the semi-synchronous replication channels is copied from the current location to ensure that the slave knows the progress of the current host execution. Another asynchronous replication channel begins to retroactive data that is backward from the machine. When the asynchronous replication channel catches up to the start of the semi-synchronous replication, the semi-synchronous replication is resumed.
Binlog File Server
Build two semi-synchronous replication channels, where the semi-synchronous channel connecting the file server is normally not enabled, and when the master-slave semi-synchronous replication network problem degrades, start the semi-synchronous replication channel with the file server. When the master-slave semi-synchronous copy is restored, the semi-synchronous replication channel with the file server is closed.
Advantages:
Two-node, low-demand resources, simple deployment
Simple architecture, no choice of the main problem, direct switch can be
Optimized semi-synchronous replication ensures consistent data compared to native replication
Disadvantages:
Need to modify kernel source code or use MySQL communication protocol. Need to have a certain understanding of the source code, and can do a certain degree of development of two times
Still relies on semi-synchronous replication, and does not fundamentally address data consistency issues
3. High-Availability architecture optimization
Extend a two-node database to a multi-node database, or a multi-node db cluster. According to their own needs to choose a master two from, a master multi-slave or multi-master multi-slave cluster.
Because of the semi-synchronous replication, there is a successful response to a slave that is considered to be a successful semi-synchronous replication feature, so the reliability of multi-slave replication is better than the reliability of single-copy from semi-synchronous replication. and the probability of a multi-node simultaneous outage is less than the probability of a single node outage, so a multi-node architecture can be considered to some extent high availability is better than a two-node architecture.
However, due to the large number of databases, database management software is needed to ensure the maintainability of the database. You can choose Mmm, MHA, or various versions of Proxy and so on. Common scenarios include the following:
mha+ Multi-node cluster
MHA Manager periodically probes the master node in the cluster, and when master fails, it automatically promotes the slave of the latest data to the new master, and then points all other slave to the new master. The entire failover process is completely transparent to the application.
MHA node runs on each MySQL server, and the primary role is to process the binary log when switching, ensuring that the switch minimizes data loss.
MHA can also be extended to the following multi-node clusters:
Original address
10 common MySQL high-availability Scheme selection interpretation