This article mainly introduces several solutions to prevent MySQL data loss when the server goes down, and introduces the applications of Replication, Monitor, and Failover in combination with practices, for more information, see MySQL as the most critical data storage center for most applications. therefore, how to make MySQL provide the HA service is a problem we have to face. When the master is running, we need to think about how to ensure that data is not lost as much as possible. Here, I will talk about the MySQL proxy and toolsets work we have done during this period and the MySQL HA solution we will use in the project at this stage and in the future.
Replication
To ensure that MySQL data is not lost, replication is a good solution, and MySQL also provides a powerful replication mechanism. We only need to know that replication adopts the asynchronous mode for performance consideration, that is, the written data is not synchronized to the slave. if the master is on the machine at this time, we may still face the risk of data loss.
To solve this problem, we can use semi-synchronous replication. The principle of semi-synchronous replication is very simple. when the master completes a transaction, it will not return until at least one slave supporting semi-synchronous confirms that it has received the event and writes it to relay-log. In this way, even if the master is on the machine, at least one slave can obtain the complete data.
However, semi-synchronous does not guarantee that data will not be lost. if the master crashes when completing the transaction and sending it to slave, it may still cause data loss. Compared with traditional asynchronous replication, semi-synchronous replication can greatly improve data security. More importantly, it is not slow. MHA authors all said they used semi-synchronous in facebook's production environment (here ), so I don't really need to worry about its performance, unless your business volume has completely exceeded facebook or google. As mentioned in this article, Loss-Less Semi-Synchronous replication has been used since MySQL 5.7, so the probability of data Loss is very small.
If you really want to ensure that data will not be lost, a better method at this stage is to use gelera, a MySQL Cluster Solution, which ensures that data will not be lost by writing three copies of the policy at the same time. I have no experience in using gelera, but I know that some companies in the industry have used it in the production environment, and the performance should not be a problem. But gelera is highly invasive to MySQL code and may not be suitable for some code cleaner students :-)
We can also use drbd to replicate MySQL data. the MySQL official document provides a detailed description, but I did not use this solution. The author of MHA wrote some questions about using drdb, here, it is for reference only.
In subsequent projects, I will give priority to the semi-synchronous replication solution. if the data is really important, we will consider using gelera.
Monitor
We mentioned above that the replication mechanism is used to ensure that the data on the master node is not lost as much as possible, but we cannot wait for several minutes for the master node to know the problem. Therefore, a set of good monitoring tools is essential.
When the master node is down, the monitor can quickly detect and perform subsequent processing, such as notifying the administrator by email or notifying the daemon to quickly perform failover.
Generally, we use keepalived or heartbeat to monitor a service, so that when the master is on the machine, we can easily switch to the slave machine. However, they still cannot immediately detect service unavailability. My company currently uses the keepalived method, but later I prefer to use zookeeper to solve the monitor and failover of the entire MySQL Cluster.
For any MySQL instance, we have a corresponding agent program. The agent and the MySQL instance are placed on the same machine, and the ping command is periodically sent to the MySQL instance to check their availability, at the same time, the agent is mounted to zookeeper through ephemeral. In this way, we can know whether MySQL is a machine, mainly in the following situations:
- When the machine is on the machine, MySQL and agent will be down, and the connection between the agent and zookeeper will naturally be disconnected.
- When MySQL fails, the agent finds that the ping is disconnected and actively disconnects from zookeeper.
- Agent is disabled, but MySQL is not
In the above three cases, we can all think that there is a problem with the MySQL machine, and zookeeper can immediately perceive it. The agent is disconnected from zookeeper. zookeeper triggers the corresponding children changed event, and the control service that monitors the event can handle it accordingly. For example, in the first two cases, the control service can automatically perform a failover, but in the third case, it may not be processed, wait for related services such as crontab or supersivord on the machine to automatically restart the agent.
The advantage of zookeeper is that it can easily monitor the entire cluster, and can instantly obtain the change information of the entire cluster and trigger the services interested in corresponding event notifications, coordinate multiple services for related processing at the same time. However, keepalived or heartbeat cannot or is too troublesome.
The problem with zookeeper is that it is complicated to deploy, and if failover is performed, it is also troublesome to obtain the latest database address for the application.
For Deployment problems, we need to ensure that a MySQL instance works with an agent. Fortunately, with docker, we are very simple. Zookeeper is not used to change the address of the second Database. we can notify the application to dynamically update the configuration information, VIP, or use proxy to solve the problem.
Although zookeeper has many advantages, zookeeper may not be the best choice if your business is not complex, such as only one master and one slave. maybe keepalived is enough.
Failover
Through monitor, we can easily perform MySQL monitoring, and notify the corresponding service to perform failover processing after MySQL is on the machine. suppose there is such a MySQL Cluster, a is the master, B and c are their slave. when a crashes, we need to perform a failover. so which one of B and c is selected as the new master?
The principle is very simple. which slave has the most recent original master data, choose which one as the new master. We can use the show slave status command to find out which slave has the latest data. We only need to compare the two key fields Master_Log_File and Read_Master_Log_Pos. These two values indicate the location of the binlog file that the slave reads to the master. the larger the index value of the binlog, the larger the pos, then the slave can be upgraded to the master. We will not discuss the possibility that multiple slave instances will be upgraded to the master node.
In the previous example, if B is promoted to master, we need to re-point c to the new master B to start replication. We reset the c MASTER by changing the master TO, but how do we know which file from the binlog of B and which position TO start copying?
GTID
To solve this problem, MySQL 5.6 introduced the concept of GTID, that is, uuid: gid. uuid is the uuid of MySQL server, which is globally unique, gid is an incremental transaction id. with these two items, we can uniquely identify a transaction recorded in the binlog. With GTID, we can easily perform failover processing.
In the preceding example, assume that the last GTID of a read by B is 3e11fa47-71ca-11e1-9e33-c80aa9429133: 23, while that of c is slave: 15. when c points to the new master B, we can know through GTID that as long as the GTID in binlog B is found to be 3E11FA47-71CA-11E1-9E33-C80AA9429562: 15, then c can start copying from its next event location. Although the method of searching binlog is still sequential, it is a little inefficient and violent, but it is much easier to guess which filename and position it is.
Google has also installed a Global Transaction ID patch for a long time, but it only uses an incremental Integer. LedisDB uses its ideas to implement failover, however, google seems to be migrating to MariaDB gradually now.
MariaDB's GTID implementation is different from MySQL 5.6, which is actually troublesome. for my MySQL tool set go-mysql, this means you need to write two sets of different codes to handle GTID. Check whether MariaDB is supported in the future.
Pseudo GTID
Although GTID is a good thing, it is only limited to MySQL 5.6 +. Currently, most of the businesses are still using versions earlier than MySQL 5.6, and my company is 5.5, these databases will not be upgraded to 5.6 for at least a long time. Therefore, we still need a good mechanism to select the filename and position of the master binlog.
At first, the author intends to study the implementation of MHA. it adopts the first method to copy the relay log to supplement the missing event, but the author does not trust relay log. In addition, MHA uses perl, I gave up my research because of a language that I couldn't fully understand.
Fortunately, I met the orchestrator project, which is really a magical project. it adopts a Pseudo-do GTID method, and the core code is this.
The code is as follows:
Create database if not exists meta;
Drop event if exists meta. create_pseudo _ gtid_view_event;
Delimiter ;;
Create event if not exists
Meta. create_pseudo _ gtid_view_event
On schedule every 10 second starts current_timestamp
On completion preserve
Enable
Do
Begin
Set @ pseudo _ gtid: = uuid ();
Set @ _ create_statement: = concat ('create or replace view meta. pseudo _ gtid_view as select \ '', @ pseudo _ gtid, '\' as pseudo _gtid_unique_val from dual ');
PREPARE st FROM @ _ create_statement;
EXECUTE st;
Deallocate prepare st;
End
;;
Delimiter;
Set global event_scheduler: = 1;
It creates an event on MySQL and writes a uuid to a view every 10 s, which is recorded in the binlog, although we still cannot directly locate an event like GTID, we can also locate a 10 s range, so that we can compare the binlogs of two MySQL databases in a very small range.
In the above example, assume that the last occurrence of uuid in c is s1. in B, we find the uuid at s2 and compare the subsequent events in sequence. if they are inconsistent, the replication may be stopped. After traversing the last binlog event of c, we can get the filename and position corresponding to the next event of B, and then let c point to this position to start copying.
To use the Pseudo GTID, you need to use slave to enable the log-slave-update option. considering that GTID must also enable this option, I feel completely acceptable.
In the future, the failover tool implemented by the author will adopt this Pseudo do GTID method.
In MySQL High Availability, the author uses another GTID method. each time a commit is committed, gtid must be recorded in a table, then we can find the corresponding location information through the gtid, but this method requires the support of the business MySQL client. I do not like it very much and will not use it anymore.
Postscript
MySQL HA has always been a relatively deep field. I only listed some of the latest research items, and some related tools will be implemented in go-mysql as much as possible.
Update
After a period of thinking and research, I have gained a lot of experience and gains. the design of MySQL HA is different from the previous ones. Later, I found that the self-designed HA solution is almost the same as facebook's article. In addition, I recently chatted with facebook people and heard that they are also vigorously implementing it. so I feel that I am in the right direction.
The new HA, I will fully embrace GTID, compared to the emergence of this thing is to solve the original replication of the pile of problems, so I will not consider non-GTID of the lower version of MySQL. Fortunately, our project has upgraded all MySQL to 5.6, fully supporting GTID.
Unlike the article in fb, which transformed mysqlbinlog to support the semi-sync replication protocol, I used the replication database of go-mysql to support the semi-sync replication protocol, in this way, the binlog of MySQL can be synchronized to a machine in real time. This may be the only difference between the fb solution and I.
Only the binlog synchronization speed is definitely faster than the native slave. after all, the process of executing the event in the binlog is shorter. In addition, we still use the original synchronization method for the real slaves, do not use semi-sync replication. Then we use MHA to monitor the entire cluster and handle failover.
I used to think that MHA is not easy to understand, but it is actually a very powerful tool. In addition, I still know what perl really means. MHA has been used by many companies in the production environment and has undergone tests. using it directly is definitely more cost-effective than writing it yourself. So in the future, I will not consider zookeeper, but I will consider writing the agent myself.