A Free Trial That Lets You Build Big!
Start building with 50+ products and up to 12 months usage for Elastic Compute Service
This article mainly introduced to prevent server downtime of the MySQL data loss of several scenarios, combined with the practice of replication and monitor and failover these three projects, the use of friends can refer to the
For most applications, MySQL is the most critical data storage center, so how to let MySQL provide ha services, we have to face a problem. When Master is in the machine, how do we ensure that the data as far as possible, how to ensure the rapid learning Master and the corresponding failover processing, we need to think about. Here, the author will combine this time to do the MySQL proxy and toolsets related work, talk about our current stage and follow-up will be adopted in the project MySQL ha program.
To ensure that MySQL data is not lost, replication is a good solution, and MySQL also provides a powerful set of replication mechanisms. It's just that we need to know that for performance reasons, replication is a asynchronous pattern, where the data written is not synchronized to the slave, and if Master is at this point, we may still be at risk of data loss.
In order to solve this problem, we can use the semi-synchronous replication,semi-synchronous replication principle is very simple, when master finishes a transaction, It waits for at least one slave that supports semi-synchronous to acknowledge receipt of the event and write it to Relay-log before returning. So even if Master is a machine, at least one slave gets the full data.
However, Semi-synchronous is not a 100% guarantee that data will not be lost, and if master crashes when it completes the transaction and sends it to slave, it can still cause data loss. Semi-synchronous replication can greatly enhance data security compared to traditional asynchronous replication. More importantly, it's not slow, MHA's authors say they use semi-synchronous (here) in Facebook's production environment, so I don't really need to worry about its performance. Unless your level of business is completely beyond Facebook or Google. In this article already mentioned, MySQL 5.7 has been used after the loss-less semi-synchronous replication, so the probability of losing data is very small.
If you really want to fully guarantee that the data will not be lost, a better approach at this stage is to use Gelera, a MySQL clustering solution, which guarantees that the data will not be lost by writing a three-part strategy at the same time. I do not have any experience in using Gelera, just know that the industry has a company to use it in the production environment, performance should not be a problem. But Gelera is more intrusive to MySQL code, it may be inappropriate for some students with code cleanliness:-)
We can also use DRBD to achieve MySQL data replication, MySQL Official document has a document is introduced in detail, but the author did not use this set of programs, MHA author wrote some of the use of DRDB problems, here, for reference only.
In subsequent projects, the author will give priority to the use of semi-synchronous replication solution, if the data is really very important, you will consider using Gelera.
We talked about using the replication mechanism to ensure that the data is not lost as much as possible after Master's machine, but we can't wait until master takes a few minutes to know that there is a problem. So a good set of monitoring tools is essential.
When Master is dropped, monitor can quickly detect and follow up, such as email notification to the administrator, or notify the daemon to failover quickly.
Usually, we use keepalived or heartbeat to monitor a service, so that when Master is in the machine, we can easily switch to the standby machine. But they still cannot detect the service is not available immediately. The author of the company at this stage of the use of the keepalived approach, but the following author is more inclined to use zookeeper to solve the entire MySQL cluster monitor and failover.
For any MySQL instance, we have a corresponding agent, the agent with the MySQL instance on the same machine, and timed to the MySQL instance to send ping command to detect its availability, At the same time, the agent is mounted to the zookeeper by means of ephemeral. In this way, we can know whether MySQL is a machine, mainly in the following situations:
Machine when machine, so MySQL and agent will be dropped, agent and zookeeper connection natural disconnect
MySQL when off, agent found Ping impassability, active disconnect with zookeeper connection
Agent when off but MySQL not when
Of the above three cases, we can all think that the MySQL machine has a problem, and zookeeper can immediately feel. The agent and the zookeeper disconnect, zookeeper triggers the corresponding children changed event, monitoring to the event's control services can do the corresponding processing. For example, if the above two cases, the control service can automatically failover, but if it is the third, it may not do processing, waiting for the machine crontab or Supersivord and other related services automatically restart the agent.
The advantage of using zookeeper is that it can easily monitor the whole cluster, and can instantly obtain the information of the whole cluster and trigger the service of interest, and coordinate multiple services for related processing. And these are keepalived or heartbeat that are too troublesome to do or to do.
The problem with zookeeper is that it is more complex to deploy and, if failover, how to get the application to the latest database address is a tricky problem.
For deployment issues, we have to ensure that a MySQL with an agent, fortunately this year with Docker, so really very simple. And for the second database address change problem, in fact, not using the zookeeper will have, we can notify the application of dynamic update configuration information, VIP, or use proxy to solve.
Although the benefits of zookeeper are many, if your business is not complex, such as only a master, a slave,zookeeper may not be the best choice, maybe keepalived is enough.
Through monitor, we can be very convenient for MySQL monitoring, while in MySQL after the machine to notify the corresponding services to do failover processing, assuming that there is now a MySQL cluster, a for master,b,c for its slave, when a after the fall, We need to do failover, so which one of the b,c do we choose as the new master?
The principle is simple, which slave has the most recent original master data, and chooses which one to use as the new master. We can learn which slave has the latest data by show slave status. We only need to compare two key fields Master_log_file and Read_master_log_pos, which represent the location of the slave read to Master Binlog file, the larger the index value of Binlog, and the larger the Pos, Then that one slave is to be promoted to master. Here we do not discuss the case where multiple slave may be promoted to master.
In the previous example, assuming B is promoted to master, we need to point C back to the new Master B to start copying. We reset C's master by Change master, but how do we know which file to Binlog from B and which position start copying?
To solve this problem, MySQL 5.6 introduced the concept of Gtid, that is, the uuid:gid,uuid for MySQL server UUID, is globally unique, while the GID is an incremented transaction ID, through these two things, We can uniquely mark a transaction that is recorded in the Binlog. Using Gtid, we can be very convenient for failover processing.
Is still the previous example, assuming B is read at this time the last Gtid for 3e11fa47-71ca-11e1-9e33-c80aa9429562:23, and C for 3e11fa47-71ca-11e1-9e33-c80aa9429562 : 15, when C points to the new Master B, we can tell by Gtid that, as long as the event Gtid 3e11fa47-71ca-11e1-9e33-c80aa9429562:15 is found in the Binlog in B, Then C can be copied from the location of its next event. Although the way to find Binlog is still sequential lookup, slightly less effective violence, but compared to ourselves to guess which filename and position, it is more convenient.
Google also had a global Transaction ID patch early, but just use an incremental plastic, ledisdb borrowed its ideas to achieve failover, but Google seems to have begun to gradually migrate to mariadb above.
MARIADB gtid Implementation and MySQL 5.6 is not the same, this is actually more troublesome, for my MySQL toolset go-mysql, means to write two sets of different code to deal with the situation of Gtid. Whether the follow-up support MARIADB again to see the situation.
Gtid is a good thing, but only for MySQL 5.6+, currently there are still most of the business using the previous version of 5.6, the author of the company is 5.5, and these databases at least for a long time will not upgrade to 5.6. So we still need a good mechanism to choose Master Binlog's filename and position.
Initially, the author intends to study the implementation of MHA, which uses the first copy of relay log to complement the missing event, but I do not trust relay log, coupled with MHA using Perl, a language that I can not understand completely, so gave up the continued research.
Fortunately, the author encountered orchestrator this project, it is really a very magical project, it adopted a pseudo gtid way, the core code is this
Copy code code as follows:
Create database if not exists meta;
Drop event if exists meta.create_pseudo_gtid_view_event;
Create event if not exists
On schedule every second starts Current_timestamp
On completion Preserve
Set @pseudo_gtid: = UUID ();
Set @_create_statement: = Concat (' Create or replace view Meta.pseudo_gtid_view as select ', @pseudo_gtid, ' as Pseudo_gt Id_unique_val from dual ');
PREPARE St from @_create_statement;
deallocate PREPARE St;
Set global Event_scheduler: = 1;
It creates an event on MySQL, every 10s, writes a UUID to a view, which is recorded in the Binlog, although we still can't navigate directly to an event like Gtid, but we can also navigate to a 10s range. So we can compare two MySQL binlog in a very small range.
To continue with the example above, assuming that the position of the UUID in C last appeared to be S1, we found the UUID in B, the position is S2, then the subsequent event, and if not, then there may be a problem and stop copying. After traversing to the last Binlog event in C, we can get the filename and the position for the next event in B, and then let C point to this location and start copying.
Using the pseudo Gtid requires slave to open the Log-slave-update option, so the individual feels perfectly acceptable, considering that Gtid must also turn on the option.
Follow-up, the author's own implementation of the failover tool, will be implemented in this pseudo Gtid way.
In the book "MySQL High Availability", the author uses another Gtid method, each time a commit, need to record the gtid in a table, and then through the Gtid to find the corresponding location information, Only this way needs the business MySQL client support, the author does not like, it is not adopted.
MySQL ha has been a relatively deep area of water, the author only listed some recent research, and some related tools will try to achieve in the go-mysql.
After a period of reflection and research, the author has a lot of experience and harvest, the design of the MySQL ha with the previous has a lot of different places. It turned out that the HA program I had designed was almost the same as the Facebook article, and the recent chat with Facebook heard that they were also doing a great thing, so they felt the right direction.
The new ha, I will completely embrace gtid, compared to the appearance of this thing is to solve the original replication of the pile of problems, so I will not consider a gtid version of the low-level MySQL. Luckily, our project has upgraded MySQL to 5.6 and fully supports Gtid.
Unlike FB that article will mysqlbinlog retrofit support Semi-sync replication protocol, I am go-mysql replication Library to support Semi-sync replication protocol, This allows the MySQL binlog to be synchronized to a single machine in real time. That's probably the only difference between me and FB.
Only synchronous Binlog speed is definitely faster than the original slave, after all, less execution binlog inside the process of the event, and the other real slaves, we still use the most original way of synchronization, do not use Semi-sync replication. We then monitor the entire cluster via MHA and failover processing.
I used to think that MHA was hard to understand, but it was a very powerful tool, and actually looking at Perl, I found it to be understood. MHA has been used by many companies in the production environment, has undergone testing, direct use is definitely more cost-effective than writing a. So the follow-up I will not consider zookeeper, consider their own writing agent.
Start building with 50+ products and up to 12 months usage for Elastic Compute Service