MySQLDatabaseMaster/Slave latencyHow to judge? This article introduces two methods for judging: 1. Seconds_Behind_Master vs 2. mk-heartbeat. Next we will introduce these methods separately.
In daily work, for MySQL master-slave replication check, on the one hand, we need to ensure that the overall structure of the replication is normal, on the other hand, we need to check whether the master-slave data is consistent. For the former, we can monitor whether the replication thread is working normally and whether the master-slave latency is within the tolerable range. For the latter, we can verify whether the md5 codes of the data in the master-slave table are consistent, to ensure data consistency, you can use the mk-table-checksum tool in the Maatkit toolkit to check.
Method 1:
Monitor the Seconds_Behind_Master parameter output by the show slave status \ G command to determine whether there is a master-slave latency. The values are as follows:
NULL-indicates that either io_thread or SQL _thread has a fault, that is, the Running status of the thread is No, rather than Yes.
0-the value is zero, which is a situation we are eager to see. It indicates that the master-slave replication is good and lag does not exist.
Positive Value: The Master/Slave database has a latency. A larger number indicates that the slave database lags behind the master database.
Negative value-rarely seen. I have heard from some senior DBAs. In fact, this is a BUG value. This parameter does not support negative values, that is, it should not appear.
Show slave status \ G. The output result of this command is very rich and many meaningful parameters are provided for our monitoring. For example, Slave_IO_Running can be used as the monitoring item of io_thread, yes indicates that the io_thread and the master database are connected normally and can be copied. If No, the communication with the master database is abnormal. In most cases, it is caused by the master-slave network; slave_ SQL _Running: this parameter indicates whether SQL _thread is normal, specifically, whether the statement is executed successfully. If the primary key is repeated or a table does not exist. Next we will talk about the focus of today's Seconds_Behind_Master. This value is used as an indicator to judge the master-slave latency. How does it get this value? At the same time, why has it been questioned by many people?
Seconds_Behind_Master compares the timestamp of the event executed by SQL _thread with the timestamp (ts) of the event copied by io_thread, and obtains such a difference. We all know that the relay-log and the bin-log of the master database have exactly the same content. When an SQL statement is recorded, the current ts will be recorded. Therefore, the reference value is from binlog, in fact, the master and slave do not need to be synchronized with NTP, that is, the master and slave clock do not need to be consistent.
You will also find that, in fact, io_thread actually occurs between io_thread and SQL _thread, and io_thread is actually associated with the master database, so the problem arises, when the I/O load of the master database is large or the network is congested, io_thread cannot copy the binlog in time (without interruption or replication), while SQL _thread can keep up with the io_thread script, in this case, the value of Seconds_Behind_Master is 0, that is, we think there is no latency, but actually it is not, you know. This is why we need to criticize the use of this parameter to monitor whether the database is delayed, but this value is not always inaccurate. If the io_thread and master network are good, this value is also very valuable.
Previously, the Seconds_Behind_Master parameter has a negative value. We already know that this value is the difference between the latest and new ts executed by io_thread and the ts executed by SQL _thread. The former is always greater than the latter, the only possibility is that the ts of an event has an error, which is smaller than the previous one. In this case, a negative value may occur.
Method 2:
Mk-heartbeat, a tool in the Maatkit universal toolkit, is considered to be a method for accurately determining replication latency.
The implementation of mk-heartbeat is also achieved by comparing timestmp. It must first ensure that the master and slave servers must be consistent and synchronize the clock with the same NTP server. It needs to create a heartbeat table on the master database, which has at least two fields: id and ts. The id is server_id, and ts is the current timestamp now (), the structure is also copied to the slave database.
After the table is created, a command for updating a row will be executed in the process mode after the master database is created, and data will be inserted to the table on a regular basis. The default period is 1 second, at the same time, the slave database will also execute a monitoring command in the background to compare it with the master database at the same cycle. The copied ts value is the same as the one on the master database, if the difference is 0, there is no latency. The larger the difference, the more seconds the delay will be.
We all know that the ts for asynchronous replication is not completely consistent, so the tool allows a gap of half a second, and the difference within this can be ignored and considered as no delay. This tool uses real-time replication to cleverly borrow timestamp to check latency, which is very useful!
The two methods for checking the Master/Slave latency of the MySQL database are described here. I hope this introduction will be helpful to you!