In daily work, for MySQL master-slave Replication
Check: on the one hand, we need to ensure that the overall structure of the replication is normal, and on the other hand, we need to check whether the Master/Slave Data is consistent. For the former, we can monitor whether the replication thread works normally and whether the Master/Slave latency is
Within the tolerable range, the latter can verify whether the MD5 code of the data in the master and slave tables is consistent to ensure data consistency. You can use MK-table-
Checksum tool to check. Here, I just want to discuss how to check the Master/Slave latency.
There are two methods to determine the master-slave latency: 1. seconds_behind_master vs 2. mk-heartbeat. The differences between the two are described below.
Method 1: monitor the value of the seconds_behind_master parameter output by the show slave status \ G command to determine whether there is a master-slave latency. The values are as follows:
Null-indicates that either io_thread or SQL _thread has a fault, that is, the running status of the thread is no, rather than yes.
0-the value is zero, which is a situation we are eager to see. It indicates that the master-slave replication is good and lag does not exist.
Positive Value: The Master/Slave database has a latency. A larger number indicates that the slave database lags behind the master database.
Negative value-rarely seen. I have heard from some senior DBAs. In fact, this is a bug value. This parameter does not support negative values, that is, it should not appear.
Show slave
Status \ G, the output result of this command is very rich, providing many meaningful parameters for our monitoring, such as: slave_io_running this parameter can be used
Yes indicates that the io_thread is connected to the master database normally and can be replicated. If no, the communication with the master database is abnormal. In most cases, it is caused by the master-slave network.
Problem; slave_ SQL _running this parameter indicates whether SQL _thread is normal, specifically whether the statement is executed successfully, and the primary key is repeated or a table does not exist. Below
Today's focus is on seconds_behind_master, which serves as an indicator for judging the master-slave latency. How does it get this value? At the same time, why does it suffer from many people?
Question?
Seconds_behind_master compares the timestamp and
The timestamp (TS) of the event copied by io_thread is compared, and the difference value is obtained. We all know the relay-log and
The content in the bin-log is exactly the same. When an SQL statement is recorded, the current ts is recorded. Therefore, the reference value comes from BINLOG. In fact, the master and slave nodes do not need to be the same as NTP.
Step. You will also find that io_thread is actually between io_thread and SQL _thread.
The database is associated, so the problem arises. When the I/O load of the master database is large or the network is blocked, io_thread cannot copy the BINLOG in time (without interruption, it is also being copied ).
SQL _thread can keep up with the io_thread script all the time. At this time, the value of seconds_behind_master is 0, that is, we think there is no latency. However,
Actually not, you know. This is why we need to criticize the use of this parameter to monitor whether the database is delayed, but this value is not always inaccurate. If io_thread and
If the master network is good, this value is also very valuable. (Like: Mom-son-daughter-in-law relationship, mom and son-in-law, daughter-in-law and son-in-law are also family members, not necessarily daughter-in-law and mother are very close. Open
Joke :-) before that, the parameter seconds_behind_master has a negative value. We already know that this value is the latest of io_thread and the new TS and
Difference between TS executed by SQL _thread. The former is always greater than the latter. The only difference is that the Ts of an event has an error, which is smaller than the previous one. In this case, negative
The value may appear.
Method 2. mk-heartbeat, a tool in the maatkit universal toolkit, is considered to be a method for accurately determining replication latency.
the implementation of MK-heartbeat is also achieved by comparing timestmp, it must first ensure that the master and slave servers must be consistent by synchronizing the clock with the same NTP
server. It needs to create a heartbeat table on the master database, which contains at least two fields: ID and Ts. The ID is server_id, and Ts is the current timestamp
now (), this structure will also be copied to the slave database. After the table is created, the update command will be executed in the later process mode of the master database to regularly insert data to the table, the default period is 1
seconds. At the same time, the slave database executes a monitoring command in the background to compare the period with that of the master database, the ts value of the copied record is the same as that of the master database. The difference of 0 indicates no delay. The larger the difference, the more seconds the latency
is. We all know that the Ts for asynchronous replication is not completely consistent, so the tool allows a gap of half a second, and the difference within this can be ignored and considered as no delay. This tool is based on a real-time
mechanism, cleverly borrow timestamp to check the latency, like!