In a MySQL replication environment, we usually judge slave delays only based on the value of Seconds_behind_master. This is acceptable in most cases, but not accurate enough to consider more factors.
First, let's look at the state of the Slave:
Copy Code code as follows:
yejr@imysql.com [(none)]> show Slave status\g
1. Row ***************************
Slave_io_state:waiting for Master to send event
***
master_log_file:mysql-bin.000327
read_master_log_pos:668711237
relay_log_file:mysql-relay-bin.002999
relay_log_pos:214736858
relay_master_log_file:mysql-bin.000327
Slave_io_running:yes
Slave_sql_running:yes
***
skip_counter:0
exec_master_log_pos:654409041
relay_log_space:229039311
***
seconds_behind_master:3296
***
You can see that the value of Seconds_behind_master is 3296, that is, the slave is delayed at least 3,296 seconds.
Let's take a look at the 2 replication process states on the slave:
Copy Code code as follows:
yejr@imysql.com [(None)]> Show full processlist\g
1. Row ***************************
Id:6
User:system User
Host:
Db:null
Command:connect
time:22005006
State:waiting for Master to send event
Info:null
2. Row ***************************
Id:7
User:system User
Host:
Db:null
Command:connect
time:3293
State:updating
Info:update * * SET * * WHERE * *
You can see that the SQL thread has been performing an update operation, noting that the value of time is 3293, and it looks like the update operation was performed for 3,293 seconds, a normal SQL, and certainly not so long.
In fact, in the replication process, the value of the time column might have several scenarios:
1, the current timestamp difference between the timestamp and IO threads in the binlog (actually relay log) that the SQL thread is currently executing, which is commonly considered seconds_behind_master value, Not the actual execution time of some SQL;
2, the SQL thread is currently without active SQL in the execution, the time value is the SQL thread idle time;
The time value of the IO thread is the total length of the thread since it was started (how many seconds), and may cause an exception, such as a negative number, or very large, if the system is modified after the IO thread has started.
Look at the following states:
#设置pager, view only a few status values of concern yejr@imysql.com [(none)]> Pager Cat | Egrep-i ' System user| exec_master_log_pos| seconds_behind_master| Read_master_log_pos ' #这是没有活跃SQL的情况, the time value is idle time and Seconds_behind_master is 0 yejr@imysql.com [(none)]> Show Processlist; Show Slave Status\g | 6 | System user | | NULL | Connect | 22004245 | Waiting for Master to send event | NULL | | 9 2 System user | | NULL | Connect | 13 | has read all relay log;** read_master_log_pos:445167889 exec_master_log_pos:445167889 seconds_behind_master:0 #和上面一样 y ejr@imysql.com [(None)]> show processlist; Show Slave Status\g | 6 | System user | | NULL | Connect | 22004248 | Waiting for Master to send event | NULL | | 9 2 System user | | NULL | Connect | 16 | has read all relay log;** read_master_log_pos:445167889 exec_master_log_pos:445167889 seconds_behind_master:0 #这时有活跃SQ L, the time value is the same as Seconds_behind_master, that is, the SQL thread is "slower" than the IO thread for 1 seconds yejr@imysql.com [(none)]> show processlist; Show Slave Status\g | 6 | System User| | NULL | Connect | 22004252 | Waiting for Master to send event | NULL | | 9 2 System user | | Floweradmin | Connect | 1 | Updating | UPDATE * * * read_master_log_pos:445182239 exec_master_log_pos:445175263 seconds_behind_master:1 #和上面一样 yejr@imysql.com [None]]> show processlist; Show Slave Status\g | 6 | System user | | NULL | Connect | 22004254 | Waiting for Master to send event | NULL | | 9 2 System user | | Floweradmin | Connect | 1 | Updating |
UPDATE * * * read_master_log_pos:445207174 exec_master_log_pos:445196837 seconds_behind_master:1
Well, finally, we say how to correctly judge the delay of slave:
1, first look at the Relay_master_log_file and master_log_file whether there are differences;
2, if Relay_master_log_file and master_log_file are the same words, and then look at the difference between Exec_master_log_pos and Read_master_log_pos, Compare the number of Binlog events that SQL threads are slower than IO threads;
3, if Relay_master_log_file and master_log_file are not the same, it indicates that the delay may be large, need to obtain Binlog status from master, to determine the current binlog and master gaps;
Therefore, the relatively more rigorous approach is:
On the third party monitoring node, the request for Master and SLAVE to launch the show BINARY logs and show SLAVE status\g at the same time, to determine the difference between Binlog and Exec_master_log_pos and Read_ Master_log_pos the difference.
For example:
The result of performing show BINARY LOGS on Master is:
+------------------+--------------+
| Log_name | file_size |
+------------------+--------------+
| mysql-bin.000009 | 1073742063 |
| mysql-bin.000010 | 107374193 |
+------------------+--------------+
The result of performing show SLAVE status\g on SLAVE is:
master_log_file:mysql-bin.000009
read_master_log_pos:668711237
relay_master_log_file:mysql-bin.000009
Slave_io_running:yes
Slave_sql_running:yes * * * * exec_master_log_pos:654409041 * * *
seconds_behind_master:3296
* *
At this point, the actual delay of the slave should be:
mysql-bin.000009 the difference delay between slave binlog read on binlog position 1073742063 and position in this binlog, namely:
1073742063-654409041 = 419,333,022 Binlog Event
and also to add mysql-bin.000010 this binlog has produced 107,374,193 Binlog event, a total
107374193 + 419333022 = 526,707,215 Binlog Event