Discuss Seconds_Behind_Master again, and discuss the master to clean up the aftermath
I wrote an article two years ago to explain the meaning of Seconds_Behind_Master and why it is inaccurate. Today, my colleague Gao asked another interesting question: how is Seconds_Behind_Master computed? Instructor Gao also made a special look at the source code to explain the problem. I found that my previous understanding was still different, so I did it myself, the following describes how it is calculated and why it cannot be completely trusted.
I usually read less MySQL source code. Generally, the source code is also a basic issue. For me, if I am not familiar with the key code location, the quick way is to cd to the source code root directory, then grep "Seconds_Behind_Master ". -R-n. The result is as follows:
./SQL/rpl_rli.cc: 1209: Seconds_Behind_Master-not critical ).
./SQL/slave. cc: 1345: "do not trust column Seconds_Behind_Master of SHOW"
./SQL/slave. cc: 1874: field_list.push_back (new Item_return_int ("Seconds_Behind_Master", 10,
./SQL/slave. cc: 1963: Seconds_Behind_Master: if SQL thread is running and I/O thread is
./SQL/slave. cc: 3254: alive and connected, this is going to make Seconds_Behind_Master be 0
./SQL/slave. cc: 3258: Seconds_Behind_Master grows. No big deal.
./SQL/slave. cc: 4666: We say in Seconds_Behind_Master that we have "caught up". Note that
./SQL/slave. cc: 4679: Seconds_Behind_Master wocould be zero only when master has no
From the search results, we mainly design SQL/slave. cc and SQL/rpl_rli.cc, and these two files can solve our doubts.
First, the first question is, how is Seconds_Behind_Master calculated?
Let's take a look at the explanations in the manual:
This field is an indication of how "late" the slave is:
When the slave is actively processing updates, this field shows the difference between the current timestamp on the slave and the original timestamp logged on the master for the event currently being processed on the slave.
When no event is currently being processed on the slave, this value is 0.
If slave is processing updates, the sbm calculation method is: slave's current timestamp-timestamp attached to the binlog event where the update is being executed. if slave does not process updates, then sbm = 0
Is that true? The answer is basically correct, but it does not explain some details. Then we can find the real formula in slave. cc:
1962/* 1963 Seconds_Behind_Master: if SQL thread is running and I/O thread is1964 connected, we can compute it otherwise show NULL (I. e. unknown ). 1965 */1966 if (mi-> slave_running = MYSQL_SLAVE_RUN_CONNECT) & 1967 mi-> rli. slave_running) 1968 {1969 long time_diff = (long) (time (0)-mi-> rli. last_master_timestamp) 1970-mi-> clock_diff_with_master); many comments 1991 protocol-> store (longlong) (mi-> rli. last _ Master_timestamp? 1992 max (0, time_diff): 0); 1993} 1994 else1995 {1996 protocol-> store_null (); 1997}
The key code is as follows:
1. long time_diff= ((long)(time(0) - mi->rli.last_master_timestamp) - mi->clock_diff_with_master); 2. protocol->store((longlong)(mi->rli.last_master_timestamp ? max(0, time_diff) : 0));
Here we finally see the legendary sbm calculation method. Here we will explain the meaning of each variable:
Time (0) // current system timestamp of the slave database, Linux System Function
Mi-> rli. last_master_timestamp // binlog event timestamp of the statement being executed by the current slave Database
Mi-> clock_diff_with_master // difference between the master and slave system timestamps, slave-master
Here we know that the description in the manual is not accurate, but a part of clock_diff_with_master is missing, because the timestamp of binlog event is recorded on the master database and the local timestamp time (0) calculated from the slave database) all of them call the system time function. If the master and slave time settings are inconsistent at this time, isn't this value completely meaningless? Therefore, to avoid this situation as much as possible, the master-slave timestamp is obtained every time a connection is established between the slave database and the master database, and a difference value is calculated and saved as a constant, this value will be subtracted from each sbm calculation. However, there is a problem. MySQL will only calculate this value when a master-slave replication connection is established and will not be updated in the future. Therefore, if the master-slave connection is established and the master-slave time is changed, for example, NTPD, such as set timestamp = xxx, then the sbm value is more unreliable.
Under normal circumstances, the first sentence is already the sbm value we see in show slave status. So what is the problem solved?
1. clock_diff_with_master is the difference between the master and slave SELECT UNIX_TIMESTAMP (). It is very likely that the master database executes 1 and the slave database executes 2 (because the master and slave databases cannot be executed at the same time ), in this case, clock_diff_with_master = 1. Assume that the value of time (0)-last_master_timestamp is 0, then 0-1 =-1, and-1 will give ambiguity to the user, in this case, the official system forces the negative value to 0.
2. the sbm value is divided into two situations. The SQL thread in the slave database has just explained in detail when processing the statement, when the SQL thread in the slave database does not process the statement, it sets this value to 0, so the second code implements this function. The ternary expression in the second sentence tells us that when mi-> rli. when the value of last_master_timestamp is 0, sbm = 0. So what is the update logic of mi-> rli. last_master_timestamp?
There are two main points:
First, rpl_rli.cc/Relay_log_info: stmt_done
Each time a binlog event is parsed from the relay log for execution, last_master_timestamp = event_creation_time;
Second, slave. cc/static Log_event * next_event (Relay_log_info * rli)
The prerequisite for entering the following logic is that the relay log has been executed and the SQL thread is waiting for updates to the relay log.
4686 time_t save_timestamp = rli-> last_master_timestamp; // save the last binlog event timestamp, after the binlog is pushed to the master database in the future, it can be used when sbm is calculated for the first time, this is also the reason why the first show slave status after the start slave shows that the sbm value is very large. 4687 rli-> last_master_timestamp = 0; // set the value to 0, then the sbm calculated by the preceding ternary expression is 0... partial code 4779 rli-> relay_log.wait_for_update_relay_log (rli-> SQL _thd) is omitted ); // wait for relay log updates 4780 // re-acquire data lock since we released it earlier4781 mysql_mutex_lock (& rli-> data_lock); 4782 rli-> last_master_timestamp = save_timestamp; // reset the original 4783 continue;
Then the second big question: Is Seconds_Behind_Master reliable? Is the value 0 indicating that the master and slave data are completely consistent?
The answer must be no.
1. sbm indicates the event delay in the relay log, while MySQL performs asynchronous replication by default. Therefore, the relay log may be executed from the database, however, the binlog of the master and slave databases is not pushed over the network, and we have encountered online sbm = 0 for network reasons, but the binlog of the master database is not pushed at all, generally, this illusion can be found through stop slave/start slave. Therefore, we usually do not directly use sbm to monitor the master-slave latency. Instead, the master-slave creates a heartbeat table and inserts a piece of data into the master database to determine the latency.
2. If the Master/Slave time is not synchronized, The sbm delay value may be inaccurate.
5.6 If multi-threaded replication is enabled, this value is even more inaccurate. I am also curious about how to monitor the master-slave latency if MTS is enabled in online 5.6?
-- EOF --