Troubleshooting MySQL semi-synchronous replication issues

Last Update:2017-07-03 Source: Internet

Author: User

Tags epoll

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. Issue background
By default, MySQL replication on the line is asynchronous, so in the extreme case, there is a certain probability of cubby the main library data when switching, so after the switch, we will rollback back to the tool to ensure that the data is not lost. Semi-synchronous replication requires that the main library perform every transaction, requiring at least one repository to be successfully received before it is actually completed, so that the master repository can be maintained strong consistency. To ensure that the master repository data is strong and consistent, reducing data loss, try to turn on MySQL's replicated semi-synchronous (Semi-sync) feature in production environments. In the actual operation process, found that most of the instances of the semi-synchronization can be normal operation, but there are a small number of instances always open (only in normal copy mode), more wonderful is the same host of two instances, one can open, one cannot. The final positioning of the problem is also very simple, but it took some effort to troubleshoot, the following will describe the whole problem of the troubleshooting process.

2. Principle of semi-synchronous replication
MySQL Master repository is consistent through the Binlog log, the main library executes the transaction locally, the Binlog log is returned to the user after the disk is dropped, and the repository synchronizes the operations of the main library by pulling the main library binlog logs. By default, there is no strict synchronization between the main library and the repository, so there is a certain probability that the data of the repository and the main library are not equal. The occurrence of the semi-synchronous feature is to ensure that the primary data is consistent at any time. With respect to asynchronous replication, after each transaction is required. The implementation principle is also very simple, after the main library is executed locally, wait for the response message of the repository (including the Binlog (File,pos) received by the latest repository), receive the standby response message, and then return to the user, such a transaction is really complete. On the main library instance, there is a dedicated thread (Ack_receiver) that receives the response message from the repository and notifies the master repository of the logs that have been received by the notification mechanism, and can continue execution. For a specific implementation of the semi-synchronization, you can refer to another article, mysql semi-synchronous ( Semi-sync) source code implementation .

3. Problem Analysis
The principle of semi-synchronous replication is briefly introduced, and now we have a look at the specific problem. After the master standby is opened, the state variable "rpl_semi_sync_master_status" of the problem instance is always off, indicating that replication is running in the normal replication state.
(1). Modify the Rpl_semi_sync_master_timeout parameter.
there is a rpl_semi_sync_master_timeout parameter in the semi-synchronous replication parameter that controls when the main library waits for a standby response message, and if it exceeds this value, it is considered that the repository has not been received (the repository may be hung or the repository can be executed very slowly, far from the main library). Replication will switch to normal replication at this time, avoiding long waits for the main library's execution transactions. This value on the line by default is 50ms, simply think is not this value is too small, then change it to 10s, but the problem is still puzzled.
(2). Print Log
The simplest and most stupid way to troubleshoot a problem is to make a log to see what is wrong. The main and standby libraries have the Rpl_semi_sync_master_trace_level and Rpl_semi_sync_slave_trace_level parameters to control the semi-synchronous copy print log, respectively. Set two parameter values to 80 (64+16), record verbose log information, and function calls in and out.

650) this.width=650; "src="/img/fz.gif "alt=" Copy Code "style=" Margin:0px;padding:0px;border:none; "/>

Master:2016-01-04 18:00:30 13212 [note] replsemisyncmaster::updatesyncheader: server ( -1721062019),  (mysql-bin.000006, 500717950)  sync (1), &NBSP;REPL (1) 2016-01-04 18:00:40  13212 [warning] timeout waiting for reply of binlog  (file:  mysql-bin.000006, pos: 500717950),  semi-sync up to file , position  0.2016-01-04 18:00:40 13212 [Note] Semi-sync replication switched  off.slave:2016-01-04 18:00:30 38932 [note] ---> replsemisyncslave::slavereply  enter2016-01-04 18:00:30 38932 [note] replsemisyncslave::slavereply: reply   (mysql-bin.000006, 500717950) 2016-01-04 18:00:30 38932 [note] <---  replsemisyncslave::slavereply exit  (0)

650) this.width=650; "src="/img/fz.gif "alt=" Copy Code "style=" Margin:0px;padding:0px;border:none; "/>

From the master log can be seen at 2016-01-04 18:00:30, the main library set the semi-synchronous flag, and began to wait for the standby response, after waiting for 10s, still did not receive a response, it is considered time-out, the semi-synchronous mode is turned off, switch to normal mode. However, from the slave log, in 2016-01-04 18:00:30 has sent (mysql-bin.000006, 500717950) to the main library, indicating that the log has been received. This shows that the master log has been hit Semi-sync, slave received the log, and also back to the package, Master is indeed waiting for 10s, is not received package, so switch to normal replication. Now the question becomes, why did master not receive it?

(3) Select function
As mentioned earlier, there is a thread (ack_receiver) on the main library instance that specifically receives the response packet, which listens to the socket through the Select function, discovers a slave response message, reads the message, and notifies the worker that it can continue execution. So does the question appear above the Select function? Because select is a system call, there has been no doubt, but it has been followed here, it has to be seen. There are several important macro definitions and descriptions related to the Select function. Mainly implemented in the three files of/usr/include/bits/typesizes.h,/usr/include/bits/select.h and/usr/include/sys/select.h.

650) this.width=650; "src="/img/fz.gif "alt=" Copy Code "style=" Margin:0px;padding:0px;border:none; "/>

/  __nfdbits]; /=  (  __FD_SET_SIZE 1024  __fd_mask;  __NFDBITS  (8 *  (int)  sizeof  (__fd_mask))   __fdmask (d)      ((__fd_mask)  1 <<  ((d)  % __nfdbits))   __fdelt (d)      ((d)  / __nfdbits)   __fds_bits (set)   (set)->__fds_bits   __fd_set (d, set )   (__fds_bits  (set) [__fdelt  (d)] |= __fdmask  (d))  __fd_clr (D, set)   (__fds_bits  (set) [__fdelt  (d)] &= ~__fdmask  (d))  __fd_isset (d,  Set)  \) [__fdelt  (d)] & __fdmask  (d))  != )

650) this.width=650; "src="/img/fz.gif "alt=" Copy Code "style=" Margin:0px;padding:0px;border:none; "/>

By Fd_set we can set the handle that we want to listen to, the handle information is stored in the Fd_set bit array, the number of array elements is determined by __FD_SETSIZE/64, for __fd_setsize=1024, the entire array has only 16 long int. Each handle occupies a bit, which is 1024 bits, and can store 1024 handles. Assuming that the handle value is 138, then 138/64=2,138%64=10, then the handle is marked in the array at the 10th position of the 2nd long int 1. so if the handle value goes beyond 1024, does it overflow here? I carefully masturbate the code, found that there is no fault-tolerant judgment, if the handle value of more than 1024 will definitely overflow. Since the Select function iterates through each bit in the array and then determines whether the handle is readable and writable, it will never be judged for more than 1024 handles, so the master repository never knows whether the standby sends a response packet.

(4) Verification
above is just a theoretical analysis, if the actual running instance handle is indeed more than 1024, then the problem is located.
1. Get the MySQL process mysql-pid
ps–aux | grep mysqld | grep port
2.gdb attach to the process
gdb–p mysql-pid
3. Locate the ack_receive thread and switch
Info Thread
Thread thread_id
4. Print the value of the socket, where the FD value is 2344.
P m_slaves

650) this.width=650; "src=" http:// Images2015.cnblogs.com/blog/176539/201601/176539-20160104194809278-1124400181.png "style=" margin:0px;padding:0 px;border:0px; "/>

we saw it because __fd_ The definition of setsize, typically 1024, causes the Select function to listen to only 1024 handles, and the maximum handle value is no more than 1024. The first method is to increase the parameter, but this method requires recompiling the Linux kernel. And because of the select mechanism, every time a bit is traversed to determine if a message arrives on the handle, if the setting is large, it will result in very low efficiency. Select is an older IO multiplexing mechanism, and the more advanced Poll,epoll have similar functionality and are more powerful, with no limit on the total number of handles and the maximum handle, with the poll or Epoll to monitor this part of the function, you can completely solve the problem . About Select,poll,epoll and other mechanisms, you can go to the Internet to check the information, here is not open discussion.

workaround , the method mentioned earlier either requires recompiling the Linux kernel or needs to change the MySQL kernel code, which provides a temporary workaround. You can perform the Stop Slave,start Slave command on the slave side to rebuild the socket connection between the master and the standby, as long as 1-1024 of FD is not fully used, the new socket FD has a chance of less than 1024, so that the select mechanism does not have a problem. The half-sync will work as well. But if 1-1024 of FD are all used by long connections, then this method is powerless.

(6) official version
Look at the latest Oracle version of the official Git 5.7 source code, this block is also implemented with SELECT, so there are similar problems. Of course, Because the handle number has a multiplexing mechanism, when the number of connections on the instance is very small, or long connection is not much, it is not easy to appear fd>1024 situation, So this bug is not very easy to come up with, but the problem is ubiquitous.

(7) Problem extension
problem positioning, another problem still bothers me for a long while. Because there are 3 blocks of listening in the MySQL kernel, 1 is the listening port select,2 is the thread pool listener epoll,3 is the semi-synchronous select listener. Slave Binlog dump thread is a normal worker thread, and the working thread's socket will be monitored by epoll, so that the socket of Binlog dump will be monitored simultaneously by the semi-synchronous select listener and the thread pool Epoll, is this not a mess? Later looked at the code carefully, only to find that the thread pool Epoll listen to the epolloneshot mode, each time the message will be unbound, need to re-register, so there will not be the same handle by the two listening mechanism simultaneously monitoring the situation.

Here, the process of troubleshooting is over, the conclusion is relatively simple, but it does take some effort to locate the problem. Since select is a more general purpose multi-channel multiplexing mechanism, it is possible to pay attention to the limitations of child shoes that are useful to the SELECT function.

Troubleshooting MySQL semi-synchronous replication issues

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More