Troubleshooting of mysql semi-synchronous replication and mysql Synchronization

Source: Internet
Author: User

Troubleshooting of mysql semi-synchronous replication and mysql Synchronization

1. Background
By default, online mysql replication is asynchronous. In extreme cases, when switching between the master and slave databases, there is a certain probability that the slave database has less data than the master database. Therefore, after switching, we use tools to perform rollback and replenishment to ensure that data is not lost. Semi-synchronous replication requires the master database to execute every transaction. It requires at least one slave database to successfully receive the transaction before the execution is completed. Therefore, the strong consistency of the master and slave databases can be maintained. To ensure strong data consistency between the master and slave databases and reduce data loss, we try to enable the semi-sync feature of mysql replication in the production environment. During the actual operation, it is found that most instances can run normally in semi-Sync Mode, but a small number of instances cannot be started at all times (they can only run in normal replication mode ), what's even more amazing is that two instances of the same host can be enabled, and one cannot. The final problem is also very simple, but it takes some time to troubleshoot it. The following describes the troubleshooting process.

2. Principles of semi-synchronous Replication
The primary and standby databases of mysql use binlog to maintain consistency. After the transaction is executed locally in the primary database, the binlog is returned to the user after it is stored in the disk. The standby database uses the binlog of the primary database to synchronize the operations of the primary database. By default, the master database and the slave database are not strictly synchronized, so there is a certain probability that the data of the slave database and the master database is not equal. The emergence of the semi-sync feature is to ensure the consistency of the master and slave data at any time. Compared with asynchronous replication, semi-synchronous replication requires that at least one slave database be successfully received before being returned to the user. The implementation principle is also very simple. After the local execution of the master database is complete, wait for the Response Message of the slave database (including the binlog (file, pos) received by the latest slave database )), after receiving the Response Message from the slave database, the transaction is actually completed. On the master database instance, there is a dedicated thread (ack_receiver) to receive the Response Message from the slave database, and the notification mechanism is used to inform the master database of the logs received by the slave database, so that the execution can continue. For more information about the implementation of semi-sync, see the mysql semi-sync source code.

3. Problem Analysis
I briefly introduced the principle of semi-synchronous replication. Now let's take a look at the specific problem. After the semi-sync switch is enabled for the master and slave databases, the status variable "Rpl_semi_sync_master_status" of the problematic instance is always OFF, indicating that the replication has been running in the normal replication status.
(1) modify the rpl_semi_sync_master_timeout parameter.
The semi-synchronous replication parameter contains the rpl_semi_sync_master_timeout parameter to control the time when the master database waits for the standby database to respond to the message. If this parameter is exceeded, the Standby database is never received, it is also possible that the execution of the slave database is slow, which is much different from that of the master database. In this case, the replication will switch to normal replication to avoid long wait for the execution transaction of the master database. The default online value is 50 ms. If the value is too small, change it to 10 s, but the problem persists.
(2). Print logs
The simplest and most stupid way to troubleshoot a problem is to log to see which part of the problem has occurred. The master database and slave database respectively have the rpl_semi_sync_master_trace_level and rpl_semi_sync_slave_trace_level parameters to control the semi-synchronous copy and print log. Set the two parameter values to 80 (64 + 16) to record the detailed log information and function calls.

master:2016-01-04 18:00:30 13212 [Note] ReplSemiSyncMaster::updateSyncHeader: server(-1721062019), (mysql-bin.000006, 500717950) sync(1), repl(1)2016-01-04 18:00:40 13212 [Warning] Timeout waiting for reply of binlog (file: mysql-bin.000006, pos: 500717950), semi-sync up to file , position 0.2016-01-04 18:00:40 13212 [Note] Semi-sync replication switched OFF.slave:2016-01-04 18:00:30 38932 [Note] ---> ReplSemiSyncSlave::slaveReply enter2016-01-04 18:00:30 38932 [Note] ReplSemiSyncSlave::slaveReply: reply (mysql-bin.000006, 500717950)2016-01-04 18:00:30 38932 [Note] <--- ReplSemiSyncSlave::slaveReply exit (0)

From the master log, we can see that at 18:00:30 on January 4, the master database was configured with a semi-sync flag and started to wait for the response from the slave database. After 10 s, the slave database still failed to receive the response, and the request timed out, disable the semi-Sync Mode and switch to the normal mode. But from the slave log, 18:00:30 has sent (mysql-bin.000006, 500717950) to the master database, indicating that the log has been received. This indicates that the master log has been marked with semi-sync. slave has received the log and returned the packet. The master has indeed waited for 10 s, that is, it has not received the packet, so switch to normal replication. Now the problem becomes. Why didn't the master receive it?

(3) select Function
As mentioned above, there is a thread (ack_receiver) on the master database instance that receives response packets. It listens to the socket through the select function and reads the message after detecting a Response Message from slave, notifies the worker thread to continue execution. Is the problem above the select function? Because select is a system call and there is no doubt, but it has come here, so you have to check it out. There are several important macro definitions and descriptions related to the select function. Mainly implemented in/usr/include/bits/typesizes. h,/usr/include/bits/select. h and/usr/include/sys/select. h.

FD_ZERO (fd_set * fdset): clears the contact between fdset and all file handles.
FD_SET (int fd, fd_set * fdset): Establishes the connection between the file handle fd and fdset.
FD_CLR (int fd, fd_set * fdset): clears the contact between the file handle fd and fdset.
FD_ISSET (int fd, fd_set * fdset): checks whether the file handle fd associated with fdset can be read and written. If it is greater than 0, the file handle fd can be read and written.

Array {__ fd_mask _ fds_bits [_ FD_SETSIZE/_ NFDBITS]; 1024/64 = 16 (long int)} fd_set # define _ FD_SET_SIZE 1024 typedef long int _ fd_mask; // 8 bytes # define _ NFDBITS (8 * (int) sizeof (_ fd_mask) // 64-bit # define _ FDMASK (d) (_ fd_mask) 1 <(d) % _ NFDBITS) // fd % 64 = N, set the nth bit to 1 # define _ FDELT (d)/_ NFDBITS) // indicates the number of long int # define _ FDS_BITS (set)->__ fds_bits) # define _ FD_SET (d, set) (_ FDS _ BITS (set) [_ FDELT (d)] | = _ FDMASK (d) # define _ FD_CLR (d, set) (_ FDS_BITS (set) [_ FDELT (d)] & = ~ _ FDMASK (d) # define _ FD_ISSET (d, set) \ (_ FDS_BITS (set) [_ FDELT (d)] & _ FDMASK (d ))! = 0)

You can use FD_SET to set the handle we want to listen to. The handle information is stored in the fd_set Bit Array. The number of array elements is determined by _ FD_SETSIZE/64. For _ FD_SETSIZE = 1024, the entire array has only 16 long int values. Each handle occupies one location, which is 1024. It can store 1024 handles. Assuming that the handle value is 138, then 138/64 = 2,138% 64 = 10, then the handle is marked in the array at the 2nd position of the 10th long int 1. So what if the handle value exceeds 1024? I carefully reviewed the handler code and found that there is no fault tolerance judgment at all. If the handle value exceeds 1024, it will surely overflow. Since the select function traverses every bit in the array and then determines whether the handle is readable and writable, it will never be judged for more than 1024 of the handles, therefore, the master database never knows whether the slave database has sent a response packet.

(4) Verification
The above is only theoretical analysis. If the actual running instance handle is indeed more than 1024, the problem is located.
1. Obtain the mysql-pid process.
Ps-aux | grep mysqld | grep port
2. gdb attach to the Process
Gdb-p mysql-pid
3. Locate the ack_receive thread and switch
Info thread
Thread thread_id
4. Print the socket value. The fd value here is 2344.
P m_slaves

(5) Solutions
We can see that the definition of _ FD_SETSIZE is generally 1024, so the select function can only listen to a maximum of 1024 handles, and the maximum handle value cannot exceed 1024. The first method is to increase the parameter, but this method needs to re-compile the Linux kernel. In addition, because of the select mechanism, each bit of data needs to be traversed each time to determine whether there is a message on the handle. Therefore, if the setting is large, the efficiency will be very low. Select is an old IO reuse mechanism. The more advanced poll and epoll have similar functions and are more powerful, and there is no limit on the total number of handles and the maximum number of handles, you can use poll or epoll to implement the listener function to completely solve the problem. For select, poll, epoll, and other mechanisms, you can go to the Internet to check the information, which will not be discussed here.

Temporary solution: either re-compile the Linux kernel or change the mysql kernel code. A temporary solution is provided here. You can run the stop slave and start slave commands on the slave end to re-establish the socket connection between the master database and the slave database. As long as the 1-1024 fd is not fully used, the new socket fd has a chance of less, in this way, the select mechanism will not be faulty, and semi-synchronization will be able to run normally. However, if the fd of 1-is used by persistent connections, this method will be powerless.

(6) official version
I read the source code 5.7 on the latest official version of oracle git, which is also implemented using select, so there are similar problems. Of course, due to the reuse mechanism of the handle number, when there are few connections on the instance or there are not many persistent connections, fd> 1024 is not prone, so this bug is not very easy to appear, however, the problem is common.

(7) problem Extension
After locating the problem, another problem also plagued me for a long time. There are three listening parts in the mysql kernel. One is the select of the listening port, the other is the epoll of the thread pool, and the other is the semi-synchronous select listener. The slave binlog dump thread is a common working thread, and the socket of the working thread will be monitored by epoll, the binlog dump socket will be monitored by the semi-synchronous select listener and the epoll listener of the thread pool at the same time. Isn't this messy? Later, I checked the code carefully and found that the epoll listening in the thread pool adopts the EPOLLONESHOT mode. Each time a message is received, it is unbound and needs to be re-registered, therefore, the same handle is not monitored by both monitoring mechanisms at the same time.

At this point, the troubleshooting process is over. The conclusion is relatively simple, but it takes some effort to locate the problem. Because select is a common multi-channel IO multiplexing mechanism, it may be necessary to pay attention to its limitations when it is useful to select functions.

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.