MySQL 5.7 Semi-synchronous replication technology

Source: Internet
Author: User
Tags ack epoll mutex create database


I. Replication architecture derivative history


Before we talk about this feature, let's take a look at MySQL's replication architecture derivation history.



In 2000, the MySQL 3.23.15 version introduced the replication. As a quasi-real-time synchronization method, replication is widely used. The implementation of Replicaton at this time involves two threads, one in master and one in slave. Slave I/O and SQL functions are used as a thread, from master to the event after the direct apply, without relay log. This way, the speed at which the event is read is slowed by the slave replay speed, which causes a large number of binary logs to not be backed up to the slave side when there is a large delay in the master standby.



In 2002, MySQL 4.0.2 releases the slave-side event read and execute independently into two threads (IO threads and SQL Threads), while introducing the relay log. The IO thread reads the event and writes the relay Log,sql thread reads the event from the relay log and executes it. This way, even if the SQL thread executes slowly, the binary log of master is synchronized to the slave as much as possible. When Master goes down, it switches to slave without a lot of data loss.



This method of asynchronous replication was used until the 2010 MySQL 5.5 release. The transaction execution of the main library does not Tabeku the synchronization progress, if the repository is backward and the main library is unfortunate crash, data loss will result. As a result, MySQL has naturally introduced semi-synchronous replication in 5.5, and the main library needs to ensure that at least one receive from the library and write to the relay log before answering the transaction submitted by the client. So can semi-synchronous replication do not lose data? Analyzed below.



In 2016, MySQL introduced a new technology in 5.7.17, called InnoDB Group Replication. At present, the official MySQL 5.7.17 based on group replication's full synchronization technology has been introduced, full synchronization technology to bring more data consistency protection. Believe is the future of synchronization technology an important direction, it is worth looking forward to. MySQL 5.7 Group Replication



According to the above mentioned several replication protocols, respectively, corresponding to the MySQL several types of replication, respectively, asynchronous, semi-synchronous, full synchronization.





    • For asynchronous replication, the main library writes the transaction Binlog event to the Binlog file, at which point the main library notifies the dump thread to send the new Binlog, and the main library continues to process the commit operation without guaranteeing that the binlog will be propagated to any of the slave nodes.
    • For full-synchronous replication, after the main library commits the transaction, all the slave nodes must receive, apply, and commit the transactions before the main library thread can continue with the subsequent operations. One obvious drawback of this is that the time the main library completes a transaction is stretched and performance is reduced.
    • For semi-synchronous replication, which is between full and asynchronous replication, the main library waits for at least one receive from the Library node and flush Binlog to the relay log file, and the main library does not have to wait for all of the library's feedback to the main library. At the same time, this is just a feedback to be received, rather than the feedback that has been fully executed and submitted, thus saving a lot of time.
second, semi-synchronous replication technology


We are talking about the second structure today. We know that the normal replication, the asynchronous replication of MySQL, relies on the MySQL binary log, also known as binary logs, for data replication. For example, two machines, one host (master), the other one is slave (slave).



1) Normal replication is: Transaction one (T1) writes binlog buffer;dumper thread notification slave has a new transaction t1;binlog buffer checkpoint;slave IO thread receives T1 and writes to its own relay log The slave SQL thread is written to the local database. At this point, both master and slave can see the new transaction, even if master hangs, slave can be promoted to the new master.



2) The exception is replicated as: Transaction one (T1) write binlog buffer;dumper thread notification Slave There is a new transaction T1;binlog buffer Checkpoint;slave because the network is not stable, has not received t1;master hang off, Slave promotion for new MASTER,T1 lost.



3) A big problem is: the host and slave transactions update the different steps, even if there is no network or other system exceptions, when the business comes up, slave because the order to execute Master batch transactions, resulting in a large delay.



To compensate for the lack of these scenarios, MySQL has launched a semi-synchronous replication from 5.5 onwards. Semi-synchronous replication improves data integrity when compared to asynchronous replication, because it is clear that the transaction will exist at least two places after a transaction commits successfully. That is, after the master's dumper thread notifies slave, an ACK is added (message acknowledgement), that is, the successful receipt of T1 's flag code, that is, the dumper thread in addition to send T1 to slave, but also undertook to receive slave ack work. If an exception occurs and no ACK is received, it is automatically demoted to normal replication until the exception is repaired and then automatically to semi-synchronous replication.



Semi-synchronous replication specific features:


    • The library will tell the main library when it connects to the main library, and it is not configured for semi-synchronization.
    • If the semi-synchronous replication is on the main library side, and there is at least one semi-synchronous copy from the library node, then the transaction thread of the main library is blocked and waiting at the time of commit, there are two possible results, or at least one notification from the library node that it has received all the Binlog events of this transaction. Either wait until a certain point in time of the configuration is exceeded, and at this point, the semi-synchronous replication is automatically turned off and converted to asynchronous replication.
    • The waiting thread above the main library is not notified until all binlog of a transaction are received by the library node, written and flush to the relay log file.
    • If, during the wait, the wait time exceeds the configured time-out time, no one notifies the current transaction from the node, then the main library is automatically converted to asynchronous replication, and when at least one semi-synchronization is caught from the node, the main library is automatically converted to a semi-synchronous copy.
    • Semi-synchronous replication must be in the main library and from both ends of the library, if not open on the main library, or open on the main library and is not open from the library, the main library will be replicated asynchronously.


Semi-synchronous replication potential issues:



Let's take a look at the semi-synchronous copy schematic, as follows:






Master writes each transaction to Binlog (Sync_binlog=1), which is passed to slave flush to disk (Sync_relay=1), while the main library commits the transaction (commit). Master waits for slave feedback to receive the relay log, and master only feeds the commit OK result to the client after receiving an ACK.



In MySQL 5.5~5.6 using after_commit mode, after the client transaction is committed at the storage engine level, the main library is down during the process of getting confirmation from the library. At this point, when the main library waits for slave ack, although the current client is not returned, the transaction is committed and the other client reads the committed transaction. If the slave side has not read the events of the transaction, and the main library has crash, then switch to the standby repository. Then the previously read transaction is gone, and a phantom reading occurs. As shown, the picture is quoted from Loss-less semi-synchronous Replication on MySQL 5.7.2.






If the main library never starts, then actually the transaction that has been successfully committed in the main library is not found on the library, that is, the data is lost, which is not what MySQL would like to see. As a result, the After_sync (lossless copy) parameter is added to the MySQL 5.7 version and is set to the default semi-synchronous mode, which solves the problem of data loss.


Third, MySQL 5.6 semi-synchronous replication configuration


Specific complete configuration can be consulted: MySQL based on log point from replication (ii)



Master configuration



1) Install the half-sync module and start it (this module is/usr/local/mysql/lib/plugin/semisync_master.so)


mysql> Install plugin rpl_semi_sync_master soname ' semisync_master.so ';




  
mysql> set global rpl_semi_sync_master_enabled = 1;
mysql> set global rpl_semi_sync_master_timeout = 2000;


After installation, the timeout for starting and customizing the master-Slave connection error defaults to 10s to 2s, once the timeout is automatically downgraded to asynchronous. (The above content to be permanent effective need to write to the configuration file)


[[email protected] ~]# cat /etc/my.cnf
[mysqld]
rpl_semi_sync_master_enabled = 1;
rpl_semi_sync_master_timeout = 2000;





Slave configuration



1) Install the half-sync module and start


mysql> install plugin rpl_semi_sync_slave soname ‘semisync_slave.so‘;
mysql> set global rpl_semi_sync_slave_enabled = 1;
mysql> show global variables like ‘%semi%‘;
+---------------------------------+-------+
| Variable_name                   | Value |
+---------------------------------+-------+
| rpl_semi_sync_slave_enabled     | ON    |
| rpl_semi_sync_slave_trace_level | 32    |
+---------------------------------+-------+
2 rows in set (0.00 sec)





2) The Slave node needs to reconnect to the master server half-sync before it takes effect


mysql> stop slave io_thread;
mysql> start slave io_thread;





PS: If you want to unload the asynchronous module, you can use uninstall.



See if half-sync is enabled on master






Now that the semi-sync is working properly, it is important to see if rpl_semi_sync_master_clients is not 0,rpl_semi_sync_master_status on. If Rpl_semi_sync_master_status is off, there is a network delay or slave IO thread delay.



Then you can verify that the semi-synchronous timeout is automatically reduced to asynchronous work. You can stop the half-sync protocol on Slave and then create a database on master to see if it can be copied to the slave.



Slave


# Turn off semi-synchronization;
mysql> set global rpl_semi_sync_slave_enabled = 0;
mysql> stop slave io_thread;
mysql> start slave io_thread; 





Master


mysql> create database dbtest;
Query OK, 1 row affected (2.01 sec)

mysql> create database dbtest01;
Query OK, 1 row affected (0.01 sec)





It took 2.01 seconds to create the first database, and the timeout we set earlier was 2 seconds, while the second database took 0.01 seconds to conclude that the timeout was converted to asynchronous delivery. You can view half-sync-related parameter values on Master rpl_semi_sync_master_clients and Rpl_semi_sync_master_status are normal.


mysql> show global status like ‘%semi%‘;  
+--------------------------------------------+-----------+
| Variable_name                              | Value     |
+--------------------------------------------+-----------+
| Rpl_semi_sync_master_clients               | 0         |
| Rpl_semi_sync_master_net_avg_wait_time     | 0         |
| Rpl_semi_sync_master_net_wait_time         | 0         |
| Rpl_semi_sync_master_net_waits             | 37490     |
| Rpl_semi_sync_master_no_times              | 3         |
| Rpl_semi_sync_master_no_tx                 | 197542    |
| Rpl_semi_sync_master_status                | OFF       |
| Rpl_semi_sync_master_timefunc_failures     | 0         |
| Rpl_semi_sync_master_tx_avg_wait_time      | 51351     |
| Rpl_semi_sync_master_tx_wait_time          | 362437445 |
| Rpl_semi_sync_master_tx_waits              | 7058      |
| Rpl_semi_sync_master_wait_pos_backtraverse | 0         |
| Rpl_semi_sync_master_wait_sessions         | 0         |
| Rpl_semi_sync_master_yes_tx                | 7472      |
+--------------------------------------------+-----------+
14 rows in set (0.00 sec)





You can see the automatic shutdown, it is important to note that when the slave turn on the semi-synchronous, or when the network delay between the master and slave back to normal, the semi-synchronous replication automatically from the asynchronous replication to semi-synchronous replication, or is quite intelligent.



Another person in the actual use of a situation when the library IO thread has a delay, the main library will automatically reduce the semi-synchronous replication to asynchronous replication, and when the delay from the library IO, the main library will also upgrade asynchronous replication to semi-synchronous replication. Can be a pressure test simulation, but at this time to view the status of master with the above directly closed slave semi-synchronous somewhat different, you will find that rpl_semi_sync_master_clients is still equal to 1, and Rpl_semi_sync_master_ Status equals off.



With the release of MySQL 5.7, the semi-synchronous replication technology has been upgraded to a new loss-less semi-synchronous replication architecture, with a significant increase in maturity, data consistency and execution efficiency.


Iv. MySQL 5.7 Semi-synchronous replication improvements


Now we know that in a semi-synchronous environment, the main library waits for slave ACK after the transaction commits, so there is a data inconsistency problem. So this slave ack at what time to wait, is also a very important question. So MySQL for the problem of semi-synchronous replication, in 5.7.2 introduced the Loss-less semi-synchronous, after calling Binlog sync, the engine layer commits to wait for slave ACK. This will only be committed if the transaction is confirmed that slave received the transaction events. Wait for slave ACK before commit, and can accumulate transactions to facilitate group commit and improve performance.



MySQL 5.7 Installs the half-sync module with the following command:


mysql> install plugin rpl_semi_sync_master soname ‘semisync_master.so‘;
Query OK, 0 rows affected (0.00 sec)





Take a look at the relevant status information


mysql> show global variables like ‘%semi%‘;
+-------------------------------------------+------------+
| Variable_name                             | Value      |
+-------------------------------------------+------------+
| rpl_semi_sync_master_enabled              | OFF        |
| rpl_semi_sync_master_timeout              | 10000      |
| rpl_semi_sync_master_trace_level          | 32         |
| rpl_semi_sync_master_wait_for_slave_count | 1          |
| rpl_semi_sync_master_wait_no_slave        | ON         |
| rpl_semi_sync_master_wait_point           | AFTER_SYNC |
+-------------------------------------------+------------+
6 rows in set (0.00 sec)




    • Support for lossless replication (loss-less semi-synchronous)


In loss-less semi-synchronous mode, master waits for Slave ack (after receiving an ACK of at least one slave node reply) before the engine layer commits after calling Binlog sync. This allows the master transaction to commit only after confirming that slave received the transaction events, and then returns the result to the client. This transaction is not visible to other transactions at this time. In this mode, the Phantom read and data loss problems caused by the After_commit mode are resolved because the main library does not commit transactions. But there will also be a problem, assuming that the main library hangs before the storage engine commits, it is clear that the transaction is unsuccessful, but because the corresponding binlog has already done sync, from the library has received these binlog, and the implementation of success, the equivalent of more data from the library, but also a problem, but more data , the problem is generally not serious. This problem can be understood as MySQL, in the case of no solution to the problem of distributed data consistency, it can guarantee that the data is not lost, more data than the loss of data is better.



Lossless replication is the addition of the Rpl_semi_sync_master_wait_point parameter to semi sync to control how the main library commits the transaction before it returns to the session transaction successfully in the semi-synchronous mode. Rpl_semi_sync_master_wait_point This parameter has two values: After_commit and After_sync



First value: After_commit (default value of 5.6)



Master writes each transaction to Binlog (Sync_binlog=1), passes the slave flush to disk (Sync_relay=1), and the main library commits the transaction. Master waits for slave feedback to receive the relay log, and master only feeds the commit OK result to the client after receiving an ACK.






Second value: After_sync (5.7 default, but no this mode in 5.6)



Master writes each transaction to Binlog, which is passed to slave flush to disk (relay log). Master waits for the slave feedback to receive the ACK of relay log before committing the transaction and returning a commit OK result to the client. Even if the main library crash, all transactions that have been committed on the main library are guaranteed to be synchronized to the slave relay log.






Comparison of semi-synchronous replication and lossless replication



1.1 ack at a different point in time


    • Semi-synchronous replication waits for an ACK after the commit log in the InnoDB layer, and the master-slave switch has the risk of data loss.
    • Lossless replication waits for an ACK after the write binlog on the MySQL server layer, and the master-slave switchover has a risk of data change.


1.2 Master-Slave data consistency


    • Semi-synchronous replication means that on the master node, the newly committed object changes the database and is visible to other things. Therefore, if you wait for the slave ack crash, then there will be a phantom read to other transactions, data loss.
    • Lossless replication After the write Binlog is complete, it transmits the binlog, but has not yet written the commit log, which means that the current object changes the database, other things are not visible. Therefore, there is no risk of phantom reading and data loss.


So 5.7 introduced the lossless replication (After_sync) mode, the main benefit is to solve the after_commit caused by the master crash data loss problem, so after the introduction of After_sync mode, all committed data has been copied, Data consistency will be improved during failover.


    • Improved performance, support for sending Binlog and receiving ACK Async


The old version of semi Sync was limited to dump thread because dump thread undertook two different and very frequent tasks: transmitting Binlog to slave, waiting for slave feedback, and the two tasks are serial, dump The thread must wait for slave to return before the next events transaction is delivered. Dump thread has become a bottleneck in the overall semi-synchronous performance improvement. In high concurrency scenarios, such a mechanism affects the overall TPS of the database.






To solve the above problem, in the 5.7 version of the Semi Sync framework, an ACK receiver thread is isolated, specifically for receiving ACK requests returned by slave, which divides the sending and receiving work of the previous dump thread into two threads to handle. This allows two threads on master to work independently, sending Binlog to slave at the same time, and receiving ACK information for slave. As a result, the semi-synchronous replication has achieved significant performance gains. This is also known as the faster Semi-sync replication when MySQL 5.7 was released.






But before MySQL 5.7.17, this ACK receiver thread used a select mechanism to listen to the results returned by slave, while the Select mechanism monitored the file handle only 0-1024, when more than 1024, The user may receive an error similar to the following in the MySQL log, more likely to cause MySQL to crash.



Semi-sync Master failed on Net_flush () before waiting for slave reply.



MySQL 5.7. Version 17, the official fixed the bug, began to use the poll mechanism to replace the original select mechanism, so as to avoid the above problems. In fact, the poll call is essentially no different from Select, except that there is no theoretical limit on the number of I/O handles, because it is stored based on a linked list. But there are also drawbacks: for example, a large number of FD arrays are replicated across the user state and the kernel address space, regardless of whether such replication makes sense. Poll Another feature is the "level trigger", if the FD is reported, and is not processed, then the next poll will report the FD again.



In fact, in high-performance software is used in another call mechanism, called epoll, high-performance representatives, such as nginx,haproxy, etc. are used epoll. The complexity of the poll may be lower than the epoll, and may be poll enough for an ACK receiver thread.


    • Performance improvement, controlling the number of successful feedback slave write transactions received by the main library


MySQL 5.7 has a new rpl_semi_sync_master_wait_slave_count parameter that can be used to control how many slave write transaction success feedback The main library accepts, providing flexibility for high-availability architecture switching. , Master waits for an ACK of two slave when the count value is 2 o'clock.





    • Performance improvement, Binlog mutex improvement


The old version of the semi-synchronous copy in the main commit Binlog write session and dump thread read Binlog operation will add a mutex to binlog, resulting in the Binlog file read and write is serialized, there is a problem of concurrency.






MySQL 5.7 has been optimized for Binlog lock in the following two ways:



1. Remove dump thread's mutex to Binlog.



2. Added security margin guarantee Binlog read security.






You can see from the introduction of replication features, the official MySQL has been constantly improving, forward. At the same time, we can find that the current native MySQL master-slave replication implementation is actually difficult to meet the data consistency in the premise of high availability, high performance.


Five, parameter Sync_binlog/sync_relay and semi-synchronous replication


Configuration of the Sync_binlog



In fact, there will be non-destructive replication process can cause the main data inconsistency, so that the primary and standby synchronization failure situation. See below for sync_binlog configuration analysis.



SOURCE Analysis


sql/binlog.cc ordered_commit
9002 update_binlog_end_pos_after_sync = (get_sync_period() == 1);
        ...
          //When sync_period(sync_binlog) is 1, update binlog end pos after sync
9021 if (!update_binlog_end_pos_after_sync)
            //Update the binlog end position, the dump thread will send the updated events
9022 update_binlog_end_pos();
        ...
          //
9057 std::pair<bool, bool> result = sync_binlog_file(false);
        ...
          //
9061 if (update_binlog_end_pos_after_sync)
9062 {
        ...
9068 update_binlog_end_pos(tmp_thd->get_trans_pos());
9069}
 
 
sql/binlog.cc sync_binlog_file
8618 std::pair<bool, bool>
8619 MYSQL_BIN_LOG::sync_binlog_file(bool force)
8620 {
8621 bool synced = false;
8622 unsigned int sync_period = get_sync_period(); //sync_binlog value
        //sync_period is 0, no sync operation, other values are sync after reaching the number of sync calls
8623 if (force || (sync_period && ++sync_counter >= sync_period))
8624 {





Configuration analysis



When Sync_binlog is 0, the Binlog Sync disk is the responsibility of the operating system. When not 0, the value is the Binlog commit group number of the regular sync disk. Through the source code we know that the Sync_binlog value is not equal to 1 when the transaction in the flush phase of the transfer binlog to the library, and the value is 1 o'clock, Binlog synchronization operation is after the sync phase. When the Sync_binlog value is greater than 1, the sync Binlog operation may not cause the binlog to drop. If there is no disk, the transaction is rolled back before it is committed, the master is power-down, and then the transaction is resumed. However, the slave may have received the events and execution of the transaction, and this time there will be more slave transaction than master, the primary and standby synchronization will fail. So if you want to keep the master, you need to set Sync_binlog to 1.



The location of the send events in the Wait_after_sync and wait_after_commit two graphs can also lead to inconsistent primary and standby data and a case of synchronization failure. The actual figure in Rpl_semi_sync_master_wait_point analysis is the case of sync Binlog greater than 1. According to the above source code, the process as shown. Master executes flush Binlog in turn, update binlog position, Sync Binlog. If Master is dropped before sync binlog after update binlog position, the original transaction will be rolled back after Master starts again. However, it may appear that slave gets to events, which also results in more slave data than master, and the master and standby synchronization fails.






For the above reasons, when Sync_binlog is set to 1, MySQL will update binlog end POS after sync. The process is as shown. At this point, Sync Binlog is required for every transaction, and sync Binlog and network send events are a serial process that degrades performance significantly.






Configuration of the Sync_relay_log



SOURCE Analysis


sql/rpl_slave.cc handle_slave_io
 
5764       if (queue_event(mi, event_buf, event_len))
           ...
5771       if (RUN_HOOK(binlog_relay_io, after_queue_event,
5772                    (thd, mi, event_buf, event_len, synced)))
 
after_queue_event
->plugin/semisync/semisync_slave_plugin.cc repl_semi_slave_queue_event
->plugin/semisync/semisync_slave.cc ReplSemiSyncSlave::slaveReply
 
queue_event
->sql/binlog.cc MYSQL_BIN_LOG::append_buffer(const char* buf, uint len, Master_info *mi)
->sql/binlog.cc after_append_to_relay_log(mi);
->sql/binlog.cc flush_and_sync(0)
->sql/binlog.cc sync_binlog_file(force)





Configuration analysis



In the slave IO thread get_sync_period Gets the value of Sync_relay_log, as with Sync_binlog for sync control. When Sync_relay_log is not 1, the position that Semisync returns to master may not have sync to disk. Under Gtid_mode, when the first two configurations are guaranteed to be correct, Sync_relay_log is not 1, only one crash of master or slave will occur without data loss or a primary and standby synchronization failure. If slave occurs without sync relay Log,master-side transaction commits, the client observes the transaction commit, and then slave-side crash. The slave end will lose the transaction that has replied to the master Ack.






But when slave starts again, if there is no time to sync the lost transaction from the master side Events,master is crash. At this point, the user accesses the slave and discovers the data is lost.






With this case,mysql Semisync, if you want to ensure that a machine outage occurs at any moment without losing data, you need to set the Sync_relay_log to 1 at the same time. The sync operation for relay log is in queue_event, with sync for each event, so the transaction response time is affected when the Sync_relay_log is set to 1, and there is a lot more latency for transactions that involve more data.



MySQL three node



In a master-to-slave semisync data consistency analysis, the high availability is discarded, and services are stopped when the network jitter or an outage occurs between the primary and standby. To be highly usable, it is natural that we can think of a master two from, so as to solve a certain network jitter or an outage when the availability problem. However, the preceding article describes the need to ensure that data consistency configuration requirements still exist, that is, the performance will not improve under normal circumstances. At the same time, we need to solve the problem of how to choose the new host when Master is down, how to avoid the multi-master situation.






When choosing a new host, be sure to read both slave computers to see which slave has the latest logs, otherwise it may cause data loss. Such a three-node scheme is similar to the distributed quorum mechanism, which needs to be written to ensure that a legal set of three nodes is written, and that the legal collection needs to be read when the new master is identified. Using Distributed Consistency Protocol (PAXOS/RAFT) can solve data consistency problem, select main problem and multi-master problem, so in recent years, most of the domestic database teams have implemented the three-node scheme based on Paxos/raft. Recently, MySQL has also introduced the group replication scheme to support multi-master clusters in plug-in form.






Transferred from: http://www.ywnds.com/?p=7023



MySQL 5.7 Semi-synchronous replication technology


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.