Perfect distributed transaction support for MySQL 5.7

Source: Internet
Author: User

Perfect distributed transaction support for MySQL 5.7

Two Phase Commit Protocol

Distributed transactions generally adopt the 2 PC Protocol, which is the full name of Two Phase Commitment Protocol. This Protocol is mainly used to solve the problem of data consistency between all nodes in the distributed database scenario. In a distributed transaction environment, the transaction commit becomes relatively complex. Because multiple nodes exist, some nodes may fail to commit, that is, the ACID feature of transactions must be ensured in each database instance. All in all, when a distributed commit fails, all nodes cannot be committed. The entire distributed transaction can be committed only when all nodes can be committed.

Distributed transactions are committed in two phases through the 2 PC protocol.

  1. Prepare;
  2. Commit/rollback

The prepare in the first phase is only used to check whether transactions of each node can be committed. Only when the transactions of all nodes are "Allowed" Can the commit of the second stage be implemented. Otherwise, rollback is enabled. Note that all the successful transactions of prepare must be committed.

MySQL distributed transactions

For a long time, MySQL databases support distributed transactions, but only limited support is supported, as shown in:

  • Transactions with prepare are rolled back when the client exits or the service is down.
  • After a Server failure is submitted, the corresponding Binlog is lost.

The problem exists in the MySQL database for decades and is not officially fixed until the MySQL-5.7.7 version.Although InnoSQL has already been fixed in version 5.5, compared with the official repair solution, we are not doing so elegantly.. The specific performance and official fixes for this issue will be detailed below, where verification is conducted using the official MySQL-5.6.27 version (unfixed) and the MySQL-5.7.9 version (fixed), respectively.

First, let's take a look at the problem. First, create a table as follows:

Create table t (
Id int auto_increment primary key,
A int
) Engine = innodb;

For the preceding tables, insert data using the following operations:

Mysql> xa start 'mysql56 ';
Mysql> insert into t VALUES (1, 1 );
Mysql> xa end 'mysql56 ';
Mysql> xa prepare 'mysql56'

Through the above operation, the user creates a distributed transaction and the prepare does not return an error, indicating that the distributed transaction can be committed. Run the xa recover command to view the following results:

Mysql> xa recover;
+ ---------- + -------------- + --------- +
| FormatID | gtrid_length | bqual_length | data |
+ ---------- + -------------- + --------- +
| 1 | 7 | 0 | mysql56 |
+ ---------- + -------------- + --------- +

If you re-connect after exiting the client, you can run the xa recover command to find that the created 2 PC transaction is missing.That is, the successful transactions of prepare are lost and do not comply with the 2PC protocol specification !!!

The main cause of the above problem is that the MySQL-5.6 version automatically rolls back the transaction that has already prepare when the client exits, so why does MySQL do this? This mainly depends on the internal implementation of MySQL, The MySQL-5.7 of previous versions, for prepare transactions,MySQL does not record binlog.(Reducing fsync officially plays an optimized role ). The preceding operations are written to binlog information only when the distributed transaction is committed. Therefore, for binlog,There is no difference between distributed transactions and common transactions.And the previous operation information of prepare is stored in the connected IO_CACHE. If the client exits at this time, the previous binlog information will be lost and can be submitted after re-connection, this will cause Binlog loss, resulting in inconsistency between the master and slave data. Therefore, when the client exits, the official system rolls back all transactions that have already been prepare!

The official practice seems to have done a good job. It sacrifices something standardized, at least to ensure the consistency of master-slave data. But in fact, if the user has already prepare and MySQL goes down before the client exits, what will happen at this time?

MySQL goes down after a successful prepare of a distributed transaction. The connection to the transaction before it goes down is not broken. At this time, transactions that have already been prepare will not be rolled back, therefore, after MySQL is restarted, the engine layer can recover the transaction through the recover mechanism. Of course, the Binlog of the transaction has been lost in the process of downtime. At this time, if committed, it will cause inconsistency between the master and slave data,That is, the submission does not record the Binlog, and the data is lost from the above.In this case, the official recommendation is to roll back the transactions that have already been prepare.

The above is the previous version of the MySQL-5.7 MySQL in the distributed transactions on a variety of issues, then the MySQL-5.7 version of the official do what improvements? This can get some information from the official WL #6860 description, we still in the absence of practice, there is no say in the attitude, from the specific operation to analyze the improvement method of MySQL-5.7:

Perform the following operations on the same table structure:

Mysql> xa start 'mysql57 ';
Mysql> insert into t VALUES (1, 1 );
Mysql> xa end 'mysql57 ';
Mysql> xa prepare 'mysql57'

At this time, we use mysqlbinlog to view the Binlog on the Master. The result is as follows:

The Relay log on Slave is also compared as follows:

Through the above operations, it is obvious that after prepare, the operations from xa start to xa prepare are recorded in the Binlog of the Master, and then transmitted to the Slave through the replication relationship. That is to say, the MySQL-5.7, MySQL for distributed transactions, in the prepare completed the write Binlog operation, by addingXA_prepare_log_eventThis is the main difference from previous versions (Binlog is not written in previous versions of prepare)

Of course, this alone is not enough, because we know that Slave uses SQL thread to play back Relay log information, because the prepare transaction can block the entire session, if only one SQL thread is played back (parallel playback is not considered), will the SQL thread be blocked by the prepare stage of the distributed transaction, resulting in a problem in the whole SQL thread playback? This is also the second problem officially to be solved: how can we enable SQL thread to play back the event in the prepare phase of distributed transactions without blocking the playback of the event in the future? In fact, this implementation is also very simple (in xa. cc: applier_reset_xa_trans), as long as the SQL thread is played back to the prepare, the client can be disconnected (the connection handle of the related cache and SQL thread is detached ). Finally, on the Slave server, you can find the following information by running the xa recover command:

Mysql> xa recover;
+ ---------- + -------------- + --------- +
| FormatID | gtrid_length | bqual_length | data |
+ ---------- + -------------- + --------- +
| 1 | 7 | 0 | mysql57 |
+ ---------- + -------------- + --------- +

When the preceding transaction is committed, the Server Load balancer will also be committed after xa commit 'mysql57 'is executed on the Master.

Summary

To sum up, MySQL 5.7's support for distributed transactions has become perfect, and a decades-long bug has been fixed, so there is another reason to upgrade to the MySQL-5.7 version.

This article permanently updates the link address:

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.