Concurrent Replication Series One: Binlog Group submission

Source: Internet
Author: User
Tags log log

http://blog.itpub.net/28218939/viewspace-1975809/


Ma Pengfei Technology MySQL database engineer

MySQL binary log was launched after MySQL version 5.1, which is mainly used for primary and standby replication, and we review how MySQL works when the Binary log function is turned on/off.

MySQL does not have Binary log turned on:

    • The InnoDB storage engine can use the redo and undo logs to crash recovery the database and, when the data crash recovery, apply all transactions that have been committed within the storage engine by redo log Redo Log recovery , all transactions that have been prepared but no commit will apply the undo Log do roll back. The client then connects to see that the data that has been committed exists in the database, and that uncommitted rollback data needs to be re-executed.

In the case of MySQL opening Binary log:

    • In order to ensure consistency between the storage engine and the binary logs on top of the MySQL database ( because the repositories replay the transactions committed by the main library through the binary log, assuming that the main repository storage engine has been committed and the binary logs are not consistent, the loss of the repository data will result in inconsistent primary and standby data ), Introduction of two-phase commit (phase commit or 2PC)
Figure 12 Phase Commit

MySQL two phase commit process:

Storage Engine (InnoDB) Transaction prepare stage: The SQL statement has successfully executed and generated a memory log of redo and undo

Binary Log Log Submission

        • Write () writes binary log memory log data to the file system cache
        • Fsync () permanently writes binary log file system cache log data to disk

Storage Engine (InnoDB) Internal submission

        • Commit phase commits within the storage engine (Innodb_flush_log_at_trx_commit control) causes undo and redo to be permanently written to disk

MySQL with binary log turned on crash recovery:

    • When the transaction is crashin the prepare phase , the transaction is not written to binary log when the database recovery and the storage engine is not committed and the transaction is put back.
    • crashWhen a transaction is Fsync () permanently written to a binary log log , but the storage engine is not in time commit, at which time the MySQL database recovery will retrieve the submitted information from the binary log's XID (MySQL database internal distributed transaction XA) and re-redo the transaction and Commit keeps the storage engine and the binary log consistent.

The above mentioned two-phase commit process for a single transaction ensures that the storage engine and binary log logs are consistent, but how do you ensure that the storage engine and binary log submissions are consistent in the concurrency scenario? When multiple transactions are committed concurrently, what is the impact if the binary log and the storage engine order are inconsistent ?

Figure 2 The order in which the InnoDB storage engine commits is different from the binary log sequence on top of MySQL

For example, the transaction starts in T1,T2,T3 Order, writes the binary log (in T1, T2, T3 Order) to the log file system cache, calls the Fsync () A group commit is made to permanently write the log files to disk, but the order of the storage engine commits is T2,T3,T1. when the T2, T3 commits a transaction after doing a on-line backup program to create a new slave to do replication, then the transaction T1 in the slave machine restore MySQL database when found not in Within the storage engine commits, the T1 transaction is roll back, when the master and standby data is inconsistent (when the slave is built, the log offset of change master to is recorded T3 after the transaction location).

Conclusion: The write order of the upper-level binary logs of the MySQL database is consistent with the transaction submission order of the storage Engine InnoDB layer, for backup and recovery needs, such as Xtrabackup and Innobackpex tools.
In order to solve the above problem, in the early MySQL version, the Prepare_commit_mutex lock ensures that the MYSQ database upper-level binary logs and the InnoDB storage engine layer are in the same order of transaction submission.

Figure 3 Ensuring that the storage engine and binary log order submissions are consistent by Prepare_commit_mutex

Figure 3 shows that in Prepare_commit_mutex, only when the last transaction commits to release the lock, the next transaction can perform Prepara operations, and in each transaction process Binary log does not the invocation of Fsync () . Due to the high cost of writing memory data to disk, the performance of permanently writing log data to the disk database will drop sharply if fsync () is frequent. At this point the MySQL database provides the Sync_binlog parameter to set how many Binlog logs are generated when the call is made Fsync () the binary log is brushed into the disk to improve overall performance, this parameter is set to function:

    • Sync_binlog=0, the operation of binary log Fsync () is based on the operating system.
    • Sync_binlog=1, each transaction commit invokes a Fsync (), at which point the data is guaranteed to be the most secure but with a large performance impact.
    • Sync_binlog=n, at least N-1 transactions will be lost when the database crash.

As shown in Figure 3, MySQL turns on binary log using Prepare_commit_mutex and Sync_log to ensure that the binary log and storage engine sequence are consistent (through Sync_binlog to control the log refresh rate), Prepare_commit The _mutex lock mechanism causes high concurrent commit transactions with poor performance and binary logs cannot be group commit.

So how do you make sure that MySQL logs the binary log in sequence and the storage engine commits in the same order and can make a Group commitof the binary log?

MySQL 5.6 introduces BLGC (binary log Group commit), the commit process of binary logs is divided into three stages, Flush stage, Sync stage, commit stage.

Then the transaction commit process is simplified to:

Storage Engine ( InnoDB) Prepare----> Database Upper (Binary Log) Flush Stage----> Sync stage----> tune storage Engine ( InnoDB ) Commit stage.

Each stage phase has its own queue, which queues the transactions for each session. When a thread registers an empty queue, the thread is treated as a leader of that queue, and the thread that is registered to that queue is follower,leader to control the behavior of follower in the queue. Leader also leads all the follower of the current queue to the next stage to execute, when the next stage is not an empty queue, at which point leader can become follower into this queue (note: Follower threads cannot become leader)

Figure 4: Binary Log three-phase commit process

On Flush Stage: All registered threads are written to the binary log cache

The data in sync stage:binary log cache will be sync to disk, and the binary log cache of all the queue transactions is permanently written to disk when sync_binlog=1

The commit Stage:leader invokes the storage engine to commit the transaction sequentially.

When a set of transactions is in the commit phase, other new transactions can be flush, allowing group commit to continue to take effect. So in order to increase the number of transactions in a set of queues in group commit, MySQL uses Binlog_max_flush_queue_time to control the wait time in the flush stage, Let the flush queue wait a little longer at this stage to increase the number of this set of transaction queues so that the queue can fysn () more transactions at a time in the sync phase.

MySQL 5.7 Parallel replication enables master-slave multithreaded replication based on the main library binary log group Commit and identifies the same set of transactions in the binary log log last_commited= N and the order of all transaction commits within the group transaction. In order to increase the number of transactions within a set of transactions, the concurrency of the standby group commits is introduced in Binlog_group_commit_sync_delay=n and Binlog_group_commit_sync_no_delay_count=n ( Note: binlog_max_flush_queue_time the 5.7.9 and later versions of MySQL are no longer valid) parameters , MySQL waits binlog_group_commit_sync_delay milliseconds until the number of Binlog_group_commit_sync_no_delay_count transactions is reached, and a group commit is made.

Reference:http://mysqlmusings.blogspot.kr/2012/06/binary-log-group-commit-in-mysql-56.html

Concurrent Replication Series One: Binlog Group submission

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.