Innodb two-stage log Submission mechanism, innodb log submission

Source: Internet
Author: User

Innodb two-stage log Submission mechanism, innodb log submission

ViewThe innodb_flush_log_at_trx_commit official website has some questions.For details about the innodb_flush_log_at_trx_commit parameter, refer to the official website:

Https://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.html#sysvar_innodb_flush_log_at_trx_commit

One of them is written as follows: With a value of 2, the contents ofInnoDBLog buffer are written to the log file after each transaction commit and the log file is flushed to disk approximately once per second.It means that if the innodb_flush_log_at_trx_commit value is set to 2, the content in the log buffer will be written to the log file each time it is submitted, and then the logfile will be flush to the disk. Because innodb's log file is ib_logfile on the hard disk as far as I know, I am confused that the log file is flushed to disk, is there a layer of structure between log buffer and disk that can cache log files? After reading a large amount of Chinese and English documents, I finally got a preliminary understanding. I. GlossaryIn the innodb Storage engine, there is a unique log file, that is, redo log file. Therefore, for the innodb Storage engine, there are two types of logfiles: redo log and binlog. Redo log: Ib_logfile0 and ib_logfile1 under the data Directory (the number is controlled by innodb_log_files_in_group). The innodb Storage engine is unique and has the corresponding redo log buffer in the memory. Therefore, the layer-3 structure for writing redo logs is: redo log buffer ---> redo logfile in the file system cache ---> redo log file on disk Binlog: The Path is in the data directory by default. You can also specify the path using the log_bin parameter. The file name is a file with the default 2. Two-Phase log Writing Process Source image: https://jin-yang.github.io/post/mysql-group-commit.html After binlog is enabled, if the session sends a commit request, before committed, a series of processes are as follows: 1. prepare stage:Write transaction changes and transaction commit information of log buffer into the redo log file in the file system cache. Note that log buffer and undo buffer (also called undo page) it is generated in real time during the transaction execution (undo is in the system tablespace by default, you can also specify an independent tablespace after 5.6), and whether the redo log in the file system cache is flushed to disk, depends on the innodb_flush_log_at_trx_commit parameter. Innodb_flush_log_at_trx_commit:
  • The value 0 indicates that the redo log buffer content is written to the redo log cached by the file system every second and flushed to the redo log file on the disk.
  • 1 indicates that the content of the redo log buffer will be written to the redo log cached by the file system during the transaction commit, and flush (solidified) to the redo log file on disk.
  • 2 indicates that the content of the redo log buffer will be written to the redo log cached by the file system during the transaction commit, and the redo log cached by the file system will be flushed once per second (solidified) to the redo log file on disk.
2. binlog writing stage:In this phase, two methods are called: write () and fsync (). The former is used to write the binlog in the file system cache, and the latter is used to write the binlog in the file system cache to the bin log on disk, the former will be called at this stage, and the latter's calling mechanism is controlled by the sync_binlog parameter. About Sync_binlogParameters:
  • Sync_binlog = 0: indicates that the fsync () call is completely handed over to the operating system, that is, whether the binlog in the file system cache is refreshed to the disk is completely controlled by the operating system.
  • Sync_binlog = 1: indicates that the binlog will be solidified to the disk when the transaction is committed.
  • Sync_binlog = N (N> 1): When the database crashes, may lose N-1 transactions, detailed principles also see https://jin-yang.github.io/post/mysql-group-commit.html
3. Final commit stage:This phase mainly includes: the server tells the storage engine that both binlog and redo log have been written (at least at the file system cache level). submit data according to the normal mechanism, then, return committed's confirmation submission information to the session. Iii. fault recovery Analysis1. If the transaction crashes after the first phase, it is considered a failed transaction because the binlog has not been written and the data is obviously not committed. You do not need to roll back or roll back. (For oracle databases, the situation is more complex. Some large transactions may have solidified data even if they are not committed, so rollback is required. It is not clear whether there is a mechanism for writing uncommitted data into the disk in advance in mysql large transactions.) 2. if the system crashes after the second stage, only one situation can ensure that data is not lost at all, that is, innodb_flush_log_at_trx_commit and sync_binlog are both set to 1. At this time, both the redo log and binlog are solidified to the disk, this ensures that data not written after commit is rolled forward and submitted during recovery. If any of them is not 1, binlog and redo log may be inconsistent, and transactions may be lost. Therefore, to ensure full consistency between the master and slave databases, both innodb_flush_log_at_trx_commit and sync_binlog of the master database must be set to 1. As for the three-phase group commit mechanism that emerged after 5.6 to solve the concurrent transaction commit exception, further research is required.

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.