An in-depth analysis of MySQL's commit non-regularity failure case

Source: Internet
Author: User

Case Description:

  A normal transaction commit, in the application will prompt the commit timeout, failure.

First, theoretical knowledge

1, about the commit principle, the transaction submission process

1, look for the modified data page:

1, if the data page in memory, it is directly memory read;

2, if the data page in memory, physical reading, from the disk into memory;

2. The undo page in the disk is transferred into memory;

3, first put the original data into undo, and then modify the data (data page into a dirty page);

4, modify the information of the data generated redo data into the Log_buffer (memory Buffer_pool a space, the default 16M);

Mysql>Show variables like '%log_buffer%';+------------------------+----------+|Variable_name|Value|+------------------------+----------+|Innodb_log_buffer_size| 16777216 |+------------------------+----------+1Rowinch Set(0.01Sec

5, log_buffer through the log thread (background thread, very diligent), continue to redo information written to disk innodb_log_file;

Mysql>Show variables like 'innodb_log_file%';+---------------------------+----------+|Variable_name|Value|+---------------------------+----------+|Innodb_log_file_size| 50331648 ||Innodb_log_files_in_group| 2        |+---------------------------+----------+2Rowsinch Set(0.01Sec

6, the transaction commits, deliberately trigger the log thread, the remaining log_buffer in the Redo data information to disk, the amount of data is not much left, completed the submission success.

Attention:

1, must write the log before the change record;

    "Log First", this is the basic principle of the database.

2, the transaction submission process, it is necessary to ensure that the log before the disk, in order to calculate the transaction submission completed.

3, accidental power loss, Memory dirty page is lost, but the disk innodb_log_file stored in the Redo log information, to restart the server, MySQL by reading the disk Log_files data, automatically change the data to run one side.

Q: Why is MySQL commit always fast, even though the amount of data modified by the transaction can be large?

A:

Because the transaction commits, not the disk data is modified, but the redo information to modify the data through the background log thread to the disk redo logfile, complete the MySQL commit, regardless of the amount of data modified by the transaction, the process is very fast.

In the memory of the dirty block, that is, the modified data page, normally by the background related write thread periodically to brush dirty page data into the disk, to ensure that the InnoDB buffer pool has enough clean blocks, free blocks.

2, about rollback principle, rollback process

1. MySQL reads the undo page information in memory

2, through the Undo information to find dirty pages, reverse the data to modify

3, do, undo the same time, and will be produced redo information

4. Transaction Submission

MySQL rollback processing mechanism:

If the thread is interrupted and the transaction is not committed, undo will log this information, wait for another session to connect, view the block data, and MySQL automatically roll back the data page to modify and then be read. In other words, in order to avoid the system because rollback is stuck, by directly killing the process, interrupt the transaction, waiting for the subsequent person to read the data information when the rollback, and then return the results.

Q:Why are rollback sometimes slow, rollback risk and risk avoidance methods?

A:

The time of rollback depends on the time the transaction modifies data before the rollback, the processing volume is long and the processing volume is small and the rollback time is short.

1, rollback risk: Easy to cause the system is hang live;

2, risk avoidance method: directly kill the session process or the MySQL process.

3. Storage Write Performance analysis

Q:mysql Commit, storage why write speed can be maintained at 0ms, very little 1ms situation?

A:

For storage, write performance is quite high: assuming that the storage cache always has free space, the transaction commits, writes the few remaining redo data in the log buffer to the storage cache, that is, to complete the MySQL commit, the process is quite fast (can be maintained at 0MS, Rarely occurs in 1ms), subsequent redo data written to disk by the cache is performed in the background.

4. Storage-level disaster preparedness (same-city disaster preparedness)

1. The synchronous process of disaster preparedness: Commit

1, Redo, Binlog write to the local storage cache;

2, through the network synchronization Binlog write to the remote synchronization server storage cache;

3, response to local database;

4, the transaction submitted successfully;

2. Risk:

Network problems (intermittent signal, cable broken), resulting in write hang, commit timeout failed.

3, Solve:

Through the time-out setting, the network outage exceeds the limit, automatically changes the synchronization to disaster recovery asynchronous, as little as possible to affect the business commit timeout failure.

Second, analysis and treatment

Memory write performance is poor, many times will reach 5ms, even more than 10ms

Note: The disaster recovery synchronization has been stopped in case.

1, storage in the BBU problem , there is a bug to monitor the BBU;

  Workaround: reboot the BBU and not update the BBU.

2. Cache is fully occupied

1, massive data write, commit data fills the cache;

2, hard disk I/O exception, abnormal SQL caused by massive physical reading;

 Resolution: index Optimization.

3. Poor storage performance

  Solution: Find the boss to pay, replacement of high-quality equipment.

An in-depth analysis of MySQL's commit non-regularity failure case

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.