Understanding the Checkpoint Technology in MySQL

Source: Internet
Author: User

Understanding the Checkpoint Technology in MySQL

1. Background generated by checkpoint
When the database adds, deletes, queries, and modifies data, it is completed in the buffer pool first. To improve the efficiency of transaction operations, the modified data in the buffer pool is used, the data is not immediately written to the disk, which may cause inconsistency between the data in the memory and the data in the disk.
One of the things requirements is Durability. When the buffer pool and disk data are inconsistent, a fault may cause data to be unable to be persisted.
To prevent data that has been modified in the memory but not written to the disk, transactions are not persistent after the data is restarted due to a fault) the first way to ensure.
Redo logs can be redone after a fault is restarted to ensure the persistence of things. However, the redo log space cannot be expanded without limit, data that has been modified in the memory but has not been submitted to the disk, that is, dirty pages, must also be written to the disk.
When and under which dirty pages are written to the disk in the memory are determined by various factors.
One of the tasks of checkpoint is to refresh the dirty pages to the disk under certain conditions for dirty pages in the memory.

2. checkpoint Classification
According to the checkpoint refresh method, the checkpoint in MySQL is divided into two types: sharp checkpoint and fuzzy checkpoint.
Sharp checkpoint: When the database is closed, all dirty pages in the buffer pool are refreshed to the disk.
Fuzzy checkpoint: When the database is running normally, some dirty pages are written to the disk at different times, and some dirty pages are refreshed to the disk, to avoid performance problems caused by refreshing all dirty pages at a time.

3. When the checkpoint occurs
Checkpoints refresh the dirty pages in the buffer pool to the disk. However, under different circumstances, the checkpoint is triggered in different ways, and the number of dirty pages written to the disk is also different.

3.1, Master Thread checkpoint
In the Master Thread, partial dirty pages are refreshed from memory to disk at a frequency of every second or every 10 seconds. This process is asynchronous. Normal user threads will not block data operations.

3.2, FLUSH_LRU_LIST checkpoint
FLUSH_LRU_LIST checkpoint is executed in a separate page cleaner thread.
MySQL manages the cache through the LRU list in the buffer pool. A certain number of idle pages must be retained in the LRU idle list, to ensure that there are enough idle pages in the buffer pool to respond to external requests to the database.
When the number of pages in this space is insufficient, FLUSH_LRU_LIST checkpoint occurs.
The number of idle pages is controlled by the innodb_lru_scan_depth parameter table. Therefore, when the number of idle list pages is less than the configured value, a checkpoint occurs, removing the pages at the end of some LRU lists.

3.3, Async/Sync Flush checkpoint
Async/Sync Flush checkpoint is executed in a separate page cleaner thread.
Async/Sync Flush checkpoint when redo logs are unavailable, refresh some dirty pages in the buffer pool to the disk. After the dirty pages are written to the disk, the redo log corresponding to the transaction can be released.
You can use innodb_log_file_size to configure the redo_log file size.

The execution of Async Flush checkpoint or Sync Flush checkpoint is determined by checkpoint_age, async_water_mark, and sync_water_mark.
Definition:
Checkpoint_age = redo_lsn-checkpoint_lsn, that is, checkpoint_age equals to the latest lsn minus the value of the lsn that has been refreshed to the disk
Async_water_mark = 75% * innodb_log_file_size
Sync_water_mark = 90% * innodb_log_file_size
1) When checkpoint_age <sync_water_mark, Flush checkpoint is not required. In other words, when the remaining space of the redo log exceeds 25%, Async/Sync Flush checkpoint is not required.
2) When async_water_mark <checkpoint_age <sync_water_mark, execute Async Flush checkpoint. That is to say, when there is less than 25% redo log space, but more than 10%, execute Async Flush checkpoint, refresh to meet condition 1
3) When checkpoint_age> sync_water_mark, execute sync Flush checkpoint. That is to say, when the remaining space of the redo log is less than 10%, execute Sync Flush checkpoint to refresh to meet condition 1.
After mysql 5.6, neither Async Flush checkpoint nor Sync Flush checkpoint will block the user's query process.

I personally think:
Since a disk is a relatively slow storage device, the interaction between memory and disk is relatively slow.
Because innodb_log_file_size defines a relatively large value, under normal circumstances, the first two checkpoints refresh the dirty pages to the disk, after the previous two checkpoints refresh the dirty pages to the disk, the redo log space corresponding to the dirty page is released immediately. Generally, Async/Sync Flush checkpoint does not occur. At the same time, we should also be aware that, in order to avoid frequent and low Async/Sync Flush checkpoint, innodb_log_file_size should be configured relatively larger.

3.4, Dirty Page too much Checkpoint
Dirty Page too much Checkpoint is implemented at a frequency every second in the Master Thread.
Dirty Page too much means that there are too many Dirty pages in the buffer pool. Execute the checkpoint Dirty Page to fl into the disk to ensure that there are enough available pages in the buffer pool.
The Dirty Page is configured by innodb_max_dirty_pages_pct. The default value of innodb_max_dirty_pages_pct is 1.0 before innodb 90%, and then 75%.

Summary:

MySQL database (of course, other relational data also has a similar mechanism). To improve the efficiency of transaction operations, the modified data is not immediately written to the disk after the transaction is committed, instead, the log first (write log ahead) method is used to ensure the durability of transactions.
For a data page that modifies a transaction, that is, a dirty page, it is refreshed to the disk asynchronously. The checkpoint is the implementer that asynchronously refreshes the dirty page to the disk.
Under different circumstances, different checkpoints will occur, and different numbers of dirty pages will be refreshed to the disk to reach the management memory (type 1, 2, 4) and redo log available space (3rd checkpoint.

Reference: MySQL technology insider Innodb Storage engine PDF download see

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.