MySQL Tutorial: About the checkpoint Mechanism
I. Introduction
Think about this scenario: If the redo log can be infinitely increased and the buffer pool is large enough, you do not need to refresh the new version of the page in the buffer pool to the disk. When a crash occurs, you can use the redo log to restore the data in the entire database system to the time when the crash occurs.
However, this requires two prerequisites: 1. The buffer pool can cache all the data in the database; 2. Redo logs can be infinitely increased.
Therefore, the Checkpoint technology was born to solve the following problems: 1. Shorten the database recovery time; 2. When the buffer pool is insufficient, refresh the dirty pages to the disk; 3. When the redo log is unavailable, refresh the dirty page.
- When the database goes down, the database does not need to redo all the logs because the pages before the Checkpoint have been refreshed back to the disk. The database only needs to restore the redo logs after the Checkpoint, which greatly shortens the recovery time.
- When the buffer pool is insufficient, the LRU algorithm will overflow the pages that are least recently used. If this page is a dirty page, you need to force the Checkpoint to refresh the new version of the dirty page, that is, the new version of the page, back to the disk.
- When the redo log is unavailable, the current transaction database system uses the redo log design cyclically, instead of increasing it infinitely, redo logs can be reused because these redo logs are no longer needed. When the database goes down, the restoration operation does not require this redo log, therefore, this part can be overwritten and reused. If you still need to use the redo log, you must force the Checkpoint to refresh the pages in the buffer pool to the current redo log location at least.
The InnoDB Storage engine uses the LSN (Log Sequence Number) to mark the version.
An LSN is an 8-byte number. Each page contains an LSN, an LSN in the redo log, and an LSN in the Checkpoint. You can run the show engine innodb status Command to observe the problem:
Mysql> show engine innodb status \ G
---
LOG
---
Log sequence number 34778380870
Logflushed up to 34778380870
Last checkpoint at 34778380870
0 pending log writes, 0 pending chkp writes
54020151 log I/o's done, 0.92 log I/o's/second
The Checkpoint occurrence time, condition, and dirty page selection are all complex. What Checkpoint does is to fl the dirty pages in the buffer pool back to the disk. The difference is that how many pages are refreshed to the disk each time and where the dirty pages are retrieved each time, and when the Checkpoint is triggered.
Ii. Checkpoint Classification
There are two types of checkpoints in the InnoDB Storage engine: Sharp Checkpoint and Fuzzy Checkpoint.
Sharp Checkpoint refreshes all dirty pages back to the disk when the database is shut down. This is the default way of working, that is, the innodb_fast_shutdown parameter is 1. However, if the database uses Sharp Checkpoint during running, the database availability will be greatly affected. Therefore, the InnoDB Storage engine uses Fuzzy Checkpoint to refresh pages, that is, only a part of dirty pages are refreshed, rather than all dirty pages are refreshed back to the disk.
Fuzzy Checkpoint: 1. Master Thread Checkpoint; 2. FLUSH_LRU_LIST Checkpoint; 3. Async/Sync Flush Checkpoint; 4. Dirty Page too much Checkpoint
1. Master Thread Checkpoint
A certain percentage of pages are refreshed from the dirty page list of the buffer pool every second or every 10 seconds. This process is asynchronous. At this time, the InnoDB Storage engine can perform other operations, the user query thread is not blocked.
2. FLUSH_LRU_LIST Checkpoint
Because the InnoDB Storage engine needs to ensure that there are about 100 idle pages in the LRU list for use. Before InnoDB1.1.x, check whether there is sufficient available space in the LRU list in the user query thread. This will obviously block the user's query operations. If there are no 100 free pages available, the InnoDB Storage engine will remove the pages at the end of the LRU list. If these pages contain dirty pages, you need to perform checkpoints. These pages are from the LRU list and are called FLUSH_LRU_LIST Checkpoint.
From MySQL 5.6, that is, InnoDB1.2.x, this check is performed in a separate Page Cleaner thread, in addition, you can use the innodb_lru_scan_depth parameter to control the number of available pages in the LRU list. The default value is 1024, for example:
Mysql> show global variables like 'innodb _ lru_scan_depth ';
+ ----------------------- + ------- +
| Variable_name | Value |
+ ----------------------- + ------- +
| Innodb_lru_scan_depth | 1024 |
+ ----------------------- + ------- +
3. Async/Sync Flush Checkpoint
This refers to the case where redo log files are unavailable. In this case, you need to forcibly refresh some pages back to the disk. In this case, the dirty pages are selected from the dirty page list. If you record the LSN that has been written to the redo log as redo_lsn and refresh the LSN that has been returned to the latest disk as checkpoint_lsn, you can define:
Checkpoint_age = redo_lsn-checkpoint_lsn
Define the following variables:
Async_water_mark = 75% * total_redo_log_file_size
Sync_water_mark = 90% * total_redo_log_file_size
If the size of each redo log file is 1 GB and two redo log files are defined, the total size of the redo log file is 2 GB. Then async_water_mark = 1.5 GB, sync_water_mark = 1.8 GB. Then:
When checkpoint_age <async_water_mark, no dirty pages need to be refreshed to the disk;
Async Flush is triggered when async_water_mark <checkpoint_age <sync_water_mark is refreshed. a sufficient number of dirty pages are refreshed from the Flush list and returned to the disk. After the refresh, the value of checkpoint_age <async_water_mark is;
Checkpoint_age> sync_water_mark is rarely used unless the redo log file is too small and is performing bulk insert operations similar to load data. At this time, the Sync Flush operation is triggered, and enough dirty pages are refreshed from the Flush list back to the disk, so that checkpoint_age <async_water_mark is satisfied after the refresh.
It can be seen that Async/Sync Flush Checkpoint is used to ensure the availability of redo logs. Before InnoDB 1.2.x, Async Flush Checkpoint will block the user query thread that finds the problem, and Sync Flush Checkpoint will block all user query threads and wait until the dirty page is refreshed. Starting from InnoDB 1.2.x-MySQL 5.6, this part of the refresh operation is also put into a separate Page Cleaner Thread, so it does not block the user query Thread.
MySQL official version does not support checking whether the refresh page is Checkpoint in the Flush list or LRU list, nor does it know the number of Async/Sync Flush times caused by redo logs. However, the InnoSQL version provides a method, which can be observed by running the show engine innodb status command, for example:
Mysql> show engine innodb status \ G
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 2058485760; in additional pool allocated 0
Dictionary memory allocated 913470
Buffer pool size 122879
Free buffers 79668
Database pages 41957
Old database pages 15468
Modified db pages 0
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 15032929, not young 0
0.00 youngs/s, 0.00 non-youngs/s
Pages read 15075936, created 366872, written 36656423
0.00 reads/s, 0.00 creates/s, 0.90 writes/s
Buffer pool hit rate 1000/1000, young-making rate 0/1000 not 0/1000
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 41957, unzip_LRU len: 0
I/O sum [39]: cur [0], unzip sum [0]: cur [0]
4. Dirty Page too much
That is to say, the InnoDB Storage engine enforces Checkpoint because there are too many dirty pages. In general, the objective is to ensure that there are enough pages available in the buffer pool. It can be controlled by the innodb_max_dirty_pages_pct parameter:
Mysql> show global variables like 'innodb _ max_dirty_pages_pct ';
+ ---------------------------- + ------- +
| Variable_name | Value |
+ ---------------------------- + ------- +
| Innodb_max_dirty_pages_pct | 75 |
+ ---------------------------- + ------- +
The value of innodb_max_dirty_pages_pct is 75, which indicates that when the number of dirty pages in the buffer pool occupies 75%, the Checkpoint is enforced to refresh some dirty pages to the disk. Before InnoDB 1.0.x, the default value of this parameter is 90, and later versions are all 75.
Iii. Checkpoint Mechanism
Fuzzy Checkpoint is used in Innodb Transaction logs. Innodb obtains the LSN corresponding to the oldest modified page (last checkpoint) each time, record the LSN of the dirty page to the log file as the Checkpoint, which means that "the logs and data of the LSN before this LSN have been flushed to the redo log
When mysql crash is used, Innodb scans the redo log and applies the redo log from the last checkpoint to the buffer pool until the LSN corresponding to the last checkpoint is equal to the Log flushed up to the corresponding LSN. The restoration is complete.
So how can we recover it?
As shown in, a transaction log of Innodb has gone through four phases:
- Creation phase: the transaction creates a log;
- Log fl: the log files written into the disk;
- Data disk flushing: the dirty pages of logs written to the data files on the disk;
- CKP writing: logs are written into log files as checkpoints;
Corresponding to these four phases, the system records four log-related information for other processing purposes:
- Log sequence number (LSN1): The maximum LSN of the current system. The new transaction Log LSN will be generated on this basis (LSN1 + new Log size );
- Log flushed up to (LSN2): The LSN of the Log file currently written;
- Oldest modified data log (LSN3): The LSN corresponding to the current Oldest dirty page data. When writing a Checkpoint, this LSN is directly written to the log file;
- Last checkpoint at (LSN4): The LSN of the current Checkpoint;
For the system, the above four lsns are decreasing, that is, LSN1> = LSN2> = LSN3> = LSN4.
For example, run the show innodb status \ G command to check whether the Oldest modified data log is not displayed ):
LOG
---
Log sequence number 34822137537
Logflushed up to 34822137537
Last checkpoint at 34822133028
0 pending log writes, 0 pending chkp writes
54189288 log I/o's done, 3.00 log I/o's/second
Iv. Log Protection Mechanism
During mysql crash, Innodb has a log flushing mechanism, which can be controlled through the innodb_flush_log_at_trx_commit parameter. Here we talk about how to prevent log overwrite and log loss.
What are the close relationships between Innodb checkpoint and redo log? Several terms need to be explained:
- Ckp age (dynamic movement): The oldest dirty page has not been flushed to the data file, that is, there is no range for last checkpoint.
- Buf age (dynamic movement): modified page information is not written to log, but is already in log buffer
- Buf async (Fixed Point): the size of the log space is 7/8. When the buf age is moved to the Buf async point, the modified page information not written to the log is forcibly written to the log, without blocking the transaction.
- Buf sync (Fixed Point): 15/16 of the log space. When a large number of logs are written, the buf age moves very fast and stops transactions at the point of buf sync, forcibly write the modified page information to the log. If the transaction is not blocked, it is dangerous to overwrite the redo log of the last checkpoint.
- Ckp async (Fixed Point): 31/32 of the log space size. When the ckp age reaches ckp async, the last checkpoint is enforced and the transaction is not blocked.
- Ckp sync (Fixed Point): the size of the log space. When the ckp age reaches ckp sync, it forces the last checkpoint to block the transaction and there is a risk of redo log overwrite.
Next, analyze four cases
- If the buf age is between buf async and buf sync
- If the buf age is after buf sync (of course this situation does not exist, mysql has a protection mechanism)
- If the ckp age is between ckp async and ckp sync (this case does not exist)
- If the ckp age is after ckp sync (this case does not exist)
First case:
When the writing volume is large, the buf age is moved between the buf async and the buf sync, which is triggered to be written into the log. mysql writes as many logs as possible. If the writing volume slows down, the buf age is moved back to "Figure 1. If the write speed is greater than the flush log speed, the buf age will eventually overlap with the buf sync. In this case, all transactions will be blocked and the 2 * (Buf age-Buf async) will be forced) in this case, I/O will be busy.
Case 2:
Of course, this situation is not possible, because if there is a possibility that the redo log may overwrite, the data will be lost. The buf age will overwrite the log size, And the buf age may exceed the log size. If you want to brush the buf age, the overall log size will not be enough to accommodate all the buf age.
Case 3 and case 4 do not have an analysis:
Ckp age is always behind the buf age (left side), because ckp age is the last checkpoint and always catches up with the buf age (flush as many modified pages as possible to the disk ), therefore, the buf age must first reach the buf sync.
What is the significance of ckp async and ckp sync?
In mysql, the page cache also has high water and low water. When the dirty page touches low water, the OS begins flush dirty page to the disk. When the page cache reaches high water, all actions are blocked, OS will be crazy flush dirty page, disk will be very busy, there is IO Storm,
This article permanently updates the link address: