First, Introduction
Consider this scenario : If the redo log can grow infinitely and the buffer pool is large enough, it is not necessary to flush the new version of the pages in the buffer pool back to disk. Because when a outage occurs, it is entirely possible to redo the log to recover the data from the entire database system to the time of the outage.
However, this requires two prerequisites: 1, the buffer pool can cache all the data in the database; 2. Redo log can grow infinitely
Therefore, the checkpoint (checkpoint) technology was born to solve the following problems: 1, shorten the recovery time of the database; 2. When the buffer pool is not enough, the dirty page is flushed to disk; 3, redo log is not available, Refresh dirty page.
When the database goes down, the database does not need to redo all the logs because the pages before checkpoint have been flushed back to disk. The database only needs to recover the redo log after checkpoint, which greatly shortens the time of recovery.
When the buffer pool is not enough, according to the LRU algorithm will overflow the least recently used pages, if this page is dirty page, then you need to enforce checkpoint, the dirty page is the new version of the page to brush back to the disk.
When redo logs are not available, because the current transactional database system is designed to recycle the redo log, and not to make it infinitely larger, the redo log can be reused in part that the redo logs are no longer needed, and when the database is down, the database recovery operation does not require this part of the redo log, So this part can be overwritten and reused. If the redo log also needs to be used, you must force checkpoint to flush the pages in the buffer pool at least to the current redo log location.
For the InnoDB storage engine, the version is marked with the LSN (Log Sequence number).
The LSN is a 8-byte number, each page has an LSN, and the redo log also has an LSN of lsn,checkpoint. You can observe by ordering the show ENGINE INNODB status:
Mysql> Show engine INNODB status \g---Log---log sequence number 34778380870Log flushed up to 34778380870Last CHECKP Oint at 347783808700 pending log writes, 0 pending CHKP writes54020151 log I/O ' s done, 0.92 log I/O ' S/second
The time, condition and the choice of dirty pages of checkpoint are very complicated. What checkpoint does is to brush the dirty pages of the buffer pool back to disk, except that each time you refresh the page to disk, where you fetch the dirty pages, and when to trigger checkpoint.
Ii. Classification of Checkpoint
Within the InnoDB storage engine, there are two types of Checkpoint, namely: Sharp Checkpoint, Fuzzy Checkpoint
Sharp Checkpoint occurs when all dirty pages are flushed back to disk when the database shuts down, which is the default mode of operation, that is, parameter innodb_fast_shutdown=1. However, if the database is running with sharp Checkpoint, the availability of the database will be greatly affected. Therefore, the InnoDB storage engine uses fuzzy checkpoint to refresh the page, that is, to refresh only a portion of the dirty pages, instead of flushing all the dirty pages back to the disk.
Fuzzy Checkpoint: 1, Master Thread checkpoint;2, flush_lru_list checkpoint;3, Async/sync FLUSH checkpoint;4, Dirty Page Too much Checkpoint
1. Master Thread Checkpoint
Refreshes a percentage of the paged disk from the list of dirty pages in the buffer pool at a rate of 10 seconds per second, and the process is asynchronous, when the InnoDB storage engine can do other things, and the user query thread does not block.
2, Flush_lru_list Checkpoint
Because the InnoDB storage engine needs to ensure that almost 100 free pages are available for use in the LRU list. Before the innodb1.1.x version, it is necessary to check that there are enough free space operations in the LRU list to occur in the user query thread, which obviously blocks the user's query operations. If there are no 100 free pages available, the InnoDB storage engine removes the page at the end of the LRU list. If there are dirty pages in these pages, then Checkpoint is required, and these pages are from the LRU list and are therefore called Flush_lru_list Checkpoint.
Starting with MySQL version 5.6, which is the innodb1.2.x version, this check is placed in a separate page cleaner thread, and the user can control the number of pages available in the LRU list via the parameter innodb_lru_scan_depth. The value defaults to 1024, such as:
mysql> SHOW GLOBAL VARIABLES like ' innodb_lru_scan_depth '; +-----------------------+-------+| Variable_name | Value |+-----------------------+-------+| innodb_lru_scan_depth | 1024x768 |+-----------------------+-------+
3, Async/sync Flush Checkpoint
Refers to situations where redo log files are not available, forcing some pages to be flushed back to disk, where dirty pages are selected from the list of dirty pages. If the LSN that has been written to the redo log is recorded as REDO_LSN, and the LSN that has been flushed back to the latest page of the disk is recorded as CHECKPOINT_LSN, you can define:
Checkpoint_age = Redo_lsn-checkpoint_lsn
Then define the following variables:
Async_water_mark = 75% * total_redo_log_file_size
Sync_water_mark = 90% * total_redo_log_file_size
If the size of each redo log file is 1GB and two redo log files are defined, the total size of the redo log file is 2GB. So ASYNC_WATER_MARK=1.5GB,SYNC_WATER_MARK=1.8GB. The
When Checkpoint_age<async_water_mark, you do not need to flush any dirty pages to disk;
When Async_water_mark<checkpoint_age<sync_water_mark triggers async Flush, flush enough dirty pages from the flush list back to the disk so that the checkpoint_age is satisfied after the refresh <async_water_mark;
Checkpoint_age>sync_water_mark This is rarely the case unless the Redo log file is set too small and is doing a bulk insert operation similar to load data. This triggers the Sync flush operation, flushing enough dirty pages from the flush list back to the disk so that the Checkpoint_age<async_water_mark is satisfied after the refresh.
As can be seen, the Async/sync Flush checkpoint is used to ensure the reuse of redo logs. Before the InnoDB 1.2.x version, Async Flush checkpoint blocked the user query thread that found the problem, and sync flush checkpoint blocked all user query threads and waited for the dirty page to be refreshed. Starting with the InnoDB 1.2.x version-that is, MySQL version 5.6-This part of the refresh operation is also placed in a separate page Cleaner thread, so the user query thread is not blocked.
The official MySQL version does not see whether the Refresh page is checkpoint from the flush list or from the LRU list, and does not know the number of Async/sync flushes generated by the redo log. However, the Innosql version provides methods that can be observed by commanding the show ENGINE INNODB status, such as:
Mysql> Show engine InnoDB status \gbuffer POOL and MEMORY----------------------total MEMORY allocated 2058485760; In additional pool allocated 0Dictionary memory allocated 913470Buffer pool size 122879Free buffers 79668Database pages 41957Old database pages 15468Modified db pages 0Pending reads 0Pending writes:lru 0, Flush Li St 0, single page 0Pages made young 15032929, isn't young 00.00 youngs/s, 0.00 Non-youngs/spages Read 15075936, created 3668 Written 366564230.00 reads/s, 0.00 creates/s, 0.90 Writes/sbuffer pool hits rate 1000/1000, young-making rate 0/100 0 not 0/1000pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/slru len:41957, unzip_lru le N:0I/O Sum[39]:cur[0], unzip sum[0]:cur[0]
4, Dirty Page too much
That is, the number of dirty pages is too large, causing the InnoDB storage engine to force checkpoint. The purpose is generally to ensure that there are enough pages available in the buffer pool. It can be controlled by parameter innodb_max_dirty_pages_pct:
Mysql> SHOW GLOBAL VARIABLES like ' innodb_max_dirty_pages_pct '; +----------------------------+-------+| Variable_name | Value |+----------------------------+-------+| innodb_max_dirty_pages_pct | |+----------------------------+-------+
A innodb_max_dirty_pages_pct value of 75 indicates that when the number of dirty pages in the buffer pool occupies 75%, checkpoint is forced to flush a portion of the dirty pages to disk. The default value for this parameter is 90 before the InnoDB 1.0.x version, and the subsequent version is 75.
Iii. mechanism of checkpoint
In the InnoDB transaction log, the fuzzy Checkpoint,innodb each fetch the oldest modified page (last Checkpoint) corresponding LSN, and then the LSN of this dirty page as a Checkpoint point to the log file, This means "the log and data for the LSN preceding this LSN have been flush to redo log
When MySQL crash, InnoDB scanned redo log, starting from last checkpoint apply redo log to buffer pool, until last checkpoint corresponding LSN equals log flushed up To corresponding LSN, the recovery is complete
So how exactly did it recover?
As shown, a transaction log for InnoDB has undergone a total of 4 stages:
Create phase: Transaction creates a log;
Log brush Disk: Log files written to disk;
Data brush Disk: The log corresponding to the dirty page data written to the data file on disk;
Write CKP: Log is written as checkpoint log file;
For these 4 stages, the System Records 4 log-related information for various other processing uses:
Log Sequence Number (LSN1): The current system LSN maximum, and the new transaction log LSN will be generated on this basis (lsn1+ the size of the new log);
Log flushed up to (LSN2): The LSN that is currently written to the log file;
Oldest modified data log (LSN3): The LSN of the current oldest dirty page data, which writes the LSN directly to the log file when the checkpoint is written;
Last checkpoint at (LSN4): LSN that is currently written to checkpoint;
For the system, the above 4 LSN is decremented, namely: Lsn1>=lsn2>=lsn3>=lsn4.
The specific example is as follows (using the Show InnoDB status \g command to view, oldest modified data log is not displayed):
Log---Log sequence number 34822137537Log flushed up to 34822137537Last checkpoint at 348221330280 pending log wri TES, 0 pending CHKP writes54189288 log I/O ' s done, 3.00 log I/O ' S/second
Iv. Log protection mechanism
MySQL crash, InnoDB has a log brush disk mechanism, can be controlled by innodb_flush_log_at_trx_commit parameters, here is how to prevent log overwrite cause log loss
What are the close relationships between InnoDB's checkpoint and redo log? There are several nouns that need to be explained:
CKP Age (Dynamic Movement): The oldest dirty page has not been flush to the data file, that is, the range of last checkpoint is not done
Buf age (Dynamic Move): Modified page information not written to log, but already in log buffer
Buf Async (fixed point): 7/8 of the log space size, when Buf age moves to the Buf async point, Forces the modified page information not written to log to start writing to log without blocking the transaction
Buf sync (fixed point): Log space size of 15/16, when written very large, Buf age moved very fast, all of a sudden to Buf sync point, blocking the transaction, forcing the modified page information to start writing to log. If the transaction is not blocked, the redo log of the last checkpoint is not covered at risk
CKP Async (fixed point): 31/32 of the log space size, when CKP age arrives Ckp async, forces the last checkpoint, does not block the transaction
CKP sync (fixed point): Log space size, when CKP age arrives at Ckp sync, forcing last checkpoint, blocking transactions, there is a danger of redo log coverage
Analysis of 4 different cases
If BUF age is between buf async and buf sync
If BUF age is behind BUF sync (this is not the case, MySQL has a protection mechanism)
If CKP age is between CKP async and Ckp sync (this is not the case)
If CKP age is behind CKP sync (this is not the case)
First case:
When the write volume is huge, buf age moves between buf async and buf sync, triggering writes out to log, MySQL writes as many logs as possible, and if the write volume slows down, buf age moves back to the "figure one" state. If the write volume is greater than the flush log speed, buf age will eventually overlap with buf sync, when all transactions are blocked, forcing the dirty page of the buf age-buf async to brush the disk, then IO will be busy.
Second case:
Of course this is not possible, because if redo log exists, the data will be lost. BUF Age crosses Log size,buf age may be more than log size, if you want to brush buf age, then the entire log size is not enough to accommodate all buf age.
There is no analysis of the third and fourth cases:
CKP age is always behind the buf age (left), because CKP age is the last checkpoint point, always chasing buf age (as many modified page flush to disk), so buf age must have reached buf sync first.
CKP Async and Ckp sync have any meaning?
The page cache in MySQL also has high water and low water, when the dirty page touches low water, the OS starts Flush dirty page to disk, and when the high water, it blocks all actions. OS will be madly Flush dirty page, the disk will be busy, there is IO Storm,
This article refers to:
http://blog.csdn.net/yah99_wolf/article/category/539408
Http://www.cnblogs.com/bamboos/p/3532150.html
http://tech.uc.cn/?p=716
http://www.mysqlperformanceblog.com/2011/04/04/innodb-flushing-theory-and-solutions/
"MySQL" on the checkpoint mechanism