Checkpoint queue and incremental checkpoint

Source: Internet
Author: User
Tags what dba


The main purpose of a checkpoint is to refresh dirty blocks in a way that has minimal impact on the daily operations of the database. Dirty blocks are constantly generated. How can I refresh the dirty blocks to the disk? Before 8i, Oracle regularly performs some modification operations on lock residences to refresh all dirty blocks in the Buffer cache. This method of refreshing dirty blocks is called a full checkpoint, which greatly affects the efficiency, after 9i, a full checkpoint occurs only when the database is closed.

Starting from 8i, Oracle has added the concept of incremental checkpoints. The main purpose of incremental checkpoints is to regularly refresh some dirty blocks. It is unreasonable to refresh the dirty blocks once because they are constantly generated and are not exhausted. Stop all user modification operations like a full checkpoint and refresh the dirty blocks before resuming. This will definitely affect the performance. Partial refreshing of all incremental checkpoints is the best solution to dirty blocks. What are the parts that are refreshed each time? According to the statistical study, the earliest dirty blocks are refreshed each time in the order of dirty blocks. This method is the most reasonable. To achieve this, Oracle creates another linked list in the Buffer cache, that is, the checkpoint queue. When a block is corrupted, it is linked to the end of the checkpoint queue. It's like waiting in the queue. People from will be in the first place, and those from will be in the second place. In the future, every person will be at the end of the team, this team is a queue arranged in the order of time. The checkpoint queue is like this. When the block is dirty, it is chained to the end. Therefore, the checkpoint queue queues are arranged in a queue according to the time sequence of dirty blocks.

650) this. width = 650; "src =" "width =" 492 "height =" 419 "border =" 0 "alt =" "/>


For example, each node in the checkpoint queue points to a dirty block. There is actually very little information in each node in the checkpoint queue, that is, to record the address of the corresponding block in the Buffer cache, And the redo record of the dirty block in the log file, the addresses of the previous and later nodes are also available. Check Point queues include LRU and dirty LRU, which are two-way linked lists. A two-way linked list records the addresses of the first and last nodes in a node.

The blocks in the header of the checkpoint queue are first corrupted. Therefore, Oracle regularly wakes up DBWn from the checkpoint queue header and refreshes the dirty blocks along the order of the checkpoint queue. While refreshing dirty blocks, new dirty blocks can be continuously linked to the tail of the checkpoint queue. This operation regularly wakes up DBWn to refresh dirty blocks. Oracle is called an incremental checkpoint.

650) this. width = 650; "src =" "width =" 566 "height =" 482 "border =" 0 "alt =" "/>


For example, the dirty blocks pointed to by nodes 1, 2, and 3 have been refreshed as clean blocks. At the same time, two blocks become corrupted and are linked to the end of the checkpoint queue. They are nodes 9 and 10.

The header of the Checkpoint queue, also known as the Checkpoint position and Checkpoint postion, does not need to be understood literally. In short, the checkpoint position is the checkpoint queue header. The information of the checkpoint queue header node, that is, the checkpoint position. Oracle records it frequently in the control file and records it frequently. Generally, there is a dedicated process CKPT every three seconds, and the checkpoint position is recorded in the control file.

650) this. width = 650; "src =" "width =" 589 "height =" 537 "border =" 0 "alt =" "/>


For example, the current checkpoint position is node 1 of the checkpoint queue. Three more seconds later, the CKPT process starts and records the new checkpoint location to the control file:


650) this. width = 650; "src =" "width =" 582 "height =" 510 "border =" 0 "alt =" "/>



The new checkpoint is node 4, which corresponds to the earliest dirty block of the current dirty time. Nodes 1, 2, and 3 have been removed from the checkpoint queue. Because their corresponding dirty blocks are no longer dirty. In general, the blocks after the checkpoint position in the control file are dirty blocks. However, there are also some exceptions, because the checkpoint position is updated every three seconds, just as the dirty blocks corresponding to nodes 1, 2, and 3 have been refreshed, however, because the three-second interval does not reach, the checkpoint still points to node 1. The position of the checkpoint is updated to node 4 only after three seconds.

We will talk about the checkpoint queue and the position of the checkpoint first. Before we fully introduce what is an incremental checkpoint, we will first talk about an important role of the checkpoint queue.

Let's first summarize what happened inside Oracle when users modify blocks:

1. If the block is not in the Buffer cache, read the block into the Buffer cache.

2. Redo the record, record it to the log cache, and write it to the log file when the user submits the record.

3. Modify the block in the Buffer cache

4. set the dirty block flag in the Buffer cache to change the block to a dirty block. At the same time, add a new node at the end of the checkpoint queue to record the information of the new dirty block, including: the location of the dirty block in the Buffer cache. The location of the redo record corresponding to the dirty block generated in step 2.

5. After the user submits the redo record, the corresponding redo record is written from the redo cache to the log file.

I now add the log to the figure above:

650) this. width = 650; "src =" "width =" 585 "height =" 400 "border =" 0 "alt =" "/>


For example, each node in the checkpoint queue stores the address of the dirty block and the number of the redo record corresponding to the dirty block. The location of the dirty block in the Buffer cache is random. You may not modify it. However, redo records are generated sequentially, just as they are arranged in the checkpoint queue. Because they are generated when the block is modified and dirty. Block A is first modified, and the redo record of Block A is placed in the front. Block B is modified, and the redo record of Block B is placed behind the redo record of block. The sequence is the same as that in the checkpoint. When the database is crashed due to exceptions, such as exceptions, power outages, and so on, there are many dirty blocks in the Buffer cache and they are written to the disk. For example, if a power failure occurs, there are still seven dirty blocks on the disk, which contain user-modified data, oracle has sent the feedback "Your modification is complete" to the user, and the user thinks that their modification is complete and will keep it in the database. However, the sudden power failure caused the data in these dirty blocks to be lost and they could not be written to the disk.

How Does Oracle solve this problem? When the database is restarted, Oracle only needs to read the checkpoint position from the control file. The redo record number is recorded in the checkpoint position, oracle can quickly locate redo record n in the log file, read the redo data in redo record n, and reproduce the user's modification operation to the database. Then, Oracle reads the redo data in the redo record n + 1 to reproduce the user's modifications. This process will continue following the log stream sequence and block the last redo record, in the example, the last redo record is n + 6. After this process is completed, all user modifications are reproduced and will not be lost at all. As long as your log file is complete and the log stream is complete, no information will be lost.

Some people may have a problem. After a redo record is generated, it is also sent to the redo cache first, and then written to the log file by the redo cache. Under such a mechanism, some redo records will be lost when they are not written to the log file, and the database suddenly becomes a machine. In this way, the dirty blocks corresponding to these redo records will not be restored. The user will still lose some data.

This situation does happen, but the lost information is useless. Why. Oracle writes the redo record corresponding to the dirty blocks modified by the transaction into the log file each time the user issues the submit command. Only when this operation is completed will the user receive the "submitted" message ", for a complete transaction, when the user sees that the commit is complete, it means that the corresponding redo record must be written to the log file, even if an exception occurs and crashes, it is also absolutely recoverable. The database crashes if the user does not commit or has no time to commit, and the transaction is incomplete. The transaction must be rolled back and no recovery is required. For such an incomplete transaction, the corresponding redo record may be lost, but this does not matter, because the incomplete transaction does not need to be recovered. That is to say, your modifications will not be lost only when your transactions are committed. However, there is another premise that the log file cannot be damaged. What DBA must do is ensure that the log file cannot be damaged. DBAs can use Disk Mirroring technology such as RAID1, or multiple backup log files. We have discussed this in the previous section.

The recovery we mentioned above is automatic and does not require DBA participation. It is called instance recovery.

The functions of the checkpoint queue and the incremental checkpoint are similar. Their main purpose is to refresh dirty blocks in the order of the checkpoint queue. In addition, the instance is restored.

Next, we will discuss how to set up incremental checkpoints.


The checkpoint settings mentioned here mainly refer to frequent incremental checkpoint settings. Note that an incremental checkpoint is just a term and does not need to be understood literally. When an incremental checkpoint occurs, Oracle will wake up DBWn to write dirty blocks along the checkpoint queue. This is the incremental checkpoint. So how long does an incremental check point occur? The frequency of this incremental checkpoint is very important. It basically controls how long DBWn will refresh the dirty block. If DBWn activity is too frequent, it will affect the overall performance of the database. If DBWn activity is too frequent, it will squeeze too many dirty blocks, which will also affect the performance. In addition, if an exception crashes, the instance needs to be restored. The more dirty blocks, the slower the instance recovery .. Before 9i, DBA mainly sets the frequency of incremental checkpoints by means of interval. For example, it can enable Oracle to perform incremental Checkpoints every 10 minutes. If this number is not set properly, the database performance will be greatly affected. In addition, the instance may be restored for a long time. After 9i, especially in 10g, checkpoints are already quite intelligent and rarely become the culprit of I/O problems. Set the fast_start_mttr_target parameter in 9i to the desired instance recovery time. The system automatically controls the frequency of incremental checkpoints. For example, if you want the instance to be restored within five minutes, you can set this parameter to 300, that is, 300.

If the value set by this parameter exceeds the actual hardware limit, for example, if you set it to 60, you expect that the database can recover the instance within one minute under any circumstances, however, the instance cannot be restored within one minute based on the dirty block generation speed of the database and the Write Performance of the storage device. At this time, Oracle will automatically set the appropriate fast_start_mttr_target parameter value. We can see the corrected parameter value in the parameter file, or the actual value in the Target_mttr column in the V $ instance_recovery view. For example:

For example)

We cannot set this value too small, because it is only accidental that the instance must be recovered. If fast_start_mttr_target is set to a small value to restore the instance as soon as possible, DBWn will be active frequently, which may cause performance problems. To avoid unreasonable incremental checkpoint frequency setting, if fast_start_mttr_target is set to 0 in 10 Gb, Oracle automatically adjusts the checkpoint Frequency Based on the speed at which dirty blocks are generated and the performance of storage hardware, make sure that the checkpoint frequency is not the cause of the I/O problem.

The main task of the checkpoint is to urge DBWn to refresh dirty blocks. If too many wait events occur when DBWn refreshes dirty blocks, It means too many dirty blocks and the writing speed of the storage device is too slow, or, the incremental checkpoint frequency is too high or too low. The wait event for DBWn to write dirty blocks is Db file parallel write. If your Incremental checkpoint frequency is low and you find this event, you should set the incremental checkpoint frequency to a higher level after eliminating the write performance problem of the storage device. If your Incremental checkpoint frequency is too high and the Db file parallel write event occurs, this indicates that the checkpoint frequency is too high.

In addition to this, there is also a wait event related to DBWn and incremental check. It is a Write complete waits event, when the current process needs to modify several blocks in the blocks to be written in batches by DBWn, this wait event will occur. This event is the frontend process that waits for DBWn to finish writing. There are too many waiting tasks, indicating that there is a problem with the write performance of the storage device, or the incremental checkpoint is too frequent.

We can see a lot of information about checkpoints in V $ instance_recovery:

If the Estimated_mttr column is too large, the check points are not frequent enough, and too many dirty blocks are generated. At the same time, in the V $ sysstat data view, there are two materials: background checkpoints started and background checkpoints completed. The first one is the number of background process checkpoint start times, and the last one is the number of background process checkpoint completion times. The significance of background process checkpoints is actually incremental checkpoints. Only incremental checkpoints are triggered by background processes. If you use the Alter system checkpoing command to allow the system to complete the full checkpoint, this is called the foreground checkpoint and the incremental checkpoint are irrelevant and will not be recorded in the two documents. If these two values are often different, for example, the checkpoint start times are less than 1 if they are greater than the completion times, this indicates that the checkpoint starts too many times but is not completed in time. This indicates that the checkpoint is too frequent or the checkpoint is too slow to complete.

For example, a large number of dirty blocks are generated and log files are smaller than 5 MB. The log file frequency is switched to trigger the checkpoint and check the wait event)

The checkpoint issue is actually caused by DBWn writing I/O in most cases. The wait event for DBWn writing dirty blocks is Db file parallel write, and the Write complete waits wait event, when the current process wants to modify several blocks in which DBWn is to be written in batches, this wait event will occur. This event is the frontend process that waits for DBWn to write. There are too many waiting tasks, which also indicates that DBWn has a problem.

NOTE: For the I/O problem of data files, in addition to waiting for events, we can also use the V $ filestat view in the previous sections to help identify the problem .)


Welcome to my ChinaUnix technical blog:



To be continued)

This article from "ye shaochen" blog, please be sure to keep this source

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.