During the use of the Informix database, when a checkpoint operation occurs, the database application is blocked from running until the checkpoint operation is complete. This can significantly degrade the performance of your database. This article will introduce the principle and application of non-blocking checkpoint and RTO strategy in Informix 11 database, and hope that we can have a more comprehensive understanding of non-blocking checkpoint and RTO strategy.
As we know, checkpoints are a very important operation of the database server, which is used to liquidate all or part of the transactions and data in the buffer pool to disk, generating a consistency point for the database server, so that when the database server fails, it can be restarted at the established point.
The purpose of the checkpoint is to move the restart point in the logical log periodically forward. If the checkpoint does not exist and a failure occurs, the database server should handle all transactions that have been recorded in the logical log since the system was restarted.
Full checkpoint and Blur checkpoint
Prior to version Informix 11, Informix had two types of checkpoints, one being a complete or synchronous checkpoint (full or sync checkpoints), and the other a blur checkpoint (fuzzy checkpoints). A full or synchronized checkpoint will liquidate all modified data in the buffer pool to disk and establish a consistency point for the database server. Typically, when a full or synchronous checkpoint occurs, things in the database server are blocked until the checkpoint operation completes. When the checkpoint lasts a long time, it can have a significant impact on system performance. To reduce the impact of checkpoint on system performance, starting with the Informix 9.2 release, Informix introduced the concept of a fuzzy checkpoint, using a fuzzy checkpoint method that, like a full or synchronous checkpoint operation, would liquidate the data in the buffer pool to disk, but the data type of the database built into the The data changed by the Insert,update and delete operations will not be cleared to disk so that the duration of each blur checkpoint operation will be significantly reduced, thus reducing the impact on system performance.
The system produces a blur checkpoint when the following conditions occur:
The checkpoint interval is set at a value that is typically set in the CKPTINTVL parameter of the Onconfig configuration file.
Physical log reaches 75% of total size
Perform administrative events such as increasing database space, adding blocks (chunk)
Execute onmode-c Fuzzy command
The system produces a full or synchronous checkpoint when the following conditions occur:
When you use the Ontape or On-bar command for a backup or restore operation
When the database server completes a quick restore (fast recovery) or full recovery (ull recovery)
One checkpoint per logical log space: Informix Dynamic Server cannot overwrite the logical log that contains the current checkpoint, so it must trigger a checkpoint before moving to that logical log
Execute onmode-c command
Perform a normal database server shutdown operation
When a full checkpoint operation or a blur checkpoint operation occurs, the database server performs the following series of actions:
Blocking transactions
To liquidate data from the physical log buffer to disk
Liquidate the modified data in the buffer to disk. In the case of a Blur checkpoint, the Informix page cleanup thread will liquidate the change data generated by the inserts, deletes, updates operations to disk, and the change data generated by inserts, deletes, and updates operations will not be cleared to disk If it is a full checkpoint, the Informix page cleanup thread will liquidate all the changed data to disk.
Writes the checkpoint completion record to the logical log buffer, while the checkpoint information is updated to the System reservation page.
To liquidate data from the logical log buffer to disk
Logically emptying the physical log
Before Informix version 11, both a full checkpoint operation and a Blur checkpoint operation would block the transaction and affect system performance. Therefore, users need to use a variety of methods to minimize the full checkpoint operation and Blur checkpoint operation duration. Typically, the user adjusts the LRU parameter to a very small number, reducing the checkpoint time by continuously clearing the data in the buffer pool. However, when the LRU parameter is adjusted to a very small amount of time, it will reduce the write operation buffering, the CPU resources, and increase the contention for the buffer pool, but also affect the performance of the system OLTP. System optimization is a very difficult task. In addition, when a large number of transactions are blocked, the fuzzy checkpoint operation time is unpredictable, at the same time, the use of Fuzzy checkpoint operation, the database server fault recovery time has become unpredictable.
Before Informix version 11, checkpoint operations can also confuse the user for system response time and system recoverability: If the checkpoint interval is short, the system recovers well, but more transactions are blocked, and if the checkpoint is long spaced and fewer transactions are blocked, the system will recover longer.