1. Concept
REDO Log is a mechanism that Oracle has established to ensure that committed transactions are not lost. In fact, the existence of the redo log is prepared for two scenarios, one we call instance recovery (INSTANCE RECOVERY), one we call Media recovery (medium RECOVERY).
The purpose of instance recovery is to ensure that the data in buffer cache is not lost in the event of a database failure, and does not result in inconsistent database.
The purpose of media recovery is to be able to recover data when a data file has failed.
Although the mechanisms used for these two recoveries are similar, the two recoveries are fundamentally different, as many DBAs often confuse.
REDO log data is organized according to thread, for a single instance system, there is only one thread, for the RAC system, there may be multiple thread, each database instance has a set of independent REDO log files, with separate log BUFFER, Changes to an instance are recorded independently of the redo log file of a thread.
2. Recovery steps and differences
For media recovery and instance recovery, the first step is through the Redo log information to roll forward, in doing roll forward, through the Redo log files in the database change vector (we will shortly introduce the database changes vector CV), according to the SCN, submitted to the relevant data file So that the state of the data file scrolls forward. Note that the change in the undo table space is also recorded in the Redo log, so the data files associated with the undo table space are also rolled forward. When you are currently rolling to the last available redo log or archive log, all the database recovery level work is complete. At this time, the database contains all the recorded changes, some of which have been submitted, some of which have not yet been submitted. In the latest state of the undo table space, we can also see some transactions that have not yet been committed.
So the next thing the database needs to do is transaction-level processing, rolling back transactions that have not yet been committed to ensure database consistency.
This article URL address: http://www.bianceng.cn/database/Oracle/201410/45404.htm
For a single instance of the system, the instance recovery is usually in the database after the exception of the database restart, when the database executed shutdown abort or due to the operating system, host, and other reasons after the restart, in Alter DATABASE open, will automatically do instance recovery. In a RAC environment, if an instance is down, or an instance is taken over, the instance of the outage is recovered. Unless all instances are down, the first instance to execute ALTER DATABASE Open will do an instance recovery. This is also the reason that redo log is an instance-private component, but the redo log file must be stored on shared storage.
The cache mechanism of Oracle database is performance-oriented, the cache mechanism should maximize the performance of the database, so the cache is written to the data file is always postponed as much as possible. This mechanism greatly improves the performance of the database, but there may be some problems when the instance fails.
First, when an instance fails, it is possible that some things have not been completely written to disk in the modification of the data file, and some modification information about the data file that has been committed to the transaction may be lost in the disk file. Second, it is possible that some transactions that have not yet been committed have been written to the disk file for modification of the data file. It is also possible that some of the data for an atom change has been written to a file, and some data has not been written to a disk file. Instance recovery is to complete the restoration of the above data automatically through the information recorded in the online REDO log file. This process is completely automatic and requires no manual intervention.
In this mechanism, there are two issues that need to be addressed, the first is how to ensure that committed transactions are not lost, and the second is how to balance the time required for database performance and instance recovery to ensure that database performance does not degrade and that the instance is recovered quickly.
To solve the first problem is simpler, Oracle has a mechanism called log-force-at-commit, which means that when a transaction is committed, the redo log data associated with this transaction, including the Commit record, must be written from log buffer redo Log file, at which point the transaction commits a successful signal before it can be sent to the user process. This mechanism ensures that even if some of the buffer cache in the committed transaction has not been written to the data file, an instance failure occurs, and when the instance is restored, the inconsistent data can be rolled forward by redo log information.
To solve the second problem, Oracle is implemented through the checkpoint mechanism. In Oracle database, the modification of the buffer cahce is done by the foreground process, but the foreground process is only responsible for reading the data blocks from the data file into the buffer cache, and not to write the data file for the buffer cache. The operation of BUFFER cache writing data file is done by background process DBWR. DBWR can be used to write a chunk of data back to a data file based on the load of the system and whether the data block is being made available to other processes. This mechanism, a block of data can be written back to the file time may have a certain randomness, some of the first modification of the data block may be relatively late to be written to the data file. And the checkpoint mechanism is an effective complement to this mechanism, when checkpoint occurs, the CKPT process will require the DBWR process to write back to the data file all the modified blocks that were prior to the SCN. So once this checkpoint is complete, all the data changes before this SCN have been saved, and if an instance failure occurs, then when the instance is recovered, it is only necessary to start the change after the checkpoint has been completed, The changes before the checkpoint need not be considered.
So far, we have learned some basic principles of the instance recovery mechanism, and we can generally understand the working mechanism of redo log. But I think we need to go a little deeper. Learn something deeper. In fact, through the introduction of the old white, we may have felt that the recovery of the example is very thorough, and in fact, there are many problems we have not resolved. Some readers may want to ask, is there any possibility that the changes in the data file have been written, but redo log information is still in log buffer, did not write redo log, this situation how to recover?
Here we have to introduce a noun: write-ahead-log, that is, log writing is preferred. Log write precedence consists of two algorithms, the first is that when a buffer cache changes the vector has not been written to the redo log file, this modified buffer cache data is not allowed to be written into the data file, This ensures that the data file does not contain any changes that are not recorded in the Redo log file, and the second is that this buffer cache modification cannot be written to the data file until the change vector for the undo information for a particular data is not written to redo log.
The mechanism for media recovery and instance recovery is similar, and the difference is that media recovery occurs when a stored data file fails, and media recovery is not automated and must be implemented manually by recover database or recover datafile commands. In general, media recovery is a recovery from a recovered data file, so you need to use an archive log when doing media recovery.
Author: 51cto Blog Oracle Little Bastard