Similarities and differences in Oracle Media recovery and instance recovery

Source: Internet
Author: User

1. Concept

? ? REDO Log is a mechanism established by Oracle to ensure that transactions that have been committed are not lost. In fact, the existence of the redo log is prepared for two scenarios:

    • Instance recovery (INSTANCE RECOVERY);

    • Medium recovery (media RECOVERY).

? ? The purpose of the instance recovery is to ensure that the data in the buffer cache is not lost in the event of a database failure and does not result in inconsistent database;

The purpose of media recovery is to recover data when a data file fails.

While the mechanisms used for these two recoveries are similar, the two recoveries are fundamentally different, as many DBAs often confuse.

REDO log data is organized according to thread, for a single-instance system, there is only one thread, for the RAC system, there may be multiple thread, each DB instance has a set of independent REDO log files, with a separate log BUFFER, Changes to an instance are recorded independently in a thread's redo log file.


2. Recovery steps

? ? For media recovery and instance recovery, the first step is to roll forward through the information of redo log, when rolling forward, through the redo log files recorded in the database change vector, according to the SCN comparison, submitted to the relevant data file, so that the state of the data file roll forward. It is important to note that changes to the undo table space are also recorded in the Redo log, so the undo Tablespace-related data files are also rolled forward. When you roll to the last available redo log or archive log, all the database recovery level work is done. At this time, the database contains all of the recorded changes, some of which have already been submitted, some of which have not yet been submitted. In the Undo table space of the latest state, we can also see some transactions that have not yet been committed.

So the next thing the database needs to do is transaction-level processing, rolling back those uncommitted transactions to ensure database consistency.

? ? For a single instance of the system, the instance recovery is generally in the database instance after the exception failure after the database restarts, when the database execution shutdown abort or due to the operating system, host, and other reasons for downtime, the ALTER DATABASE open, will automatically do the instance recovery. In a RAC environment, if an instance is down, the live instance will take over and restore the instance for the outage. Unless all instances are down, the first instance that executes the alter DATABASE Open will do the instance recovery. This is also the redo log is the private component of the instance, but the reason that the redo log file must reside on the shared storage.

? ? The cache mechanism of the Oracle database is performance-oriented, and the cache mechanism should maximize the performance of the database, so cache writes to the data file is always postponed as much as possible. This mechanism greatly improves the performance of the database, but there may be some problems when the instance fails.

The first is that when an instance fails, some changes to the data file may not be fully written to the disk, and some modifications to the data file that have been committed by the transaction may be missing from the disk file. Second, it is possible that some transactions that have not yet been committed have been written to the disk file for changes to the data file. It is also possible that some of the data for an atom change has been written to the file, and some data has not been written to the disk file. Instance recovery is to automatically complete the restoration of the above data through the information recorded in the online REDO log file. This process is completely automatic and does not require manual intervention.

In this mechanism, there are two problems to be solved:

The first is how to ensure that transactions that have been committed are not lost;

The second is how to balance the time required for database performance and instance recovery to ensure that database performance does not degrade and that instance recovery is fast.

? ? Solving the first problem is simple, and Oracle has a mechanism called log-force-at-commit, which means that the redo log data associated with the transaction, including the commit record, must be written from the log buffer at the time the transaction is committed redo Log file, at which time the transaction submits a successful signal to be sent to the user process. Through this mechanism, you can ensure that even if some of the buffer caches in the committed transaction have not been written to the data file, an instance failure occurs, and when the instance is restored, the inconsistent data can be rolled forward by redo log information.

? ? To solve the second problem, Oracle is implemented through a checkpoint mechanism. In the Oracle database, the modification of the buffer cahce is done by the foreground process, but the foreground process is only responsible for reading the data block from the data file into the buffer cache, and is not responsible for buffer cache writing to the data file. The operation of the BUFFER cache to write to the data file is done by the background process DBWR. DBWR can be used to write part of a chunk back into a data file, depending on the load of the system and whether the data block is being made available to other processes. In this mechanism, the time at which a block of data is written back to a file can be somewhat random, and some of the data blocks that are modified may be later written to the data file. And the checkpoint mechanism is an effective complement to this mechanism, checkpoint occurs, the CKPT process will require the DBWR process will be a SCN before all the modified blocks are written back to the data file. So once this checkpoint is finished, all the data changes before this SCN have been saved, and if there is an instance failure, then when the instance is restored, only the amount of change that the checkpoint has completed will begin. The changes before checkpoint do not need to be considered.

So far, we have learned some basic principles of the instance recovery mechanism, and we can generally understand the working mechanism of redo log. But I think we need to go a little deeper. Learn some more in-depth insider stories. In fact, through the above introduction, you may already feel that the instance recovery understanding is very thorough, and in fact, there are many problems we have not solved. Some readers might ask, is there any possibility that the changes in the data file have been written, but the redo log information is still in log buffer and is not written to redo log, how can this be recovered?

? ? Here we also introduce a noun: write-ahead-log, that is, log write priority. Log write precedence consists of two algorithms:

? ? The first aspect is that the modified buffer cache data is not allowed to be written to the data file until the modified change vector of the buffer cache has not been written to the redo log file, thus ensuring that the data file cannot contain the redo Changes recorded in the log file;

? ? The second aspect is that the buffer cache modification cannot be written to the data file until the change vector of the undo information for the data has not been written to redo log.

3. Difference

? ? The mechanism of media recovery and instance recovery is similar, the difference is that the media recovery is when the stored data file fails, the media recovery can not be automated, you must manually execute the recover database or recover datafile command to implement. In general, media recovery is a starting point for recovering from a recovered data file, so you need to use archived logs when doing media recovery.


Similarities and differences in Oracle Media recovery and instance recovery

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.