Oracle Recovery Internal principle: instance Recovery

Last Update:2017-02-27 Source: Internet

Author: User

Tags requires

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Instance recovery is used to recover from a crash failure or an instance in a parallel server environment fails, so instance recovery can refer either to crash recovery or to instance recovery in a parallel server environment (as long as one surviving instance can recover one or more failed instances).

The goal of instance recovery is to restore the block of data in a failed instance in the data buffer and close the open thread. Instance recovery uses only the online archive log and the current online data file (no restore history backup is required). An instance recovery can only recover one thread at a time, recovering from the thread's closest thread checkpoint until the end of the thread.

5.1 Detecting whether an instance recovery is required

An instance recovery occurs automatically when an Oracle kernel discovers that an instance is dead and the thread in the control file is open or open. Instance recovery occurs automatically in the following two scenarios:

1. Open the database for the first time after the crash failed.

2. An individual instance of a parallel server (not all) failed.

In a parallel server environment, a surviving instance detects that one or more instance failures require an instance recovery by using the following methods:

1. A foreground process for a surviving instance detects "Invalid block lock" when a chunk in the data file is read into the data buffer. This multiple occurs when another instance has read the block into the data buffer and protected the block ' dirty data ' with the lock, and the instance fails.

2. The foreground process of the surviving instance notifies its Smon process to see the failed instance.

3. The surviving instance can request the death instance's thread to open the flag lock to discover that the instance is dead.

The Smon process of the surviving instance gets a list of death instances and a list of the wrong blocks of data. Note: When the instance is restored, the locks in these lists will be cleared out.

5.2 Thread-at-a-time Redo Application

Instance recovery can only handle one thread at a time, so only one instance can be recovered at the same times. Before processing the next thread, instance recovery applies all redo logs for each thread (starting from the thread checkpoint of the thread to the end of the thread) to the data file. The correctness of this algorithm depends on the fact that only one instance can modify the block in the data buffer at the same time. The block is written back to the disk when different instances modify the same piece. Therefore, when the instance is recovered, it is enough to read the data cache from disk to a block, and that thread contains the most recent modification log for that block.

Instance recovery can always be done as long as the thread's online log is complete. Crash recovery first processes the thread with the lowest thread checkpoint and restores in the order that the thread checkpoint SCN increments. This ensures that the database checkpoint is advanced by each recovered thread.

5.3 Current Online data files

The checkpoint counter is used to validate that the data file is the current online data file rather than a historical backup. If the data file is restored from backup, the media recovery is required first.

When the data file is restored from the backup, even if the online redo log can be restored, media recovery is still unavoidable. The reason is that crash recovery is a redo log after the thread checkpoint is applied to each thread. Crash recovery can be done with this redo algorithm because each block requires a maximum of one thread redo log.

However, if you are restoring on a restored backup, you cannot determine which threads to redo the log. So the algorithm of one thread at a time does not work in this case. Recovery on backup requires the consolidation of multiple threads. If all redo logs after the data file checkpoint are merged, the redo logs in each thread are combined in the order of the SCN increment. This thread merging redo algorithm is only used for media recovery (see section 6th).

Crash Recovery If you use the thread merge redo algorithm to recover a backup, even if the checkpoint of the data file is consistent with the database checkpoint, it still fails. The reason is that in all threads, crash recovery loses the redo log between the database checkpoint and the highest checkpoint. In contrast, media recovery applies the redo program starting at the data file checkpoint. Also, even crash recovery will fail if you apply the Redo program from the data file checkpoint. Because it only goes to find the online redo log. All threads may have archived their redo logs and used the online logs.

If you use the command startup RECOVER, crash recovery fails because the data file requires media recovery. The recover database is automatically called for media recovery before it is opened.

5.4 Inspection Points

Instance recovery does not attempt to apply the redo log before the data file checkpoint (the checkpoint SCN in the header of the data file does not determine whether an instance recovery is required).

The instance resumes reading the redo log from the point of data file checkpoint to the end of the thread and finds the maximum SCN allocated by the thread. Used to close threads and push thread checkpoints. The data file checkpoint and checkpoint counter are also promoted after the instance recovery is complete.

5.5 Crash Recovery complete

When the crash recovery is complete, the online fuzzy bit of all data files, the hot backup blur bit, the media restoration Blur bit will all be erased, and then a special redo record is written in the redo log, marking the end of the crash recovery. This record is used to notify the media of when recovery can erase the online fuzzy bit and hot backup bits of the data file.

See more highlights of this column: http://www.bianceng.cnhttp://www.bianceng.cn/database/Oracle/

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More