Oracle recovery Internal principles (various recovery features)

Source: Internet
Author: User
The goal of parallel recovery is to use parallel computing and IO mechanisms to reduce the time for crash recovery, single instance recovery, and media recovery. When multiple data files on multiple disks are restored at the same time

The goal of parallel recovery is to use parallel computing and I/O mechanisms to reduce the time for crash recovery, single instance recovery, and media recovery. When multiple data files on multiple disks are restored at the same time

10.1 parallel recovery (v7.1)

The goal of parallel recovery is to use parallel computing and I/O mechanisms to reduce the time for crash recovery, single instance recovery, and media recovery. When multiple data files on multiple disks are restored at the same time, the recovery time can be effectively reduced.

10.1.1 parallel recovery Architecture

Parallel partition recovery involves two tasks:

1. Read the redo log.

2. Apply the Change vector.

Step 1 does not merge rows. Redo logs must be read sequentially and then merged in media recovery. Therefore, this task is completed by a process: the process that reads the redo log

Step 2 is suitable for parallelism. Therefore, the application changes the vector task and delegates it to a group of subordinate processes of the redo program. The redo log reading process sends the Change vector to the subordinate process of the redo program, using the same IPC Mechanism (inter-process communication mechanism) as in parallel queries ). The Change Vector uses the HASH function to distribute data block addresses as parameters. Therefore, each redo process only processes the change vectors allocated to its "Bucket. The subordinate process of the redo program reads data blocks into the cache and checks whether the change vector is applied.

This architecture achieves parallel processing in the Process of reading I/O from data blocks and changing vector applications. It allows logs to read I/O and data blocks to read I/O in parallel. In addition, it allows reading I/O of data blocks in different HASH buckets to be performed in parallel. As long as the benefits of parallelism are greater than the cost of process management and communication, the recovery time is effectively reduced.

10.1.2 parallel recovery system initialization parameters

PARALLEL_RECOVERY_MAX_THREADS

PARALLEL_RECOVERY_MIN_THREADS

These two parameters control the number of subordinate processes of the program during crash recovery or media recovery.

PARALLEL_INSTANCE_RECOVERY_THREADS

This parameter controls the number of processes that are subordinate to the program during instance recovery.

10.1.3 media recovery syntax changes

The recover database Command adds an optional parameter to specify the number of subordinate processes of the redo program. If this parameter is specified, the value of the default PARALLEL_RECOVERY_MAX_THREADS parameter is overwritten.

The recover tablespace command adds an optional parameter to specify the number of subordinate processes of the redo program. If this parameter is specified, the value of the default PARALLEL_RECOVERY_MIN_THREADS parameter is overwritten.

The recover datafile command adds an optional parameter to specify the number of subordinate processes of the redo program. If this parameter is specified, the value of the default PARALLEL_RECOVERY_MIN_THREADS parameter is overwritten.

10.2 redo log Checksums (v7.2)

Log checksum allows you to check whether logs are damaged before being archived. The objective is to prevent corrupted logs from being copied (archived ). This feature will be used in conjunction with a new command clear logfile to CLEAR a corrupted online log without archiving.

A new initialization parameter LOG_BLOCK_CHECKSUM controls whether the log checksum function is activated. If this parameter is set, a value is calculated and placed in the header of the log block before each log block is written to the disk from the buffer zone. The checksum value will be verified during archiving or restoration. If the checksum value of the log block is incorrect, the system tries to read the log block from other members of the log group (if there are other members ). If all members have an inevitable checksum error, the log reading operation fails.

If the log cannot be archived due to an unrecoverable checksum error, the log cannot be used. Finally, log switching and new logs will be delayed. If no action is taken, the database will stop. The clear logfile command provides a way to avoid archiving this log.

10.3 clear logs (v7.2)

If all members of an online log group are lost or damaged (for example, due to a checksum error or media error ). The redo log program may still work normally until the log group is reused. Once the checkpoint of all threads exceeds the log, the log can be reused. However, the reuse fails in the following cases:

1. The log cannot be archived because of a checksum error, and cannot be archived. As a result, the log cannot be reused.

2. Log switching failed because the log is unavailable (for example, due to media failure), the log may have been archived or not archived.

The alter database clear logfile command is used to resolve an inactive log file (that is, a log file that is not required for crash recovery) from the preceding situation. Clear logfile allows an inactive log to be cleared, such as discarding and initializing. It is similar to dropping LOGFILE before adding LOGFILE. In many cases, this command can be used to avoid unnecessary database shutdown or log resetting.

Note: clear logfile cannot be used to CLEAR online logs (for example, logs in the "current" or "active" status) required for crash recovery ). If such online logs are corrupted, you need to shut down the database in an abnormal way, perform Incomplete recovery, and enable the database with the log reset option.

The UNARCHIVED option is used to smooth the log clearing process, even if the log file is not archived. In this case, drop logfile is not allowed. In addition, clear logfile is allowed to handle the following situations:

1. There are only two log groups in the thread

2. When the media fails, all log group members are lost or damaged.

3. The log to be cleared is the current log of a closed thread.

Drop logfile is not allowed in all the above cases.

Clearing Unarchived log files makes existing backups useless because the log files are required for restoration. Therefore, we recommend that you back up the database immediately after Unarchived logs are cleared. In addition, the unrecoverable datafile option must be used together to restore a data file offline and before going online. Add the unrecoverable datafile option to the clear logfile command. offline data files and corresponding tablespace must be deleted from the database, this is because the logs recovered when the data file is online are cleared and no backup is available.

The foreground process performs the following steps when executing clear logfile:

1. Check that this log file is not required for crash recovery and can be cleared.

2. Mark the log as "Clearing" and "No archiving required" in the control file, so that the log file cannot be a candidate member for log switching.

3. Create a New log file and clear it with multiple writes (a long process ).

4. Reset the mark "Clearing.

If the foreground process stops running the clear logfile command, the log will be unavailable. The redo log program will be delayed, and the database may stop, especially during log switching because you have to wait for the checkpoint to complete or archive. If the front-end process actually died when executing clear logfile, it should be re-executed. Another method is to delete some logs that have been cleared. Clear logfile may also fail due to an I/O error when it is cleared. The solution is to delete the log and add another one to replace it.

Series of articles: Oracle internal recovery principles? Where = nkey & keyword = 19824

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.