Oracle Data block corruption and recovery detailed

Last Update:2014-10-15 Source: Internet

Author: User

Tags dba system log

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. What is block corruption:

The so-called corrupted block of data is that the block is not in the recognizable Oracle format or its contents are internally inconsistent. Typically, corruption is caused by a hardware failure or an operating system problem. The Oracle database identifies the corrupted block as "logical corruption" or "media corruption." If it is a logical corruption, it is an Oracle internal error. When an inconsistency is detected by the Oracle database, the logically corrupted block is marked as corrupted. If the media is corrupted, the block is malformed, and the block read from disk does not contain meaningful information.

You can repair a damaged block of media by recovering the block, or by deleting the database object that contains the damaged block (or both). If the media corruption is caused by a hardware failure, the problem can be resolved completely only after a hardware failure has been fixed.

Whenever a read or write operation is performed on a block, the following consistency checks are performed:
--Block version

--the result of the DBA (block address) value in the cache compared to the DBA value in the block buffer

--Block checksum (if enabled)

The corrupted blocks are identified as the following categories:

--Media damage

--Logical (or software) damage

2. Block damage Symptom: ORA-01578

ORA-01578 error: "ORACLE data block corrupted (file #%s, Block #%s)":

-Generate this information when a corrupted block is found

--Always return relative file number and block number

--Return to the session that issued the query (the query executes when corruption is found)

--displayed in the Alert.log file

In general, ORA-01578 errors are caused by a hardware problem. If the ORA-01578 error always returns the same parameter, the most likely cause is a damaged block medium.
If the parameters that are returned change every time, there may be a hardware problem. You should check the memory and page space and check the I/O subsystem to find the controller in question.
Note: ORA-01578 returns the relative file number, but the accompanying ORA-01110 error displays the absolute file number.

3. How to Handle damage

--Check the alert log and operating system log files.

--Use the available diagnostic tools to find the type of damage.

-Run the check function multiple times to determine if the error persists.

--Recover the data from the damaged object, as needed.

--Solve the hardware problem:
Memory Bars,
Disk controller,
Disk

--Restore or restore data from the damaged object, as needed.

Always try to determine whether the error persists. Run the ANALYZE command multiple times, or, if possible, perform a shutdown and restart operation, and then try the previously failed operation again. Find out if there are any other damage. If a damaged block is found, there may be other damaged blocks.

Hardware failures must be resolved immediately. When you experience a hardware problem, you should contact the vendor to continue working after you have checked and repaired your computer. You should run a full hardware diagnostic session at this time.

There may be many types of hardware failures:
--I/O hardware or firmware failure

--Operating system
--I/O or cache issues

--Memory or paging issues

--Disk Repair Utility

4. Real-time Verification block integrity: db_block_checking:

Database block checking can be enabled by setting the db_block_checking initialization parameter to TRUE. Whenever a block or index block is modified, the check checks the internal consistency of the block and the index block. Db_block_checking is a dynamic parameter that can be modified using the ALTER SYSTEM SET statement. Block checking is always enabled for system table spaces. Block checking typically results in a 1% to 10% overhead, depending on the workload. The more updates or inserts you are performing, the higher the overhead of performing a block check. The db_block_checking has the following four possible values:

--off: No block checks are performed on all tablespaces except SYSTEM.

--low: Performs a basic block check after the contents of the blocks in memory change (for example, after an UPDATE or INSERT statement is executed and after the disk is read).

--medium: Performs all low checks and performs a semantic block check on all table blocks that are not organized by index.

--full: Performs all low and MEDIUM checks and performs a semantic check on the index block.

Initialize parameter db_block_checking:

--the degree to which the control check is handled when self-consistency checks are performed on each block

--Protects against memory and data corruption

--can be set using the Alter SESSION command or the ALTER SYSTEM DEFERRED command

5. Block Media recovery

In most cases, the first time a corruption is encountered, the database marks the block as media corruption, and then writes it to disk. It is not possible to perform any subsequent read operations on the block until it has been restored. Block recovery can only be performed on blocks marked as corrupted or not checked for damage. RMAN RECOVER can be used ... Block media recovery is performed by the blocks command. By default, RMAN searches for a good block copy in the Flashback log, and then searches for blocks in a full or 0-level incremental backup. If RMAN finds a good copy, it restores the replicas and performs a media recovery on the block. Block media recovery can only use redo logs for media recovery, not incremental backups.

The V$database_block_corruption view shows blocks marked as corrupted by database components such as RMAN commands, ANALYZE, DBV, SQL queries, and so on. This view increases the corresponding row for the following types of corruption:

-Physical/Media corruption: The database does not recognize blocks: Checksum is invalid, the block content is all zero, or the size is incomplete. By default, physical corruption checking is enabled.

--Logical corruption: the checksum of the block is valid, and the size and the end of the block match, but the content is inconsistent. Block Media Recovery cannot repair logical block corruption. By default, the logical corruption check is disabled. You can enable logical corruption checking by specifying the check LOGICAL option for the BACKUP, RESTORE, RECOVER, and VALIDATE commands.

Block Media recovery:
--Reduced mean recovery time (MTTR)

--Increase availability during media recovery

--Data files remain online during recovery

--only the blocks being restored are inaccessible

--Using RMAN RECOVER ... BLOCK command Invocation

--Restore blocks using flashback logs and full or level 0 backup

--performing media recovery using redo logs

--v$database_block_corruption view shows blocks marked as broken

6. Prerequisites for Block Media recovery

--The target database must be in ARCHIVELOG mode

--the backup of the data file containing the damaged block must be a full backup or a level 0 backup.

--To use proxy replicas, you must first restore them to a nondefault location
--rman can only be recovered using archived redo logs

--To use the flashback log, the flashback database must be enabled

The following prerequisites apply to RECOVER ... BLOCK command:

--The target database must be running in ARCHIVELOG mode and must be open or loaded with the current control file.

--A data file backup that contains a damaged block must be a full or level 0 backup and cannot be a proxy replica. If only proxy replica backups exist, they can be restored to a nondefault location on disk, in which case RMAN considers them to be copies of the data files, where blocks are searched during block media recovery.

--rman can only be recovered using archived redo logs. RMAN cannot use a Level 1 incremental backup. Block media recovery cannot recover lost or inaccessible archive redo logs, but can sometimes recover lost redo records.

--The Flashback database must be enabled on the target database so that RMAN can search the flashback log for a good copy of the corrupted block. If Flashback event logging is enabled and this event records an older but undamaged version of the corrupted block, RMAN can use these blocks, which may increase the speed of recovery.

7.RECOVER ... BLOCK command

--Determine the backup that contains the block to be recovered

--Read the backup and accumulate the requested block into the memory buffer

--Manage the Block media recovery session by reading the archive log from the backup, if necessary

RECOVER datafile 6 BLOCK 3; Recover a single block

RECOVER RECOVER Multiple blocks

DataFile 2 BLOCK in multiple data files

DataFile 2 BLOCK 79

DataFile 6 BLOCK 183;

RECOVER corruption LIST; Recover all blocks logged in V$database_block_corruption

To recover a single block:

Before a block recovery, you must determine the damaged block. In general, block corruption is reported in the following locations:

--list FAILURE, VALIDATE or BACKUP ... Results of the VALIDATE command
--v$database_block_corruption View

--error message in standard output

--Alert log files and user trace files (identified in the V$diag_info view)
--sql results of the ANALYZE TABLE and ANALYZE INDEX commands
--dbverify Results of the utility program

For example, the following message may be found in the user trace file:

Ora-01578:oracle data Block Corrupted (file # 7, Block # 3)

Ora-01110:data file 7: '/ORACLE/ORADATA/ORCL/TOOLS01.DBF '

Ora-01578:oracle data Block corrupted (file # 2, Block # 235)

Ora-01110:data file 2: '/ORACLE/ORADATA/ORCL/UNDOTBS01.DBF '

--After determining the block, run RECOVER at the RMAN prompt ... The Block command, which specifies the file number and block number of the damaged block.

RECOVER

DataFile 7 BLOCK 3

DataFile 2 BLOCK 235;

Oracle Data block corruption and recovery detailed

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More