Oracle block damage and recovery
1. What is block damage:
Damaged data blocks refer to data blocks that are not in identifiable Oracle format or are inconsistent internally. In general, damage is caused by hardware faults or operating system problems. The Oracle Database identifies the damaged block as "logical damage" or "Media damage ". If the logic is corrupted, it is an Oracle internal error. After the Oracle database detects inconsistency, it marks the logically damaged block as corrupt. If the media is damaged, the block format is incorrect. The block read from the disk does not contain meaningful information.
You can restore a block or delete a database object that contains a damaged block (or both) to repair the damaged block. If the medium is damaged due to a hardware fault, the problem can be completely solved only after the hardware fault is fixed.
As long as you perform read or write operations on the block, the following consistency check is performed:
-- Block version
-- Result of comparing the DBA (data block address) value in the cache with the DBA value in the block buffer
-- Block checksum (if enabled)
Damaged blocks are identified in the following categories:
-- Medium damage
-- Logic (or software) Corruption
2. Block damage fault phenomenon: ORA-01578
ORA-01578 error: "ORACLE data block was upted (file # % s, block # % s )":
-- This information is generated when damaged data blocks are found.
-- Always returns the relative file number and block number
-- Returns the session to which the query is sent (this query is executed when corruption is found)
-- Display in the alert. log File
Typically, ORA-01578 errors are caused by hardware problems. If the same parameter is always returned for a ORA-01578 error, the most likely cause is Block Media corruption.
If the returned parameters change each time, there may be hardware problems. The memory and page space should be checked and the I/O subsystem should be checked to find problematic controllers.
Note: The ORA-01578 returns the relative file number, but the resulting ORA-01110 error displays the absolute file number.
3. How to Handle corruption
-- Check the warning log and operating system log files.
-- Use an available diagnostic tool to identify the damage type.
-- Run the check function multiple times to check whether the error persists.
-- Restore data from damaged objects as needed.
-- Solve hardware problems:
Memory stick,
Disk controller,
Disk
-- Recover or restore data from damaged objects as needed.
Always try to determine whether the error persists. Run the ANALYZE command multiple times. If possible, you can disable the operation and start the operation again, and then try the operation that experienced an earlier fault again. Check for any other damages. If a damaged block is found, other damaged blocks may exist.
The hardware fault must be resolved immediately. In case of hardware problems, contact the supplier and continue working after checking and fixing the computer. In this case, a Full Hardware diagnostic session should be run.
There may be many types of hardware faults:
-- I/O hardware or firmware faults
-- Operating System
-- I/O or high-speed cache Problems
-- Memory or paging Problems
-- Disk repair Utility
4. Real-time verification of block Integrity: DB_BLOCK_CHECKING:
You can enable database block check by setting DB_BLOCK_CHECKING initialization parameter to TRUE. As long as the data block or index block is modified, This check checks the internal consistency between the data block and the index block. DB_BLOCK_CHECKING is a dynamic parameter that can be modified using the alter system set statement. Block check is always enabled for system tablespace. Block checks usually produce 1% to 10% overhead, depending on the workload. The more update or insert operations being performed, the higher the overhead of block check execution. DB_BLOCK_CHECKING has the following four possible values:
-- OFF: All tablespaces except SYSTEM do not perform block checks.
-- LOW: After the block content in the memory is changed (for example, after the UPDATE or INSERT statement is executed and the disk is read), the basic block header check is executed.
-- MEDIUM: performs all LOW checks and performs semantic block checks on all table blocks that are not organized by indexes.
-- FULL: perform all LOW and MEDIUM checks to perform semantic checks on index blocks.
Initialization parameter DB_BLOCK_CHECKING:
-- Control the degree of processing when performing a self-consistency check on each block
-- Prevents memory and data corruption
-- You can use the alter session command or the alter system deferred command to set
5. Block Media recovery
In most cases, when a block is damaged for the first time, the database marks it as a medium and writes it to the disk. You cannot perform any subsequent read operations on the block before it is restored. You can only perform block recovery for blocks marked as damaged or that have not passed the damage check. You can use the rman recover... BLOCK command to restore the BLOCK Media. By default, RMAN searches for a block copy in the flash back log, and then searches for the block in the full backup or 0-level Incremental backup. If RMAN finds a good copy, the copies are restored and the block is restored. Block Media recovery can only use the redo log for media recovery, but cannot use Incremental backup.
The V $ DATABASE_BLOCK_CORRUPTION view displays damaged blocks marked by database components (such as RMAN commands, ANALYZE, dbv, and SQL queries. This view will add rows if the following types are damaged:
-- Physical/medium damage: the database cannot identify blocks: the checksum is invalid, the block content is all zero, or the block header is incomplete. By default, the physical damage check is enabled.
-- Logic corruption: the block checksum is valid. The block header matches the block end, but the content is inconsistent. Block Media recovery cannot repair Logical Block damage. By default, the logical corruption check is disabled. You can enable the LOGICAL corruption CHECK by specifying the check logical option of the BACKUP, RESTORE, RECOVER, and VALIDATE commands.
Block Media recovery:
-- MTTR)
-- Improve availability during media recovery
-- Data files remain online during recovery
-- Only the block being restored is inaccessible.
-- Use the rman recover... BLOCK command to call
-- Use flashback logs and full backup or zero-level backup to restore Blocks
-- Use the redo log to execute media Restoration
-- V $ database_block_partition uption view displays damaged blocks
6. Prerequisites for Block Media recovery
-- The target database must be in ARCHIVELOG mode.
-- The backup of data files containing damaged blocks must be full backup or zero-level backup.
-- To use proxy copies, you must first restore them to a non-default location
-- RMAN can only use archived redo logs for restoration.
-- To use the flashback log, you must enable the flashback database.
The following prerequisites apply to the RECOVER... BLOCK command:
-- The target database must run in ARCHIVELOG mode and must be opened or loaded using the current control file.
-- Backup of data files containing damaged blocks must be either full backup or zero-level backup, not a proxy copy. If only proxy copies are backed up, you can restore them to a non-default location on the disk. In this case, RMAN considers them as copies of data files, when the block media is restored, the block is searched.
-- RMAN can only use archived redo logs for restoration. RMAN cannot use level 1 Incremental backup. Block Media restoration cannot restore lost or inaccessible archived redo logs, but sometimes it can restore lost redo records.
-- Enable flashback on the target database so that RMAN can search for copies of damaged blocks in the flashback log. If a flashback event record is enabled and the event record contains older but not corrupted versions of the damaged block, RMAN can use these blocks, which may increase the recovery speed.
7. RECOVER... BLOCK command
-- Confirm the backup of the block to be restored.
-- Read the backup and accumulate the requested block into the memory buffer.
-- If necessary, you can manage Block Media recovery sessions by reading archived logs from the backup
Recover datafile 6 BLOCK 3; Recover a single block
RECOVER Recover multiple blocks
DATAFILE 2 BLOCK 43 in multiple data files
DATAFILE 2 BLOCK 79
DATAFILE 6 BLOCK 183;
RECOVER upload uption LIST; Recover all blocks logged in V $ database_block_reply uption
Recover a single block:
Before resuming block recovery, you must determine the damaged block. In general, block damage is reported in the following locations:
-- LIST the result of the FAILURE, VALIDATE, or BACKUP... VALIDATE command
-- V $ DATABASE_BLOCK_CORRUPTION View
-- Error messages in standard output
-- Alert Log File and user trace file (identified in V $ DIAG_INFO view)
-- SQL ANALYZE TABLE and ANALYZE INDEX Command results
-- Result of DBVERIFY Utility
For example, the following message may be found in the User tracking file:
ORA-01578: ORACLE data block upted (file #7, block #3)
ORA-01110: data file 7: '/oracle/oradata/orcl/tools01.dbf'
ORA-01578: ORACLE data block upted (file #2, block #235)
ORA-01110: data file 2: '/oracle/oradata/orcl/undotbs01.dbf'
-- After confirming the BLOCK, run the RECOVER... BLOCK command at the RMAN prompt to specify the file number and BLOCK number of the damaged BLOCK.
RECOVER
DATAFILE 7 BLOCK 3
DATAFILE 2 BLOCK 235;
8. Use the 10231 event for processing:
(If the block is damaged but there is no backup, there is no way to reply)
Run the following command in sqlplus:
Alter system set events = '10231 trace name context forever, level 10 ';
Then export the table:
Exp test/test file = t. dmp tables = t;
Delete the table from the database
Drop table t;
Then Import
Imp test/test file = t. dmp tables = t;
Finally, close the 10231 event:
Alter system set events = '10231 trace name context off ';