One quiet afternoon, an AIX small machine with an ORACLE database in the test environment was shut down due to an unexpected power failure. In the test environment, an engineer was assigned to solve the problem. Specifically, the server was restarted, it is found that the/app directory of oracle is not mounted. Then we fixed it through smitty fs, mounted the app, and started oracle.
The system logs of system.txt (obtained through errpt-a), alert_soa.log, and oracle trace logs trc are collected. The trc logs are analyzed as follows:
/App/oracle/product/10.2.0/admin/soa/bdump/soa_mmon_307366.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0-64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME =/app/oracle/product/10.2.0
System name: AIX
Node name: data2
Release: 3
Version: 5
Machine: 00CE993C4C00
Instance name: soa
Redo thread mounted by this instance: 1
Oracle process number: 11
Unix process pid: 307366, image: oracle @ data2 (MMON)
* ** 14:06:10. 308
* ** Service name :( SYS $ BACKGROUND) 14:06:10. 212
* ** Session id: (161.1) 14:06:10. 212
Hex dump of (file 3, block 49259)
Dump of memory from 0x07000000C5934000 to 0x07000000C5936000
700347c5934000 06A20000 00C0C06B 0178F614 00000104 [...... k. x ......]
700347c5934010 45A30000 010A0025 116224d 0178F614 [E ...... %... "M. x ..]
700w.c5934020 00000000 1F023200 00C0C069 00010003 [...... 2 ...... I ......]
Observe the other two files, found that there are more ORA-1578 ERROR and disk operation error.
Analysis: Generally, a shared disk is sometimes not mounted when the CLUSTER is switched to a dual-host, accidentally powered off, or in other cases. You need to use FSCK to repair the shared disk and then MOUNT it. after the restoration is completed, the database can be directly started; otherwise, "data blocks are damaged and the database cannot be started" will be reported during the database startup process. At this time, we can detect and fix Errors Based on different data block damage types and determine solutions to the problems.
I. Causes of data block corruption:
1. Hardware problems: disk controller problems or disk faults)
2. Physical block damage is usually caused by the previous cause)
3. The logical data block is damaged.
Ii. Analysis of Bad blocks:
Oracle Data blocks have fixed formats and structures, which are divided into three layers: Cache layer, Transaction layer, and Data layer.
Check the consistency when performing read/write operations on data blocks:
-Block type
-DBA
-Scn
-Header and tail
When inconsistency is found, it is marked as a bad block. There are two types of Bad blocks: Physical Bad blocks and logical Bad blocks. Impact of Bad blocks: Data Dictionary tables, rollback segment tables, temporary segments, and user data tables and indexes.
3. Determine the Fault Cause and corresponding solutions:
1. View alert. there are no other ORA-errors in the log file. If an error is reported to a file of different disks, it is a problem with the disk controller. For details, see V $ DATAFILE, to check which files are under this controller, you need to check whether the disk controller (usually the controller has two control A and control B) is normal.
2. If an error is reported to different files on the same disk, It is a disk issue. Check whether the disk has an alarm or LVM has an error.
3. If you point to the same file on the same disk, you can run the following statement to find the file name:
SELECT SEGMENT_NAME, SEGMENT_TYPE FROM DBA_EXTENTS WHERE FILE_ID = <file number> AND <block number> BETWEEN BLOCK_ID AND BLOCK_ID + BLOCKS-1;
The file number and block number can be found in the error log. If the query continuously points to a table or index, You can recreate them.
4. If the file is a SYSTEM tablespace or in NOARCHIVELOG mode, when the database is still running, EXP exports all the data, reconstructs the database, and then IMP injects the new database.
5. If the database is in ARCHIVELOG mode, you can use DBV to verify the Bad blocks, use RMAN to fix the Bad blocks, and then start the database.
Or another solution
Shut down the database. If the database cannot be closed, take the corresponding data file offline:
Alter database datafile 'file' OFFLINE;
Try to copy the data file to another disk. If the copy fails, the file will be lost.
Then startup mount;
Rename the data file to the name of the file successfully copied to another disk.
Alter database rename file 'old path filename 'TO 'new path filename ';
Alter database open;
Recover datafile file name;
Alter database datafile 'file name' ONLINE;
Iv. solutions in this example
In this case, the database has backup and archive and backup is available, so use the rman command to fix Bad blocks.
Check Bad blocks with DBV first
$ Show parameter db_block_size
$ Select BYTES/2048 from v $ datafile where FILE # = 3;
$ Dbv file =/app/oracle/product/10.2.0/oradata/soa/user01.dbf blocksize = 8192
$ Rman target/
Recovery MANAGER: Release 10.2.0.1.0-Production on Friday Aug 17 15:07:14 2013
Copyright (c) 1982,200 5, Oracle. All rights reserved.
Connect to the target database: soa (DBID = 1281151392)
RMAN> blockrecover datafile 3 blocks 49259;
Start blockrecover
Use the target database control file to replace the recovery directory
Allocated channel: ORA_DISK_1
Channel ORA_DISK_1: sid = 187 devtype = DISK
Channel ORA_DISK_1: Recovering Blocks
Channel ORA_DISK_1: Specifies the block to be restored from the backup set.
Recovering data file 049259 Blocks
Channel ORA_DISK_1: Reading the backup segment ORACLE \ FLASH_RECOVERY_AREA \ DB01 \ BACKUPSET
\ 2013_02_28 \ O1_MF_NNNDF_TAG201302287_3 \ YCS579G_.BKP
Channel ORA_DISK_1: recovered from backup segment 1
Channel ORA_DISK_1: block Recovery completed, time: 00:00:02
Restoring media...
Media Recovery completed. Time: 00:00:05
Blockrecover completed in 1-3-13
RMAN> exit
The recovery manager is complete.
SQL> select count (*) from buffer. t;
COUNT (*)
----------
3298
After the damaged block is repaired, the v $ database_block_uption will not be updated. It needs to be updated at the next backup.
SQL> select * from v $ database_block_corruption;
FILE # BLOCK # BLOCKS upload uption_change # upload uptio
---------------------------------------------------------
3, 49259, 1, 0, CHECKSUM
$ Rman target/
Recovery MANAGER: Release 10.2.0.1.0-Production on Sunday, September 11, March 1 16:09:43 2013
Copyright (c) 1982,200 5, Oracle. All rights reserved.
Connect to the target database: soa (DBID = 1281151392)
RMAN> backup validate datafile 3;
Start backup
Use the target database control file to replace the recovery directory
Allocated channel: ORA_DISK_1
Channel ORA_DISK_1: sid = 132 devtype = DISK
Channel ORA_DISK_1: Start all data file backup Sets
Channel ORA_DISK_1: Specifies the data file in the backup set.
Channel ORA_DISK_1: Backup set completed. elapsed time: 00: 00: 03
Backup completed in 1-3-13
RMAN> exit
The recovery manager is complete.
SQL> select * from v $ database_block_corruption;
Unselected row
Note: If the database is not backed up, you can use the dbms_repair package to remedy the problem, but the database will be lost.
So far, the database recovery is complete and the restart operation is normal.
This article is from the "dripping water and stone" blog and will not be reposted!