This is the data recovery case for a customer we've been working on recently, and the customer environment is windows,oracle version 9201. The database could not be opened because of a disk failure. After understanding, the customer in June on the fault, during the search for a number of manufacturers to recover, have not solved the problem. Finally, the customer recovered the business through an exp backup that was early in the morning, although the business is now normal, but the data for six months, so still need to recover.
Let's take a look at the alert log information for the database, as follows:
?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21st
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
Tue May 26 18:00:18 2015
Database Characterset is Us7ascii
Replication_dependency_tracking turned off (no async multimaster replication found)
Completed:alter DATABASE OPEN
Wed May 27 07:59:36 2015
Thread 1 advanced to log sequence 1059
Current log# 3 seq# 1059 mem# 0:d:\oracle\oradata\sjyykf\redo03. LOG
Wed May 27 09:39:21 2015
Errors in file d:\oracle\admin\sjyykf\udump\sjyykf_ora_5276.trc:
Wed May 27 09:49:09 2015
Errors in file d:\oracle\admin\sjyykf\udump\sjyykf_ora_5212.trc:
Ora-00600:internal error code, arguments: [QERRMOFBU1], [1013], [], [], [], [], [], []
Wed May 27 09:49:11 2015
Error 3120 trapped in 2PC on transaction 3.44.42846. Cleaning up.
Error Stack returned to User:
Ora-03120:two-task Conversion Routine:integer Overflow
Ora-02063:preceding Line from Ctais
Wed May 27 10:16:49 2015
Errors in file d:\oracle\admin\sjyykf\udump\sjyykf_ora_5212.trc:
Ora-00600:internal error code, arguments: [729], [12224], [space leak], [], [], [], [], []
Wed May 27 10:16:49 2015
Errors in file d:\oracle\admin\sjyykf\udump\sjyykf_ora_5212.trc:
Ora-00600:internal error code, arguments: [Ncodeffsheap], [], [], [], [], [], [], []
Ora-00600:internal error code, arguments: [729], [12224], [space leak], [], [], [], [], []
Wed May 27 11:40:16 2015
Errors in file d:\oracle\admin\sjyykf\udump\sjyykf_ora_2640.trc:
Ora-24343:user defined callback Error
Ora-02063:preceding Line from Ctais
Wed May 27 11:48:35 2015
Errors in file d:\oracle\admin\sjyykf\udump\sjyykf_ora_6096.trc:
Ora-24343:user defined callback Error
Ora-02063:preceding Line from Ctais
Wed May 27 13:57:57 2015
Error 2068 trapped in 2PC on transaction 7.42.40674. Cleaning up.
Error Stack returned to User:
Ora-02068:following Severe error from Ctais
Ora-03113:end-of-file on communication channel
Wed May 27 17:42:48 2015
Errors in file d:\oracle\admin\sjyykf\udump\sjyykf_ora_5604.trc:
Ora-02068:following Severe error from Ctais
Ora-03113:end-of-file on communication channel
。。。。。
Wed May 27 18:00:18 2015
Database Characterset is Us7ascii
Replication_dependency_tracking turned off (no async multimaster replication found)
Completed:alter DATABASE OPEN
Thu May 28 06:29:49 2015
Thread 1 advanced to log sequence 1060
Current log# 1 seq# 1060 mem# 0:d:\oracle\oradata\sjyykf\redo01. LOG
Thu May 28 10:59:05 2015
Errors in file d:\oracle\admin\sjyykf\udump\sjyykf_ora_5364.trc:
Ora-00600:internal error code, arguments: [QERRMOFBU1], [1013], [], [], [], [], [], []
Thu May 28 11:28:03 2015
Errors in file d:\oracle\admin\sjyykf\udump\sjyykf_ora_3340.trc:
Ora-00600:internal error code, arguments: [QERRMOFBU1], [1013], [], [], [], [], [], []
。。。。。
。。。。。
Mon June 01 14:43:50 2015
Error 2068 trapped in 2PC on transaction 6.24.44274. Cleaning up.
Error Stack returned to User:
Ora-02068:following Severe error from Ctais
Ora-03113:end-of-file on communication channel
Mon June 01 16:52:00 2015
Errors in file d:\oracle\admin\sjyykf\udump\sjyykf_ora_5424.trc:
Ora-00600:internal error code, arguments: [QERRMOFBU1], [1013], [], [], [], [], [], []
Mon June 01 16:52:01 2015
Error 3106 trapped in 2PC on transaction 1.25.40549. Cleaning up.
Error Stack returned to User:
Ora-03106:fatal two-task Communication protocol error
Ora-02063:preceding Line from Ctais
Tue June 02 02:52:39 2015
Thread 1 advanced to log sequence 1064
Current log# 2 seq# 1064 mem# 0:d:\oracle\oradata\sjyykf\redo02. LOG
Tue June 02 09:10:36 2015
Errors in file d:\oracle\admin\sjyykf\udump\sjyykf_ora_4596.trc:
Ora-02068:following Severe error from Ctais
Ora-03113:end-of-file on communication channel
Tue June 02 09:16:01 2015
Error 2068 trapped in 2PC on transaction 6.22.44357. Cleaning up.
Error Stack returned to User:
Ora-02068:following Severe error from Ctais
Ora-03113:end-of-file on communication channel
Shutting down Instance:further logons disabled
Shutting down instance (immediate)
From the above information can be seen, the customer's environment in fact, in May there have been many problems, alert log reported a lot of errors, which about ORA-00600 [QERRMOFBU1] This error, according to the MoS document, most likely the 9201 version of the bug. In addition, we can see that there is a ORA-00600 [729] error, which is usually related to memory leaks, according to which the customer's database environment may be less memory.
It turns out that the back to the scene to understand the situation found that the database environment in the virtual machine environment, the CPU and memory are relatively small
Let's continue to look at the most recent breakdown, as follows:
Tue June 02 18:00:21 2015
Smon:enabling Cache Recovery
Tue June 02 18:00:22 2015
Undo Segment 1 onlined
Undo Segment 2 onlined
Undo Segment 3 onlined
Undo Segment 4 onlined
Undo Segment 5 onlined
Undo Segment 6 onlined
Undo Segment 7 onlined
Undo Segment 8 onlined
Undo Segment 9 onlined
Undo Segment onlined
Successfully onlined Undo tablespace 1.
Tue June 02 18:00:22 2015
smon:enabling TX Recovery
Tue June 02 18:00:22 2015
Database Characterset is Us7ascii
Replication_dependency_tracking turned off (no async multimaster replication found)
Completed:alter DATABASE OPEN
***
Corrupt block relative dba:0x01058531 (file 4, block 361777)
Fractured block found during buffer read
Data in bad block-
Type:6 Format:2 rdba:0x01058531
Last Change scn:0x0d21.3c28cf0a seq:0x2 flg:0x04
Consistency value in tail:0x480a0601
Check value in blocks header:0xd46e, computed block checksum:0x8703
spare1:0x0, spare2:0x0, spare3:0x0
***
Reread of rdba:0x01058531 (file 4, block 361777) found same corrupted data
***
Corrupt block relative dba:0x010683c9 (file 4, block 426953)
Bad check value found during buffer read
Data in bad block-
Type:6 Format:2 RDBA:0X010683C9
Last Change scn:0x0d21.b84eab23 seq:0x2 flg:0x04
Consistency value in tail:0xab230602
Check value in blocks header:0xca31, computed block checksum:0x3454
spare1:0x0, spare2:0x0, spare3:0x0
***
Reread of rdba:0x010683c9 (file 4, block 426953) found same corrupted data
***
Corrupt block relative dba:0x010fdf29 (file 4, block 1040169)
Bad check value found during buffer read
Data in bad block-
Type:6 Format:2 rdba:0x010fdf29
Last Change scn:0x0d21.b851943f seq:0x2 flg:0x04
Consistency value in tail:0x943f0602
Check value in blocks header:0x92bd, computed block checksum:0x300a
spare1:0x0, spare2:0x0, spare3:0x0
***
Reread of rdba:0x010fdf29 (file 4, block 1040169) found same corrupted data
***
Corrupt block relative dba:0x010a4782 (file 4, block 673666)
Fractured block found during buffer read
Data in bad block-
Type:6 Format:2 rdba:0x010a4782
Last Change Scn:0x0d21.4c552c11 seq:0x2 flg:0x04
Consistency value in tail:0xeb550602
Check value in blocks header:0xb524, computed block checksum:0xc744
spare1:0x0, spare2:0x0, spare3:0x0
***
Reread of rdba:0x010a4782 (file 4, block 673666) found same corrupted data
Wed June 03 05:22:16 2015
Errors in file d:\oracle\admin\sjyykf\udump\sjyykf_ora_4916.trc:
Ora-00600:internal error code, arguments: [Rworupo.1], [2560], [60], [], [], [], [], []
***
Corrupt block relative dba:0x01051131 (file 4, block 332081)
Fractured block found during buffer read
Data in bad block-
Type:6 Format:2 rdba:0x01051131
Last Change Scn:0x0d21.3c252e20 seq:0x2 flg:0x04
Consistency value in tail:0x32840602
Check value in blocks header:0xcfd5, computed block CHECKSUM:0X1CA4
spare1:0x0, spare2:0x0, spare3:0x0
***
Reread of rdba:0x01051131 (file 4, block 332081) found same corrupted data
Wed June 03 05:22:56 2015
Thread 1 advanced to log sequence 1065
Current log# 3 seq# 1065 mem# 0:d:\oracle\oradata\sjyykf\redo03. LOG
Wed June 03 07:12:17 2015
Errors in file d:\oracle\admin\sjyykf\udump\sjyykf_ora_4748.trc:
Ora-00600:internal error code, arguments: [15851], [8], [8], [1], [2], [], [], []
Wed June 03 10:00:55 2015
Errors in file d:\oracle\admin\sjyykf\udump\sjyykf_ora_2120.trc:
***
Corrupt block relative dba:0x011e6669 (file 4, block 1992297)
Bad check value found during buffer read
Data in bad block-
Type:6 Format:2 rdba:0x011e6669
Last Change SCN:0X0D21.E670ECBC seq:0x2 flg:0x04
Consistency value in tail:0xecbc0602
Check value in blocks header:0xc93, computed block checksum:0x6090
spare1:0x0, spare2:0x0, spare3:0x0
***
Reread of rdba:0x011e6669 (file 4, block 1992297) found same corrupted data
Wed June 03 10:15:14 2015
Shutting down Instance:further logons disabled
Starting ORACLE instance (normal)
Dump file D:\oracle\admin\sjyykf\bdump\alert_sjyykf.log
Wed June 03 15:28:24 2015
ORACLE v9.2.0.1.0-production vsnsta=0
Vsnsql=12 vsnxtr=3
Windows Version 5.2 Service Pack 2, CPU type 586
Wed June 03 15:28:24 2015
Starting ORACLE instance (normal)
。。。。。。
。。。。。。
Wed June 03 17:48:25 2015
Database mounted in Exclusive Mode.
Completed:alter DATABASE MOUNT
Wed June 03 17:48:51 2015
ALTER DATABASE RECOVER datafile 2
Wed June 03 17:48:51 2015
Media Recovery Datafile:2
Media Recovery Start
warning! Recovering data file 2 from a fuzzy backup. It might be a online
Backup taken without entering the begin backup command.
Starting datafile 2 recovery in thread 1 sequence 1064
DataFile 2: ' D:\ORACLE\ORADATA\SJYYKF\UNDOTBS01. DBF '
Media Recovery Log
Recovery of Online Redo log:thread 1 Group 2 Seq 1064 Reading mem 0
mem# 0 errs 0:d:\oracle\oradata\sjyykf\redo02. LOG
***
Corrupt block relative dba:0x008002c9 (file 2, block 713)
Fractured block found during media/instance recovery
Data in bad block-
Type:2 Format:2 RDBA:0X008002C9
Last Change scn:0x0d21.4d3eaf6b seq:0x24 flg:0x04
Consistency value in tail:0x2a090207
Check value in blocks header:0x19e5, computed block checksum:0xfb49
spare1:0x0, spare2:0x0, spare3:0x0
***
Reread of RDBA:0X008002C9 (file 2, block 713) found same corrupted data
Errors with log.
Media Recovery failed with error 354
ORA-283 signalled During:alter DATABASE RECOVER datafile 2 ...
Wed June 03 17:50:41 2015
ALTER DATABASE DataFile 2 offline
Wed June 03 17:50:45 2015
ORA-1145 signalled During:alter database datafile 2 offline ...
Wed June 03 17:54:07 2015
ALTER DATABASE datafile 2 offline drop
Wed June 03 17:54:07 2015
Completed:alter database datafile 2 offline drop
Wed June 03 17:54:50 2015
ALTER DATABASE DataFile 2 offline
Wed June 03 17:54:50 2015
Completed:alter Database datafile 2 offline
Wed June 03 17:56:01 2015
ALTER DATABASE Open
Wed June 03 17:56:01 2015
Beginning crash recovery of 1 threads
Wed June 03 17:56:01 2015
Started the scan
Wed June 03 17:56:01 2015
Completed the scan
3030 Redo blocks read, 797 data blocks need
Wed June 03 17:56:01 2015
Started recovery at
Thread 1:logseq 1065, Block 2, SCN 3361.3097022559
Recovery of Online Redo log:thread 1 Group 3 Seq 1065 Reading Mem 0
mem# 0 errs 0:d:\oracle\oradata\sjyykf\redo03. LOG
***
Corrupt block relative dba:0x01108411 (file 4, block 1082385)
Fractured block found during crash/instance recovery
Data in bad block-
Type:32 Format:2 rdba:0x01108411
Last Change Scn:0x0d21.4caf8ef6 seq:0x2 flg:0x04
Consistency value in tail:0xa1c72001
Check value in blocks header:0x8274, computed block checksum:0x2f32
spare1:0x0, spare2:0x0, spare3:0x0
***
Reread of rdba:0x01108411 (file 4, block 1082385) found same corrupted data
***
Corrupt block relative Dba:0x0119bc11 (file 4, block 1686545)
Fractured block found during crash/instance recovery
Data in bad block-
Type:32 Format:2 Rdba:0x0119bc11
Last Change Scn:0x0d21.4cc759c7 seq:0x1 flg:0x04
Consistency value in tail:0xba332001
Check value in blocks HEADER:0XDC04, computed block checksum:0xe3f4
spare1:0x0, spare2:0x0, spare3:0x0
***
Reread of Rdba:0x0119bc11 (file 4, block 1686545) found same corrupted data
Wed June 03 17:56:02 2015
Ended Recovery at
Thread 1:logseq 1065, Block 3032, SCN 3361.3097685436
803 data blocks read, 2 data blocks written, 3030 redo blocks Read
Crash Recovery completed successfully
Wed June 03 17:56:02 2015
Thread 1 advanced to log sequence 1066
Thread 1 opened at log sequence 1066
Current log# 1 seq# 1066 mem# 0:d:\oracle\oradata\sjyykf\redo01. LOG
Successful open of Redo thread 1.
Wed June 03 17:56:02 2015
Smon:enabling Cache Recovery
Wed June 03 17:56:02 2015
Errors in file d:\oracle\admin\sjyykf\udump\sjyykf_ora_3728.trc:
Ora-00600:internal error code, arguments: [4000], [6], [], [], [], [], [], []
Wed June 03 17:56:03 2015
Errors in file d:\oracle\admin\sjyykf\udump\sjyykf_ora_3728.trc:
Ora-00704:bootstrap Process Failure
Ora-00704:bootstrap Process Failure
Ora-00600:internal error code, arguments: [4000], [6], [], [], [], [], [], []
Wed June 03 17:56:03 2015
Error 704 happened during DB Open, shutting down database
User:terminating instance due to error 704
Instance terminated by USER, PID = 3728
ORA-1092 signalled during:alter database open ...
Wed June 03 18:06:48 2015
Starting ORACLE instance (normal)
license_max_session = 0
license_sessions_warning = 0
SCN Scheme 2
Using log_archive_dest Parameter Default value
License_max_users = 0
SYS Auditing is disabled
Starting up ORACLE RDBMS version:9.2.0.1.0.
System parameters with Non-default values:
We can see that after the problem, some engineers tried to recover several times, including the offline datafile action. The ORA-00600 [4000] error occurred when the database was last opened more than once, and it was stuck on this error, causing the next step to fail. Because of this error, many manufacturers have failed to resolve the problem. Let's first look at the trace file contents of the error:
* * 2015-06-04 11:42:37.468
Ksedmp:internal or fatal error
Ora-00600:internal error code, arguments: [4000], [6], [], [], [], [], [], []
Current SQL statement for this session:
Select CTime, Mtime, stime from obj$ where obj# =: 1
-----Call Stack Trace-----
Calling call entry argument values in hex
Location type point (? means dubious value)
-------------------- -------- -------------------- ----------------------------
_ksedmp+327 Callrel _ksedst+0
_ksfdmp.108+14 Callrel _ksedmp+0 3
_kgeriv+137 Callreg 00000000 49101d0 3
_kgeasi+189 callrel _kgeriv+0 49101d0 341219C FA0 1 490b1d4
_ktudba+541 callrel _kgeasi+0 49101d0 341219C FA0 2 1 4 6
__vinfreq__ktrget+1 Callrel _ktudba+0 6 490b69c 0 0
647
_ktrgtc+339 Callrel _ktrgcm.98+0
_kdsgrp.51+432 callrel _ktrgtc+0 342cad0 342ca24 599948
490b800 200 598608 599598
_kdsfbrcb+420 Callrel _kdsgrp.51+0 342CACC 0 342CACC
.. 1.15_3.filter.81+ callrel _kdsfbrcb+0 342CACC 342ce3c 0 1 0 0
949 342ca24 599f28 490b90c 0
_kpofrws+223 Call??? 00000000 3b79d128 5240BC 490bb74 1
.. 1.3_1.filter.30+7 callrel _kpofrws+0 343c11c 3b79eca0 5240BC
1 490bb74
_opifch+76 Callrel _opifch2+0 5 490BCAC
_opiodr+1398 Callreg 00000000 5 2 490C3BC
_rpidrus.43+153 Callrel _opiodr+0 5 2 490C3BC 5
_skgmstack+113 Callreg 00000000 490bf84
_rpidru+109 callrel _skgmstack+0 490bf9c 4910058 F618
490bf84
_rpiswu2+382 Callreg 00000000 490c2c0
_rpidrv+298 Callrel _rpiswu2+0
_rpifch+28 Callrel _rpidrv+0 5 5 490C3BC 8
_kqdpts+178 Callrel _rpifch+0 5 5
.. 1.56_6.filter.13+ Callrel _kqdpts+0
43
_kqlbplc+113 Callrel _kqrlfc+0
_kqlblfc+181 Callrel _kqlbplc+0 0
_adbdrv+10013 callrel _kqlblfc+0 0 490d590 490d590
.. 1.5_1.filter.29+6 Callrel _adbdrv+0
55
_opiosq0+2507 Callrel _opiexe+0 4 0 490d9c8
_kpooprx+236 Callrel _opiosq0+0 3 E 490da60 24
_kpoal8+561 Callrel _kpooprx+0 490E338 490E278 13 1 0 24
_opiodr+1398 Callreg 00000000 5E 490E334
_ttcpip+3024 Callreg 00000000 5E 490E334 0
_opitsk+1907 Callrel _ttcpip+0
_opiino+1480 callrel _opitsk+0 0 0 49164b0 441784C F4 0
_opiodr+1398 Callreg 00000000 3C 4 490fbd8
_opidrv+563 Callrel _opiodr+0 3C 4 490fbd8 0
_sou2o+25 Callrel _opidrv+0
_opimai+266 Callrel _sou2o+0
_oraclethreadstart@ Callrel _opimai+0
4+961
7C82482C Callreg 00000000
We can see that in the Open database, Oracle recursive SQL failed, causing the database to not open. This error is essentially due to the need to use undo to construct a consistent read when executing recursive SQL, but the undo is found to be corrupted. Because it is a non-archive environment, it is not possible to recover by conventional means.
In this error, [6] indicates the rollback segment number, which means that the undo block that accesses the error belongs to the 6th number rollback segment.
One might say that if you force the offline 6th rollback segment with suppressed parameters, can you open the database? It's not really going to work here.
Let's take a look at why the SQL-accessed block of data needs to construct consistency read?
Block Header dump:0x0040007a
Object ID on block? Y
seg/obj:0x12 csc:0xd21.e67ade95 itc:1 FLG:-Typ:1-DATA
fsl:0 fnx:0x0 ver:0x01
Itl Xid Uba Flag Lck SCN/FSC
0x01 0x0006.00e.0000ad3c 0x008005d8.1e96.3c--u-1 FSC 0x0000.e67ade96
Data_block_dump,data header at 0x10fc0044
===============
Tsiz:0x1fb8
Hsiz:0xea
pbl:0x10fc0044
bdba:0x0040007a
76543210
flag=--------
Ntab=1
nrow=108
Frre=-1
Fsbo=0xea
fseo=0x110
avsp=0x35c
tosp=0x35c
0xe:pti[0] nrow=108 offs=0
As you can see from the above information, there is indeed a block on the object id=18, the transaction state is U, although the transaction has been commit, but there is a row of locked records. Oracle's block cleanup is divided into 2 types: fast block cleanup and delay block cleanup. Obviously, this is the case for fast block cleanup, where most of the data block cleanup is quick cleanup because of the high efficiency.
If you want to put the database open, then in fact, through bbed modify the block ITL information, marking transactions for submission, while modifying lck and other information can be.
As I checked the main business data sheet space file through DBV, I found hundreds of bad blocks, and dBV stopped after detecting a certain location, which could be a problem with physical bad, as follows:
D:\ORACLE\ORADATA\SJYYKF>DBV File=sjyy_dat. DBF blocksize=8192
Dbverify:release 9.2.0.8.0-production on Mon Aug 10 14:31:35 2015
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
Dbverify-verification starting:file = Sjyy_dat. Dbf
Page 332081 is influx-most likely media corrupt
***
Corrupt block relative dba:0x01051131 (file 4, block 332081)
Fractured block found during DBV:
Data in bad block-
Type:6 Format:2 rdba:0x01051131
Last Change Scn:0x0d21.3c252e20 seq:0x2 flg:0x04
Consistency value in tail:0x32840602
Check value in blocks header:0xcfd5, computed block CHECKSUM:0X1CA4
spare1:0x0, spare2:0x0, spare3:0x0
***
Page 361777 is influx-most likely media corrupt
***
Corrupt block relative dba:0x01058531 (file 4, block 361777)
Fractured block found during DBV:
Data in bad block-
Type:6 Format:2 rdba:0x01058531
Last Change scn:0x0d21.3c28cf0a seq:0x2 flg:0x04
Consistency value in tail:0x480a0601
Check value in blocks header:0xd46e, computed block checksum:0x8703
spare1:0x0, spare2:0x0, spare3:0x0
***
Page 387785 is marked corrupt
***
Corrupt block relative dba:0x0105eac9 (file 4, block 387785)
Bad check value found during DBV:
Data in bad block-
Type:6 Format:2 RDBA:0X0105EAC9
Last Change scn:0x0d21.b83e4972 seq:0x2 flg:0x04
Consistency value in tail:0x49720602
Check value in blocks header:0x49ea, computed block CHECKSUM:0XF2C2
spare1:0x0, spare2:0x0, spare3:0x0
***
。。。。。。。
。。。。。。。
Page 517425 is influx-most likely media corrupt
***
Corrupt block relative dba:0x0107e531 (file 4, block 517425)
Fractured block found during DBV:
Data in bad block-
Type:6 Format:2 rdba:0x0107e531
Last Change scn:0x0d21.b85cf83a seq:0x1 flg:0x04
Consistency value in tail:0xf3ca0602
Check value in blocks header:0x1010, computed block checksum:0xbf3
spare1:0x0, spare2:0x0, spare3:0x0
***
Dbv-00102:file I/O error on File (Sjyy_dat. DBF) during verification read operation (-2)
D:\oracle\oradata\sjyykf>
At the same time considering the amount of data is small, less than 30G, so directly through the ODU recovery, because the system check found only 1 bad blocks, the data dictionary is intact.
Note: The physical backup of the database is required, otherwise it will be disastrous if there is a problem.